From pponocy at booktravelbound.com  Sat Dec  1 01:45:16 2007
From: pponocy at booktravelbound.com (Nicolas Vigil)
Date: Sat, 1 Dec 2007 16:45:16 +0700
Subject: [ofa-general] =?koi8-r?b?0sHT09nMy8EgySDE0tXHz8Ug?=
Message-ID: <095724628.82317845607425@booktravelbound.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071201/3e2ab917/attachment.html>

From vlad at lists.openfabrics.org  Sat Dec  1 02:50:51 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sat,  1 Dec 2007 02:50:51 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071201-0200 daily build status
Message-ID: <20071201105051.8BDD2E601B7@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.12
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.14
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.19
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.19
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.15
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on ppc64 with linux-2.6.18-8.el5

Failed:
Build failed on i686 with 2.6.15-23-server
Log:
		-I/usr/local/include/scst \
		-I/home/vlad/tmp/ofa_1_3_kernel-20071201-0200_check/drivers/infiniband/ulp/srpt \
		-I/home/vlad/tmp/ofa_1_3_kernel-20071201-0200_check/drivers/net/cxgb3 \
		-Iinclude \
		$(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
		' \
		modules
make: *** /lib/modules/2.6.15-23-server/build: No such file or directory.  Stop.
make: *** [kernel] Error 2
----------------------------------------------------------------------------------


From unguardedly at pixelcorps.net  Sat Dec  1 04:11:51 2007
From: unguardedly at pixelcorps.net (Winston Wright)
Date: Sat, 01 Dec 2007 13:11:51 +0100
Subject: [ofa-general] Adobe Font Folio 11 MAC/XP/Vista for 189,
	Retails @ 2599 (You save 2409)
Message-ID: <000001c83412$5d558300$0100007f@localhost>

microsoft money home & business 7 - 39
adobe encore dvd 2 - 49
office professional xp - 49
microsoft frontpage 2003 - 29
intuit quicken premier 2008 - 29
microsoft visual basic professional 6.0 - 49
luxology modo 301 for mac - 129
microsoft visual basic professional 6.0 - 49

type mycheapsoft. com in internet explorer bar


From hrosenstock at xsigo.com  Sat Dec  1 06:12:26 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sat, 01 Dec 2007 06:12:26 -0800
Subject: [ofa-general] Two element types on cancel_list in cancel_mads()?
In-Reply-To: <474F5CA4.8030604@ichips.intel.com>
References: <474F598D.70701@opengridcomputing.com>
	<474F5CA4.8030604@ichips.intel.com>
Message-ID: <1196518346.10845.155.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-11-29 at 16:43 -0800, Sean Hefty wrote:
> > In cancel_mads, elements from two different lists are added to the 
> > cancel_list:  wait_list and local_list.  Subsequent processing of the 
> > cancel_list treats all elements as struct ib_mad_send_wr_private, and 
> > uses the send_buf field of that structure.  But it appears to me that 
> > the items from local_list are actually of type struct 
> > ib_mad_local_private, and hence the reference to send_buf for these 
> > elements is incorrect.  Can you help me understand how this works?
> 
> I was looking at the local_list handling in cancel_mads() and the rest 
> of mad code myself.  Hal knows this part of the code better than I do, 
> maybe he can look here and see if there's a definite problem.  This 
> looks like the cause of the bug Dotan just reported.

Sorry for the slow response. I've been consumed with other matters for
the last couple days.

I started investigating this and found that this change was first
introduced over 2 years ago by the following:

commit 2c153b934dca08d58e0aafde18a182e0891aa201
Author: Hal Rosenstock <halr at voltaire.com>
Date:   Wed Jul 27 11:45:31 2005 -0700

    [PATCH] IB: Eliminate MAD cache leak associated with local completions
    
    Eliminate MAD cache leak associated with local completions.  Also, when
    canceling MAD, empty local completion list as well.
    
    Signed-off-by: Hal Rosenstock <halr at voltaire.com>
    Cc: Roland Dreier <rolandd at cisco.com>
    Signed-off-by: Andrew Morton <akpm at osdl.org>
    Signed-off-by: Linus Torvalds <torvalds at osdl.org>

More later...

-- Hal

> - Sean


From hrosenstock at xsigo.com  Sat Dec  1 06:18:04 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sat, 01 Dec 2007 06:18:04 -0800
Subject: [ofa-general] nightly osm_sim report 2007-12-01:normal completion
In-Reply-To: <MTLEXCH01Z1UbDBRArV000092be@mtlexch01.mtl.com>
References: <MTLEXCH01Z1UbDBRArV000092be@mtlexch01.mtl.com>
Message-ID: <1196518684.10845.160.camel@hrosenstock-ws.xsigo.com>

Hi Yevgeny,

On Sat, 2007-12-01 at 07:09 +0200, kliteyn at mellanox.co.il wrote:
> OSM Simulation Regression Summary
>  
> [Generated mail - please do NOT reply]
>  
> 
> OpenSM binary date = 2007-11-30
> OpenSM git rev = Thu_Nov_29_19:37:20_2007 [498e13f7145f77d468054688d8cbea61677b624a]
> ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
>  
> 
> Total=480  Pass=479  Fail=1
>  
> 
> Pass:
> 36 Stability IS1-16.topo
> 36 Pkey IS1-16.topo
> 36 OsmTest IS1-16.topo
> 36 OsmStress IS1-16.topo
> 36 Multicast IS1-16.topo
> 36 LidMgr IS1-16.topo
> 12 Stability IS3-loop.topo
> 12 Stability IS3-128.topo
> 12 Pkey IS3-128.topo
> 12 OsmTest IS3-loop.topo
> 12 OsmTest IS3-128.topo
> 12 OsmStress IS3-128.topo
> 12 Multicast IS3-loop.topo
> 12 Multicast IS3-128.topo
> 12 FatTree merge-roots-4-ary-2-tree.topo
> 12 FatTree merge-root-4-ary-3-tree.topo
> 12 FatTree gnu-stallion-64.topo
> 12 FatTree blend-4-ary-2-tree.topo
> 12 FatTree RhinoDDR.topo
> 12 FatTree FullGnu.topo
> 12 FatTree 4-ary-2-tree.topo
> 12 FatTree 2-ary-4-tree.topo
> 12 FatTree 12-node-spaced.topo
> 12 FTreeFail 4-ary-2-tree-missing-sw-link.topo
> 12 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
> 12 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
> 12 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
> 11 LidMgr IS3-128.topo
> 
> Failures:
> 1 LidMgr IS3-128.topo

We've seen similar reports on other runs too. Is this a regression tool
issue or a real failure ?

Thanks.

-- Hal

> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Sat Dec  1 07:19:29 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sat, 01 Dec 2007 07:19:29 -0800
Subject: [ofa-general] [PATCH] ib/mad: fix incorrect access to items on
	local_list
In-Reply-To: <000001c8337a$cdc18e60$ff0da8c0@amr.corp.intel.com>
References: <474BE237.8050602@dev.mellanox.co.il> <aday7cjntc9.fsf@cisco.com>
	<000001c8337a$cdc18e60$ff0da8c0@amr.corp.intel.com>
Message-ID: <1196522369.10845.168.camel@hrosenstock-ws.xsigo.com>

On Fri, 2007-11-30 at 09:59 -0800, Sean Hefty wrote:
> In cancel_mads(), MADs are moved from the wait_list and local_list
> to a cancel_list for processing.  However, the structures on these two
> lists are not the same.  The wait_list references struct
> ib_mad_send_wr_private, but local_list references struct
> ib_mad_local_private.  Cancel_mads() treats all items moved to the
> cancel_list as struct ib_mad_send_wr_private.  This leads to a system
> crash when requests are moved from the local_list to the cancel_list.
> 
> Fix this by leaving local_list alone.  All requests on the local_list
> have completed are just awaiting processing by a queued worker thread.
> 
> Bug (crash) reported by Dotan Barak <dotanb at dev.mellanox.co.il>.
> Problem with local_list access reported by Robert Reynolds
> <rreynolds at opengridcomputing.com>.
> 
> Signed-off-by: Sean Hefty <sean.hefty at intel.com>
> ---
> This patch is untested.  Dotan, can you see if this fixes the crash that
> you were seeing?
> 
>  drivers/infiniband/core/mad.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 91e62c3..7ef2c7c 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -2284,8 +2284,6 @@ static void cancel_mads(struct ib_mad_agent_private *mad_agent_priv)
>  
>  	/* Empty wait list to prevent receives from finding a request */
>  	list_splice_init(&mad_agent_priv->wait_list, &cancel_list);
> -	/* Empty local completion list as well */
> -	list_splice_init(&mad_agent_priv->local_list, &cancel_list);

It may fix the crash but I think this reintroduces a memory leak so I
think the real fix is a little more complicated.

-- Hal

>  	spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
>  
>  	/* Report all cancelled requests */
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Sat Dec  1 07:23:40 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sat, 01 Dec 2007 07:23:40 -0800
Subject: [ofa-general] Two element types on cancel_list in cancel_mads()?
In-Reply-To: <1196518346.10845.155.camel@hrosenstock-ws.xsigo.com>
References: <474F598D.70701@opengridcomputing.com>
	<474F5CA4.8030604@ichips.intel.com>
	<1196518346.10845.155.camel@hrosenstock-ws.xsigo.com>
Message-ID: <1196522620.10845.173.camel@hrosenstock-ws.xsigo.com>

On Sat, 2007-12-01 at 06:12 -0800, Hal Rosenstock wrote:
> On Thu, 2007-11-29 at 16:43 -0800, Sean Hefty wrote:
> > > In cancel_mads, elements from two different lists are added to the 
> > > cancel_list:  wait_list and local_list.  Subsequent processing of the 
> > > cancel_list treats all elements as struct ib_mad_send_wr_private, and 
> > > uses the send_buf field of that structure.  But it appears to me that 
> > > the items from local_list are actually of type struct 
> > > ib_mad_local_private, and hence the reference to send_buf for these 
> > > elements is incorrect.  Can you help me understand how this works?
> > 
> > I was looking at the local_list handling in cancel_mads() and the rest 
> > of mad code myself.  Hal knows this part of the code better than I do, 
> > maybe he can look here and see if there's a definite problem.  This 
> > looks like the cause of the bug Dotan just reported.
> 
> Sorry for the slow response. I've been consumed with other matters for
> the last couple days.
> 
> I started investigating this and found that this change was first
> introduced over 2 years ago by the following:
> 
> commit 2c153b934dca08d58e0aafde18a182e0891aa201
> Author: Hal Rosenstock <halr at voltaire.com>
> Date:   Wed Jul 27 11:45:31 2005 -0700
> 
>     [PATCH] IB: Eliminate MAD cache leak associated with local completions
>     
>     Eliminate MAD cache leak associated with local completions.  Also, when
>     canceling MAD, empty local completion list as well.
>     
>     Signed-off-by: Hal Rosenstock <halr at voltaire.com>
>     Cc: Roland Dreier <rolandd at cisco.com>
>     Signed-off-by: Andrew Morton <akpm at osdl.org>
>     Signed-off-by: Linus Torvalds <torvalds at osdl.org>
> 
> More later...

FWIW, I traced the origin of this change back to the following thread:

http://lists.openfabrics.org/pipermail/general/2005-May/thread.html

http://lists.openfabrics.org/pipermail/general/2005-May/005951.html

Subject:

slab error in kmem_cache_destroy(): cache `ib_mad': Can't free all
objects

-- Hal

> 
> -- Hal
> 
> > - Sean
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From apdsb1 at gmail.com  Sat Dec  1 08:29:10 2007
From: apdsb1 at gmail.com (APD)
Date: Sun, 02 Dec 2007 00:29:10 +0800
Subject: [ofa-general] TENDERING,
	COST ESTIMATING AND CONTRACTS ADMINISTRATION WORKSHOP
	- 13 & 14 Dec 2007/Building & Construction Seminar Series -www
Message-ID: <1196526550.625@openfabrics.org>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071202/f3a6a868/attachment.html>

From sashak at voltaire.com  Sat Dec  1 08:48:05 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 1 Dec 2007 16:48:05 +0000
Subject: [ofa-general] [PATCH] opensm: make osm_pkey_get_tables static
Message-ID: <20071201164805.GT375@sashak.voltaire.com>


Make osm_pkey_get_tables defined and used only in osm_port_info_rcv.c
static. Also rename to get_pkey_table().

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/include/opensm/osm_pkey.h  |   39 -------------------------------------
 opensm/opensm/osm_port_info_rcv.c |   16 +++++++-------
 2 files changed, 8 insertions(+), 47 deletions(-)

diff --git a/opensm/include/opensm/osm_pkey.h b/opensm/include/opensm/osm_pkey.h
index 3c84e4b..0dce001 100644
--- a/opensm/include/opensm/osm_pkey.h
+++ b/opensm/include/opensm/osm_pkey.h
@@ -696,44 +696,5 @@ boolean_t osm_physp_has_pkey(IN osm_log_t * p_log,
 *
 *********/
 
-/****f* OpenSM: osm_pkey_get_tables
-* NAME
-*  osm_pkey_get_tables
-*
-* DESCRIPTION
-*  Sends a request for getting the pkey tables of the given physp.
-*
-* SYNOPSIS
-*/
-void osm_pkey_get_tables(IN osm_log_t * p_log,
-			 IN osm_req_t * p_req,
-			 IN osm_subn_t * const p_subn,
-			 IN struct _osm_node *const p_node,
-			 IN struct _osm_physp *const p_physp);
-
-/*
-* PARAMETERS
-*  p_log
-*     [in] Pointer to osm_log object.
-*
-*  p_req
-*     [in] Pointer to osm_req object.
-*
-*  p_subn
-*     [in] Pointer to osm_subn object.
-*
-*  p_node
-*     [in] Pointer to osm_node object.
-*
-*  p_physp
-*     [in] Pointer to osm_physp_t object.
-*
-* RETURN VALUES
-*  None
-*
-* NOTES
-*
-*********/
-
 END_C_DECLS
 #endif				/* _OSM_PKEY_H_ */
diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c
index dd3642d..9ea8738 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -370,11 +370,11 @@ __osm_pi_rcv_process_ca_or_router_port(IN const osm_pi_rcv_t * const p_rcv,
 #define IBM_VENDOR_ID  (0x5076)
 /**********************************************************************
  **********************************************************************/
-void osm_pkey_get_tables(IN osm_log_t * p_log,
-			 IN osm_req_t * p_req,
-			 IN osm_subn_t * const p_subn,
-			 IN osm_node_t * const p_node,
-			 IN osm_physp_t * const p_physp)
+static void get_pkey_table(IN osm_log_t * p_log,
+			   IN osm_req_t * p_req,
+			   IN osm_subn_t * const p_subn,
+			   IN osm_node_t * const p_node,
+			   IN osm_physp_t * const p_physp)
 {
 
 	osm_madw_context_t context;
@@ -384,7 +384,7 @@ void osm_pkey_get_tables(IN osm_log_t * p_log,
 	uint16_t block_num, max_blocks;
 	uint32_t attr_mod_ho;
 
-	OSM_LOG_ENTER(p_log, osm_pkey_get_tables);
+	OSM_LOG_ENTER(p_log, get_pkey_table);
 
 	path = *osm_physp_get_dr_path_ptr(p_physp);
 
@@ -452,8 +452,8 @@ __osm_pi_rcv_get_pkey_slvl_vla_tables(IN const osm_pi_rcv_t * const p_rcv,
 {
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_get_pkey_slvl_vla_tables);
 
-	osm_pkey_get_tables(p_rcv->p_log, p_rcv->p_req, p_rcv->p_subn,
-			    p_node, p_physp);
+	get_pkey_table(p_rcv->p_log, p_rcv->p_req, p_rcv->p_subn,
+		       p_node, p_physp);
 
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Sat Dec  1 08:48:57 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 1 Dec 2007 16:48:57 +0000
Subject: [ofa-general] [PATCH] opensm: remove testability_mode option
Message-ID: <20071201164857.GU375@sashak.voltaire.com>


Remove testability_mode option - it is not something suitable for main
stream.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/include/opensm/osm_subnet.h |   20 --------------------
 opensm/opensm/main.c               |    4 ----
 opensm/opensm/osm_state_mgr.c      |   10 ----------
 opensm/opensm/osm_subnet.c         |    1 -
 4 files changed, 0 insertions(+), 35 deletions(-)

diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h
index b67add3..cf52b49 100644
--- a/opensm/include/opensm/osm_subnet.h
+++ b/opensm/include/opensm/osm_subnet.h
@@ -156,22 +156,6 @@ typedef void
 *
 *********/
 
-/****d* OpenSM: Subnet/osm_testability_modes_t
-* NAME
-*	osm_testability_modes_t
-*
-* DESCRIPTION
-*	Enumerates the possible testability modes.
-*
-* SYNOPSIS
-*/
-typedef enum _osm_testability_modes {
-	OSM_TEST_MODE_NONE = 0,
-	OSM_TEST_MODE_EXIT_BEFORE_SEND_HANDOVER,
-	OSM_TEST_MODE_MAX
-} osm_testability_modes_t;
-/***********/
-
 /****s* OpenSM: Subnet/osm_qos_options_t
 * NAME
 *	osm_qos_options_t
@@ -270,7 +254,6 @@ typedef struct _osm_subn_opt {
 	osm_pfn_ui_mcast_extension_t pfn_ui_mcast_fdb_assign;
 	void *ui_mcast_fdb_assign_ctx;
 	boolean_t sweep_on_trap;
-	osm_testability_modes_t testability_mode;
 	char *routing_engine_name;
 	boolean_t connect_roots;
 	char *lid_matrix_dump_file;
@@ -440,9 +423,6 @@ typedef struct _osm_subn_opt {
 *	sweep_on_trap
 *		Received traps will initiate a new sweep.
 *
-*	testability_mode
-*		Object that indicates if we are running in a special testability mode.
-*
 *	routing_engine_name
 *		Name of used routing engine
 *		(other than default Min Hop Algorithm)
diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
index 4b99dd0..0bc9238 100644
--- a/opensm/opensm/main.c
+++ b/opensm/opensm/main.c
@@ -754,10 +754,6 @@ int main(int argc, char *argv[])
 			 */
 			else if (dbg_lvl == 5)
 				vendor_debug++;
-			else if (dbg_lvl >= 10)
-				/* Please look at osm_subnet.h for list
-				 * of testability modes. */
-				opt.testability_mode = dbg_lvl - 9;
 			else
 				printf(" OpenSM: Unknown debug option %d"
 				       " ignored\n", dbg_lvl);
diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index 1ff8eb7..5c39f11 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1175,16 +1175,6 @@ __osm_state_mgr_send_handover(IN osm_state_mgr_t * const p_mgr,
 
 	OSM_LOG_ENTER(p_mgr->p_log, __osm_state_mgr_send_handover);
 
-	if (p_mgr->p_subn->opt.testability_mode ==
-	    OSM_TEST_MODE_EXIT_BEFORE_SEND_HANDOVER) {
-		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
-			"__osm_state_mgr_send_handover: ERR 3315: "
-			"Exit on testability mode OSM_TEST_MODE_EXIT_BEFORE_SEND_HANDOVER\n");
-		osm_exit_flag = TRUE;
-		sleep(3);
-		exit(1);
-	}
-
 	/*
 	 * Send a query of SubnSet(SMInfo) HANDOVER to the remote sm given.
 	 */
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index 5887819..17c166a 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -470,7 +470,6 @@ void osm_subn_set_default_opt(IN osm_subn_opt_t * const p_opt)
 	p_opt->pfn_ui_mcast_fdb_assign = NULL;
 	p_opt->ui_mcast_fdb_assign_ctx = NULL;
 	p_opt->sweep_on_trap = TRUE;
-	p_opt->testability_mode = OSM_TEST_MODE_NONE;
 	p_opt->routing_engine_name = NULL;
 	p_opt->connect_roots = FALSE;
 	p_opt->lid_matrix_dump_file = NULL;
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Sat Dec  1 08:49:39 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 1 Dec 2007 16:49:39 +0000
Subject: [ofa-general] [PATCH] libibumad: fix memory leak
Message-ID: <20071201164939.GV375@sashak.voltaire.com>


Fix memory leak - free allocated by scandir() namelist elements.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 libibumad/src/umad.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/libibumad/src/umad.c b/libibumad/src/umad.c
index 5b7b83e..1dc328d 100644
--- a/libibumad/src/umad.c
+++ b/libibumad/src/umad.c
@@ -177,6 +177,7 @@ get_port(char *ca_name, char *dir, int portnum, umad_port_t *port)
 		idx = strtoul(namelist[i]->d_name, NULL, 0);
 		sys_read_uint(port_dir, namelist[i]->d_name, &val);
 		port->pkeys[idx] = val;
+		free(namelist[i]);
 	}
 	port->pkeys_size = ret;
 	free(namelist);
@@ -188,8 +189,11 @@ get_port(char *ca_name, char *dir, int portnum, umad_port_t *port)
 	return 0;
 
 clean:
-	if (namelist)
+	if (namelist) {
+		for (i = 0; i < ret ; i++)
+			free(namelist[i]);
 		free(namelist);
+	}
 	if (port->pkeys)
 		free(port->pkeys);
 	return -EIO;
-- 
1.5.3.4.206.g58ba4


From sweitzen at cisco.com  Sat Dec  1 12:19:24 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Sat, 1 Dec 2007 12:19:24 -0800
Subject: [ofa-general] ofa_1_3_kernel 20071201-0200 daily build status
In-Reply-To: <20071201105051.8BDD2E601B7@openfabrics.org>
References: <20071201105051.8BDD2E601B7@openfabrics.org>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A3444F@xmb-sjc-216.amer.cisco.com>

I don't see any new builds in
http://www.openfabrics.org/builds/ofed-1.3.

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems


> -----Original Message-----
> From: general-bounces at lists.openfabrics.org 
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of 
> Vladimir Sokolovsky (Mellanox)
> Sent: Saturday, December 01, 2007 2:51 AM
> To: openib-general at openib.org
> Subject: [ofa-general] ofa_1_3_kernel 20071201-0200 daily build status
> 
> This email was generated automatically, please do not reply
> 
> 
> git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
> git_branch: ofed_kernel
> 
> Common build parameters:   --with-ipoib-mod --with-sdp-mod 
> --with-srp-mod --with-user_mad-mod --with-user_access-mod 
> --with-mthca-mod --with-mlx4-mod --with-core-mod 
> --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod
> 
> Passed:
> Passed on i686 with linux-2.6.22
> Passed on i686 with linux-2.6.21.1
> Passed on i686 with linux-2.6.18
> Passed on i686 with linux-2.6.17
> Passed on i686 with linux-2.6.16
> Passed on i686 with linux-2.6.19
> Passed on i686 with linux-2.6.14
> Passed on i686 with linux-2.6.13
> Passed on i686 with linux-2.6.15
> Passed on i686 with linux-2.6.12
> Passed on x86_64 with linux-2.6.21.1
> Passed on x86_64 with linux-2.6.20
> Passed on x86_64 with linux-2.6.12
> Passed on powerpc with linux-2.6.13
> Passed on x86_64 with linux-2.6.13
> Passed on x86_64 with linux-2.6.18
> Passed on ia64 with linux-2.6.22
> Passed on x86_64 with linux-2.6.22
> Passed on x86_64 with linux-2.6.16
> Passed on powerpc with linux-2.6.14
> Passed on powerpc with linux-2.6.15
> Passed on x86_64 with linux-2.6.14
> Passed on powerpc with linux-2.6.12
> Passed on x86_64 with linux-2.6.17
> Passed on ppc64 with linux-2.6.15
> Passed on ppc64 with linux-2.6.12
> Passed on ppc64 with linux-2.6.14
> Passed on x86_64 with linux-2.6.19
> Passed on ppc64 with linux-2.6.16
> Passed on ia64 with linux-2.6.23
> Passed on x86_64 with linux-2.6.9-55.ELsmp
> Passed on ppc64 with linux-2.6.17
> Passed on ppc64 with linux-2.6.19
> Passed on ia64 with linux-2.6.19
> Passed on x86_64 with linux-2.6.16.43-0.3-smp
> Passed on x86_64 with linux-2.6.15
> Passed on ppc64 with linux-2.6.18
> Passed on ia64 with linux-2.6.18
> Passed on x86_64 with linux-2.6.18-53.el5
> Passed on ia64 with linux-2.6.12
> Passed on ppc64 with linux-2.6.13
> Passed on ia64 with linux-2.6.21.1
> Passed on x86_64 with linux-2.6.16.21-0.8-smp
> Passed on ia64 with linux-2.6.15
> Passed on ia64 with linux-2.6.13
> Passed on ia64 with linux-2.6.14
> Passed on ia64 with linux-2.6.17
> Passed on x86_64 with linux-2.6.18-1.2798.fc6
> Passed on x86_64 with linux-2.6.18-8.el5
> Passed on ia64 with linux-2.6.16
> Passed on x86_64 with linux-2.6.9-42.ELsmp
> Passed on ia64 with linux-2.6.16.21-0.8-default
> Passed on ppc64 with linux-2.6.18-8.el5
> 
> Failed:
> Build failed on i686 with 2.6.15-23-server
> Log:
> 		-I/usr/local/include/scst \
> 		
> -I/home/vlad/tmp/ofa_1_3_kernel-20071201-0200_check/drivers/in
finiband/ulp/srpt \
> 		
> -I/home/vlad/tmp/ofa_1_3_kernel-20071201-0200_check/drivers/ne
t/cxgb3 \
> 		-Iinclude \
> 		$(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
> 		' \
> 		modules
> make: *** /lib/modules/2.6.15-23-server/build: No such file 
> or directory.  Stop.
> make: *** [kernel] Error 2
> --------------------------------------------------------------
> --------------------
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From sean.hefty at intel.com  Sat Dec  1 16:43:27 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Sat, 1 Dec 2007 16:43:27 -0800
Subject: [ofa-general] [PATCH] ib/mad: fix incorrect access to items
	onlocal_list
In-Reply-To: <1196522369.10845.168.camel@hrosenstock-ws.xsigo.com>
References: <474BE237.8050602@dev.mellanox.co.il> <aday7cjntc9.fsf@cisco.com>
	<000001c8337a$cdc18e60$ff0da8c0@amr.corp.intel.com>
	<1196522369.10845.168.camel@hrosenstock-ws.xsigo.com>
Message-ID: <000001c8347c$5a9878b0$66258686@amr.corp.intel.com>

>It may fix the crash but I think this reintroduces a memory leak so I
>think the real fix is a little more complicated.

The thought was that the local_list would be processed normally through the
queued work item.  I need to spend some time looking at what the original memory
leak problem was.

- Sean


From kliteyn at mellanox.co.il  Sat Dec  1 21:04:08 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 2 Dec 2007 07:04:08 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-02:normal completion
Message-ID: <MTLEXCH01JxArGvnCjn000093f7@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-01
OpenSM git rev = Thu_Nov_29_19:37:20_2007 [498e13f7145f77d468054688d8cbea61677b624a]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=480  Pass=480  Fail=0
 
 
Pass:
36 Stability IS1-16.topo
36 Pkey IS1-16.topo
36 OsmTest IS1-16.topo
36 OsmStress IS1-16.topo
36 Multicast IS1-16.topo
36 LidMgr IS1-16.topo
12 Stability IS3-loop.topo
12 Stability IS3-128.topo
12 Pkey IS3-128.topo
12 OsmTest IS3-loop.topo
12 OsmTest IS3-128.topo
12 OsmStress IS3-128.topo
12 Multicast IS3-loop.topo
12 Multicast IS3-128.topo
12 LidMgr IS3-128.topo
12 FatTree merge-roots-4-ary-2-tree.topo
12 FatTree merge-root-4-ary-3-tree.topo
12 FatTree gnu-stallion-64.topo
12 FatTree blend-4-ary-2-tree.topo
12 FatTree RhinoDDR.topo
12 FatTree FullGnu.topo
12 FatTree 4-ary-2-tree.topo
12 FatTree 2-ary-4-tree.topo
12 FatTree 12-node-spaced.topo
12 FTreeFail 4-ary-2-tree-missing-sw-link.topo
12 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
12 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
12 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From ogerlitz at voltaire.com  Sat Dec  1 22:59:10 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Sun, 02 Dec 2007 08:59:10 +0200
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <OF7F9B4959.CD9666D5-ON872573A3.007FBDE3-882573A3.00809D5D@us.ibm.com>
References: <OF7F9B4959.CD9666D5-ON872573A3.007FBDE3-882573A3.00809D5D@us.ibm.com>
Message-ID: <475257BE.3090108@voltaire.com>

Shirley Ma wrote:
> I just touch tested ofed-1.3 beta IPoIB. And found there was a kernel 
> parameter hw_csum being added in IPoIB. I have several questions here:
> 1. Why not using ethtool to set up these HW_CSUM flags?
> 2. I haven't looked at the detailed code yet, is that possible with this 
> flag, TCP/IP will not do CSUM for HCA which has no TCP/IP offload support? 
> If so, then these packets should be limited to be IB network. Routing to 
> ethernet network, the packets would be dropped. If not, I tested IPoIB-UD, 
> why I saw 30% improvement with hw_csum set in none connectX mthca SDR 
> environment?
> 3. I saw switching between IPoIB-cm and IPoIB-ud corrupted interface IP 
> address (unicast address, subnet mask, broadcast address. Anybody saw the 
> same problem?

Shirley,

Please take a look on slide 13 in Dror's presentation @
http://openfabrics.org/archives/nov2007sc/IPoIB-UD%20SO.pdf

Over the session some criticism has been expressed by developers at the 
audience that the patch which was merged to OFED 1.3 did not address the 
comments made over the list, Dror apologized and said this will be taken 
care of.

Also please note that Roland is the upstream kernel IB stack maintainer 
where the process to have this or that code piece in OFED does not 
include posting the patches to review nor getting feedback from him as a 
must for inclusion.

Or.


From ogerlitz at voltaire.com  Sat Dec  1 23:04:43 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Sun, 02 Dec 2007 09:04:43 +0200
Subject: [ofa-general] RE: disconnect issues/questions
In-Reply-To: <000001c832ca$3bedec50$60d8180a@amr.corp.intel.com>
References: <15ddcffd0711142341g7b83d917t2fcc4b9a64e54f55@mail.gmail.com><15ddcffd0711142358m55192a25qaa2e419045f6d0ea@mail.gmail.com>	<000001c828e9$f0ad4f40$2ccc180a@amr.corp.intel.com>
	<000001c832ca$3bedec50$60d8180a@amr.corp.intel.com>
Message-ID: <4752590B.8040301@voltaire.com>

Sean Hefty wrote:
>>> B) will RDMA_CM_EVENT_DISCONNECTED event would --always-- be generated
>>> also for the side that called rdma_disconnect()? in both cases (yes
>>> and no), we need to document this.
> 
> Always is too strong, but this is typically the case for IB.  (A device removal
> event would prevent this from occurring.)

OK, thanks for the clarification, can this be documented in the man 
pages? note that the device removal event is not exposed to user space.

Or.


From ogerlitz at voltaire.com  Sat Dec  1 23:14:20 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Sun, 02 Dec 2007 09:14:20 +0200
Subject: [ofa-general] RE: [PATCH] librdmacm/man: fix-up man pages
In-Reply-To: <000801c832b6$81feb850$f5d8180a@amr.corp.intel.com>
References: <000101c81a64$3582de80$9c98070a@amr.corp.intel.com>	<4726EEAC.3070105@voltaire.com>
	<472755C4.10600@ichips.intel.com>	<47285F53.4060402@voltaire.com>
	<4728BF4A.1060301@ichips.intel.com>	<15ddcffd0710311320v6b91b3cm3be0f7882e30ad2b@mail.gmail.com>	<000001c81cb5$4ce12160$9c98070a@amr.corp.intel.com>	<15ddcffd0711270435t12a18dc3waac2596b3884ac72@mail.gmail.com>	<000001c8311a$176cdbe0$63248686@amr.corp.intel.com>	<15ddcffd0711280307u7a89c6c2q2854b071f74d9123@mail.gmail.com>
	<000801c832b6$81feb850$f5d8180a@amr.corp.intel.com>
Message-ID: <47525B4C.9040808@voltaire.com>

Sean Hefty wrote:
>> Some users have approached me and said that its unclear from the man
>> pages for some values of the connection param structure what are their
>> legal values. Reviewing this a little, I think we should add the
>> maximum values for the retry_count and rnr_retry_count under the
>> infiniband specific section of the rdma_connect and rdma_accept pages.
> 
> These have been updated and pushed upstream.  Please let me know if you're aware
> of any other documentation changes.

got it, thanks.

As for more documentation changes let me do a little review and get back 
to you, later this week.

Or.


From dotanb at dev.mellanox.co.il  Sun Dec  2 01:28:06 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Sun, 02 Dec 2007 11:28:06 +0200
Subject: [ofa-general] i got kernel oops in ib_umad when executing
	ULPstests
In-Reply-To: <000901c832bd$2f649db0$f5d8180a@amr.corp.intel.com>
References: <474BE237.8050602@dev.mellanox.co.il> <aday7cjntc9.fsf@cisco.com>
	<000901c832bd$2f649db0$f5d8180a@amr.corp.intel.com>
Message-ID: <47527AA6.2070807@dev.mellanox.co.il>


>     Signed-off-by: Roland Dreier <rolandd at cisco.com>
>
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 493f4c6..a72bcea 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -1750,7 +1750,7 @@ ib_find_send_mad(struct ib_mad_agent_private *mad_agent_pr
>                      */
>                     (is_direct(wc->recv_buf.mad->mad_hdr.mgmt_class) ||
>                      rcv_has_same_gid(mad_agent_priv, wr, wc)))
> -                       return wr;
> +                       return (wr->status == IB_WC_SUCCESS) ? wr : NULL;
>         }
>
>         /*
>
>
> Dotan, can you verify that this fix is still there after the backport patches
> are applied?
>
> - Sean
>
>   

Yes, this patch exists in the code after the backport patches were 
applied....

Dotan


From dotanb at dev.mellanox.co.il  Sun Dec  2 01:36:05 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Sun, 02 Dec 2007 11:36:05 +0200
Subject: [ofa-general] Question:  Verbs API Error code recover
In-Reply-To: <4750A321.5080406@hermes-microvision.com>
References: <4750A321.5080406@hermes-microvision.com>
Message-ID: <47527C85.8050801@dev.mellanox.co.il>

Hi.

Wei Fang wrote:
> Hi, All:
>
> I'm new here just some days ago. Right now I'm facing a problem to 
> using OFED 1.2.5's verb api.   In my programming, I use RDMA Write 
> function to transfer data ( ibv_post_send ). Then I use ibv_poll_cq to 
> get this CQ's finish.  Sometimes, ibv_poll_cq's return error is 
> IBV_WC_RETRY_EXC_ERR (error code is 12).  When this error code 
> happen,  any next transfer will always fail.  In this case, I have to 
> restart computer.  Anyone can tell me how to recover this error 
> without quit program or restart PC?
>

If you have a completion with status IBV_WC_RETRY_EXC_ERR your QP state 
will be moved to error, so all of the WR that you will post after this 
will fail too.
If you have this failure you need to reconnect the QPs (i don't know why 
you need to restart the computer in order to fix this ....).


I think that you need to check why you got this completion status from 
the first place (did the remote side close the QP?)

Dotan


From vlad at lists.openfabrics.org  Sun Dec  2 02:53:39 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sun,  2 Dec 2007 02:53:39 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071202-0200 daily build status
Message-ID: <20071202105340.08BBDE60033@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.20
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-53.el5

Failed:
Build failed on i686 with 2.6.15-23-server
Log:
		-I/usr/local/include/scst \
		-I/home/vlad/tmp/ofa_1_3_kernel-20071202-0200_check/drivers/infiniband/ulp/srpt \
		-I/home/vlad/tmp/ofa_1_3_kernel-20071202-0200_check/drivers/net/cxgb3 \
		-Iinclude \
		$(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
		' \
		modules
make: *** /lib/modules/2.6.15-23-server/build: No such file or directory.  Stop.
make: *** [kernel] Error 2
----------------------------------------------------------------------------------


From coadunite at lutheran-hosp.com  Sun Dec  2 03:08:17 2007
From: coadunite at lutheran-hosp.com (Milner Patterson)
Date: Sun, 02 Dec 2007 13:08:17 +0200
Subject: [ofa-general] Adobe Acrobat Professional 8 MAC/XP/Vista for 79,
	Retails @ 599 (You Save 520)
Message-ID: <000001c834d2$7887c780$0100007f@localhost>

quarkxpress passport 7.3 - 79
adobe after effects cs3 - 69
cyberlink powerdvd ultra deluxe 7 - 29
mcafee desktop firewall 8.0.493 - 39
adobe photoshop cs2 v 9.0 - 69
adobe acrobat 8.0 professional - 79
mindjet mindmanager 7 for mac - 39
ms xp professional with sp2 - 49

type saleonsoftware. com in internet explorer bar


From kliteyn at dev.mellanox.co.il  Sun Dec  2 04:13:47 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Sun, 02 Dec 2007 14:13:47 +0200
Subject: [ofa-general] [PATCH 1/3] opensm: Remove unnecessary ntoh and hton
 conversions in LinkRecord processing
Message-ID: <4752A17B.1040804@dev.mellanox.co.il>

Remove unnecessary ntoh and hton conversions in LinkRecord processing.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_sa_link_record.c |   24 +++++++++++-------------
 1 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index 8acacec..ba52aea 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -153,15 +153,13 @@ __osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv,
 /**********************************************************************
  **********************************************************************/
 static void
-__get_base_lid(IN const osm_physp_t * p_physp, OUT uint16_t * p_base_lid)
+__get_base_lid(IN const osm_physp_t * p_physp, OUT ib_net16_t * p_base_lid)
 {
 	if (p_physp->p_node->node_info.node_type == IB_NODE_TYPE_SWITCH)
-		*p_base_lid =
-		    cl_ntoh16(osm_physp_get_base_lid
-			      (osm_node_get_physp_ptr(p_physp->p_node, 0))
-		    );
+		*p_base_lid = osm_physp_get_base_lid
+			      (osm_node_get_physp_ptr(p_physp->p_node, 0));
 	else
-		*p_base_lid = cl_ntoh16(osm_physp_get_base_lid(p_physp));
+		*p_base_lid = osm_physp_get_base_lid(p_physp);
 }

 /**********************************************************************
@@ -177,8 +175,8 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 {
 	uint8_t src_port_num;
 	uint8_t dest_port_num;
-	ib_net16_t from_base_lid_ho;
-	ib_net16_t to_base_lid_ho;
+	ib_net16_t from_base_lid;
+	ib_net16_t to_base_lid;

 	OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_physp_link);

@@ -269,12 +267,12 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
 			dest_port_num);

-	__get_base_lid(p_src_physp, &from_base_lid_ho);
-	__get_base_lid(p_dest_physp, &to_base_lid_ho);
+	__get_base_lid(p_src_physp, &from_base_lid);
+	__get_base_lid(p_dest_physp, &to_base_lid);

-	__osm_lr_rcv_build_physp_link(p_rcv, cl_ntoh16(from_base_lid_ho),
-				      cl_ntoh16(to_base_lid_ho),
-				      src_port_num, dest_port_num, p_list);
+	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
+				      to_base_lid, src_port_num,
+				      dest_port_num, p_list);

       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
-- 
1.5.1.4


From kliteyn at dev.mellanox.co.il  Sun Dec  2 04:13:47 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Sun, 02 Dec 2007 14:13:47 +0200
Subject: [ofa-general] [PATCH 1/3] opensm: Remove unnecessary ntoh and hton
 conversions in LinkRecord processing
Message-ID: <4752A17B.1040804@dev.mellanox.co.il>

Remove unnecessary ntoh and hton conversions in LinkRecord processing.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_sa_link_record.c |   24 +++++++++++-------------
 1 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index 8acacec..ba52aea 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -153,15 +153,13 @@ __osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv,
 /**********************************************************************
  **********************************************************************/
 static void
-__get_base_lid(IN const osm_physp_t * p_physp, OUT uint16_t * p_base_lid)
+__get_base_lid(IN const osm_physp_t * p_physp, OUT ib_net16_t * p_base_lid)
 {
 	if (p_physp->p_node->node_info.node_type == IB_NODE_TYPE_SWITCH)
-		*p_base_lid =
-		    cl_ntoh16(osm_physp_get_base_lid
-			      (osm_node_get_physp_ptr(p_physp->p_node, 0))
-		    );
+		*p_base_lid = osm_physp_get_base_lid
+			      (osm_node_get_physp_ptr(p_physp->p_node, 0));
 	else
-		*p_base_lid = cl_ntoh16(osm_physp_get_base_lid(p_physp));
+		*p_base_lid = osm_physp_get_base_lid(p_physp);
 }

 /**********************************************************************
@@ -177,8 +175,8 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 {
 	uint8_t src_port_num;
 	uint8_t dest_port_num;
-	ib_net16_t from_base_lid_ho;
-	ib_net16_t to_base_lid_ho;
+	ib_net16_t from_base_lid;
+	ib_net16_t to_base_lid;

 	OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_physp_link);

@@ -269,12 +267,12 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
 			dest_port_num);

-	__get_base_lid(p_src_physp, &from_base_lid_ho);
-	__get_base_lid(p_dest_physp, &to_base_lid_ho);
+	__get_base_lid(p_src_physp, &from_base_lid);
+	__get_base_lid(p_dest_physp, &to_base_lid);

-	__osm_lr_rcv_build_physp_link(p_rcv, cl_ntoh16(from_base_lid_ho),
-				      cl_ntoh16(to_base_lid_ho),
-				      src_port_num, dest_port_num, p_list);
+	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
+				      to_base_lid, src_port_num,
+				      dest_port_num, p_list);

       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
-- 
1.5.1.4


From kliteyn at dev.mellanox.co.il  Sun Dec  2 04:15:23 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Sun, 02 Dec 2007 14:15:23 +0200
Subject: [ofa-general] [PATCH 2/3] opensm: adding missing comparison by
 to_lid/from_lid in LinkRecord processing
Message-ID: <4752A1DB.5010103@dev.mellanox.co.il>

Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
component mask bits was missing in LinkRecord processing.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_sa_link_record.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index ba52aea..0970ad7 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -256,6 +256,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 		if (dest_port_num != p_lr->to_port_num)
 			goto Exit;

+	__get_base_lid(p_src_physp, &from_base_lid);
+	__get_base_lid(p_dest_physp, &to_base_lid);
+
+	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
+		if (from_base_lid != p_lr->from_lid)
+			goto Exit;
+
+	if (comp_mask & IB_LR_COMPMASK_TO_LID)
+		if (to_base_lid != p_lr->to_lid)
+			goto Exit;
+
 	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_lr_rcv_get_physp_link: "
@@ -267,8 +278,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
 			dest_port_num);

-	__get_base_lid(p_src_physp, &from_base_lid);
-	__get_base_lid(p_dest_physp, &to_base_lid);

 	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
 				      to_base_lid, src_port_num,
-- 
1.5.1.4


From kliteyn at dev.mellanox.co.il  Sun Dec  2 04:16:21 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Sun, 02 Dec 2007 14:16:21 +0200
Subject: [ofa-general] [PATCH 3/3] opensm: Fixing broken logic in 'process
 world' part of LinkRecord processing
Message-ID: <4752A215.40809@dev.mellanox.co.il>

Fixing broken logic in 'process world' part of LinkRecord processing.
When both HCA's ports belong to the same subnet, OpenSM would scan
'half-world' for each port of this HCA, and then for each port it would
get the node and iterate again through all the ports of this node.
In addition to the time consumed by these unnecessary iterations, it
also caused some records to be found twice.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_sa_link_record.c |   37 ++++++++++++++++++++++-------------
 1 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index 0970ad7..6ebf3d0 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -300,7 +300,8 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv,
 {
 	const osm_physp_t *p_src_physp;
 	const osm_physp_t *p_dest_physp;
-	const cl_qmap_t *p_port_tbl;
+	const cl_qmap_t *p_node_tbl;
+	osm_node_t * p_node;
 	uint8_t port_num;
 	uint8_t num_ports;
 	uint8_t dest_num_ports;
@@ -417,19 +418,27 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv,
 			/*
 			   Process the world (recurse once back into this function).
 			 */
-			p_port_tbl = &p_rcv->p_subn->port_guid_tbl;
-			p_src_port = (osm_port_t *) cl_qmap_head(p_port_tbl);
-
-			while (p_src_port !=
-			       (osm_port_t *) cl_qmap_end(p_port_tbl)) {
-				__osm_lr_rcv_get_port_links(p_rcv, p_lr,
-							    p_src_port, NULL,
-							    comp_mask, p_list,
-							    p_req_physp);
-
-				p_src_port =
-				    (osm_port_t *) cl_qmap_next(&p_src_port->
-								map_item);
+			p_node_tbl = &p_rcv->p_subn->node_guid_tbl;
+			p_node = (osm_node_t *)cl_qmap_head(p_node_tbl);
+
+			while (p_node != (osm_node_t *)cl_qmap_end(p_node_tbl)) {
+				/*
+				   Get only one port for each node.
+				   After the recursive call, this function will
+				   scan all the ports of this node anyway.
+				 */
+				p_src_physp = osm_node_get_any_physp_ptr(p_node);
+				if (osm_physp_is_valid(p_src_physp)) {
+					p_src_port = (osm_port_t *)
+					    cl_qmap_get(&p_rcv->p_subn->port_guid_tbl,
+					        osm_physp_get_port_guid(p_src_physp));
+					__osm_lr_rcv_get_port_links(p_rcv, p_lr,
+								    p_src_port, NULL,
+								    comp_mask, p_list,
+								    p_req_physp);
+				}
+				p_node = (osm_node_t *) cl_qmap_next(&p_node->
+								     map_item);
 			}
 		}
 	}
-- 
1.5.1.4


From kliteyn at mellanox.co.il  Sun Dec  2 04:21:48 2007
From: kliteyn at mellanox.co.il (Yevgeny Kliteynik)
Date: Sun, 2 Dec 2007 14:21:48 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-01:normalcompletion
In-Reply-To: <1196518684.10845.160.camel@hrosenstock-ws.xsigo.com>
References: <MTLEXCH01Z1UbDBRArV000092be@mtlexch01.mtl.com>
	<1196518684.10845.160.camel@hrosenstock-ws.xsigo.com>
Message-ID: <6C2C79E72C305246B504CBA17B5500C902C5B508@mtlexch01.mtl.com>

Hi Hal,

It looks like a regression tool (simulator) issue. 
Didn't have a chance to debug it yet, but I have 
everything I need to recreate the failure.

 
Regards,
 
Yevgeny Kliteynik
    
Mellanox Technologies LTD
Tel: +972-4-909-7200 ext: 394
Fax: +972-4-959-3245
P.O. Box 586 Yokneam 20692 ISRAEL 

-----Original Message-----
From: Hal Rosenstock [mailto:hrosenstock at xsigo.com] 
Sent: Saturday, December 01, 2007 16:18
To: Yevgeny Kliteynik
Cc: sashak at voltaire.com; Eitan Zahavi; general at lists.openfabrics.org
Subject: Re: [ofa-general] nightly osm_sim report
2007-12-01:normalcompletion

Hi Yevgeny,

On Sat, 2007-12-01 at 07:09 +0200, kliteyn at mellanox.co.il wrote:
> OSM Simulation Regression Summary
>  
> [Generated mail - please do NOT reply]
>  
> 
> OpenSM binary date = 2007-11-30
> OpenSM git rev = Thu_Nov_29_19:37:20_2007
[498e13f7145f77d468054688d8cbea61677b624a]
> ibutils git rev = Tue_Sep_4_17:57:34_2007
[4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
>  
> 
> Total=480  Pass=479  Fail=1
>  
> 
> Pass:
> 36 Stability IS1-16.topo
> 36 Pkey IS1-16.topo
> 36 OsmTest IS1-16.topo
> 36 OsmStress IS1-16.topo
> 36 Multicast IS1-16.topo
> 36 LidMgr IS1-16.topo
> 12 Stability IS3-loop.topo
> 12 Stability IS3-128.topo
> 12 Pkey IS3-128.topo
> 12 OsmTest IS3-loop.topo
> 12 OsmTest IS3-128.topo
> 12 OsmStress IS3-128.topo
> 12 Multicast IS3-loop.topo
> 12 Multicast IS3-128.topo
> 12 FatTree merge-roots-4-ary-2-tree.topo
> 12 FatTree merge-root-4-ary-3-tree.topo
> 12 FatTree gnu-stallion-64.topo
> 12 FatTree blend-4-ary-2-tree.topo
> 12 FatTree RhinoDDR.topo
> 12 FatTree FullGnu.topo
> 12 FatTree 4-ary-2-tree.topo
> 12 FatTree 2-ary-4-tree.topo
> 12 FatTree 12-node-spaced.topo
> 12 FTreeFail 4-ary-2-tree-missing-sw-link.topo
> 12 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
> 12 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
> 12 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
> 11 LidMgr IS3-128.topo
> 
> Failures:
> 1 LidMgr IS3-128.topo

We've seen similar reports on other runs too. Is this a regression tool
issue or a real failure ?

Thanks.

-- Hal

> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


From vlad at dev.mellanox.co.il  Sun Dec  2 04:26:26 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Sun, 02 Dec 2007 14:26:26 +0200
Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0
	firmware support
In-Reply-To: <47501F09.4060800@opengridcomputing.com>
References: <47501F09.4060800@opengridcomputing.com>
Message-ID: <4752A472.7030306@dev.mellanox.co.il>

Steve Wise wrote:
> Vlad, please pull cxgb3 fixes for ofed-1.2.5 from:
> 
> git://git.openfabrics.org/~swise/ofed-1.2.5 stevo
> 
> These are cxgb3 bug fixes and PPC64 additions that we need for 
> ofed-1.2.5  (stay tuned for ofed-1.3 patches soon).
> 
> The patches are all accepted upstream and were posted here:
> 
> http://www.spinics.net/lists/netdev/msg47492.html
> 
> and here:
> 
> http://www.spinics.net/lists/netdev/msg48240.html
> 
> 
> Also, please pull version 1.1.0 of libcxgb3 from:
> 
> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
> 
> The library and drivers need to be included together as they are both 
> needed to support the chelsio 5.0 firmware.
> 
> Alsoalso: After you integrate these, can you crank a daily OFED-1.2.5.3 
> build including all this?
> 
> 
> Thanks,
> 
> Steve.
> 

Done,

Regards,
Vladimir


From hrosenstock at xsigo.com  Sun Dec  2 04:48:41 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sun, 02 Dec 2007 04:48:41 -0800
Subject: [ofa-general] [PATCHv2] libibmad/dump.c: Use bit mask approach to
	decoding LinkWidth/Speed Enabled/Supported
Message-ID: <1196599721.10845.218.camel@hrosenstock-ws.xsigo.com>

libibmad/dump.c: Use bit mask approach to decoding LinkWidth/Speed
Enabled/Supported

Based on email from Jason Gunthorpe <jgunthorpe at obsidianresearch.com>

Signed-off-by: Hal Rosenstock <hal at xsigo.com>

diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c
index 9628eba..aa88582 100644
--- a/libibmad/src/dump.c
+++ b/libibmad/src/dump.c
@@ -227,39 +227,46 @@ mad_dump_linkwidth(char *buf, int bufsz, void *val, int valsz)
 	}
 }
 
+static void
+dump_linkwidth(char *buf, int bufsz, int width)
+{
+	char *s = buf, *e = s + bufsz;
+
+	if (width & 0x1)
+		s += snprintf(s, e - s, "1X or ");
+	if (s < e && (width & 0x2))
+		s += snprintf(s, e - s, "4X or ");
+	if (s < e && (width & 0x4))
+		s += snprintf(s, e - s, "8X or ");
+	if (s < e && (width & 0x8))
+		s += snprintf(s, e - s, "12X or ");
+
+	if ((width >> 4) || s == buf)
+		s += snprintf(s, e - s, "undefined (%d)", width);
+	else
+		s[-4] = 0;
+}
+
 void
 mad_dump_linkwidthsup(char *buf, int bufsz, void *val, int valsz)
 {
 	int width = *(int *)val;
 
-	switch (width) {
+	dump_linkwidth(buf, bufsz, width);
+
+	switch(width) {
 	case 1:
-		snprintf(buf, bufsz, "1X");
-		break;
-	case 2:
-		snprintf(buf, bufsz, "4X (IBA extension)");
-		break;
 	case 3:
-		snprintf(buf, bufsz, "1X or 4X");
-		break;
-	case 4:
-		snprintf(buf, bufsz, "8X (IBA extension)");
-		break;
 	case 7:
-		snprintf(buf, bufsz, "1X or 4X or 8X");
-		break;
-	case 8:
-		snprintf(buf, bufsz, "12X (IBA extension)");
-		break;
 	case 11:
-		snprintf(buf, bufsz, "1X or 4X or 12X");
-		break;
 	case 15:
-		snprintf(buf, bufsz, "1X or 4X or 8X or 12X");
 		break;
+
 	default:
-		IBWARN("bad width %d", width);
-		buf[0] = 0;
+		if (!(width >> 4))
+			snprintf(buf + strlen(buf), bufsz - strlen(buf),
+				 " (IBA extension)");
+		break;
 	}
 }
 
@@ -267,21 +274,8 @@ void
 mad_dump_linkwidthen(char *buf, int bufsz, void *val, int valsz)
 {
 	int width = *(int *)val;
-	char *s = buf, *e = s + bufsz;
 
-	if (width & 0x1)
-		s += snprintf(s, e - s, "1X or ");
-	if (s < e && (width & 0x2))
-		s += snprintf(s, e - s, "4X or ");
-	if (s < e && (width & 0x4))
-		s += snprintf(s, e - s, "8X or ");
-	if (s < e && (width & 0x8))
-		s += snprintf(s, e - s, "12X or ");
-
-	if ((width >> 4) || s == buf)
-		s += snprintf(s, e - s, "?(%d)", width);
-	else
-		s[-3] = 0;
+	dump_linkwidth(buf, bufsz, width);
 }
 
 void
@@ -300,75 +294,55 @@ mad_dump_linkspeed(char *buf, int bufsz, void *val, int valsz)
 		snprintf(buf, bufsz, "10.0 Gbps");
 		break;
 	default:
-		snprintf(buf, bufsz, "?(%d)", speed);
+		snprintf(buf, bufsz, "undefined (%d)", speed);
 		break;
 	}
 }
 
-void
-mad_dump_linkspeedsup(char *buf, int bufsz, void *val, int valsz)
+static void
+dump_linkspeed(char *buf, int bufsz, int speed)
 {
-	int speed = *(int *)val;
+	char *s = buf, *e = s + bufsz;
+
+	if (speed & 0x1)
+		s += snprintf(s, e - s, "2.5 Gbps or ");
+	if (s < e && (speed & 0x2))
+		s += snprintf(s, e - s, "5.0 Gbps or ");
+	if (s < e && (speed & 0x4))
+		s += snprintf(s, e - s, "10.0 Gbps or ");
+
+	if ((speed >> 3) || s == buf)
+		s += snprintf(s, e - s, "undefined (%d)", speed);
+	else
+		s[-4] = 0;
 
 	switch (speed) {
 	case 1:
-		snprintf(buf, bufsz, "2.5 Gbps");
-		break;
-	case 2:
-		snprintf(buf, bufsz, "5.0 Gbps (IBA extension)");
-		break;
 	case 3:
-		snprintf(buf, bufsz, "2.5 or 5.0 Gbps");
-		break;
-	case 4:
-		snprintf(buf, bufsz, "10.0 Gbps (IBA extension)");
-		break;
 	case 5:
-		snprintf(buf, bufsz, "2.5 or 10.0 Gbps");
-		break;
 	case 7:
-		snprintf(buf, bufsz, "2.5 or 5.0 or 10.0 Gbps");
 		break;
 	default:
-		snprintf(buf, bufsz, "?(%d)", speed);
+		if (!(speed >> 3))
+			snprintf(s, e - s, " (IBA extension)"); 
 		break;
 	}
 }
 
 void
+mad_dump_linkspeedsup(char *buf, int bufsz, void *val, int valsz)
+{
+	int speed = *(int *)val;
+
+	dump_linkspeed(buf, bufsz, speed);
+}
+
+void
 mad_dump_linkspeeden(char *buf, int bufsz, void *val, int valsz)
 {
 	int speed = *(int *)val;
 
-	switch (speed) {
-	case 1:
-		snprintf(buf, bufsz, "2.5 Gbps");
-		break;
-	case 2:
-		snprintf(buf, bufsz, "5.0 Gbps (IBA extension)");
-		break;
-	case 3:
-		snprintf(buf, bufsz, "2.5 or 5.0 Gbps");
-		break;
-	case 4:
-		snprintf(buf, bufsz, "10.0 Gbps (IBA extension)");
-		break;
-	case 5:
-		snprintf(buf, bufsz, "2.5 or 10.0 Gbps");
-		break;
-	case 6:
-		snprintf(buf, bufsz, "5.0 or 10.0 Gbps");
-		break;
-	case 7:
-		snprintf(buf, bufsz, "2.5 or 5.0 or 10.0 Gbps");
-		break;
-	case 15:
-		snprintf(buf, bufsz, "SpeedSupported");
-		break;
-	default:
-		snprintf(buf, bufsz, "?(%d)", speed);
-		break;
-	}
+	dump_linkspeed(buf, bufsz, speed);
 }
 
 void


From kliteyn at dev.mellanox.co.il  Sun Dec  2 05:04:09 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Sun, 02 Dec 2007 15:04:09 +0200
Subject: [ofa-general] SDP is leaving "orphan" connections in sdpnetstat
Message-ID: <4752AD49.2060709@dev.mellanox.co.il>

Jim,

I have a simple client/server application that is using SDP.
The client opens several threads. Each of these threads connects
to a server and send a chunk of bits. Nothing fancy.

Now I noticed that when I run sdpnetstat on the server's side,
I see all the SDP connections disappear once the client is dead.
But on the client's side I see these lines remain in the sdpnetstat
output:

Proto Recv-Q Send-Q Local Address           Foreign Address        PID/Program name
...
sdp        0      0 11.4.3.156:54601        11.4.3.157:dnp         -
sdp        0      0 11.4.3.156:54602        11.4.3.157:dnp         -
sdp        0      0 11.4.3.156:54603        11.4.3.157:dnp         -
sdp        0      0 11.4.3.156:54604        11.4.3.157:dnp         -
...

While the client is alive, sdpnetstat looks ok:

Proto Recv-Q Send-Q Local Address           Foreign Address        PID/Program name
...
sdp        0 123004 11.4.3.156:54601        11.4.3.157:dnp         7773/client
sdp        0 237776 11.4.3.156:54602        11.4.3.157:dnp         7773/client
sdp        0 113072 11.4.3.156:54603        11.4.3.157:dnp         7773/client
sdp        0 229376 11.4.3.156:54604        11.4.3.157:dnp         7773/client
...

Once these orphan connections appear, they never disappear - only
new orphans are added to the list from time to time.

Any idea what is going on here? Is this bug or feature?

-- Yevgeny


From tziporet at dev.mellanox.co.il  Sun Dec  2 05:30:42 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Sun, 02 Dec 2007 15:30:42 +0200
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <OF7F9B4959.CD9666D5-ON872573A3.007FBDE3-882573A3.00809D5D@us.ibm.com>
References: <OF7F9B4959.CD9666D5-ON872573A3.007FBDE3-882573A3.00809D5D@us.ibm.com>
Message-ID: <4752B382.5080300@mellanox.co.il>

Shirley Ma wrote:
> I just touch tested ofed-1.3 beta IPoIB. And found there was a kernel 
> parameter hw_csum being added in IPoIB. I have several questions here:
> 1. Why not using ethtool to set up these HW_CSUM flags?
>   
This is experimental code - Dror explained it in OFED devcon (Or sent 
you the pointer)
> 2. I haven't looked at the detailed code yet, is that possible with this 
> flag, TCP/IP will not do CSUM for HCA which has no TCP/IP offload support? 
> If so, then these packets should be limited to be IB network. Routing to 
> ethernet network, the packets would be dropped. If not, I tested IPoIB-UD, 
> why I saw 30% improvement with hw_csum set in none connectX mthca SDR 
> environment?
>   
Arbel HCA also support checksum offload - so you will see improvement in 
this HCA too.
> 3. I saw switching between IPoIB-cm and IPoIB-ud corrupted interface IP 
> address (unicast address, subnet mask, broadcast address. Anybody saw the 
> same problem?
>   
Can you give more details? We change here between CM and UD mode all the 
time.
What is the exact scenario you run?

Tziporet


From eli at dev.mellanox.co.il  Sun Dec  2 05:45:59 2007
From: eli at dev.mellanox.co.il (Eli Cohen)
Date: Sun, 02 Dec 2007 15:45:59 +0200
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <OF7F9B4959.CD9666D5-ON872573A3.007FBDE3-882573A3.00809D5D@us.ibm.com>
References: <OF7F9B4959.CD9666D5-ON872573A3.007FBDE3-882573A3.00809D5D@us.ibm.com>
Message-ID: <1196603159.22671.11.camel@mtls03>

On Fri, 2007-11-30 at 15:28 -0800, Shirley Ma wrote:
> I just touch tested ofed-1.3 beta IPoIB. And found there was a kernel 
> parameter hw_csum being added in IPoIB. I have several questions here:
> 1. Why not using ethtool to set up these HW_CSUM flags?
There is no adequate interface in Ethtool for doing it so we use a
module parameter. This is because we see this as a static configuration
per host.

> 2. I haven't looked at the detailed code yet, is that possible with this 
> flag, TCP/IP will not do CSUM for HCA which has no TCP/IP offload support? 
Yes, the HCA need not have checksum offload support. the idea is the IB
ICRC provides the insurance that the packets are not corrupt.

> If so, then these packets should be limited to be IB network. Routing to 
> ethernet network, the packets would be dropped. If not, I tested IPoIB-UD, 
> why I saw 30% improvement with hw_csum set in none connectX mthca SDR 
> environment?
> 3. I saw switching between IPoIB-cm and IPoIB-ud corrupted interface IP 
> address (unicast address, subnet mask, broadcast address. Anybody saw the 
> same problem?
I did not notice this but it happened to me a few times that when I
wanted to change the mtu and forgot to specify "mtu" in if config, then
ifconfig interpreted the mtu as the IP address and ended up in a wrong
IP address.

> 
> Thanks
> Shirley
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From sashak at voltaire.com  Sun Dec  2 09:29:07 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 2 Dec 2007 17:29:07 +0000
Subject: [ofa-general] Re: [PATCH 1/3] opensm: Remove unnecessary ntoh and
	hton conversions in LinkRecord processing
In-Reply-To: <4752A17B.1040804@dev.mellanox.co.il>
References: <4752A17B.1040804@dev.mellanox.co.il>
Message-ID: <20071202172907.GD708@sashak.voltaire.com>

On 14:13 Sun 02 Dec     , Yevgeny Kliteynik wrote:
> Remove unnecessary ntoh and hton conversions in LinkRecord processing.
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Sun Dec  2 10:09:49 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 2 Dec 2007 18:09:49 +0000
Subject: [ofa-general] Re: [PATCH 2/3] opensm: adding missing comparison by
	to_lid/from_lid in LinkRecord processing
In-Reply-To: <4752A1DB.5010103@dev.mellanox.co.il>
References: <4752A1DB.5010103@dev.mellanox.co.il>
Message-ID: <20071202180949.GE708@sashak.voltaire.com>

Hi Yevgeny,

On 14:15 Sun 02 Dec     , Yevgeny Kliteynik wrote:
> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
> component mask bits was missing in LinkRecord processing.
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
>  opensm/opensm/osm_sa_link_record.c |   13 +++++++++++--
>  1 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
> index ba52aea..0970ad7 100644
> --- a/opensm/opensm/osm_sa_link_record.c
> +++ b/opensm/opensm/osm_sa_link_record.c
> @@ -256,6 +256,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>  		if (dest_port_num != p_lr->to_port_num)
>  			goto Exit;
> 
> +	__get_base_lid(p_src_physp, &from_base_lid);
> +	__get_base_lid(p_dest_physp, &to_base_lid);
> +
> +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
> +		if (from_base_lid != p_lr->from_lid)
> +			goto Exit;
> +
> +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
> +		if (to_base_lid != p_lr->to_lid)
> +			goto Exit;

Would this be correct LMC > 0? As far as I understand aliased (not based)
LIDs can be used in a query.

Sasha

> +
>  	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
>  		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
>  			"__osm_lr_rcv_get_physp_link: "
> @@ -267,8 +278,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>  			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>  			dest_port_num);
> 
> -	__get_base_lid(p_src_physp, &from_base_lid);
> -	__get_base_lid(p_dest_physp, &to_base_lid);
> 
>  	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
>  				      to_base_lid, src_port_num,
> -- 
> 1.5.1.4
> 


From dwsicradicalm at sicradical.pt  Sun Dec  2 10:32:54 2007
From: dwsicradicalm at sicradical.pt (Brett Wyms)
Date: , 2 Dec 2007 19:32:54 +0100
Subject: [ofa-general] Find cheap alternative to expensive American
	medications.
Message-ID: <01c8351a$2265ae10$37a48d58@dwsicradicalm>

 Speaking about e-shops, Canadian Pharmacy drugstore is the most trustworthy online drugstore that has gained a reputation of providing with the top quality meds at the absolutely low cost. We understand that terms of delivery are extremely important when it comes to health products, so we deliver all the orders really fast. Make a secure and confidential purchase with Canadian Pharmacy. Check our prices and you will definitely make the order. 

http://processmachine.cn

 Once you tried, you will always purchase with ŤCanadianPharmacyť.

Brett Wyms


From Talirlakf at atoutvert.fr  Sun Dec  2 11:03:57 2007
From: Talirlakf at atoutvert.fr (tuka Talir)
Date: Sun, 2 Dec 2007 20:03:57 +0100
Subject: [ofa-general] notnef1
Message-ID: <1641E537.5E56F091@atoutvert.fr>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071202/3f39816c/attachment.html>

From telltales at compaqnet.se  Sun Dec  2 11:26:18 2007
From: telltales at compaqnet.se (Penelope Goss)
Date: , 2 Dec 2007 11:26:18 -0800
Subject: [ofa-general] Get personal pussy available on your command any time
	of the night or day.
Message-ID: <01c834d6$2839a010$1d921b48@telltales>

 You don't even need to read testimonials (although, there are tons of them) to understand that the Personal Pussy is the best male sex toy available. It feels like tight soft warm and wet pussy allowing you to experience real life like fuck.
   With the Personal Puss! you can make a nice fuck any day, any time. Ordering your Personal Pussy you'll receive a virgin pussy and you will tear the hymen on your first penetration.

http://flopool.com

 It gets the job done!


From kliteyn at mellanox.co.il  Sun Dec  2 21:12:49 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 3 Dec 2007 07:12:49 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-03:normal completion
Message-ID: <MTLEXCH01IClmpcQABZ00009578@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-02
OpenSM git rev = Sat_Dec_1_16:38:12_2007 [3ea95fb947863f0a0fe71af794b60993e9ce8b79]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=519  Fail=1
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
38 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:
1 LidMgr IS1-16.topo


From webmaster at adventskalender-online.net  Sun Dec  2 16:26:35 2007
From: webmaster at adventskalender-online.net (webmaster at adventskalender-online.net)
Date: Mon, 03 Dec 2007 01:26:35 +0100
Subject: [ofa-general] Online Adventskalender - Mit 2664 tollen Gewinnen ! 
Message-ID: <E1Iyz8t-0001zZ-4V@lebeaux.adventskalender-online.net>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071203/66787f0c/attachment.html>

From balaji at mcs.anl.gov  Sun Dec  2 23:59:47 2007
From: balaji at mcs.anl.gov (Pavan Balaji)
Date: Mon, 03 Dec 2007 01:59:47 -0600
Subject: [ofa-general] CFP: Workshop on High-Performance,
	Power-Aware Computing (HP-PAC)
In-Reply-To: <4725B504.5000204@mcs.anl.gov>
References: <4725B504.5000204@mcs.anl.gov>
Message-ID: <4753B773.805@mcs.anl.gov>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071203/f1c3e87e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2008-sm-logo.jpg
Type: image/jpeg
Size: 6905 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071203/f1c3e87e/attachment.jpg>

From vlad at lists.openfabrics.org  Mon Dec  3 02:50:52 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Mon,  3 Dec 2007 02:50:52 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071203-0200 daily build status
Message-ID: <20071203105052.8828CE60053@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.19
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.13
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:
Build failed on i686 with 2.6.15-23-server
Log:
		-I/usr/local/include/scst \
		-I/home/vlad/tmp/ofa_1_3_kernel-20071203-0200_check/drivers/infiniband/ulp/srpt \
		-I/home/vlad/tmp/ofa_1_3_kernel-20071203-0200_check/drivers/net/cxgb3 \
		-Iinclude \
		$(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
		' \
		modules
make: *** /lib/modules/2.6.15-23-server/build: No such file or directory.  Stop.
make: *** [kernel] Error 2
----------------------------------------------------------------------------------


From dbahzbHannah at ezdiabetes.com  Mon Dec  3 04:17:40 2007
From: dbahzbHannah at ezdiabetes.com (Hand your)
Date: Mon, 03 Dec 2007 10:17:40 -0200
Subject: [ofa-general] Medical products, (780k physicians)
Message-ID: <553788e4olb0$h7586hg0$9206v3g0@Delldim5150


788,204 in total 17,671 emails 
34 primary and secondary specialties 
Can easily be sorted by 16 different fields 

Price for this week only = $385 

*** Get the data below as a gift when you order the MD list above *** 

Pharmaceutical Company Decision makers in the USA
50000 personal emails and names for decision makers 

Database of US Hospitals
complete contact information for CEO's, CFO's, Directors 
and more - over 23,000 listings in total for more than 
7,000 hospitals in the USA 

Listing of US Dentists
More than half a million listings [worth $299 alone!] 

Contact List of US Chiropractors
More than 100,000 chiropractors practicing in the USA 

reply to: md_dr978 at live.ca     206-888-1732

above valid thru Dec 31


From koen.segers at vrt.be  Mon Dec  3 05:56:41 2007
From: koen.segers at vrt.be (Koen Segers)
Date: Mon, 03 Dec 2007 14:56:41 +0100
Subject: [ofa-general] IO Size more than 48K
In-Reply-To: <01B9E81EECACE94DBBD0A556E768FB8A01E3C7A0@NAMAIL2.ad.lsil.com>
References: <01B9E81EECACE94DBBD0A556E768FB8A01E3C7A0@NAMAIL2.ad.lsil.com>
Message-ID: <1196690201.6758.10.camel@koenVRT>

Can you give more information on which analyzer you used?


Regards,

Koen


On Fri, 2007-11-30 at 10:48 -0700, Batwara, Ashish wrote:
> This is what I did as suggested by Vu and it seems to be working.
> However, when I send 2MB IO, it gets broken into 512K+1MB+512K by SRP
> as seen on analyzer. I am just wondering what the logic is? On the
> other side, when we increase the srp_sg_tablesize beyond 256, we are
> seeing following message in /var/log/messages “Nov 29 21:17:50 p50
> kernel:   REJ reason 0x3” which indicates “IB_CM_REJ_NO_RESOURCES”, so
> not sure how to get around to this problem to send larger IO than 1MB
> in one shot.
> 
>  
> 
>  
> 
> modprobe ib_srp srp_sg_tablesize=256
> 
> echo
> id_ext=200600A0B81138C9,max_sect=4096,ioc_guid=00a0b81112da0003,dgid=fe8000000000000000a0b81112da0001,pkey=ffff,service_id=200600a0b81138c9> /sys/class/infiniband_srp/srp-mthca0-1/add_target
> 
>  
> 
> -----Original Message-----
> From: chas williams - CONTRACTOR [mailto:chas at cmf.nrl.navy.mil] 
> Sent: Friday, November 30, 2007 11:43 AM
> To: Kevin Harms
> Cc: Vu Pham; openib-general at openib.org; Batwara, Ashish
> Subject: Re: [ofa-general] IO Size more than 48K 
> 
>  
> 
> addtionally, you might need to echo 'blocks' >
> 
> /sys/block/<device/queue/max_hw_segments to increase the size of the
> 
> rdma segments.
> 
>  
> 
> max_hw_segments doesnt exist on all kernels i think.
> 
>  
> 
> In message <3A453CF1-5FFC-44BF-8F72-7E3EF5AA6E41 at alcf.anl.gov>,Kevin
> Harms writ
> 
> es:
> 
> > 
> 
> >     you may also have to go to /sys/block/sdX/queue and echo 1024
> >  
> 
> >max_sectors_kb
> 
> >     if you use the srp_daemon you can also add:
> 
> >     a max_sect=2048 to /etc/srp_daemon.conf
> 
> > 
> 
> >kevin
> 
> > 
> 
> >On Nov 29, 2007, at 11:08 AM, Vu Pham wrote:
> 
> > 
> 
> >> 
> 
> >>> Hi,
> 
> >>> We are using OFED-1.2, and using xdd and some other tools, and  
> 
> >>> trying to
> 
> >>> send 1/2MB IOs, but what we are seeing in analyzer traces, that  
> 
> >>> memory
> 
> >>> descriptor in SRP command shows max. 48K which means 1MB I/Os
> has  
> 
> >>> broken
> 
> >>> into smaller SRP request from initiator.
> 
> >>> How can I have this I/O directly going to target? What parameter
> I  
> 
> >>> need
> 
> >>> to change?
> 
> >>> 
> 
> >>> 
> 
> >> 
> 
> >> module param srp_sg_tablesize (default is 12 ie. 12 x 4K = 48K)
> 
> >> and/or
> 
> >> max_sect=yyy in echo id_ext=xxx,...,max_sect=1024,service_id=
> > /sys/ 
> 
> >> class/infiniband_srp/...
> 
> >> 
> 
> >> -vu
> 
> >> 
> 
> >>> Thanks
> 
> >>> Ashish
> 
> >>> _______________________________________________
> 
> >>> general mailing list
> 
> >>> general at lists.openfabrics.org
> 
> >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> >>> 
> 
> >>> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-gene
> 
> >ral
> 
> >>> 
> 
> >> 
> 
> >> _______________________________________________
> 
> >> general mailing list
> 
> >> general at lists.openfabrics.org
> 
> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> >> 
> 
> >> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-gener
> 
> >al
> 
> >> 
> 
> > 
> 
> >_______________________________________________
> 
> >general mailing list
> 
> >general at lists.openfabrics.org
> 
> >http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> > 
> 
> >To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
> 
> > 
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
*** Disclaimer ***

Vlaamse Radio- en Televisieomroep
Auguste Reyerslaan 52, 1043 Brussel

nv van publiek recht
BTW BE 0244.142.664
RPR Brussel
http://www.vrt.be/disclaimer


From swise at opengridcomputing.com  Mon Dec  3 06:23:12 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Mon, 03 Dec 2007 08:23:12 -0600
Subject: [ofa-general] ofa_1_3_kernel 20071203-0200 daily build status
In-Reply-To: <20071203105052.8828CE60053@openfabrics.org>
References: <20071203105052.8828CE60053@openfabrics.org>
Message-ID: <47541150.6050707@opengridcomputing.com>

Vladimir Sokolovsky (Mellanox) wrote:
> This email was generated automatically, please do not reply
> 
> 
> git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
> git_branch: ofed_kernel
> 
> Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod
> 
> Passed:
> Passed on i686 with linux-2.6.22
> Passed on i686 with linux-2.6.21.1
> Passed on i686 with linux-2.6.18
> Passed on i686 with linux-2.6.17
> Passed on i686 with linux-2.6.19
> Passed on i686 with linux-2.6.13
> Passed on i686 with linux-2.6.16
> Passed on i686 with linux-2.6.15
> Passed on i686 with linux-2.6.14
> Passed on i686 with linux-2.6.12
> Passed on x86_64 with linux-2.6.22
> Passed on x86_64 with linux-2.6.21.1
> Passed on x86_64 with linux-2.6.16
> Passed on x86_64 with linux-2.6.20
> Passed on x86_64 with linux-2.6.18
> Passed on ia64 with linux-2.6.21.1
> Passed on x86_64 with linux-2.6.19
> Passed on ia64 with linux-2.6.23
> Passed on ia64 with linux-2.6.22
> Passed on x86_64 with linux-2.6.12
> Passed on x86_64 with linux-2.6.15
> Passed on powerpc with linux-2.6.12
> Passed on x86_64 with linux-2.6.13
> Passed on powerpc with linux-2.6.13
> Passed on x86_64 with linux-2.6.16.43-0.3-smp
> Passed on x86_64 with linux-2.6.14
> Passed on x86_64 with linux-2.6.17
> Passed on powerpc with linux-2.6.15
> Passed on ppc64 with linux-2.6.16
> Passed on ppc64 with linux-2.6.15
> Passed on powerpc with linux-2.6.14
> Passed on ppc64 with linux-2.6.14
> Passed on ppc64 with linux-2.6.12
> Passed on ppc64 with linux-2.6.19
> Passed on ia64 with linux-2.6.18
> Passed on x86_64 with linux-2.6.18-53.el5
> Passed on ppc64 with linux-2.6.13
> Passed on ia64 with linux-2.6.19
> Passed on ppc64 with linux-2.6.18
> Passed on x86_64 with linux-2.6.16.21-0.8-smp
> Passed on x86_64 with linux-2.6.9-55.ELsmp
> Passed on x86_64 with linux-2.6.18-8.el5
> Passed on x86_64 with linux-2.6.18-1.2798.fc6
> Passed on ia64 with linux-2.6.13
> Passed on ia64 with linux-2.6.12
> Passed on ia64 with linux-2.6.17
> Passed on ia64 with linux-2.6.14
> Passed on ia64 with linux-2.6.15
> Passed on ia64 with linux-2.6.16
> Passed on ppc64 with linux-2.6.17
> Passed on x86_64 with linux-2.6.9-42.ELsmp
> Passed on ppc64 with linux-2.6.18-8.el5
> Passed on ia64 with linux-2.6.16.21-0.8-default
> 
> Failed:
> Build failed on i686 with 2.6.15-23-server
> Log:
> 		-I/usr/local/include/scst \
> 		-I/home/vlad/tmp/ofa_1_3_kernel-20071203-0200_check/drivers/infiniband/ulp/srpt \
> 		-I/home/vlad/tmp/ofa_1_3_kernel-20071203-0200_check/drivers/net/cxgb3 \
> 		-Iinclude \
> 		$(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
> 		' \
> 		modules
> make: *** /lib/modules/2.6.15-23-server/build: No such file or directory.  Stop.
> make: *** [kernel] Error 2
> ----------------------------------------------------------------------------------
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Jeff. I think this problem started when you updated the server maybe? 
The update must have removed /lib/modules/2.6.15-23-server/build.

Steve.


From vlad at dev.mellanox.co.il  Mon Dec  3 07:01:49 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Mon, 03 Dec 2007 17:01:49 +0200
Subject: [ofa-general] OFA server patching
In-Reply-To: <474DFA90.6070907@nasa.gov>
References: <474DFA90.6070907@nasa.gov>
Message-ID: <47541A5D.9090106@dev.mellanox.co.il>

Jeff Becker wrote:
> Hi all. In the interest of keeping our server up to date, I applied the
> latest Ubuntu patches. Several upgrades were made, including git. If you
> have any problems, let me know. Thanks.
> 
> -jeff

Hi,
OFED-1.3 daily builds were broken since 29 Nov 2007.
Autotools , g++ and some other packages were removed by server patching...
I have reinstalled autotools and other missing packages on the OFA server. So, OFED-1.3 daily builds are OK now.

Regards,
Vladimir


From kliteyn at dev.mellanox.co.il  Mon Dec  3 07:15:56 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 03 Dec 2007 17:15:56 +0200
Subject: [ofa-general] Re: [PATCH 2/3] opensm: adding missing comparison
 by	to_lid/from_lid in LinkRecord processing
In-Reply-To: <20071202180949.GE708@sashak.voltaire.com>
References: <4752A1DB.5010103@dev.mellanox.co.il>
	<20071202180949.GE708@sashak.voltaire.com>
Message-ID: <47541DAC.9000900@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> Hi Yevgeny,
> 
> On 14:15 Sun 02 Dec     , Yevgeny Kliteynik wrote:
>> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
>> component mask bits was missing in LinkRecord processing.
>>
>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>> ---
>>  opensm/opensm/osm_sa_link_record.c |   13 +++++++++++--
>>  1 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
>> index ba52aea..0970ad7 100644
>> --- a/opensm/opensm/osm_sa_link_record.c
>> +++ b/opensm/opensm/osm_sa_link_record.c
>> @@ -256,6 +256,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>>  		if (dest_port_num != p_lr->to_port_num)
>>  			goto Exit;
>>
>> +	__get_base_lid(p_src_physp, &from_base_lid);
>> +	__get_base_lid(p_dest_physp, &to_base_lid);
>> +
>> +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
>> +		if (from_base_lid != p_lr->from_lid)
>> +			goto Exit;
>> +
>> +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
>> +		if (to_base_lid != p_lr->to_lid)
>> +			goto Exit;
> 
> Would this be correct LMC > 0? As far as I understand aliased (not based)
> LIDs can be used in a query.

Good catch, thanks.

-- Yevgeny

> Sasha
> 
>> +
>>  	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
>>  		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
>>  			"__osm_lr_rcv_get_physp_link: "
>> @@ -267,8 +278,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>>  			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>>  			dest_port_num);
>>
>> -	__get_base_lid(p_src_physp, &from_base_lid);
>> -	__get_base_lid(p_dest_physp, &to_base_lid);
>>
>>  	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
>>  				      to_base_lid, src_port_num,
>> -- 
>> 1.5.1.4
>>
> 


From kliteyn at dev.mellanox.co.il  Mon Dec  3 07:16:55 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 03 Dec 2007 17:16:55 +0200
Subject: [ofa-general] [PATCH 2/3 v2] opensm: adding missing comparison by
 to_lid/from_lid in LinkRecord processing
Message-ID: <47541DE7.4000106@dev.mellanox.co.il>

Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
component mask bits was missing in LinkRecord processing.

Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_sa_link_record.c |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index ba52aea..1230b91 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -177,6 +177,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 	uint8_t dest_port_num;
 	ib_net16_t from_base_lid;
 	ib_net16_t to_base_lid;
+	uint16_t lmc_mask;

 	OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_physp_link);

@@ -256,6 +257,19 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 		if (dest_port_num != p_lr->to_port_num)
 			goto Exit;

+	__get_base_lid(p_src_physp, &from_base_lid);
+	__get_base_lid(p_dest_physp, &to_base_lid);
+
+	lmc_mask = ~((1 << p_rcv->p_subn->opt.lmc) - 1);
+
+	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
+		if (from_base_lid != (p_lr->from_lid & lmc_mask))
+			goto Exit;
+
+	if (comp_mask & IB_LR_COMPMASK_TO_LID)
+		if (to_base_lid != (p_lr->to_lid & lmc_mask))
+			goto Exit;
+
 	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_lr_rcv_get_physp_link: "
@@ -267,8 +281,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
 			dest_port_num);

-	__get_base_lid(p_src_physp, &from_base_lid);
-	__get_base_lid(p_dest_physp, &to_base_lid);

 	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
 				      to_base_lid, src_port_num,
-- 
1.5.1.4


From hrosenstock at xsigo.com  Mon Dec  3 07:26:56 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Mon, 03 Dec 2007 07:26:56 -0800
Subject: [ofa-general] Re: [PATCH 2/3] opensm: adding missing
	comparison by	to_lid/from_lid in LinkRecord processing
In-Reply-To: <47541DAC.9000900@dev.mellanox.co.il>
References: <4752A1DB.5010103@dev.mellanox.co.il>
	<20071202180949.GE708@sashak.voltaire.com>
	<47541DAC.9000900@dev.mellanox.co.il>
Message-ID: <1196695616.10845.241.camel@hrosenstock-ws.xsigo.com>

On Mon, 2007-12-03 at 17:15 +0200, Yevgeny Kliteynik wrote:
> Sasha Khapyorsky wrote:
> > Hi Yevgeny,
> > 
> > On 14:15 Sun 02 Dec     , Yevgeny Kliteynik wrote:
> >> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
> >> component mask bits was missing in LinkRecord processing.
> >>
> >> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> >> ---
> >>  opensm/opensm/osm_sa_link_record.c |   13 +++++++++++--
> >>  1 files changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
> >> index ba52aea..0970ad7 100644
> >> --- a/opensm/opensm/osm_sa_link_record.c
> >> +++ b/opensm/opensm/osm_sa_link_record.c
> >> @@ -256,6 +256,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
> >>  		if (dest_port_num != p_lr->to_port_num)
> >>  			goto Exit;
> >>
> >> +	__get_base_lid(p_src_physp, &from_base_lid);
> >> +	__get_base_lid(p_dest_physp, &to_base_lid);
> >> +
> >> +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
> >> +		if (from_base_lid != p_lr->from_lid)
> >> +			goto Exit;
> >> +
> >> +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
> >> +		if (to_base_lid != p_lr->to_lid)
> >> +			goto Exit;
> > 
> > Would this be correct LMC > 0? As far as I understand aliased (not based)
> > LIDs can be used in a query.
> 
> Good catch, thanks.

Note that:
In a query request, any LID of a port can be requested as the ToLID. In
a query response, only the base LID of a port is returned as the ToLID.

-- Hal

> 
> -- Yevgeny
> 
> > Sasha
> > 
> >> +
> >>  	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
> >>  		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
> >>  			"__osm_lr_rcv_get_physp_link: "
> >> @@ -267,8 +278,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
> >>  			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
> >>  			dest_port_num);
> >>
> >> -	__get_base_lid(p_src_physp, &from_base_lid);
> >> -	__get_base_lid(p_dest_physp, &to_base_lid);
> >>
> >>  	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
> >>  				      to_base_lid, src_port_num,
> >> -- 
> >> 1.5.1.4
> >>
> > 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Mon Dec  3 07:26:56 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Mon, 03 Dec 2007 07:26:56 -0800
Subject: [ofa-general] Re: [PATCH 2/3] opensm: adding missing
	comparison by	to_lid/from_lid in LinkRecord processing
In-Reply-To: <47541DAC.9000900@dev.mellanox.co.il>
References: <4752A1DB.5010103@dev.mellanox.co.il>
	<20071202180949.GE708@sashak.voltaire.com>
	<47541DAC.9000900@dev.mellanox.co.il>
Message-ID: <1196695616.10845.241.camel@hrosenstock-ws.xsigo.com>

On Mon, 2007-12-03 at 17:15 +0200, Yevgeny Kliteynik wrote:
> Sasha Khapyorsky wrote:
> > Hi Yevgeny,
> > 
> > On 14:15 Sun 02 Dec     , Yevgeny Kliteynik wrote:
> >> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
> >> component mask bits was missing in LinkRecord processing.
> >>
> >> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> >> ---
> >>  opensm/opensm/osm_sa_link_record.c |   13 +++++++++++--
> >>  1 files changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
> >> index ba52aea..0970ad7 100644
> >> --- a/opensm/opensm/osm_sa_link_record.c
> >> +++ b/opensm/opensm/osm_sa_link_record.c
> >> @@ -256,6 +256,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
> >>  		if (dest_port_num != p_lr->to_port_num)
> >>  			goto Exit;
> >>
> >> +	__get_base_lid(p_src_physp, &from_base_lid);
> >> +	__get_base_lid(p_dest_physp, &to_base_lid);
> >> +
> >> +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
> >> +		if (from_base_lid != p_lr->from_lid)
> >> +			goto Exit;
> >> +
> >> +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
> >> +		if (to_base_lid != p_lr->to_lid)
> >> +			goto Exit;
> > 
> > Would this be correct LMC > 0? As far as I understand aliased (not based)
> > LIDs can be used in a query.
> 
> Good catch, thanks.

Note that:
In a query request, any LID of a port can be requested as the ToLID. In
a query response, only the base LID of a port is returned as the ToLID.

-- Hal

> 
> -- Yevgeny
> 
> > Sasha
> > 
> >> +
> >>  	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
> >>  		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
> >>  			"__osm_lr_rcv_get_physp_link: "
> >> @@ -267,8 +278,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
> >>  			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
> >>  			dest_port_num);
> >>
> >> -	__get_base_lid(p_src_physp, &from_base_lid);
> >> -	__get_base_lid(p_dest_physp, &to_base_lid);
> >>
> >>  	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
> >>  				      to_base_lid, src_port_num,
> >> -- 
> >> 1.5.1.4
> >>
> > 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From kliteyn at dev.mellanox.co.il  Mon Dec  3 07:52:04 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 03 Dec 2007 17:52:04 +0200
Subject: [ofa-general] Re: [PATCH 2/3] opensm: adding missing	comparison
	by	to_lid/from_lid in LinkRecord processing
In-Reply-To: <1196695616.10845.241.camel@hrosenstock-ws.xsigo.com>
References: <4752A1DB.5010103@dev.mellanox.co.il>	
	<20071202180949.GE708@sashak.voltaire.com>	
	<47541DAC.9000900@dev.mellanox.co.il>
	<1196695616.10845.241.camel@hrosenstock-ws.xsigo.com>
Message-ID: <47542624.1090003@dev.mellanox.co.il>

Hal Rosenstock wrote:
> On Mon, 2007-12-03 at 17:15 +0200, Yevgeny Kliteynik wrote:
>> Sasha Khapyorsky wrote:
>>> Hi Yevgeny,
>>>
>>> On 14:15 Sun 02 Dec     , Yevgeny Kliteynik wrote:
>>>> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
>>>> component mask bits was missing in LinkRecord processing.
>>>>
>>>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>>>> ---
>>>>  opensm/opensm/osm_sa_link_record.c |   13 +++++++++++--
>>>>  1 files changed, 11 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
>>>> index ba52aea..0970ad7 100644
>>>> --- a/opensm/opensm/osm_sa_link_record.c
>>>> +++ b/opensm/opensm/osm_sa_link_record.c
>>>> @@ -256,6 +256,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>>>>  		if (dest_port_num != p_lr->to_port_num)
>>>>  			goto Exit;
>>>>
>>>> +	__get_base_lid(p_src_physp, &from_base_lid);
>>>> +	__get_base_lid(p_dest_physp, &to_base_lid);
>>>> +
>>>> +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
>>>> +		if (from_base_lid != p_lr->from_lid)
>>>> +			goto Exit;
>>>> +
>>>> +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
>>>> +		if (to_base_lid != p_lr->to_lid)
>>>> +			goto Exit;
>>> Would this be correct LMC > 0? As far as I understand aliased (not based)
>>> LIDs can be used in a query.
>> Good catch, thanks.
> 
> Note that:
> In a query request, any LID of a port can be requested as the ToLID. In
> a query response, only the base LID of a port is returned as the ToLID.

Right.
So in the current implementation it does build response with only base
lids, but it will include more than one LinkRecord with the base lids
in the resulting list - one for each lid when LMC>0...

I'll repost the patch.

-- Yevgeny

> -- Hal
> 
>> -- Yevgeny
>>
>>> Sasha
>>>
>>>> +
>>>>  	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
>>>>  		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
>>>>  			"__osm_lr_rcv_get_physp_link: "
>>>> @@ -267,8 +278,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>>>>  			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>>>>  			dest_port_num);
>>>>
>>>> -	__get_base_lid(p_src_physp, &from_base_lid);
>>>> -	__get_base_lid(p_dest_physp, &to_base_lid);
>>>>
>>>>  	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
>>>>  				      to_base_lid, src_port_num,
>>>> -- 
>>>> 1.5.1.4
>>>>
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


From swise at opengridcomputing.com  Mon Dec  3 08:01:35 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Mon, 03 Dec 2007 10:01:35 -0600
Subject: [ofa-general] [ANNOUNCE] libcxgb3 1.1.0 released
Message-ID: <4754285F.1040801@opengridcomputing.com>

Version 1.1.0 of libcxgb3 is available from:

http://www.openfabrics.com/downloads/cxgb3/libcxgb3-1.1.0.tar.gz

This release is included in the latest ofed-1.2.5 and will be (soon) in 
ofed-1.3.  This version enables support for version 5.0 of the Chelsio 
RNIC firmware, and is required if you are upgrading to that firmware.


From tziporet at mellanox.co.il  Mon Dec  3 08:24:04 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Mon, 3 Dec 2007 18:24:04 +0200
Subject: [ofa-general] Agenda for OFED meeting today
Message-ID: <6C2C79E72C305246B504CBA17B5500C90282E454@mtlexch01.mtl.com>

This is the agenda for OFED meeting today:

1. OFA server status (I asked Jeff Becker to join the meeting)
2. Beta testing status - last week several companies have not
participated thus its is not clear to me what is the status of beta
release.
3. RC1 plans - due to the build issues on the OFA server we have a new
build only today. Need to decide when is the appropriate time for RC1
    Also - without bugzilla functioning not clear how can we track
issues
4. open issues


Tziporet Koren
Software Director
Mellanox Technologies
mailto: tziporet at mellanox.co.il
Tel +972-4-9097200, ext 380

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071203/59d9faaa/attachment.html>

From chas at cmf.nrl.navy.mil  Mon Dec  3 09:14:28 2007
From: chas at cmf.nrl.navy.mil (chas williams - CONTRACTOR)
Date: Mon, 03 Dec 2007 12:14:28 -0500
Subject: [ofa-general] IO Size more than 48K 
In-Reply-To: <1196690201.6758.10.camel@koenVRT> 
Message-ID: <200712031714.lB3HESk5028599@cmf.nrl.navy.mil>

lecroy catc.  its the only 4x analyzer on the market to the best
of my knowledge.

In message <1196690201.6758.10.camel at koenVRT>,Koen Segers writes:
>Can you give more information on which analyzer you used?
>
>
>Regards,
>
>Koen
>
>
>On Fri, 2007-11-30 at 10:48 -0700, Batwara, Ashish wrote:
>> This is what I did as suggested by Vu and it seems to be working.
>> However, when I send 2MB IO, it gets broken into 512K+1MB+512K by SRP
>> as seen on analyzer. I am just wondering what the logic is? On the
>> other side, when we increase the srp_sg_tablesize beyond 256, we are
>> seeing following message in /var/log/messages “Nov 29 21:17:50 p50
>> kernel:   REJ reason 0x3” which indicates “IB_CM_REJ_NO_RESOURCES”, so
>> not sure how to get around to this problem to send larger IO than 1MB
>> in one shot.
>> 
>>  
>> 
>>  
>> 
>> modprobe ib_srp srp_sg_tablesize=256
>> 
>> echo
>> id_ext=200600A0B81138C9,max_sect=4096,ioc_guid=00a0b81112da0003,dgid=fe80000
>00000000000a0b81112da0001,pkey=ffff,service_id=200600a0b81138c9> /sys/class/in
>finiband_srp/srp-mthca0-1/add_target
>> 
>>  
>> 
>> -----Original Message-----
>> From: chas williams - CONTRACTOR [mailto:chas at cmf.nrl.navy.mil] 
>> Sent: Friday, November 30, 2007 11:43 AM
>> To: Kevin Harms
>> Cc: Vu Pham; openib-general at openib.org; Batwara, Ashish
>> Subject: Re: [ofa-general] IO Size more than 48K 
>> 
>>  
>> 
>> addtionally, you might need to echo 'blocks' >
>> 
>> /sys/block/<device/queue/max_hw_segments to increase the size of the
>> 
>> rdma segments.
>> 
>>  
>> 
>> max_hw_segments doesnt exist on all kernels i think.
>> 
>>  
>> 
>> In message <3A453CF1-5FFC-44BF-8F72-7E3EF5AA6E41 at alcf.anl.gov>,Kevin
>> Harms writ
>> 
>> es:
>> 
>> > 
>> 
>> >     you may also have to go to /sys/block/sdX/queue and echo 1024
>> >  
>> 
>> >max_sectors_kb
>> 
>> >     if you use the srp_daemon you can also add:
>> 
>> >     a max_sect=2048 to /etc/srp_daemon.conf
>> 
>> > 
>> 
>> >kevin
>> 
>> > 
>> 
>> >On Nov 29, 2007, at 11:08 AM, Vu Pham wrote:
>> 
>> > 
>> 
>> >> 
>> 
>> >>> Hi,
>> 
>> >>> We are using OFED-1.2, and using xdd and some other tools, and  
>> 
>> >>> trying to
>> 
>> >>> send 1/2MB IOs, but what we are seeing in analyzer traces, that  
>> 
>> >>> memory
>> 
>> >>> descriptor in SRP command shows max. 48K which means 1MB I/Os
>> has  
>> 
>> >>> broken
>> 
>> >>> into smaller SRP request from initiator.
>> 
>> >>> How can I have this I/O directly going to target? What parameter
>> I  
>> 
>> >>> need
>> 
>> >>> to change?
>> 
>> >>> 
>> 
>> >>> 
>> 
>> >> 
>> 
>> >> module param srp_sg_tablesize (default is 12 ie. 12 x 4K = 48K)
>> 
>> >> and/or
>> 
>> >> max_sect=yyy in echo id_ext=xxx,...,max_sect=1024,service_id=
>> > /sys/ 
>> 
>> >> class/infiniband_srp/...
>> 
>> >> 
>> 
>> >> -vu
>> 
>> >> 
>> 
>> >>> Thanks
>> 
>> >>> Ashish
>> 
>> >>> _______________________________________________
>> 
>> >>> general mailing list
>> 
>> >>> general at lists.openfabrics.org
>> 
>> >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> >>> 
>> 
>> >>> To unsubscribe, please visit
>> http://openib.org/mailman/listinfo/openib-gene
>> 
>> >ral
>> 
>> >>> 
>> 
>> >> 
>> 
>> >> _______________________________________________
>> 
>> >> general mailing list
>> 
>> >> general at lists.openfabrics.org
>> 
>> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> >> 
>> 
>> >> To unsubscribe, please visit
>> http://openib.org/mailman/listinfo/openib-gener
>> 
>> >al
>> 
>> >> 
>> 
>> > 
>> 
>> >_______________________________________________
>> 
>> >general mailing list
>> 
>> >general at lists.openfabrics.org
>> 
>> >http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> > 
>> 
>> >To unsubscribe, please visit
>> http://openib.org/mailman/listinfo/openib-general
>> 
>> > 
>> 
>> 
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-gener
>al
>*** Disclaimer ***
>
>Vlaamse Radio- en Televisieomroep
>Auguste Reyerslaan 52, 1043 Brussel
>
>nv van publiek recht
>BTW BE 0244.142.664
>RPR Brussel
>http://www.vrt.be/disclaimer
>
>


From Ashish.Batwara at lsi.com  Mon Dec  3 09:25:12 2007
From: Ashish.Batwara at lsi.com (Batwara, Ashish)
Date: Mon, 3 Dec 2007 10:25:12 -0700
Subject: [ofa-general] IO Size more than 48K 
In-Reply-To: <200712031714.lB3HESk5028599@cmf.nrl.navy.mil>
Message-ID: <01B9E81EECACE94DBBD0A556E768FB8A01E3CA31@NAMAIL2.ad.lsil.com>

Yes. We are using CATC. However, our customers are using Finisar too, but none of the analyzer vendor has IB DDR analyzer and we are looking for one. 
Talked to CATC (Lecroy) 2 weeks ago, and they have this as part of their roadmap and tentative date might be Q3'08. Does anyone know any vendor who is having DDR analyzer currently?

Thanks
Ashish

-----Original Message-----
From: chas williams - CONTRACTOR [mailto:chas at cmf.nrl.navy.mil] 
Sent: Monday, December 03, 2007 11:14 AM
To: koen.segers at VRT.BE
Cc: Batwara, Ashish; Kevin Harms; openib-general at openib.org
Subject: Re: [ofa-general] IO Size more than 48K 

lecroy catc.  its the only 4x analyzer on the market to the best
of my knowledge.

In message <1196690201.6758.10.camel at koenVRT>,Koen Segers writes:
>Can you give more information on which analyzer you used?
>
>
>Regards,
>
>Koen
>
>
>On Fri, 2007-11-30 at 10:48 -0700, Batwara, Ashish wrote:
>> This is what I did as suggested by Vu and it seems to be working.
>> However, when I send 2MB IO, it gets broken into 512K+1MB+512K by SRP
>> as seen on analyzer. I am just wondering what the logic is? On the
>> other side, when we increase the srp_sg_tablesize beyond 256, we are
>> seeing following message in /var/log/messages â€œNov 29 21:17:50 p50
>> kernel:   REJ reason 0x3â€ which indicates â€œIB_CM_REJ_NO_RESOURCESâ€, so
>> not sure how to get around to this problem to send larger IO than 1MB
>> in one shot.
>> 
>>  
>> 
>>  
>> 
>> modprobe ib_srp srp_sg_tablesize=256
>> 
>> echo
>> id_ext=200600A0B81138C9,max_sect=4096,ioc_guid=00a0b81112da0003,dgid=fe80000
>00000000000a0b81112da0001,pkey=ffff,service_id=200600a0b81138c9> /sys/class/in
>finiband_srp/srp-mthca0-1/add_target
>> 
>>  
>> 
>> -----Original Message-----
>> From: chas williams - CONTRACTOR [mailto:chas at cmf.nrl.navy.mil] 
>> Sent: Friday, November 30, 2007 11:43 AM
>> To: Kevin Harms
>> Cc: Vu Pham; openib-general at openib.org; Batwara, Ashish
>> Subject: Re: [ofa-general] IO Size more than 48K 
>> 
>>  
>> 
>> addtionally, you might need to echo 'blocks' >
>> 
>> /sys/block/<device/queue/max_hw_segments to increase the size of the
>> 
>> rdma segments.
>> 
>>  
>> 
>> max_hw_segments doesnt exist on all kernels i think.
>> 
>>  
>> 
>> In message <3A453CF1-5FFC-44BF-8F72-7E3EF5AA6E41 at alcf.anl.gov>,Kevin
>> Harms writ
>> 
>> es:
>> 
>> > 
>> 
>> >     you may also have to go to /sys/block/sdX/queue and echo 1024
>> >  
>> 
>> >max_sectors_kb
>> 
>> >     if you use the srp_daemon you can also add:
>> 
>> >     a max_sect=2048 to /etc/srp_daemon.conf
>> 
>> > 
>> 
>> >kevin
>> 
>> > 
>> 
>> >On Nov 29, 2007, at 11:08 AM, Vu Pham wrote:
>> 
>> > 
>> 
>> >> 
>> 
>> >>> Hi,
>> 
>> >>> We are using OFED-1.2, and using xdd and some other tools, and  
>> 
>> >>> trying to
>> 
>> >>> send 1/2MB IOs, but what we are seeing in analyzer traces, that  
>> 
>> >>> memory
>> 
>> >>> descriptor in SRP command shows max. 48K which means 1MB I/Os
>> has  
>> 
>> >>> broken
>> 
>> >>> into smaller SRP request from initiator.
>> 
>> >>> How can I have this I/O directly going to target? What parameter
>> I  
>> 
>> >>> need
>> 
>> >>> to change?
>> 
>> >>> 
>> 
>> >>> 
>> 
>> >> 
>> 
>> >> module param srp_sg_tablesize (default is 12 ie. 12 x 4K = 48K)
>> 
>> >> and/or
>> 
>> >> max_sect=yyy in echo id_ext=xxx,...,max_sect=1024,service_id=
>> > /sys/ 
>> 
>> >> class/infiniband_srp/...
>> 
>> >> 
>> 
>> >> -vu
>> 
>> >> 
>> 
>> >>> Thanks
>> 
>> >>> Ashish
>> 
>> >>> _______________________________________________
>> 
>> >>> general mailing list
>> 
>> >>> general at lists.openfabrics.org
>> 
>> >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> >>> 
>> 
>> >>> To unsubscribe, please visit
>> http://openib.org/mailman/listinfo/openib-gene
>> 
>> >ral
>> 
>> >>> 
>> 
>> >> 
>> 
>> >> _______________________________________________
>> 
>> >> general mailing list
>> 
>> >> general at lists.openfabrics.org
>> 
>> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> >> 
>> 
>> >> To unsubscribe, please visit
>> http://openib.org/mailman/listinfo/openib-gener
>> 
>> >al
>> 
>> >> 
>> 
>> > 
>> 
>> >_______________________________________________
>> 
>> >general mailing list
>> 
>> >general at lists.openfabrics.org
>> 
>> >http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> > 
>> 
>> >To unsubscribe, please visit
>> http://openib.org/mailman/listinfo/openib-general
>> 
>> > 
>> 
>> 
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> 
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-gener
>al
>*** Disclaimer ***
>
>Vlaamse Radio- en Televisieomroep
>Auguste Reyerslaan 52, 1043 Brussel
>
>nv van publiek recht
>BTW BE 0244.142.664
>RPR Brussel
>http://www.vrt.be/disclaimer
>
>

From wei.fang at hermes-microvision.com  Mon Dec  3 09:36:29 2007
From: wei.fang at hermes-microvision.com (Wei Fang)
Date: Mon, 03 Dec 2007 09:36:29 -0800
Subject: [ofa-general] Question:  Verbs API Error code recover
In-Reply-To: <47527C85.8050801@dev.mellanox.co.il>
References: <4750A321.5080406@hermes-microvision.com>
	<47527C85.8050801@dev.mellanox.co.il>
Message-ID: <47543E9D.4010300@hermes-microvision.com>

Hi, Dotan:

Thank you for your answer. Actually When I got that error, I quit my 
program and restart my program to reconnect the QPs. But I still got 
error for all of the later WR.  I have to restart computer. Before I use 
OFED,  my program is based on old Mellanox's verb API. I didn't find any 
problem for that. So I don't know why for that.

Dotan Barak wrote:
> Hi.
>
> Wei Fang wrote:
>> Hi, All:
>>
>> I'm new here just some days ago. Right now I'm facing a problem to 
>> using OFED 1.2.5's verb api.   In my programming, I use RDMA Write 
>> function to transfer data ( ibv_post_send ). Then I use ibv_poll_cq 
>> to get this CQ's finish.  Sometimes, ibv_poll_cq's return error is 
>> IBV_WC_RETRY_EXC_ERR (error code is 12).  When this error code 
>> happen,  any next transfer will always fail.  In this case, I have to 
>> restart computer.  Anyone can tell me how to recover this error 
>> without quit program or restart PC?
>>
>
> If you have a completion with status IBV_WC_RETRY_EXC_ERR your QP 
> state will be moved to error, so all of the WR that you will post 
> after this will fail too.
> If you have this failure you need to reconnect the QPs (i don't know 
> why you need to restart the computer in order to fix this ....).
>
>
> I think that you need to check why you got this completion status from 
> the first place (did the remote side close the QP?)
>

> Dotan
>
>

-- 
Best Regards

Wei Fang

Hermes Microvision Inc.

(Tel)       (408)597-8600
(Fax)       (408)597-8601
(Direct Tel)(408)597-8646

============================================
The information contained in this document is confidential and may be
legally privileged. It is intended solely for the use of the addressee and
others authorized to receive it. If you are not the intended recipient you
are hereby notified that any disclosure, copying, distribution or any action
taken or omitted in reliance on it is strictly prohibited and may be
unlawful.
============================================


From Jeffrey.C.Becker at nasa.gov  Mon Dec  3 09:54:07 2007
From: Jeffrey.C.Becker at nasa.gov (Jeff Becker)
Date: Mon, 03 Dec 2007 09:54:07 -0800
Subject: [ofa-general] Re: Agenda for OFED meeting today
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90282E454@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C90282E454@mtlexch01.mtl.com>
Message-ID: <475442BF.50103@nasa.gov>

Tziporet Koren wrote:
>
> This is the agenda for OFED meeting today:
>
> 1. OFA server status (I asked Jeff Becker to join the meeting)
>
I was unable to read my e-mail until just now. I tried joining the
meeting but it had ended. I'm working on fixing bugzilla and the wiki
along with other miscellaneous things. Is there a priority for these?
Thanks.

-jeff

> 2. Beta testing status - last week several companies have not
> participated thus its is not clear to me what is the status of beta
> release.
>
> 3. RC1 plans - due to the build issues on the OFA server we have a new
> build only today. Need to decide when is the appropriate time for RC1
>
>     Also - without bugzilla functioning not clear how can we track issues
> 4. open issues
>
>
> Tziporet Koren
> Software Director
> Mellanox Technologies
> mailto: _tziporet at mellanox.co.il_ <mailto:tziporet at mellanox.co.il>
> Tel +972-4-9097200, ext 380
>


From mshefty at ichips.intel.com  Mon Dec  3 10:12:39 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Mon, 03 Dec 2007 10:12:39 -0800
Subject: [ofa-general] Two element types on cancel_list in	cancel_mads()?
In-Reply-To: <1196522620.10845.173.camel@hrosenstock-ws.xsigo.com>
References: <474F598D.70701@opengridcomputing.com>	
	<474F5CA4.8030604@ichips.intel.com>	
	<1196518346.10845.155.camel@hrosenstock-ws.xsigo.com>
	<1196522620.10845.173.camel@hrosenstock-ws.xsigo.com>
Message-ID: <47544717.3040705@ichips.intel.com>

>> commit 2c153b934dca08d58e0aafde18a182e0891aa201
>> Author: Hal Rosenstock <halr at voltaire.com>
>> Date:   Wed Jul 27 11:45:31 2005 -0700
>>
>>     [PATCH] IB: Eliminate MAD cache leak associated with local completions
>>     
>>     Eliminate MAD cache leak associated with local completions.  Also, when
>>     canceling MAD, empty local completion list as well.
>>     
>>     Signed-off-by: Hal Rosenstock <halr at voltaire.com>
>>     Cc: Roland Dreier <rolandd at cisco.com>
>>     Signed-off-by: Andrew Morton <akpm at osdl.org>
>>     Signed-off-by: Linus Torvalds <torvalds at osdl.org>
>>
>> More later...
> 
> FWIW, I traced the origin of this change back to the following thread:
> 
> http://lists.openfabrics.org/pipermail/general/2005-May/thread.html
> 
> http://lists.openfabrics.org/pipermail/general/2005-May/005951.html
> 
> Subject:
> 
> slab error in kmem_cache_destroy(): cache `ib_mad': Can't free all
> objects

Looking back at the patch above, I think that emptying the local 
completion list in cancel_mads() is separate from fixing the cache leak 
in local_completions().  By removing the local list handling from 
cancel_mads(), all handling of those mads should go through 
local_completions() instead, where the memory leak was addressed.

- Sean


From hrosenstock at xsigo.com  Mon Dec  3 10:47:38 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Mon, 03 Dec 2007 10:47:38 -0800
Subject: [ofa-general] Two element types on cancel_list in	cancel_mads()?
In-Reply-To: <47544717.3040705@ichips.intel.com>
References: <474F598D.70701@opengridcomputing.com>
	<474F5CA4.8030604@ichips.intel.com>
	<1196518346.10845.155.camel@hrosenstock-ws.xsigo.com>
	<1196522620.10845.173.camel@hrosenstock-ws.xsigo.com>
	<47544717.3040705@ichips.intel.com>
Message-ID: <1196707658.30768.39.camel@hrosenstock-ws.xsigo.com>

On Mon, 2007-12-03 at 10:12 -0800, Sean Hefty wrote:
> >> commit 2c153b934dca08d58e0aafde18a182e0891aa201
> >> Author: Hal Rosenstock <halr at voltaire.com>
> >> Date:   Wed Jul 27 11:45:31 2005 -0700
> >>
> >>     [PATCH] IB: Eliminate MAD cache leak associated with local completions
> >>     
> >>     Eliminate MAD cache leak associated with local completions.  Also, when
> >>     canceling MAD, empty local completion list as well.
> >>     
> >>     Signed-off-by: Hal Rosenstock <halr at voltaire.com>
> >>     Cc: Roland Dreier <rolandd at cisco.com>
> >>     Signed-off-by: Andrew Morton <akpm at osdl.org>
> >>     Signed-off-by: Linus Torvalds <torvalds at osdl.org>
> >>
> >> More later...
> > 
> > FWIW, I traced the origin of this change back to the following thread:
> > 
> > http://lists.openfabrics.org/pipermail/general/2005-May/thread.html
> > 
> > http://lists.openfabrics.org/pipermail/general/2005-May/005951.html
> > 
> > Subject:
> > 
> > slab error in kmem_cache_destroy(): cache `ib_mad': Can't free all
> > objects
> 
> Looking back at the patch above, I think that emptying the local 
> completion list in cancel_mads() is separate from fixing the cache leak 
> in local_completions().  By removing the local list handling from 
> cancel_mads(), all handling of those mads should go through 
> local_completions() instead, where the memory leak was addressed.

Yes, in looking at this again, that seems right. So the fix may just be
just that simple...

It would be nice to rerun the script that was used to originally find
the cache leak to verify that aspect.

-- Hal

> - Sean


From ssufficool at rov.sbcounty.gov  Mon Dec  3 11:21:55 2007
From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley)
Date: Mon, 3 Dec 2007 11:21:55 -0800
Subject: [ofa-general] RE: SRP Target Port
In-Reply-To: <47504A93.7090704@systemfabricworks.com>
Message-ID: <C2F174F99918D54CA2A96E57C5079B6F35518F@sbc-exmsg2.sbcounty.gov>

I have specified the initiator and target GUID in the group membership
as to isolate initiators to specific target ports and I get no
connections. However when  I leave this as Default group and check the
initiator port using OSMTEST for WinIB, I get a port is is 1 higher than
that reported by /proc/scst_tgt/sessions:
 
WinIB osmtest:
        1: GUID = 0x1708ffffd0dd61, lid = 0x0008, state = ACTIVE
        2: GUID = 0x1708ffffd0dd62, lid = 0x0011, state = ACTIVE
 
SCST Session:
Target name          Initiator name                      Group name
Command Count
ib_srpt              0x001a4bffff0cd045001708ffffd0dd60  Default
0

Is it safe to assume the session (SCST Group USER) will always be one
less than the initiator port id? The OFED target code seems to be
current (git reports no updates). 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071203/b222b9f5/attachment.html>

From hrosenstock at xsigo.com  Mon Dec  3 11:27:50 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Mon, 03 Dec 2007 11:27:50 -0800
Subject: [ofa-general] RE: SRP Target Port
In-Reply-To: <C2F174F99918D54CA2A96E57C5079B6F35518F@sbc-exmsg2.sbcounty.gov>
References: <C2F174F99918D54CA2A96E57C5079B6F35518F@sbc-exmsg2.sbcounty.gov>
Message-ID: <1196710070.30768.49.camel@hrosenstock-ws.xsigo.com>

On Mon, 2007-12-03 at 11:21 -0800, Sufficool, Stanley wrote:
> I have specified the initiator and target GUID in the group membership
> as to isolate initiators to specific target ports and I get no
> connections. However when  I leave this as Default group and check the
> initiator port using OSMTEST for WinIB, I get a port is is 1 higher
> than that reported by /proc/scst_tgt/sessions:
>  
> WinIB osmtest:
>         1: GUID = 0x1708ffffd0dd61, lid = 0x0008, state = ACTIVE
>         2: GUID = 0x1708ffffd0dd62, lid = 0x0011, state = ACTIVE
>  
> SCST Session:
> Target name          Initiator name                      Group name
> Command Count
> ib_srpt              0x001a4bffff0cd045001708ffffd0dd60  Default
> 0

It's likely the difference between node and port GUID that you are
seeing.

> Is it safe to assume the session (SCST Group USER) will always be one
> less than the initiator port id?

It is a common convention but not all vendors use it and it is _not_
guaranteed to work per IBA.

-- Hal

>  The OFED target code seems to be current (git reports no updates). 
>  
>  
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From vuhuong at mellanox.com  Mon Dec  3 11:39:07 2007
From: vuhuong at mellanox.com (Vu Pham)
Date: Mon, 03 Dec 2007 11:39:07 -0800
Subject: [ofa-general] RE: SRP Target Port
In-Reply-To: <C2F174F99918D54CA2A96E57C5079B6F35518F@sbc-exmsg2.sbcounty.gov>
References: <C2F174F99918D54CA2A96E57C5079B6F35518F@sbc-exmsg2.sbcounty.gov>
Message-ID: <47545B5B.5080507@mellanox.com>


> I have specified the initiator and target GUID in the group membership 
> as to isolate initiators to specific target ports and I get no 
> connections.
What is exactly group name that you created?
You need to create group name same as *initiator name* (first 8-byte is 
target port GUID and last 8-byte is second 8-byt of initiator port ID)

> However when  I leave this as Default group and check the initiator 
> port using OSMTEST for WinIB, I get a port is is 1 higher than that 
> reported by /proc/scst_tgt/sessions:
>  
> WinIB osmtest:
>         1: GUID = 0x1708ffffd0dd61, lid = 0x0008, state = ACTIVE
>         2: GUID = 0x1708ffffd0dd62, lid = 0x0011, state = ACTIVE
>  
> SCST Session:
> Target name          Initiator name                      Group 
> name                          Command Count
> ib_srpt              0x001a4bffff0cd045001708ffffd0dd60  
> Default                             0
> Is it safe to assume the session (SCST Group USER) will always be one 
> less than the initiator port id? The OFED target code seems to be 
> current (git reports no updates).
>

To create *initiator name*, srp target use 8-byte of target port GUID 
and second 8-byte of *initiator port ID* (initiator port ID is sent in 
srp login in private data of connection request)

My guess is that Win SRP use HCA GUID (not port GUID) to construct last 
8-byte of initiator port ID.


>  
> ------------------------------------------------------------------------
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From jim at mellanox.com  Mon Dec  3 13:35:06 2007
From: jim at mellanox.com (Jim Mott)
Date: Mon, 3 Dec 2007 13:35:06 -0800
Subject: [ofa-general] [PATCH 1/1] SDP - Various bzcopy bugs
Message-ID: <F57121538EA0C94F86018DDD40ADA1D19C1C5E@mtiexch01.mti.com>

The Mellanox regression tests posted a number of failures when
multiple threads were accessing the same sockets concurrently.  In
addition to test failures, there were log messages of the form:
  sdp_sock(54386:19002): Could not reap -5 in-flight sends

This fix handles all these failures and errors.

Signed-off-by: Jim Mott <jim at mellanox.com>
---

Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp.h
===================================================================
--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp.h
2007-12-03 11:34:45.000000000 -0600
+++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp.h	2007-12-03
13:44:43.000000000 -0600
@@ -173,7 +173,6 @@
 
 	/* BZCOPY data */
 	int   zcopy_thresh;
-	void *zcopy_context;
 
 	struct ib_sge ibsge[SDP_MAX_SEND_SKB_FRAGS + 1];
 	struct ib_wc  ibwc[SDP_NUM_WC];
Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c
===================================================================
--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_bcopy.c
2007-12-03 11:34:45.000000000 -0600
+++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c
2007-12-03 13:44:30.000000000 -0600
@@ -242,15 +242,11 @@
 	++ssk->tx_tail;
 
 	/* TODO: AIO and real zcopy cdoe; add their context support here
*/
-	if (ssk->zcopy_context && skb->data_len) {
+	if (skb->h.raw) {
 		struct bzcopy_state *bz;
-		struct sdp_bsdh *h;
 
-		h = (struct sdp_bsdh *)skb->data;
-		if (h->mid == SDP_MID_DATA) {
-			bz = (struct bzcopy_state *)ssk->zcopy_context;
-			bz->busy--;
-		}
+		bz = (struct bzcopy_state *)skb->h.raw;
+		bz->busy--;
 	}
 
 	return skb;
@@ -751,12 +747,8 @@
 		sdp_post_recvs(ssk);
 		sdp_post_sends(ssk, 0);
 
-		if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) {
-			if (ssk->zcopy_context)
-				sdp_bzcopy_write_space(ssk);
-			else
-				sk_stream_write_space(&ssk->isk.sk);
-		}
+		if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
+			sk_stream_write_space(&ssk->isk.sk);
 	}
 
 	return ret;
Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c
===================================================================
--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_main.c
2007-12-03 11:34:51.000000000 -0600
+++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c
2007-12-03 13:14:10.000000000 -0600
@@ -1203,10 +1203,24 @@
 
 static inline struct bzcopy_state *sdp_bz_cleanup(struct bzcopy_state
*bz)
 {
-	int i;
+	int i, max_retry;
 	struct sdp_sock *ssk = (struct sdp_sock *)bz->ssk;
 
-	ssk->zcopy_context = NULL;
+	/* Wait for in-flight sends; should be quick */
+	if (bz->busy) {
+		struct sock *sk = &ssk->isk.sk;
+
+		for (max_retry = 0; max_retry < 10000; max_retry++) {
+			poll_send_cq(sk);
+
+			if (!bz->busy)
+				break;
+		}
+
+		if (bz->busy)
+			sdp_warn(sk, "Could not reap %d in-flight
sends\n",
+				 bz->busy);
+	}
 
 	if (bz->pages) {
 		for (i = bz->cur_page; i < bz->page_cnt; i++)
@@ -1280,14 +1294,14 @@
 	}
 
 	up_write(&current->mm->mmap_sem);
-	ssk->zcopy_context = bz;
 
 	return bz;
 
 out_2:
 	up_write(&current->mm->mmap_sem);
+	kfree(bz->pages);
 out_1:
-	sdp_bz_cleanup(bz);
+	kfree(bz);
 
 	return NULL;
 }
@@ -1461,19 +1475,17 @@
 };
 
 /* like sk_stream_memory_free - except measures remote credits */
-static inline int sdp_bzcopy_slots_avail(struct sdp_sock *ssk)
+static inline int sdp_bzcopy_slots_avail(struct sdp_sock *ssk,
+					 struct bzcopy_state *bz)
 {
-	struct bzcopy_state *bz = (struct bzcopy_state
*)ssk->zcopy_context;
-
-	BUG_ON(!bz);
 	return slots_free(ssk) > bz->busy;
 }
 
 /* like sk_stream_wait_memory - except waits on remote credits */
-static int sdp_bzcopy_wait_memory(struct sdp_sock *ssk, long *timeo_p)
+static int sdp_bzcopy_wait_memory(struct sdp_sock *ssk, long *timeo_p,
+				  struct bzcopy_state *bz)
 {
 	struct sock *sk = &ssk->isk.sk;
-	struct bzcopy_state *bz = (struct bzcopy_state
*)ssk->zcopy_context;
 	int err = 0;
 	long vm_wait = 0;
 	long current_timeo = *timeo_p;
@@ -1481,7 +1493,7 @@
 
 	BUG_ON(!bz);
 
-	if (sdp_bzcopy_slots_avail(ssk))
+	if (sdp_bzcopy_slots_avail(ssk, bz))
 		current_timeo = vm_wait = (net_random() % (HZ / 5)) + 2;
 
 	while (1) {
@@ -1506,13 +1518,13 @@
 
 		clear_bit(SOCK_ASYNC_NOSPACE, &sk->sk_socket->flags);
 
-		if (sdp_bzcopy_slots_avail(ssk))
+		if (sdp_bzcopy_slots_avail(ssk, bz))
 			break;
 
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 		sk->sk_write_pending++;
 		sk_wait_event(sk, &current_timeo,
-			sdp_bzcopy_slots_avail(ssk) && vm_wait);
+			sdp_bzcopy_slots_avail(ssk, bz) && vm_wait);
 		sk->sk_write_pending--;
 
 		if (vm_wait) {
@@ -1603,7 +1615,8 @@
 			skb = sk->sk_write_queue.prev;
 
 			if (!sk->sk_send_head ||
-			    (copy = size_goal - skb->len) <= 0) {
+			    (copy = size_goal - skb->len) <= 0 ||
+			    bz != (struct bzcopy_state *)skb->h.raw) {
 
 new_segment:
 				/*
@@ -1614,7 +1627,7 @@
 				 * receive credits.
 				 */
 				if (bz) {
-					if
(!sdp_bzcopy_slots_avail(ssk))
+					if (!sdp_bzcopy_slots_avail(ssk,
bz))
 						goto wait_for_sndbuf;
 				} else {
 					if (!sk_stream_memory_free(sk))
@@ -1626,6 +1639,8 @@
 				if (!skb)
 					goto wait_for_memory;
 
+				skb->h.raw = (unsigned char *)bz;
+
 				/*
 				 * Check whether we can use HW checksum.
 				 */
@@ -1691,7 +1706,7 @@
 			if (copied)
 				sdp_push(sk, ssk, flags & ~MSG_MORE,
mss_now, TCP_NAGLE_PUSH);
 
-			err = (bz) ? sdp_bzcopy_wait_memory(ssk, &timeo)
:
+			err = (bz) ? sdp_bzcopy_wait_memory(ssk, &timeo,
bz) :
 				     sk_stream_wait_memory(sk, &timeo);
 			if (err)
 				goto do_error;
@@ -1704,24 +1719,10 @@
 out:
 	if (copied) {
 		sdp_push(sk, ssk, flags, mss_now, ssk->nonagle);
-		if (bz) {
-			int max_retry;
-
-			/* Wait for in-flight sends; should be quick */
-			for (max_retry = 0; max_retry < 10000;
max_retry++) {
-				if (!bz->busy)
-					break;
-
-				poll_send_cq(sk);
-			}
-
-			if (bz->busy)
-				sdp_warn(sk,
-					 "Could not reap %d in-flight
sends\n",
-					 bz->busy);
 
+		if (bz)
 			bz = sdp_bz_cleanup(bz);
-		} else
+		else
 			if (size > send_poll_thresh)
 				poll_send_cq(sk);
 	}


From xma at us.ibm.com  Mon Dec  3 14:45:01 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Mon, 3 Dec 2007 14:45:01 -0800
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <4752B382.5080300@mellanox.co.il>
Message-ID: <OF6AF07FF5.D914FC1B-ON872573A6.007CCC04-882573A6.004B8A56@us.ibm.com>


Tziporet Koren <tziporet at dev.mellanox.co.il> wrote on 12/02/2007 05:30:42
AM:

> Shirley Ma wrote:
> > I just touch tested ofed-1.3 beta IPoIB. And found there was a kernel
> > parameter hw_csum being added in IPoIB. I have several questions here:
> > 1. Why not using ethtool to set up these HW_CSUM flags?
> >
> This is experimental code - Dror explained it in OFED devcon (Or sent
> you the pointer)

So this code won't be in mainline kernel.

> > 2. I haven't looked at the detailed code yet, is that possible with
this
> > flag, TCP/IP will not do CSUM for HCA which has no TCP/IP offload
support?
> > If so, then these packets should be limited to be IB network. Routing
to
> > ethernet network, the packets would be dropped. If not, I tested
IPoIB-UD,
> > why I saw 30% improvement with hw_csum set in none connectX mthca SDR
> > environment?
> Arbel HCA also support checksum offload - so you will see improvement in
> this HCA too.
> > 3. I saw switching between IPoIB-cm and IPoIB-ud corrupted interface IP

> > address (unicast address, subnet mask, broadcast address. Anybody saw
the
> > same problem?
> >
> Can you give more details? We change here between CM and UD mode all the
> time.
> What is the exact scenario you run?
>
> Tziporet
>

Yes, I will rerun my test to find the sequence to reproduce this.

thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071203/640f6302/attachment.html>

From jon at opengridcomputing.com  Mon Dec  3 14:45:50 2007
From: jon at opengridcomputing.com (Jon Mason)
Date: Mon, 3 Dec 2007 16:45:50 -0600
Subject: [ofa-general] uDAPL EVD queue length issue
Message-ID: <20071203224550.GF11990@opengridcomputing.com>

While working on OMPI udapl btl, I have noticed some "interesting"
behavior.  OFA udapl wants the evd queues to be a power of 2 and
then will subtract 1 for book keeping (ie, so that internal head and
tail pointers never touch except when the ring is empty).  OFA udapl
will report the queue length as this number (and not the original
size requested) when queried.  This becomes interesting when a power
of 2 is passed in and then queried.  For example, a requested queue
of length 256 will report a length of 255 when queried.  I cannot tell
if it is acceptable to get a size less than the one you request, based
on the udapl documentation.

Now during the setup of the ompi connection, it will try to make the
local parameters sufficient to run the programs.  Now if we try to
run a small amount of procs, then the defaults will be reset across
all nodes.  Since the defaults may not exactly match, udapl btl will
try to resize the queue (in this example 256 > 255).  When the call
finally makes it up to the ofa udapl code, it will bail because it
checks to see if the new size is less than or equal to the current
size + 1.

So if the ofa udapl code is working as designed, then the ompi udapl
btl code needs to have the proper boundary check for size + 1 (for
which I have a patch).  If not, then the ofa code need to be changed
to either round up to the next power of 2 if given a power of 2, or
return the size + 1 when queried.

So, which one is correct?

Thanks,
Jon

BTW, If anyone is interested, I have cut down dapltest to a very basic
test that will show this behavior 100% of the time.  I can make the
source available to whomever wants it.


From sashak at voltaire.com  Mon Dec  3 14:59:40 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Mon,  3 Dec 2007 22:59:40 +0000
Subject: [ofa-general] [PATCH 4/5] infiniband-diags/saquery: allow empty src
	and/or dst with --src-to-dst option
In-Reply-To: <1196722781437-git-send-email-sashak@voltaire.com>
References: <1196722781437-git-send-email-sashak@voltaire.com>
Message-ID: <11967227813294-git-send-email-sashak@voltaire.com>

Allow to use empty source and/or destination fields with --src-to-dst
option, in this case empty field will not be marked in comp_mask.
Examples: '--src-to-dst :48', '--src-to-dst 4:', '--src-to-dst :'.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/src/saquery.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c
index db6ba11..1442902 100644
--- a/infiniband-diags/src/saquery.c
+++ b/infiniband-diags/src/saquery.c
@@ -1239,17 +1239,17 @@ main(int argc, char **argv)
 		case 1:
 		{
 			char *opt  = strdup(optarg);
-			char *tok1 = strtok(opt, ":");
-			char *tok2 = strtok(NULL, "\0");
-
-			if (tok1 && tok2) {
-				src = strdup(tok1);
-				dst = strdup(tok2);
-			} else {
+			char *ch = strchr(opt, ':');
+			if (!ch) {
 				fprintf(stderr,
 					"ERROR: --src-to-dst <node>:<node>\n");
 				usage();
 			}
+			*ch++ = '\0';
+			if (*opt)
+				src = strdup(opt);
+			if (*ch)
+				dst = strdup(ch);
 			free(opt);
 			query_type = IB_MAD_ATTR_PATH_RECORD;
 			break;
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Mon Dec  3 14:59:38 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Mon,  3 Dec 2007 22:59:38 +0000
Subject: [ofa-general] [PATCH 2/5] infinibad-diags/saquery: move lid
	resolving functions
In-Reply-To: <1196722781437-git-send-email-sashak@voltaire.com>
References: <1196722781437-git-send-email-sashak@voltaire.com>
Message-ID: <1196722781925-git-send-email-sashak@voltaire.com>

Move lid resolving functions, so it will be useful by query processors.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/src/saquery.c |   90 ++++++++++++++++++++--------------------
 1 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c
index 72fe10d..64a6b79 100644
--- a/infiniband-diags/src/saquery.c
+++ b/infiniband-diags/src/saquery.c
@@ -631,6 +631,51 @@ get_all_records(osm_bind_handle_t bind_handle,
 			       trusted ? OSM_DEFAULT_SM_KEY : 0);
 }
 
+/**
+ * return the lid from the node descriptor (name) supplied
+ */
+static ib_api_status_t
+get_lid_from_name(osm_bind_handle_t bind_handle, const char *name, ib_net16_t *lid)
+{
+	int               i = 0;
+	ib_node_record_t *node_record = NULL;
+	ib_node_info_t   *p_ni = NULL;
+	ib_net16_t        attr_offset = ib_get_attr_offset(sizeof(*node_record));
+	ib_api_status_t   status;
+
+	status = get_all_records(bind_handle, IB_MAD_ATTR_NODE_RECORD, attr_offset, 0);
+	if (status != IB_SUCCESS)
+		return (status);
+
+	for (i = 0; i < result.result_cnt; i++) {
+		node_record = osmv_get_query_node_rec(result.p_result_madw, i);
+		p_ni = &(node_record->node_info);
+		if (name && strncmp(name, (char *)node_record->node_desc.description,
+				    sizeof(node_record->node_desc.description)) == 0) {
+			*lid = cl_ntoh16(node_record->lid);
+			break;
+		}
+	}
+	return_mad();
+	return (status);
+}
+
+static ib_net16_t
+get_lid(osm_bind_handle_t bind_handle, const char * name)
+{
+	ib_net16_t rc_lid = 0;
+
+	if (!name)
+		return(0);
+	if (isalpha(name[0]))
+		assert(get_lid_from_name(bind_handle, name, &rc_lid) == IB_SUCCESS);
+	else
+		rc_lid = atoi(name);
+	if (rc_lid == 0)
+		fprintf(stderr, "Failed to find lid for \"%s\"\n", name);
+        return (rc_lid);
+}
+
 /*
  * Get the portinfo records available with IsSM or IsSMdisabled CapabilityMask bit on.
  */
@@ -696,35 +741,6 @@ print_node_records(osm_bind_handle_t bind_handle)
 	return (status);
 }
 
-/**
- * return the lid from the node descriptor (name) supplied
- */
-static ib_api_status_t
-get_lid_from_name(osm_bind_handle_t bind_handle, const char *name, ib_net16_t *lid)
-{
-	int               i = 0;
-	ib_node_record_t *node_record = NULL;
-	ib_node_info_t   *p_ni = NULL;
-	ib_net16_t        attr_offset = ib_get_attr_offset(sizeof(*node_record));
-	ib_api_status_t   status;
-
-	status = get_all_records(bind_handle, IB_MAD_ATTR_NODE_RECORD, attr_offset, 0);
-	if (status != IB_SUCCESS)
-		return (status);
-
-	for (i = 0; i < result.result_cnt; i++) {
-		node_record = osmv_get_query_node_rec(result.p_result_madw, i);
-		p_ni = &(node_record->node_info);
-		if (name && strncmp(name, (char *)node_record->node_desc.description,
-				    sizeof(node_record->node_desc.description)) == 0) {
-			*lid = cl_ntoh16(node_record->lid);
-			break;
-		}
-	}
-	return_mad();
-	return (status);
-}
-
 static ib_api_status_t
 get_print_path_rec_lid(osm_bind_handle_t bind_handle,
 		       ib_net16_t src_lid,
@@ -1051,22 +1067,6 @@ get_bind_handle(void)
 	return (bind_handle);
 }
 
-static ib_net16_t
-get_lid(osm_bind_handle_t bind_handle, const char * name)
-{
-	ib_net16_t rc_lid = 0;
-
-	if (!name)
-		return(0);
-	if (isalpha(name[0]))
-		assert(get_lid_from_name(bind_handle, name, &rc_lid) == IB_SUCCESS);
-	else
-		rc_lid = atoi(name);
-	if (rc_lid == 0)
-		fprintf(stderr, "Failed to find lid for \"%s\"\n", name);
-        return (rc_lid);
-}
-
 static void
 clean_up(void)
 {
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Mon Dec  3 14:59:39 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Mon,  3 Dec 2007 22:59:39 +0000
Subject: [ofa-general] [PATCH 3/5] infiniband-diags/saquery: LinkRecord query
	support
In-Reply-To: <1196722781437-git-send-email-sashak@voltaire.com>
References: <1196722781437-git-send-email-sashak@voltaire.com>
Message-ID: <119672278173-git-send-email-sashak@voltaire.com>

Add support for SA LinkRecord query.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/src/saquery.c |   82 +++++++++++++++++++++++++++++++++++++++-
 1 files changed, 81 insertions(+), 1 deletions(-)

diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c
index 64a6b79..db6ba11 100644
--- a/infiniband-diags/src/saquery.c
+++ b/infiniband-diags/src/saquery.c
@@ -560,6 +560,17 @@ print_inform_info_record(ib_inform_info_record_t *p_iir)
 	}
 }
 
+static void dump_one_link_record(ib_link_record_t *lr)
+{
+	printf("LinkRecord dump:\n"
+	       "\t\tFromLID....................%u\n"
+	       "\t\tFromPort...................%u\n"
+	       "\t\tToPort.....................%u\n"
+	       "\t\tToLID......................%u\n",
+	       cl_ntoh16(lr->from_lid), lr->from_port_num,
+	       lr->to_port_num, cl_ntoh16(lr->to_lid));
+}
+
 static void
 return_mad(void)
 {
@@ -694,6 +705,41 @@ get_issm_records(osm_bind_handle_t bind_handle, ib_net32_t capability_mask)
 			       0);
 }
 
+/*
+ * Get the LinkRecord(s)
+ */
+static ib_api_status_t get_link_records(osm_bind_handle_t bind_handle,
+					int from_lid, int from_port,
+					int to_lid, int to_port)
+{
+	ib_link_record_t lr;
+	ib_net64_t comp_mask;
+
+	memset(&lr, 0, sizeof(lr));
+	comp_mask = 0;
+
+	if (from_lid > 0) {
+		lr.from_lid = cl_hton16(from_lid);
+		comp_mask |= IB_LR_COMPMASK_FROM_LID;
+	}
+	if (from_port >= 0) {
+		lr.from_port_num = cl_hton16(from_port);
+		comp_mask |= IB_LR_COMPMASK_FROM_PORT;
+	}
+	if (to_lid > 0) {
+		lr.to_lid = cl_hton16(to_lid);
+		comp_mask |= IB_LR_COMPMASK_TO_LID;
+	}
+	if (to_port >= 0) {
+		lr.to_port_num = cl_hton16(to_port);
+		comp_mask |= IB_LR_COMPMASK_TO_PORT;
+	}
+
+	return get_any_records(bind_handle, IB_MAD_ATTR_LINK_RECORD,
+			       comp_mask, &lr,
+			       ib_get_attr_offset(sizeof(ib_link_record_t)), 0);
+}
+
 static ib_api_status_t
 print_node_records(osm_bind_handle_t bind_handle)
 {
@@ -1000,6 +1046,32 @@ print_inform_info_records(osm_bind_handle_t bind_handle)
 	return (status);
 }
 
+static ib_api_status_t
+print_link_records(osm_bind_handle_t bind_handle, char *from, char *to)
+{
+	int i;
+	ib_link_record_t *lr;
+	int from_lid, to_lid, from_port, to_port;
+	ib_api_status_t status;
+
+	from_lid = get_lid(bind_handle, from);
+	to_lid = get_lid(bind_handle, to);
+	from_port = -1;
+	to_port = -1;
+
+	status = get_link_records(bind_handle, from_lid, from_port,
+				  to_lid, to_port);
+	if (status != IB_SUCCESS)
+		return status;
+
+	for (i = 0; i < result.result_cnt; i++) {
+		lr = osmv_get_query_result(result.p_result_madw, i);
+		dump_one_link_record(lr);
+	}
+	return_mad();
+	return status;
+}
+
 static osm_bind_handle_t
 get_bind_handle(void)
 {
@@ -1101,6 +1173,7 @@ usage(void)
 	fprintf(stderr, "      (if multicast group specified, list member GIDs"
 				" only for group specified\n");
 	fprintf(stderr, "      specified, for example 'saquery -m 0xC000')\n");
+	fprintf(stderr, "   -x get LinkRecord info\n");
 	fprintf(stderr, "   --src-to-dst get a PathRecord for <src:dst>\n"
 			"                where src and dst are either node "
 				"names or LIDs\n");
@@ -1130,7 +1203,7 @@ main(int argc, char **argv)
 	ib_net16_t         dst_lid;
 	ib_api_status_t    status;
 
-	static char const str_opts[] = "pVNDLlGOUcSIsgmdhP:C:t:";
+	static char const str_opts[] = "pVNDLlGOUcSIsgmxdhP:C:t:";
 	static const struct option long_opts [] = {
 	   {"p", 0, 0, 'p'},
 	   {"Version", 0, 0, 'V'},
@@ -1143,6 +1216,7 @@ main(int argc, char **argv)
 	   {"s", 0, 0, 's'},
 	   {"g", 0, 0, 'g'},
 	   {"m", 0, 0, 'm'},
+	   {"x", 0, 0, 'x'},
 	   {"d", 0, 0, 'd'},
 	   {"c", 0, 0, 'c'},
 	   {"S", 0, 0, 'S'},
@@ -1247,6 +1321,9 @@ main(int argc, char **argv)
 			query_type = IB_MAD_ATTR_MCMEMBER_RECORD;
 			members = 1;
 			break;
+		case 'x':
+			query_type = IB_MAD_ATTR_LINK_RECORD;
+			break;
 		case 'd':
 			osm_debug = 1;
 			break;
@@ -1354,6 +1431,9 @@ main(int argc, char **argv)
 	case IB_MAD_ATTR_INFORM_INFO_RECORD:
 		status = print_inform_info_records(bind_handle);
 		break;
+	case IB_MAD_ATTR_LINK_RECORD:
+		status = print_link_records(bind_handle, src, dst);
+		break;
 	default:
 		fprintf(stderr, "Unknown query type %d\n", query_type);
 		status = IB_UNKNOWN_ERROR;
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Mon Dec  3 14:59:41 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Mon,  3 Dec 2007 22:59:41 +0000
Subject: [ofa-general] [PATCH 5/5] infiniband-diags/man: add -x option to
	saquery man page
In-Reply-To: <1196722781437-git-send-email-sashak@voltaire.com>
References: <1196722781437-git-send-email-sashak@voltaire.com>
Message-ID: <11967227812860-git-send-email-sashak@voltaire.com>

Add -x option (LinkRecord info) to saquery man page.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/man/saquery.8 |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/infiniband-diags/man/saquery.8 b/infiniband-diags/man/saquery.8
index 6860062..23cfc8f 100644
--- a/infiniband-diags/man/saquery.8
+++ b/infiniband-diags/man/saquery.8
@@ -6,7 +6,8 @@ saquery \- query InfiniBand subnet administration attributes
 .SH SYNOPSIS
 .B saquery 
 [\-h] [\-d] [\-p] [\-N] [\-\-list | \-D] [\-S] [\-I] [\-L] [\-l] [\-G] [\-O]
-[\-U] [\-c] [\-s] [\-g] [\-m] [\-C ca_name] [\-P ca_port] [\-t(imeout) <msec>]
+[\-U] [\-c] [\-s] [\-g] [\-m] [\-x]
+[\-C ca_name] [\-P ca_port] [\-t(imeout) <msec>]
 [\-\-src\-to\-dst <src:dst>]
 [\-\-sgid\-to\-dgid <sgid\-dgid>]
 [\-\-node\-name\-map <node\-name\-map>]
@@ -64,6 +65,9 @@ get multicast member info.  If a group is specified, limit the output to the
 group specified and print one line containing only the GUID and node
 description for each entry. Example: saquery -m 0xc000
 .TP
+\fB\-x\fR
+get LinkRecord info
+.TP
 \fB\-\-src-to-dst\fR
 get a PathRecord for <src:dst>
 where src and dst are either node names or LIDs
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Mon Dec  3 14:59:37 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Mon,  3 Dec 2007 22:59:37 +0000
Subject: [ofa-general] [PATCH 1/5] infiniband-diags/saquery: add
	get_any_records() function
Message-ID: <1196722781437-git-send-email-sashak@voltaire.com>

Add get_any_records() function - this gets attribute specific data (id,
comp_mask, offset, etc.) as parameters.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/src/saquery.c |   83 +++++++++++++++++-----------------------
 1 files changed, 35 insertions(+), 48 deletions(-)

diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c
index 4f4c6f2..72fe10d 100644
--- a/infiniband-diags/src/saquery.c
+++ b/infiniband-diags/src/saquery.c
@@ -573,23 +573,26 @@ return_mad(void)
 }
 
 /**
- * Get all the records available for requested query type.
+ * Get any record(s)
  */
 static ib_api_status_t
-get_all_records(osm_bind_handle_t bind_handle,
-		ib_net16_t query_id,
-		ib_net16_t attr_offset,
-		int trusted)
+get_any_records(osm_bind_handle_t bind_handle,
+		ib_net16_t attr_id, ib_net32_t attr_mod, ib_net64_t comp_mask,
+		void *attr, ib_net16_t attr_offset,
+		ib_net64_t sm_key)
 {
 	ib_api_status_t   status;
 	osmv_query_req_t  req;
 	osmv_user_query_t user;
 
-	memset( &req, 0, sizeof( req ) );
-	memset( &user, 0, sizeof( user ) );
+	memset(&req, 0, sizeof(req));
+	memset(&user, 0, sizeof(user));
 
-	user.attr_id = query_id;
+	user.attr_id = attr_id;
 	user.attr_offset = attr_offset;
+	user.attr_mod = attr_mod;
+	user.comp_mask = comp_mask;
+	user.p_attr = attr;
 
 	req.query_type = OSMV_QUERY_USER_DEFINED;
 	req.timeout_ms = sa_timeout_ms;
@@ -598,23 +601,34 @@ get_all_records(osm_bind_handle_t bind_handle,
 	req.query_context = NULL;
 	req.pfn_query_cb = query_res_cb;
 	req.p_query_input = &user;
-	if (trusted)
-		req.sm_key = OSM_DEFAULT_SM_KEY;
-	else
-		req.sm_key = 0;
+	req.sm_key = sm_key;
 
 	if ((status = osmv_query_sa(bind_handle, &req)) != IB_SUCCESS) {
 		fprintf(stderr, "Query SA failed: %s\n",
 			ib_get_err_str(status));
-		return (status);
+		return status;
 	}
 
 	if (result.status != IB_SUCCESS) {
 		fprintf(stderr, "Query result returned: %s\n",
 			ib_get_err_str(result.status));
-		return (result.status);
+		return result.status;
 	}
-	return (status);
+
+	return status;
+}
+
+/**
+ * Get all the records available for requested query type.
+ */
+static ib_api_status_t
+get_all_records(osm_bind_handle_t bind_handle,
+		ib_net16_t query_id,
+		ib_net16_t attr_offset,
+		int trusted)
+{
+	return get_any_records(bind_handle, query_id, 0, 0, NULL, attr_offset,
+			       trusted ? OSM_DEFAULT_SM_KEY : 0);
 }
 
 /*
@@ -623,43 +637,16 @@ get_all_records(osm_bind_handle_t bind_handle,
 static ib_api_status_t
 get_issm_records(osm_bind_handle_t bind_handle, ib_net32_t capability_mask)
 {
-	ib_api_status_t   status;
-	osmv_query_req_t  req;
-	osmv_user_query_t user;
 	ib_portinfo_record_t attr;
 
-	memset( &req, 0, sizeof( req ) );
-	memset( &user, 0, sizeof( user ) );
 	memset( &attr, 0, sizeof ( attr ) );
 	attr.port_info.capability_mask = capability_mask;
 
-	user.attr_id = IB_MAD_ATTR_PORTINFO_RECORD;
-	user.attr_offset = ib_get_attr_offset(sizeof(ib_portinfo_record_t));
-	user.attr_mod = cl_ntoh32(1 << 31);	/* enhanced query */
-	user.comp_mask = IB_PIR_COMPMASK_CAPMASK;
-	user.p_attr = &attr;
-
-	req.query_type = OSMV_QUERY_USER_DEFINED;
-	req.timeout_ms = sa_timeout_ms;
-	req.retry_cnt = 1;
-	req.flags = OSM_SA_FLAGS_SYNC;
-	req.query_context = NULL;
-	req.pfn_query_cb = query_res_cb;
-	req.p_query_input = &user;
-	req.sm_key = 0;
-
-	if ((status = osmv_query_sa(bind_handle, &req)) != IB_SUCCESS) {
-		fprintf(stderr, "Query SA failed: %s\n",
-			ib_get_err_str(status));
-		return (status);
-	}
-
-	if (result.status != IB_SUCCESS) {
-		fprintf(stderr, "Query result returned: %s\n",
-			ib_get_err_str(result.status));
-		return (result.status);
-	}
-	return (status);
+	return get_any_records(bind_handle, IB_MAD_ATTR_PORTINFO_RECORD,
+			       cl_hton32(1 << 31), IB_PIR_COMPMASK_CAPMASK,
+			       &attr,
+			       ib_get_attr_offset(sizeof(ib_portinfo_record_t)),
+			       0);
 }
 
 static ib_api_status_t
@@ -1218,7 +1205,7 @@ main(int argc, char **argv)
 			query_type = IB_MAD_ATTR_PATH_RECORD;
 			break;
 		case 'V':
-			fprintf(stderr, "%s %s\n", argv0, get_build_version() );
+			fprintf(stderr, "%s %s\n", argv0, get_build_version());
 			exit(-1);
 		case 'D':
 			node_print_desc = ALL_DESC;
-- 
1.5.3.4.206.g58ba4


From rdreier at cisco.com  Mon Dec  3 14:48:13 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 03 Dec 2007 14:48:13 -0800
Subject: [ofa-general] [PATCH 0/6] nes: Cosmetic changes;
	support virtual WQs and PPC
In-Reply-To: <5E701717F2B2ED4EA60F87C8AA57B7CC07A57572@venom2> (Glenn
	Grundstrom's message of "Tue, 27 Nov 2007 19:39:59 -0600")
References: <20071114221453.3ADD5E609F0@openfabrics.org>
	<adalk8jnsty.fsf@cisco.com>
	<5E701717F2B2ED4EA60F87C8AA57B7CC07A57572@venom2>
Message-ID: <ada4pezjsya.fsf@cisco.com>

 > Or you said I could submit patches to you.  One problem with
 > only using your tree is that I need to supply updates and backports
 > for OFED builds.  I've cloned Vlad's tree for that.

How you manage OFED is up to you.  However, once your driver is in the
upstream tree there is no alternative except to supply patches that
apply to my tree or the upstream tree, and you might as well get in
the habit now.  (And in fact, making a convincing case that you can
maintain a driver in the upstream kernel is part of getting your
driver merged)

 > How about this for the future: Let me know when you've made
 > changes and I'll pull and merge your changes into my code.
 > That way I can still provide everything for OFED and patches
 > I create should apply cleanly in your branch.  I assume
 > you could also pull from my tree as well, right?  If you have
 > a better way to satisfy all we can discuss it.

I can't pull from your current tree, because it has a bunch of OFED
crap in it.  If you want me to pull, then you need to provide a tree
that is based on the tree I'm pulling into and has no changes except
the ones you're trying to give to me.

Anyway.  I just threw away my old neteffect branch, and pushed out a
new branch with the nes driver from your OFED tree.  It seems as if
that tree already had all the patches you emailed out.  I also ran
scripts/cleanfile on the nes driver to fix the obvious whitespace
problems.  However there is still a lot of cosmetic work that needs to
get done -- for example, while I'm not a stickler about the 80-column
limit, having lines more than 150 characters long is clearly
excessive.

 - R.


From Jeffrey.C.Becker at nasa.gov  Mon Dec  3 15:31:02 2007
From: Jeffrey.C.Becker at nasa.gov (Jeff Becker)
Date: Mon, 03 Dec 2007 15:31:02 -0800
Subject: [ofa-general] test-plz ignore
Message-ID: <475491B6.6090201@nasa.gov>


From sashak at voltaire.com  Mon Dec  3 16:03:36 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 4 Dec 2007 00:03:36 +0000
Subject: [ofa-general] Re: [PATCHv2] libibmad/dump.c: Use bit mask approach
	to decoding LinkWidth/Speed Enabled/Supported
In-Reply-To: <1196599721.10845.218.camel@hrosenstock-ws.xsigo.com>
References: <1196599721.10845.218.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071204000336.GC15250@sashak.voltaire.com>

On 04:48 Sun 02 Dec     , Hal Rosenstock wrote:
> libibmad/dump.c: Use bit mask approach to decoding LinkWidth/Speed
> Enabled/Supported
> 
> Based on email from Jason Gunthorpe <jgunthorpe at obsidianresearch.com>
> 
> Signed-off-by: Hal Rosenstock <hal at xsigo.com>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Mon Dec  3 16:06:45 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 4 Dec 2007 00:06:45 +0000
Subject: [ofa-general] Re: [PATCH 2/3 v2] opensm: adding missing comparison
	by to_lid/from_lid in LinkRecord processing
In-Reply-To: <47541DE7.4000106@dev.mellanox.co.il>
References: <47541DE7.4000106@dev.mellanox.co.il>
Message-ID: <20071204000645.GD15250@sashak.voltaire.com>

On 17:16 Mon 03 Dec     , Yevgeny Kliteynik wrote:
> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
> component mask bits was missing in LinkRecord processing.
> 
> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From xma at us.ibm.com  Mon Dec  3 15:19:47 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Mon, 3 Dec 2007 15:19:47 -0800
Subject: ***SPAM*** Re: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <1196603159.22671.11.camel@mtls03>
Message-ID: <OFA7A18491.37BC7D4D-ON872573A6.007CFD7C-882573A6.004EB91D@us.ibm.com>


Eli Cohen <eli at dev.mellanox.co.il> wrote on 12/02/2007 05:45:59 AM:

> On Fri, 2007-11-30 at 15:28 -0800, Shirley Ma wrote:
> > I just touch tested ofed-1.3 beta IPoIB. And found there was a kernel
> > parameter hw_csum being added in IPoIB. I have several questions here:
> > 1. Why not using ethtool to set up these HW_CSUM flags?
> There is no adequate interface in Ethtool for doing it so we use a
> module parameter. This is because we see this as a static configuration
> per host.

Ethtool does support rx csum and tx csum:
#define ETHTOOL_GRXCSUM         0x00000014 /* Get RX hw csum enable
(ethtool_value) */
#define ETHTOOL_SRXCSUM         0x00000015 /* Set RX hw csum enable
(ethtool_value) */
#define ETHTOOL_GTXCSUM         0x00000016 /* Get TX hw csum enable
(ethtool_value) */
#define ETHTOOL_STXCSUM         0x00000017 /* Set TX hw csum enable
(ethtool_value) */

We should use ethtool here.

> > 2. I haven't looked at the detailed code yet, is that possible with
this
> > flag, TCP/IP will not do CSUM for HCA which has no TCP/IP offload
support?
> Yes, the HCA need not have checksum offload support. the idea is the IB
> ICRC provides the insurance that the packets are not corrupt.

That's something we discussed long time ago when we wanted GSO to avoid
extra copy by using ICRC to enable SG feature. I remembered Roland rejected
this idea since there could be potenical data corruption. And even if we do
prove that ICRC is 100% accurate, then we should have some codes here to
limit the IP destination within IB subnet when using ICRC. Otherwise, if
the packets routing out to ehthernet IP subnet, these packets will be
dropped.

thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071203/2cb2a905/attachment.html>

From wei.fang at hermes-microvision.com  Mon Dec  3 16:50:52 2007
From: wei.fang at hermes-microvision.com (Wei Fang)
Date: Mon, 03 Dec 2007 16:50:52 -0800
Subject: [ofa-general] Question:  Verbs API Error code recover
In-Reply-To: <47527C85.8050801@dev.mellanox.co.il>
References: <4750A321.5080406@hermes-microvision.com>
	<47527C85.8050801@dev.mellanox.co.il>
Message-ID: <4754A46C.1030808@hermes-microvision.com>

Hi, Dotan:

When I got that error, I quit my program and use ib_rmda_bw prorgam to 
test Infiniband link. It still fails like this:

ib_rdma_bw 10.8.6.3
19068: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | iters=1000 
| duplex=0 | cma=0 |
19068: Local address:  LID 0x01, QPN 0x2e0404, PSN 0xc39344 RKey 
0x4c003101 VAddr 0x00002a958bc000
19068: Remote address: LID 0x3d9, QPN 0x140404, PSN 0x77012a, RKey 
0x74003100 VAddr 0x00002a958bc000

19068:main: Completion with error at client:
19068:main: Failed status 12: wr_id 3
19068:main: scnt=100, ccnt=0


Dotan Barak wrote:
> Hi.
>
> Wei Fang wrote:
>> Hi, All:
>>
>> I'm new here just some days ago. Right now I'm facing a problem to 
>> using OFED 1.2.5's verb api.   In my programming, I use RDMA Write 
>> function to transfer data ( ibv_post_send ). Then I use ibv_poll_cq 
>> to get this CQ's finish.  Sometimes, ibv_poll_cq's return error is 
>> IBV_WC_RETRY_EXC_ERR (error code is 12).  When this error code 
>> happen,  any next transfer will always fail.  In this case, I have to 
>> restart computer.  Anyone can tell me how to recover this error 
>> without quit program or restart PC?
>>
>
> If you have a completion with status IBV_WC_RETRY_EXC_ERR your QP 
> state will be moved to error, so all of the WR that you will post 
> after this will fail too.
> If you have this failure you need to reconnect the QPs (i don't know 
> why you need to restart the computer in order to fix this ....).
>
>
> I think that you need to check why you got this completion status from 
> the first place (did the remote side close the QP?)
>
> Dotan
>
>

-- 
Best Regards

Wei Fang

Hermes Microvision Inc.

(Tel)       (408)597-8600
(Fax)       (408)597-8601
(Direct Tel)(408)597-8646

============================================
The information contained in this document is confidential and may be
legally privileged. It is intended solely for the use of the addressee and
others authorized to receive it. If you are not the intended recipient you
are hereby notified that any disclosure, copying, distribution or any action
taken or omitted in reliance on it is strictly prohibited and may be
unlawful.
============================================


From sashak at voltaire.com  Mon Dec  3 17:34:42 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 4 Dec 2007 01:34:42 +0000
Subject: [ofa-general] Re: [PATCH 2/3 v2] opensm: adding missing comparison
	by to_lid/from_lid in LinkRecord processing
In-Reply-To: <47541DE7.4000106@dev.mellanox.co.il>
References: <47541DE7.4000106@dev.mellanox.co.il>
Message-ID: <20071204013442.GE15250@sashak.voltaire.com>

Hi Yevgeny,

On 17:16 Mon 03 Dec     , Yevgeny Kliteynik wrote:
> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
> component mask bits was missing in LinkRecord processing.
> 
> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
>  opensm/opensm/osm_sa_link_record.c |   16 ++++++++++++++--
>  1 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
> index ba52aea..1230b91 100644
> --- a/opensm/opensm/osm_sa_link_record.c
> +++ b/opensm/opensm/osm_sa_link_record.c
> @@ -177,6 +177,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>  	uint8_t dest_port_num;
>  	ib_net16_t from_base_lid;
>  	ib_net16_t to_base_lid;
> +	uint16_t lmc_mask;
> 
>  	OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_physp_link);
> 
> @@ -256,6 +257,19 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>  		if (dest_port_num != p_lr->to_port_num)
>  			goto Exit;
> 
> +	__get_base_lid(p_src_physp, &from_base_lid);
> +	__get_base_lid(p_dest_physp, &to_base_lid);
> +
> +	lmc_mask = ~((1 << p_rcv->p_subn->opt.lmc) - 1);
> +
> +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
> +		if (from_base_lid != (p_lr->from_lid & lmc_mask))
> +			goto Exit;
> +
> +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
> +		if (to_base_lid != (p_lr->to_lid & lmc_mask))
> +			goto Exit;

Actually it is broken too. Since all LIDs in comparison are in network
bit order lmc_mask should be converted too. I will patch separately.

Sasha

> +
>  	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
>  		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
>  			"__osm_lr_rcv_get_physp_link: "
> @@ -267,8 +281,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>  			cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>  			dest_port_num);
> 
> -	__get_base_lid(p_src_physp, &from_base_lid);
> -	__get_base_lid(p_dest_physp, &to_base_lid);
> 
>  	__osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
>  				      to_base_lid, src_port_num,
> -- 
> 1.5.1.4
> 


From sashak at voltaire.com  Mon Dec  3 17:45:35 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 4 Dec 2007 01:45:35 +0000
Subject: [ofa-general] Re: [PATCH 2/3 v2] opensm: adding missing comparison
	by to_lid/from_lid in LinkRecord processing
In-Reply-To: <20071204013442.GE15250@sashak.voltaire.com>
References: <47541DE7.4000106@dev.mellanox.co.il>
	<20071204013442.GE15250@sashak.voltaire.com>
Message-ID: <20071204014535.GF15250@sashak.voltaire.com>

On 01:34 Tue 04 Dec     , Sasha Khapyorsky wrote:
> > @@ -256,6 +257,19 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
> >  		if (dest_port_num != p_lr->to_port_num)
> >  			goto Exit;
> > 
> > +	__get_base_lid(p_src_physp, &from_base_lid);
> > +	__get_base_lid(p_dest_physp, &to_base_lid);
> > +
> > +	lmc_mask = ~((1 << p_rcv->p_subn->opt.lmc) - 1);
> > +
> > +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
> > +		if (from_base_lid != (p_lr->from_lid & lmc_mask))
> > +			goto Exit;
> > +
> > +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
> > +		if (to_base_lid != (p_lr->to_lid & lmc_mask))
> > +			goto Exit;
> 
> Actually it is broken too. Since all LIDs in comparison are in network
> bit order lmc_mask should be converted too. I will patch separately.

Something like this.

Sasha

commit 15e3721e822a2d93b32a6f08915a3f84e65424a4
Author: Sasha Khapyorsky <sashak at voltaire.com>
Date:   Tue Dec 4 01:37:57 2007 +0000

    opensm: fix lmc_mask bit order in osm_sa_link_record.c
    
    All LIDs here are in network byte order, so lmc_mask should be converted
    too.
    
    Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>

diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index 1230b91..0497bcd 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -177,7 +177,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 	uint8_t dest_port_num;
 	ib_net16_t from_base_lid;
 	ib_net16_t to_base_lid;
-	uint16_t lmc_mask;
+	ib_net16_t lmc_mask;
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_physp_link);
 
@@ -261,6 +261,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
 	__get_base_lid(p_dest_physp, &to_base_lid);
 
 	lmc_mask = ~((1 << p_rcv->p_subn->opt.lmc) - 1);
+	lmc_mask = cl_hton16(lmc_mask);
 
 	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
 		if (from_base_lid != (p_lr->from_lid & lmc_mask))


From akepner at sgi.com  Mon Dec  3 17:40:21 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Mon, 3 Dec 2007 17:40:21 -0800
Subject: [ofa-general] IPoIB CQ overrun
In-Reply-To: <1195374443.2802.31.camel@mtls03>
References: <20071115202302.GK5448@sgi.com> <20071116232331.GK24803@sgi.com>
	<1195374443.2802.31.camel@mtls03>
Message-ID: <20071204014021.GF10669@sgi.com>

On Sun, Nov 18, 2007 at 10:27:23AM +0200, Eli Cohen wrote:

> Can you tell how IPOIB is configured - connected mode or datagram mode?
> Also can you send more context from /var/log/messages? Especially can
> you rerun with debug enabled and send the output?
> Enabling debug can be done by:
> echo 1 > /sys/module/ib_ipoib/parameters/debug_level

Yes, it's connected mode.

Here another log of on overrun with "debug_level=1". I added code to dump 
the CQ context table (just did a QUERY_CQ and logged the result). 


15:50:39 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:50:42 r3i1n2 kernel: ib0: neigh_destructor for 000404 fe80:0000:0000:0000:0008:f104:0398:8595
15:50:42 r3i1n2 kernel: ib0: Reap connection for gid fe80:0000:0000:0000:0008:f104:0398:8595
15:50:42 r3i1n2 kernel: ib0: Destroy active connection 0xf048d head 0x2 tail 0x2
15:50:53 r3i1n2 in.rshd[7056]: connect from 10.148.0.9 (10.148.0.9)
15:50:53 r3i1n2 kernel: ib_mthca 0000:06:00.0: CQ overrun on CQN 240082
15:50:53 r3i1n2 kernel: cq_context = 0xffff8101eee9c000
15:50:53 r3i1n2 kernel: flags = 0x90000900
15:50:53 r3i1n2 kernel: start_hi = 0x0
15:50:53 r3i1n2 kernel: start_lo = 0x0
15:50:53 r3i1n2 kernel: logsize_usrpage = 0x7000002
15:50:53 r3i1n2 kernel: comp_eqn = 0x1
15:50:53 r3i1n2 kernel: pd = 0x4
15:50:53 r3i1n2 kernel: lkey = 0xd0108900
15:50:53 r3i1n2 kernel: last_notified_index = 0x217
15:50:53 r3i1n2 kernel: solicit_producer_index = 0x9c18
15:50:53 r3i1n2 kernel: consumer_index = 0x0
15:50:53 r3i1n2 kernel: producer_index = 0x218
15:50:53 r3i1n2 kernel: cqn = 0x240082
15:50:53 r3i1n2 kernel: ci_db = 0x7ffd
15:50:53 r3i1n2 kernel: state_db = 0x1
15:50:58 r3i1n2 kernel: ib1: Send unicast ARP to 016d
15:50:58 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:51:11 r3i1n2 in.rshd[7057]: connect from 10.148.0.9 (10.148.0.9)
15:51:27 r3i1n2 kernel: ib0: REQ arrived
15:51:31 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:51:32 r3i1n2 kernel: ib1: Send unicast ARP to 016d
15:51:32 r3i1n2 kernel: ib0: Send unicast ARP to 00ac
15:51:42 r3i1n2 in.rshd[7058]: connect from 10.148.0.9 (10.148.0.9)
15:52:12 r3i1n2 in.rlogind[7059]: connect from 10.148.0.9 (10.148.0.9)
15:52:17 r3i1n2 kernel: ib1: Send unicast ARP to 016d
15:52:22 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:52:54 r3i1n2 in.rlogind[7060]: connect from 192.168.159.1 (192.168.159.1)
15:52:54 r3i1n2 rlogind[7060]: pam_rhosts_auth(rlogin:auth): allowed to root at r3lead as root
15:52:59 r3i1n2 kernel: ib1: Send unicast ARP to 016d
15:53:11 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:53:32 r3i1n2 kernel: ib0: Send unicast ARP to 00ac
15:54:14 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:54:19 r3i1n2 kernel: ib1: Send unicast ARP to 016d
15:54:26 r3i1n2 kernel: ib_mthca 0000:06:00.0: mthca_create_cq: cq = 0xffff81015a3ee7c0 cqn = 0x350090
15:54:26 r3i1n2 kernel: ib0: ipoib_cm_tx_init: ib_create_cq returns 0xffff81022523b1c0
15:54:26 r3i1n2 kernel: ib0: Request connection 0x13048f for gid fe80:0000:0000:0000:0008:f104:0398:8595 qpn 0x404
15:54:26 r3i1n2 kernel: ib0: REP received.
15:54:43 r3i1n2 in.rshd[7061]: connect from 192.168.159.1 (192.168.159.1)
15:54:43 r3i1n2 rshd[7061]: pam_rhosts_auth(rsh:auth): allowed to root at r3lead as root
15:54:48 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:55:03 r3i1n2 login[4750]: resmgr: unable to connect to resmgrd: No such file or directory
15:55:03 r3i1n2 login[4750]: resmgr login failed
15:55:23 r3i1n2 kernel: ib0: Send unicast ARP to 0165
15:55:28 r3i1n2 kernel: ib1: Send unicast ARP to 016d
15:55:30 r3i1n2 kernel: ib0: TX ring 0xf00405 full, stopping kernel net queue
15:55:32 r3i1n2 kernel: NETDEV WATCHDOG: ib0: transmit timed out
15:55:32 r3i1n2 kernel: ib0: transmit timeout: latency 1688 msecs
15:55:32 r3i1n2 kernel: ib0: queue stopped 1, tx_head 13657, tx_tail 13657
15:55:33 r3i1n2 kernel: NETDEV WATCHDOG: ib0: transmit timed out
 

Looking at the contents of the CQ context table (right after the 
overrun at 15:50:53), do the producer and consumer indices look 
reasonable? I expected to find that producer_index + 1 == consumer_index.

-- 
Arthur


From akepner at sgi.com  Mon Dec  3 18:01:04 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Mon, 3 Dec 2007 18:01:04 -0800
Subject: [ofa-general] IPoIB CQ overrun
In-Reply-To: <adazlx9wndb.fsf@cisco.com>
References: <20071115202302.GK5448@sgi.com> <adazlx9wndb.fsf@cisco.com>
Message-ID: <20071204020104.GG10669@sgi.com>

On Mon, Nov 19, 2007 at 08:29:36PM -0800, Roland Dreier wrote:
> 
> OFED 1.2 uses a separate CQ for send completions in connected mode.
> (I'm assuming you're using the OFED default of connected mode for
> IPoIB).  I guess it would be useful to know which CQ is overrunning,
> ie whether it is the main IPoIB CQ or one of the CM send CQs.  One way
> to check this would be to add a print to mthca to dump the CQN when a
> CQ is created, and also add prints to IPoIB just before each call to
> ib_create_cq() so that the CQNs can be correlated.
> 
> Another thing you could try would be a 2.6.24-rc kernel (or an OFED
> 1.3 prerelease I guess), which has a change that moves all completions
> into one CQ in IPoIB.  This may fix the bug by accident.
> 

Yes, we're using CM.

I dumped out the CQNs as they were created and generally the first 
non-reserved CQs get made by ipoib_transport_dev_init() when ipoib 
is brought up on each port. CQN 0x80 is used by port 0, 0x81 by 
port 1. 

The other CQs used by IPoIB are the ones made by ipoib_cm_tx_init(). 

We see overruns on both types of CQ. 

Here's an overrun on the main IPoIB CQ (CQN 0x80):

Dec  2 10:18:08 r6i1n8 kernel: ib0: Send unicast ARP to 0165
Dec  2 10:18:13 r6i1n8 kernel: ib1: Send unicast ARP to 016d
Dec  2 10:18:28 r6i1n8 kernel: ib0: Send unicast ARP to 0165
Dec  2 10:18:39 r6i1n8 kernel: ib0: Send unicast ARP to 010a
Dec  2 10:18:48 r6i1n8 kernel: ib0: Send unicast ARP to 0165
Dec  2 10:19:08 r6i1n8 kernel: ib0: Send unicast ARP to 0165
Dec  2 10:19:13 r6i1n8 kernel: ib1: Send unicast ARP to 016d
Dec  2 10:19:23 r6i1n8 kernel: ib0: Send unicast ARP to 016a
Dec  2 10:19:23 r6i1n8 kernel: ib_mthca 0000:06:00.0: CQ overrun on CQN 000080
Dec  2 10:19:23 r6i1n8 kernel: ib_mad: Fatal error (1) on MAD QP (1)
Dec  2 10:19:23 r6i1n8 kernel: cq_context = 0xffff8101b0ec1000
Dec  2 10:19:23 r6i1n8 kernel: flags = 0x90000900
Dec  2 10:19:23 r6i1n8 kernel: start_hi = 0x0
Dec  2 10:19:23 r6i1n8 kernel: start_lo = 0x0
Dec  2 10:19:23 r6i1n8 kernel: logsize_usrpage = 0xb000002
Dec  2 10:19:23 r6i1n8 kernel: comp_eqn = 0x1
Dec  2 10:19:23 r6i1n8 kernel: pd = 0x4
Dec  2 10:19:23 r6i1n8 kernel: lkey = 0x1300
Dec  2 10:19:23 r6i1n8 kernel: last_notified_index = 0x6972
Dec  2 10:19:23 r6i1n8 kernel: solicit_producer_index = 0x6173
Dec  2 10:19:23 r6i1n8 kernel: consumer_index = 0x0
Dec  2 10:19:23 r6i1n8 kernel: producer_index = 0x6973
Dec  2 10:19:23 r6i1n8 kernel: cqn = 0x80
Dec  2 10:19:23 r6i1n8 kernel: ci_db = 0x7fff
Dec  2 10:19:23 r6i1n8 kernel: state_db = 0x0
Dec  2 10:19:28 r6i1n8 kernel: ib0: Send unicast ARP to 0165
Dec  2 10:19:48 r6i1n8 kernel: ib0: Send unicast ARP to 0165
Dec  2 10:19:57 r6i1n8 kernel: ib_mad: Fatal error (1) on MAD QP (1)

(The CQ context table was dumped for debugging.)

And there was an example of a CM send CQ overrun in the mail I just 
sent to Eli (and ofa-general).

> Another thing you could try would be a 2.6.24-rc kernel (or an OFED
> 1.3 prerelease I guess), which has a change that moves all completions
> into one CQ in IPoIB.  This may fix the bug by accident.

The system was upgraded to OFED 1.3-alpha2, and now it's much more 
difficult to get the CQ overrun. (There are some overruns in the 
log files, but I can't seem to figure out how to reproduce them - 
it was much easier to get the CQ overruns with OFED 1.2 on the 
system.)

-- 
Arthur


From rdreier at cisco.com  Mon Dec  3 21:03:20 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 03 Dec 2007 21:03:20 -0800
Subject: [ofa-general] IPoIB CQ overrun
In-Reply-To: <20071204020104.GG10669@sgi.com> (akepner@sgi.com's message of
	"Mon, 3 Dec 2007 18:01:04 -0800")
References: <20071115202302.GK5448@sgi.com> <adazlx9wndb.fsf@cisco.com>
	<20071204020104.GG10669@sgi.com>
Message-ID: <aday7cbhx0n.fsf@cisco.com>

 > I dumped out the CQNs as they were created and generally the first 
 > non-reserved CQs get made by ipoib_transport_dev_init() when ipoib 
 > is brought up on each port. CQN 0x80 is used by port 0, 0x81 by 
 > port 1. 

Actually I think the first two CQs created are created by the MAD module:

 > Dec  2 10:19:23 r6i1n8 kernel: ib_mthca 0000:06:00.0: CQ overrun on CQN 000080
 > Dec  2 10:19:23 r6i1n8 kernel: ib_mad: Fatal error (1) on MAD QP (1)

It seems that there is a CQ error and then ib_mad gets a catastrophic
error on its QP.

Given that you are seeing CQ overruns on two completely different
types of QPs, I think its more likely there is some problem with the
mthca driver's handling of updating the CQ consumer index than that
there are two independent bugs being triggered by your test.

What kind of hardware was this on again?  It's x86-64, right?  But is
there anything out of the ordinary about these systems?

 - R.


From mukilk at gmail.com  Mon Dec  3 21:08:28 2007
From: mukilk at gmail.com (Mukil Kesavan)
Date: Tue, 4 Dec 2007 00:08:28 -0500
Subject: [ofa-general] OFED QoS Support
Message-ID: <8b9320b60712032108x3b85e5ebuf2ff79aa8b8e7f12@mail.gmail.com>

Hi,

I am a grad student at Georgia Institute of Technology working on Infiniband
for a project. My focus is on IB QoS support. I used the OFED
1.2distribution - opensm to configure the SL-VL mappings and the VLArb
tables
with the opensm.opts file in /var/cache/osm. Currently, I am configuring QoS
as 1 high priority VL with weight 200 and disabling other high priority VLs.
When I run the tests, there is no change in BW Avgs as per the QoS setting
and it appears that BW is shared evenly among different applications during
the tests.

I currently use a Mellanox MT25208 Inifnihost III Ex HCA - Firmware version
4.6.0. Could someone help me getting the QoS to work? There are not a lot of
resources on the web and I've been following the email exchanges amongst you
guys involved in development but they aren't conclusive to me.

Thanks for your time,

Mukil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071204/7f4c5755/attachment.html>

From kliteyn at mellanox.co.il  Mon Dec  3 21:17:29 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 4 Dec 2007 07:17:29 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-04:normal completion
Message-ID: <MTLEXCH01zwWJv3zOdn00009725@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-03
OpenSM git rev = Sun_Dec_2_14:13:47_2007 [0b4650d25a4ac93e615d100a9985abbb3ec17313]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=519  Fail=1
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
12 Multicast IS3-128.topo

Failures:
1 Multicast IS3-128.topo


From villaibarronci.com at knightpoint.com  Tue Dec  4 00:32:09 2007
From: villaibarronci.com at knightpoint.com (Quentin Powell)
Date: Tue, 04 Dec 2007 13:32:09 +0500
Subject: [ofa-general] Software At Low Pr1ce
Message-ID: <000201c8364b$c07e2a00$0100007f@vxnguru>


Use addr: yflnow1. com (delete space)
in your browser
....................
 Microsoft Windows Vista Ultimate   $79
 Macromedia Flash Professional 8    $49
 Adobe Premiere 2.0                 $59
 Corel Grafix Suite X3              $59
 Adobe Il1ustrator CS2              $59
 Adobe Photoshop CS2 V9.0           $69
 Adobe Photoshop CS3 Extended       $89
 Macromedia Studio 8                $99
 Autodesk Autocad 2007             $129
 Adobe Creative Suite 2            $149
 Adobe Creative Suite 3 Premium    $269
....................
        For Mac:
 Adobe Acrobat Pro 7            $69
 Adobe After Effects            $49
 Macromedia Flash Pro 8         $49
 Adobe Creative Suite 2 Premium $49
 Ableton Live 5.0.1             $49
 Adobe Photoshop CS             $49
....................
Just copy 'yflnow1. com' (w/o spaces and quotes)
in address bar of your browser


....................
Proster, his hands still tremb
Brodick reacted with amazing s
What the hells the matter with


From kliteyn at dev.mellanox.co.il  Tue Dec  4 00:13:02 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 04 Dec 2007 10:13:02 +0200
Subject: [ofa-general] Re: [PATCH 2/3] opensm: adding missing	comparison
	by	to_lid/from_lid in LinkRecord processing
In-Reply-To: <47542624.1090003@dev.mellanox.co.il>
References: <4752A1DB.5010103@dev.mellanox.co.il>		<20071202180949.GE708@sashak.voltaire.com>		<47541DAC.9000900@dev.mellanox.co.il>	<1196695616.10845.241.camel@hrosenstock-ws.xsigo.com>
	<47542624.1090003@dev.mellanox.co.il>
Message-ID: <47550C0E.1090908@dev.mellanox.co.il>

Yevgeny Kliteynik wrote:
> Hal Rosenstock wrote:
>> On Mon, 2007-12-03 at 17:15 +0200, Yevgeny Kliteynik wrote:
>>> Sasha Khapyorsky wrote:
>>>> Hi Yevgeny,
>>>>
>>>> On 14:15 Sun 02 Dec     , Yevgeny Kliteynik wrote:
>>>>> Comparison for IB_LR_COMPMASK_FROM_LID/IB_LR_COMPMASK_TO_LID
>>>>> component mask bits was missing in LinkRecord processing.
>>>>>
>>>>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>>>>> ---
>>>>>  opensm/opensm/osm_sa_link_record.c |   13 +++++++++++--
>>>>>  1 files changed, 11 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/opensm/opensm/osm_sa_link_record.c 
>>>>> b/opensm/opensm/osm_sa_link_record.c
>>>>> index ba52aea..0970ad7 100644
>>>>> --- a/opensm/opensm/osm_sa_link_record.c
>>>>> +++ b/opensm/opensm/osm_sa_link_record.c
>>>>> @@ -256,6 +256,17 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * 
>>>>> const p_rcv,
>>>>>          if (dest_port_num != p_lr->to_port_num)
>>>>>              goto Exit;
>>>>>
>>>>> +    __get_base_lid(p_src_physp, &from_base_lid);
>>>>> +    __get_base_lid(p_dest_physp, &to_base_lid);
>>>>> +
>>>>> +    if (comp_mask & IB_LR_COMPMASK_FROM_LID)
>>>>> +        if (from_base_lid != p_lr->from_lid)
>>>>> +            goto Exit;
>>>>> +
>>>>> +    if (comp_mask & IB_LR_COMPMASK_TO_LID)
>>>>> +        if (to_base_lid != p_lr->to_lid)
>>>>> +            goto Exit;
>>>> Would this be correct LMC > 0? As far as I understand aliased (not 
>>>> based)
>>>> LIDs can be used in a query.
>>> Good catch, thanks.
>>
>> Note that:
>> In a query request, any LID of a port can be requested as the ToLID. In
>> a query response, only the base LID of a port is returned as the ToLID.
> 
> Right.
> So in the current implementation it does build response with only base
> lids, but it will include more than one LinkRecord with the base lids
> in the resulting list - one for each lid when LMC>0...

I take it back.
Looks OK (apart from the lmc_mask issue).

-- Yevgeny

> I'll repost the patch.
> 
> -- Yevgeny
> 
>> -- Hal
>>
>>> -- Yevgeny
>>>
>>>> Sasha
>>>>
>>>>> +
>>>>>      if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
>>>>>          osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
>>>>>              "__osm_lr_rcv_get_physp_link: "
>>>>> @@ -267,8 +278,6 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * 
>>>>> const p_rcv,
>>>>>              cl_ntoh64(osm_physp_get_port_guid(p_dest_physp)),
>>>>>              dest_port_num);
>>>>>
>>>>> -    __get_base_lid(p_src_physp, &from_base_lid);
>>>>> -    __get_base_lid(p_dest_physp, &to_base_lid);
>>>>>
>>>>>      __osm_lr_rcv_build_physp_link(p_rcv, from_base_lid,
>>>>>                        to_base_lid, src_port_num,
>>>>> -- 
>>>>> 1.5.1.4
>>>>>
>>> _______________________________________________
>>> general mailing list
>>> general at lists.openfabrics.org
>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>
>>> To unsubscribe, please visit 
>>> http://openib.org/mailman/listinfo/openib-general
>>
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From kliteyn at dev.mellanox.co.il  Tue Dec  4 00:13:31 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 04 Dec 2007 10:13:31 +0200
Subject: [ofa-general] Re: [PATCH 2/3 v2] opensm: adding missing comparison
 by	to_lid/from_lid in LinkRecord processing
In-Reply-To: <20071204014535.GF15250@sashak.voltaire.com>
References: <47541DE7.4000106@dev.mellanox.co.il>
	<20071204013442.GE15250@sashak.voltaire.com>
	<20071204014535.GF15250@sashak.voltaire.com>
Message-ID: <47550C2B.6090907@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> On 01:34 Tue 04 Dec     , Sasha Khapyorsky wrote:
>>> @@ -256,6 +257,19 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>>>  		if (dest_port_num != p_lr->to_port_num)
>>>  			goto Exit;
>>>
>>> +	__get_base_lid(p_src_physp, &from_base_lid);
>>> +	__get_base_lid(p_dest_physp, &to_base_lid);
>>> +
>>> +	lmc_mask = ~((1 << p_rcv->p_subn->opt.lmc) - 1);
>>> +
>>> +	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
>>> +		if (from_base_lid != (p_lr->from_lid & lmc_mask))
>>> +			goto Exit;
>>> +
>>> +	if (comp_mask & IB_LR_COMPMASK_TO_LID)
>>> +		if (to_base_lid != (p_lr->to_lid & lmc_mask))
>>> +			goto Exit;
>> Actually it is broken too. Since all LIDs in comparison are in network
>> bit order lmc_mask should be converted too. I will patch separately.
> 
> Something like this.

Right, it does fix the problem. Thanks.

-- Yevgeny

> Sasha
> 
> commit 15e3721e822a2d93b32a6f08915a3f84e65424a4
> Author: Sasha Khapyorsky <sashak at voltaire.com>
> Date:   Tue Dec 4 01:37:57 2007 +0000
> 
>     opensm: fix lmc_mask bit order in osm_sa_link_record.c
>     
>     All LIDs here are in network byte order, so lmc_mask should be converted
>     too.
>     
>     Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> 
> diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
> index 1230b91..0497bcd 100644
> --- a/opensm/opensm/osm_sa_link_record.c
> +++ b/opensm/opensm/osm_sa_link_record.c
> @@ -177,7 +177,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>  	uint8_t dest_port_num;
>  	ib_net16_t from_base_lid;
>  	ib_net16_t to_base_lid;
> -	uint16_t lmc_mask;
> +	ib_net16_t lmc_mask;
>  
>  	OSM_LOG_ENTER(p_rcv->p_log, __osm_lr_rcv_get_physp_link);
>  
> @@ -261,6 +261,7 @@ __osm_lr_rcv_get_physp_link(IN osm_lr_rcv_t * const p_rcv,
>  	__get_base_lid(p_dest_physp, &to_base_lid);
>  
>  	lmc_mask = ~((1 << p_rcv->p_subn->opt.lmc) - 1);
> +	lmc_mask = cl_hton16(lmc_mask);
>  
>  	if (comp_mask & IB_LR_COMPMASK_FROM_LID)
>  		if (from_base_lid != (p_lr->from_lid & lmc_mask))
> 


From jackm at dev.mellanox.co.il  Tue Dec  4 00:32:38 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Tue, 4 Dec 2007 10:32:38 +0200
Subject: [ofa-general] [PATCH v9] IB/mlx4: shrinking WQE
Message-ID: <200712041032.39020.jackm@dev.mellanox.co.il>

IB/mlx4: shrinking WQE

ConnectX supports shrinking wqe, such that a single WR can include
multiple units of wqe_shift.  This way, WRs can differ in size, and
do not have to be a power of 2 in size, saving memory and speeding up
send WR posting.  Unfortunately, if we do this wqe_index field in CQE
can't be used to look up the WR ID anymore, so do this only if
selective signalling is off.

Further, on 32-bit platforms, we can't use vmap to make
the QP buffer virtually contigious. Thus we have to use
constant-sized WRs to make sure a WR is always fully within
a single page-sized chunk.

Finally, we use WR with NOP opcode to avoid wrap-around
in the middle of WR. We set NoErrorCompletion bit to avoid getting
completions with error for NOP WRs. Since NEC is only supported
starting with firmware 2.2.232, we use constant-sized WRs
for older firmware. And, since MLX QPs only support SEND, we use
constant-sized WRs in this case.

When stamping during NOP posting, do stamping following setting of
the NOP wqe valid bit.

Signed-off-by: Michael S. Tsirkin <mst at dev.mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

---

changes since v8:
	fixed bug in __mlx4_ib_modify_qp:  did not reset qp->sq_next_wqe
	in the TO_RESET transition.
	Found by Mellanox regression testing.

Index: infiniband/drivers/infiniband/hw/mlx4/cq.c
===================================================================
--- infiniband.orig/drivers/infiniband/hw/mlx4/cq.c	2007-10-29 11:09:24.000000000 +0200
+++ infiniband/drivers/infiniband/hw/mlx4/cq.c	2007-12-04 09:29:39.391728000 +0200
@@ -331,6 +331,12 @@ static int mlx4_ib_poll_one(struct mlx4_
 	is_error = (cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
 		MLX4_CQE_OPCODE_ERROR;
 
+	if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) == MLX4_OPCODE_NOP &&
+		     is_send)) {
+		printk(KERN_WARNING "Completion for NOP opcode detected!\n");
+		return -EINVAL;
+	}
+
 	if (!*cur_qp ||
 	    (be32_to_cpu(cqe->my_qpn) & 0xffffff) != (*cur_qp)->mqp.qpn) {
 		/*
@@ -353,8 +359,10 @@ static int mlx4_ib_poll_one(struct mlx4_
 
 	if (is_send) {
 		wq = &(*cur_qp)->sq;
-		wqe_ctr = be16_to_cpu(cqe->wqe_index);
-		wq->tail += (u16) (wqe_ctr - (u16) wq->tail);
+		if (!(*cur_qp)->sq_signal_bits) {
+			wqe_ctr = be16_to_cpu(cqe->wqe_index);
+			wq->tail += (u16) (wqe_ctr - (u16) wq->tail);
+		}
 		wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
 		++wq->tail;
 	} else if ((*cur_qp)->ibqp.srq) {
Index: infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h
===================================================================
--- infiniband.orig/drivers/infiniband/hw/mlx4/mlx4_ib.h	2007-10-24 18:19:56.000000000 +0200
+++ infiniband/drivers/infiniband/hw/mlx4/mlx4_ib.h	2007-12-04 09:29:39.394732000 +0200
@@ -120,6 +120,8 @@ struct mlx4_ib_qp {
 
 	u32			doorbell_qpn;
 	__be32			sq_signal_bits;
+	unsigned		sq_next_wqe;
+	int			sq_max_wqes_per_wr;
 	int			sq_spare_wqes;
 	struct mlx4_ib_wq	sq;
 
Index: infiniband/drivers/infiniband/hw/mlx4/qp.c
===================================================================
--- infiniband.orig/drivers/infiniband/hw/mlx4/qp.c	2007-11-01 09:17:50.000000000 +0200
+++ infiniband/drivers/infiniband/hw/mlx4/qp.c	2007-12-04 09:29:39.404731000 +0200
@@ -30,6 +30,7 @@
  * SOFTWARE.
  */
 
+#include <linux/log2.h>
 #include <rdma/ib_cache.h>
 #include <rdma/ib_pack.h>
 
@@ -96,7 +97,7 @@ static int is_qp0(struct mlx4_ib_dev *de
 
 static void *get_wqe(struct mlx4_ib_qp *qp, int offset)
 {
-	if (qp->buf.nbufs == 1)
+	if (BITS_PER_LONG == 64 || qp->buf.nbufs == 1)
 		return qp->buf.u.direct.buf + offset;
 	else
 		return qp->buf.u.page_list[offset >> PAGE_SHIFT].buf +
@@ -115,16 +116,88 @@ static void *get_send_wqe(struct mlx4_ib
 
 /*
  * Stamp a SQ WQE so that it is invalid if prefetched by marking the
- * first four bytes of every 64 byte chunk with 0xffffffff, except for
- * the very first chunk of the WQE.
+ * first four bytes of every 64 byte chunk with
+ * 0x7FFFFFF | (invalid_ownership_value << 31).
+ *
+ * When max WR is than or equal to the WQE size,
+ * as an optimization, we can stamp WQE with 0xffffffff,
+ * and skip the very first chunk of the WQE.
  */
-static void stamp_send_wqe(struct mlx4_ib_qp *qp, int n)
+static void stamp_send_wqe(struct mlx4_ib_qp *qp, int n, int size)
 {
-	u32 *wqe = get_send_wqe(qp, n);
+	u32 *wqe;
 	int i;
+	int s;
+	int ind;
+	void *buf;
+	__be32 stamp;
+
+	s = roundup(size, 1 << qp->sq.wqe_shift);
+	if (qp->sq_max_wqes_per_wr > 1) {
+		for (i = 0; i < s; i += 64) {
+			ind = (i >> qp->sq.wqe_shift) + n;
+			stamp = ind & qp->sq.wqe_cnt ?  cpu_to_be32(0x7fffffff) :
+							cpu_to_be32(0xffffffff);
+			buf = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1));
+			wqe = buf + (i & ((1 << qp->sq.wqe_shift) - 1));
+			*wqe = stamp;
+		}
+	} else {
+		buf = get_send_wqe(qp, n & (qp->sq.wqe_cnt - 1));
+		for (i = 64; i < s; i += 64) {
+			wqe = buf + i;
+			*wqe = 0xffffffff;
+		}
+	}
+}
+
+static void post_nop_wqe(struct mlx4_ib_qp *qp, int n, int size)
+{
+	struct mlx4_wqe_ctrl_seg *ctrl;
+	struct mlx4_wqe_inline_seg *inl;
+	void *wqe;
+	int s;
+
+	ctrl = wqe = get_send_wqe(qp, n & (qp->sq.wqe_cnt - 1));
+	s = sizeof(struct mlx4_wqe_ctrl_seg);
+
+	if (qp->ibqp.qp_type == IB_QPT_UD) {
+		struct mlx4_wqe_datagram_seg *dgram = wqe + sizeof *ctrl;
+		struct mlx4_av *av = (struct mlx4_av *)dgram->av;
+		memset(dgram, 0, sizeof *dgram);
+		av->port_pd = cpu_to_be32((qp->port << 24) | to_mpd(qp->ibqp.pd)->pdn);
+		s += sizeof(struct mlx4_wqe_datagram_seg);
+	}
+
+	/* Pad the remainder of the WQE with an inline data segment. */
+	if (size > s) {
+		inl = wqe + s;
+		inl->byte_count = cpu_to_be32(1 << 31 | (size - s - sizeof *inl));
+	}
+	ctrl->srcrb_flags = 0;
+	ctrl->fence_size = size / 16;
+	/*
+	 * Make sure descriptor is fully written before
+	 * setting ownership bit (because HW can start
+	 * executing as soon as we do).
+	 */
+	wmb();
 
-	for (i = 16; i < 1 << (qp->sq.wqe_shift - 2); i += 16)
-		wqe[i] = 0xffffffff;
+	ctrl->owner_opcode = cpu_to_be32(MLX4_OPCODE_NOP | MLX4_WQE_CTRL_NEC) |
+		(n & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0);
+
+	stamp_send_wqe(qp, n + qp->sq_spare_wqes, size);
+}
+
+/* Post NOP WQE to prevent wrap-around in the middle of WR */
+static inline unsigned pad_wraparound(struct mlx4_ib_qp *qp, int ind)
+{
+	unsigned s = qp->sq.wqe_cnt - (ind & (qp->sq.wqe_cnt - 1));
+	if (unlikely(s < qp->sq_max_wqes_per_wr)) {
+		post_nop_wqe(qp, ind, s << qp->sq.wqe_shift);
+		ind += s;
+	}
+	return ind;
 }
 
 static void mlx4_ib_qp_event(struct mlx4_qp *qp, enum mlx4_event type)
@@ -241,6 +314,8 @@ static int set_rq_size(struct mlx4_ib_de
 static int set_kernel_sq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
 			      enum ib_qp_type type, struct mlx4_ib_qp *qp)
 {
+	int s;
+
 	/* Sanity check SQ size before proceeding */
 	if (cap->max_send_wr	 > dev->dev->caps.max_wqes  ||
 	    cap->max_send_sge	 > dev->dev->caps.max_sq_sg ||
@@ -256,20 +331,69 @@ static int set_kernel_sq_size(struct mlx
 	    cap->max_send_sge + 2 > dev->dev->caps.max_sq_sg)
 		return -EINVAL;
 
-	qp->sq.wqe_shift = ilog2(roundup_pow_of_two(max(cap->max_send_sge *
-							sizeof (struct mlx4_wqe_data_seg),
-							cap->max_inline_data +
-							sizeof (struct mlx4_wqe_inline_seg)) +
-						    send_wqe_overhead(type)));
-	qp->sq.max_gs    = ((1 << qp->sq.wqe_shift) - send_wqe_overhead(type)) /
-		sizeof (struct mlx4_wqe_data_seg);
+	s = max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg),
+		cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) +
+		send_wqe_overhead(type);
 
 	/*
-	 * We need to leave 2 KB + 1 WQE of headroom in the SQ to
-	 * allow HW to prefetch.
+	 * Hermon supports shrinking wqe, such that a single WR can include
+	 * multiple units of wqe_shift.  This way, WRs can differ in size, and
+	 * do not have to be a power of 2 in size, saving memory and speeding up
+	 * send WR posting.  Unfortunately, if we do this wqe_index field in CQE
+	 * can't be used to look up the WR ID anymore, so do this only if
+	 * selective signalling is off.
+	 *
+	 * Further, on 32-bit platforms, we can't use vmap to make
+	 * the QP buffer virtually contigious. Thus we have to use
+	 * constant-sized WRs to make sure a WR is always fully within
+	 * a single page-sized chunk.
+	 *
+	 * Finally, we use NOP opcode to avoid wrap-around in the middle of WR.
+	 * We set NEC bit to avoid getting completions with error for NOP WRs.
+	 * Since NEC is only supported starting with firmware 2.2.232,
+	 * we use constant-sized WRs for older firmware.
+	 *
+	 * And, since MLX QPs only support SEND, we use constant-sized WRs in this
+	 * case.
+	 *
+	 * We look for the smallest value of wqe_shift such that the resulting
+	 * number of wqes does not exceed device capabilities.
+	 *
+	 * We set WQE size to at least 64 bytes, this way stamping invalidates each WQE.
 	 */
-	qp->sq_spare_wqes = (2048 >> qp->sq.wqe_shift) + 1;
-	qp->sq.wqe_cnt = roundup_pow_of_two(cap->max_send_wr + qp->sq_spare_wqes);
+	if (dev->dev->caps.fw_ver >= MLX4_FW_VER_WQE_CTRL_NEC &&
+	    qp->sq_signal_bits && BITS_PER_LONG == 64 &&
+	    type != IB_QPT_SMI && type != IB_QPT_GSI)
+		qp->sq.wqe_shift = ilog2(64);
+	else
+		qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s));
+
+	for (;;) {
+		if (1 << qp->sq.wqe_shift > dev->dev->caps.max_sq_desc_sz)
+			return -EINVAL;
+
+		qp->sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 << qp->sq.wqe_shift);
+
+		/*
+		 * We need to leave 2 KB + 1 WR of headroom in the SQ to
+		 * allow HW to prefetch.
+		 */
+		qp->sq_spare_wqes = (2048 >> qp->sq.wqe_shift) + qp->sq_max_wqes_per_wr;
+		qp->sq.wqe_cnt = roundup_pow_of_two(cap->max_send_wr *
+						    qp->sq_max_wqes_per_wr +
+						    qp->sq_spare_wqes);
+
+		if (qp->sq.wqe_cnt <= dev->dev->caps.max_wqes)
+			break;
+
+		if (qp->sq_max_wqes_per_wr <= 1)
+			return -EINVAL;
+
+		++qp->sq.wqe_shift;
+	}
+
+	qp->sq.max_gs = ((qp->sq_max_wqes_per_wr << qp->sq.wqe_shift) -
+			 send_wqe_overhead(type)) / sizeof (struct mlx4_wqe_data_seg);
 
 	qp->buf_size = (qp->rq.wqe_cnt << qp->rq.wqe_shift) +
 		(qp->sq.wqe_cnt << qp->sq.wqe_shift);
@@ -281,7 +405,8 @@ static int set_kernel_sq_size(struct mlx
 		qp->sq.offset = 0;
 	}
 
-	cap->max_send_wr  = qp->sq.max_post = qp->sq.wqe_cnt - qp->sq_spare_wqes;
+	cap->max_send_wr  = qp->sq.max_post =
+		(qp->sq.wqe_cnt - qp->sq_spare_wqes) / qp->sq_max_wqes_per_wr;
 	cap->max_send_sge = qp->sq.max_gs;
 	/* We don't support inline sends for kernel QPs (yet) */
 	cap->max_inline_data = 0;
@@ -327,6 +452,12 @@ static int create_qp_common(struct mlx4_
 	qp->rq.tail	    = 0;
 	qp->sq.head	    = 0;
 	qp->sq.tail	    = 0;
+	qp->sq_next_wqe     = 0;
+
+	if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR)
+		qp->sq_signal_bits = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE);
+	else
+		qp->sq_signal_bits = 0;
 
 	err = set_rq_size(dev, &init_attr->cap, !!pd->uobject, !!init_attr->srq, qp);
 	if (err)
@@ -417,11 +548,6 @@ static int create_qp_common(struct mlx4_
 	 */
 	qp->doorbell_qpn = swab32(qp->mqp.qpn << 8);
 
-	if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR)
-		qp->sq_signal_bits = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE);
-	else
-		qp->sq_signal_bits = 0;
-
 	qp->mqp.event = mlx4_ib_qp_event;
 
 	return 0;
@@ -916,7 +1042,7 @@ static int __mlx4_ib_modify_qp(struct ib
 			ctrl = get_send_wqe(qp, i);
 			ctrl->owner_opcode = cpu_to_be32(1 << 31);
 
-			stamp_send_wqe(qp, i);
+			stamp_send_wqe(qp, i, 1 << qp->sq.wqe_shift);
 		}
 	}
 
@@ -969,6 +1095,7 @@ static int __mlx4_ib_modify_qp(struct ib
 		qp->rq.tail = 0;
 		qp->sq.head = 0;
 		qp->sq.tail = 0;
+		qp->sq_next_wqe = 0;
 		if (!ibqp->srq)
 			*qp->db.db  = 0;
 	}
@@ -1278,13 +1405,14 @@ int mlx4_ib_post_send(struct ib_qp *ibqp
 	unsigned long flags;
 	int nreq;
 	int err = 0;
-	int ind;
-	int size;
+	unsigned ind;
+	int uninitialized_var(stamp);
+	int uninitialized_var(size);
 	int i;
 
 	spin_lock_irqsave(&qp->sq.lock, flags);
 
-	ind = qp->sq.head;
+	ind = qp->sq_next_wqe;
 
 	for (nreq = 0; wr; ++nreq, wr = wr->next) {
 		if (mlx4_wq_overflow(&qp->sq, nreq, qp->ibqp.send_cq)) {
@@ -1300,7 +1428,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp
 		}
 
 		ctrl = wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1));
-		qp->sq.wrid[ind & (qp->sq.wqe_cnt - 1)] = wr->wr_id;
+		qp->sq.wrid[(qp->sq.head + nreq) & (qp->sq.wqe_cnt - 1)] = wr->wr_id;
 
 		ctrl->srcrb_flags =
 			(wr->send_flags & IB_SEND_SIGNALED ?
@@ -1413,16 +1541,23 @@ int mlx4_ib_post_send(struct ib_qp *ibqp
 		ctrl->owner_opcode = mlx4_ib_opcode[wr->opcode] |
 			(ind & qp->sq.wqe_cnt ? cpu_to_be32(1 << 31) : 0);
 
+		stamp = ind + qp->sq_spare_wqes;
+		ind += DIV_ROUND_UP(size * 16, 1 << qp->sq.wqe_shift);
+
 		/*
 		 * We can improve latency by not stamping the last
 		 * send queue WQE until after ringing the doorbell, so
 		 * only stamp here if there are still more WQEs to post.
+		 *
+		 * Same optimization applies to padding with NOP wqe
+		 * in case of WQE shrinking (used to prevent wrap-around
+		 * in the middle of WR).
 		 */
-		if (wr->next)
-			stamp_send_wqe(qp, (ind + qp->sq_spare_wqes) &
-				       (qp->sq.wqe_cnt - 1));
+		if (wr->next) {
+			stamp_send_wqe(qp, stamp, size * 16);
+			ind = pad_wraparound(qp, ind);
+		}
 
-		++ind;
 	}
 
 out:
@@ -1444,8 +1579,10 @@ out:
 		 */
 		mmiowb();
 
-		stamp_send_wqe(qp, (ind + qp->sq_spare_wqes - 1) &
-			       (qp->sq.wqe_cnt - 1));
+		stamp_send_wqe(qp, stamp, size * 16);
+
+		ind = pad_wraparound(qp, ind);
+		qp->sq_next_wqe = ind;
 	}
 
 	spin_unlock_irqrestore(&qp->sq.lock, flags);
Index: infiniband/drivers/net/mlx4/alloc.c
===================================================================
--- infiniband.orig/drivers/net/mlx4/alloc.c	2007-11-08 10:01:19.000000000 +0200
+++ infiniband/drivers/net/mlx4/alloc.c	2007-12-04 09:29:39.410730000 +0200
@@ -151,6 +151,19 @@ int mlx4_buf_alloc(struct mlx4_dev *dev,
 
 			memset(buf->u.page_list[i].buf, 0, PAGE_SIZE);
 		}
+
+		if (BITS_PER_LONG == 64) {
+			struct page **pages;
+			pages = kmalloc(sizeof *pages * buf->nbufs, GFP_KERNEL);
+			if (!pages)
+				goto err_free;
+			for (i = 0; i < buf->nbufs; ++i)
+				pages[i] = virt_to_page(buf->u.page_list[i].buf);
+			buf->u.direct.buf = vmap(pages, buf->nbufs, VM_MAP, PAGE_KERNEL);
+			kfree(pages);
+			if (!buf->u.direct.buf)
+				goto err_free;
+		}
 	}
 
 	return 0;
@@ -170,6 +183,9 @@ void mlx4_buf_free(struct mlx4_dev *dev,
 		dma_free_coherent(&dev->pdev->dev, size, buf->u.direct.buf,
 				  buf->u.direct.map);
 	else {
+		if (BITS_PER_LONG == 64)
+			vunmap(buf->u.direct.buf);
+
 		for (i = 0; i < buf->nbufs; ++i)
 			if (buf->u.page_list[i].buf)
 				dma_free_coherent(&dev->pdev->dev, PAGE_SIZE,
Index: infiniband/include/linux/mlx4/device.h
===================================================================
--- infiniband.orig/include/linux/mlx4/device.h	2007-10-16 15:50:08.000000000 +0200
+++ infiniband/include/linux/mlx4/device.h	2007-12-04 09:29:39.417729000 +0200
@@ -133,6 +133,11 @@ enum {
 	MLX4_STAT_RATE_OFFSET	= 5
 };
 
+static inline u64 mlx4_fw_ver(u64 major, u64 minor, u64 subminor)
+{
+	return (major << 32) | (minor << 16) | subminor;
+}
+
 struct mlx4_caps {
 	u64			fw_ver;
 	int			num_ports;
@@ -189,7 +194,7 @@ struct mlx4_buf_list {
 };
 
 struct mlx4_buf {
-	union {
+	struct {
 		struct mlx4_buf_list	direct;
 		struct mlx4_buf_list   *page_list;
 	} u;
Index: infiniband/include/linux/mlx4/qp.h
===================================================================
--- infiniband.orig/include/linux/mlx4/qp.h	2007-10-16 15:50:08.000000000 +0200
+++ infiniband/include/linux/mlx4/qp.h	2007-12-04 09:29:39.420729000 +0200
@@ -154,7 +154,11 @@ struct mlx4_qp_context {
 	u32			reserved5[10];
 };
 
+/* Which firmware version adds support for NEC (NoErrorCompletion) bit */
+#define MLX4_FW_VER_WQE_CTRL_NEC mlx4_fw_ver(2, 2, 232)
+
 enum {
+	MLX4_WQE_CTRL_NEC	= 1 << 29,
 	MLX4_WQE_CTRL_FENCE	= 1 << 6,
 	MLX4_WQE_CTRL_CQ_UPDATE	= 3 << 2,
 	MLX4_WQE_CTRL_SOLICITED	= 1 << 1,


From tziporet at mellanox.co.il  Tue Dec  4 00:34:48 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Tue, 4 Dec 2007 10:34:48 +0200
Subject: [ofa-general] OFED 1.2.5.4 is ready on the ofa server
Message-ID: <6C2C79E72C305246B504CBA17B5500C90282E46F@mtlexch01.mtl.com>

OFED-1.2.5.4 is ready: 
http://www.openfabrics.org/downloads/OFED/ofed-1.2.5/OFED-1.2.5.4.tgz

Changes since OFED 1.2.5 
======================== 
- RDS: 
  - Performance enhancements 
  - GA for Oracle 11 
- IPoIB: 
  - Use NAPI by default 
  - For small received packets, allocate a new, smaller SKB to relief
accounting 
    on the socket. 
- mlx4: 
  - Enable changing default max HCA resource limits using module
options. 
  - Support opening of more resources then the default by increasing
command 
    timeout for INIT_HCA to 10 seconds 
- PPC64 support: 
  - Fixed compilation problems on SLES10 SP1 

Changes from OFED 1.2.5.3: 
========================== 
- Low level drivers update:
  - cxgb3: Pull in latest fixes.
  - ipath: Pull in latest fixes.
- OSes support:
  - Added support for SLES9 SP4 (no QA was done)
  - Added support for RHEL5 up1 (no QA was done)
- IPOIB:
  - Removed the usage of unsignalled QP in Tx due to deadlock.
- RDS:
  - Relax the header consistency check on fragment reassembly


Tziporet & Vlad 


Tziporet Koren
Software Director
Mellanox Technologies
mailto: tziporet at mellanox.co.il
Tel +972-4-9097200, ext 380


From tziporet at dev.mellanox.co.il  Tue Dec  4 00:55:23 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Tue, 04 Dec 2007 10:55:23 +0200
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <OF6AF07FF5.D914FC1B-ON872573A6.007CCC04-882573A6.004B8A56@us.ibm.com>
References: <OF6AF07FF5.D914FC1B-ON872573A6.007CCC04-882573A6.004B8A56@us.ibm.com>
Message-ID: <475515FB.10801@mellanox.co.il>

Shirley Ma wrote:
>
> Tziporet Koren <tziporet at dev.mellanox.co.il> wrote on 12/02/2007 
> 05:30:42 AM:
>
> > Shirley Ma wrote:
> > > I just touch tested ofed-1.3 beta IPoIB. And found there was a kernel
> > > parameter hw_csum being added in IPoIB. I have several questions here:
> > > 1. Why not using ethtool to set up these HW_CSUM flags?
> > >  
> > This is experimental code - Dror explained it in OFED devcon (Or sent
> > you the pointer)
>
> So this code won't be in mainline kernel.
>
Not until we start some SPEC definition to add this feature to IPoIB

Tziporet


From vlad at dev.mellanox.co.il  Tue Dec  4 01:08:11 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Tue, 04 Dec 2007 11:08:11 +0200
Subject: [ofa-general] [PATCH]  CMA: Enable conn_id remove
Message-ID: <475518FB.1080501@dev.mellanox.co.il>

Hi Sean,
I have the following issue: The IB driver can't be unloaded after running applications over RDS.
I saw that the 'dev_remove' counter does not reach 0 value on the passive side (after connection establishment).

Please review the following patch:

CMA: Enable conn_id remove on the passive side after
connection establishment.

Signed-off-by: Vladimir Sokolovsky <vlad at mellanox.co.il>
---
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 0751697..656d6df 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1122,8 +1122,10 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
         cm_id->cm_handler = cma_ib_handler;

         ret = conn_id->id.event_handler(&conn_id->id, &event);
-       if (!ret)
+       if (!ret) {
+               cma_enable_remove(conn_id);
                 goto out;
+       }

         /* Destroy the CM ID by returning a non-zero value. */
         conn_id->cm_id.ib = NULL;


From tziporet at mellanox.co.il  Tue Dec  4 01:09:15 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Tue, 4 Dec 2007 11:09:15 +0200
Subject: [ofa-general] OFED QoS Support
In-Reply-To: <8b9320b60712032108x3b85e5ebuf2ff79aa8b8e7f12@mail.gmail.com>
References: <8b9320b60712032108x3b85e5ebuf2ff79aa8b8e7f12@mail.gmail.com>
Message-ID: <6C2C79E72C305246B504CBA17B5500C90282E475@mtlexch01.mtl.com>

QoS is supported only with ConnectX HCA, and not Inifnihost III Ex HCA.

Also only from the coming FW release (2.3.0 that should be published
soon) this feature will be enabled.
Note that it will not be enabled by default and a FW configuration
should be maid to use it
 
Also a good explanation on QoS can be fount in the QoS Support
<http://www.openfabrics.org/archives/nov2007sc/OFA_QoS_07.ppt>
presentation by Sean Hefty, Intel; Dror Goldenberg, Mellanox under:

 
You must use OFED 1.3 to use QoS since it was not supported in OFED 1.2
Yevgeny will be able to help you regarding OSM setup to support QoS
 
Tziporet
 

________________________________

From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Mukil
Kesavan
Sent: Tuesday, December 04, 2007 7:08 AM
To: general at lists.openfabrics.org
Subject: [ofa-general] OFED QoS Support


Hi,

I am a grad student at Georgia Institute of Technology working on
Infiniband for a project. My focus is on IB QoS support. I used the OFED
1.2 distribution - opensm to configure the SL-VL mappings and the VLArb
tables with the opensm.opts file in /var/cache/osm. Currently, I am
configuring QoS as 1 high priority VL with weight 200 and disabling
other high priority VLs. When I run the tests, there is no change in BW
Avgs as per the QoS setting and it appears that BW is shared evenly
among different applications during the tests.

I currently use a Mellanox MT25208 Inifnihost III Ex HCA - Firmware
version 4.6.0. Could someone help me getting the QoS to work? There are
not a lot of resources on the web and I've been following the email
exchanges amongst you guys involved in development but they aren't
conclusive to me.

Thanks for your time,

Mukil 


From tziporet at dev.mellanox.co.il  Tue Dec  4 01:34:55 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Tue, 04 Dec 2007 11:34:55 +0200
Subject: [ofa-general] OFED Dec 3 meeting summary on beta release status
Message-ID: <47551F3F.7050007@mellanox.co.il>

Meeting Summary:
1. We must get bugzilla fixed ASAP to track OFED 1.3 bugs - Jeff B.
2. RC1 is delayed to next week since there were no builds for almost a week
3. Beta release testing status:

    * Voltaire: See issues with:
          o TCP performance of ConnectX (30% worst then Arbel on UD mode)
            Note: I verified that interrupt moderation was not activated
            on ConnectX as should.  Should be fixed today on the daily
            build.
            Also LRO is not yet enabled - to be done till RC1.
          o Performance is harmed when working with IPoIB partitioning
            (different PKey)
          o iSCSI over IPoIB
    * Cisco:
          o Test so far only x86_64; RHEL4, RHEL5, SLES10
          o Focus on MPI testing: Intel MPI 3.1; HP MPI 2.5.1
          o Will test new compilers: Intel 10.1 and PGI 7.1
    * Intel:
          o Beta is working fine on 16 nodes cluster
          o Tests also small ia64 cluster - see problem with MVAPICH
            compilation
    * Qlogic:
          o See SDP issues
          o Mainly test basic verbs (libibverbs)
          o Will have a code update in 1-2 weeks
    * Mellanox:
          o Covers x86, x86_64, PPC, all OSes on the matrix
          o SDP is still broken - should be fixed by end of this week
          o Test Open MPI and MVAPICH
          o See issues with IPoIB performance - under work


Tziporet


From HNGUYEN at de.ibm.com  Tue Dec  4 02:04:54 2007
From: HNGUYEN at de.ibm.com (Hoang-Nam Nguyen)
Date: Tue, 4 Dec 2007 11:04:54 +0100
Subject: [ofa-general] Re: [ewg] OFED Dec 3 meeting summary on beta release
	status
In-Reply-To: <47551F3F.7050007@mellanox.co.il>
Message-ID: <OF0277F5F0.989C2C47-ONC12573A7.00364309-C12573A7.00375D22@de.ibm.com>

Hello Tziporet!
This is our test status:
* Tested ehca, ehca2 on SLES10 ppc64 and upstream kernel
* RHEL4.5, RHEL5.1 and other backport test will be next
* Build process works great (basic, custom, 32/64-bit libs)
Thanks
Nam

ewg-bounces at lists.openfabrics.org wrote on 04.12.2007 10:34:55:

> Meeting Summary:
> 1. We must get bugzilla fixed ASAP to track OFED 1.3 bugs - Jeff B.
> 2. RC1 is delayed to next week since there were no builds for almost a
week
> 3. Beta release testing status:
>
>     * Voltaire: See issues with:
>           o TCP performance of ConnectX (30% worst then Arbel on UD mode)
>             Note: I verified that interrupt moderation was not activated
>             on ConnectX as should.  Should be fixed today on the daily
>             build.
>             Also LRO is not yet enabled - to be done till RC1.
>           o Performance is harmed when working with IPoIB partitioning
>             (different PKey)
>           o iSCSI over IPoIB
>     * Cisco:
>           o Test so far only x86_64; RHEL4, RHEL5, SLES10
>           o Focus on MPI testing: Intel MPI 3.1; HP MPI 2.5.1
>           o Will test new compilers: Intel 10.1 and PGI 7.1
>     * Intel:
>           o Beta is working fine on 16 nodes cluster
>           o Tests also small ia64 cluster - see problem with MVAPICH
>             compilation
>     * Qlogic:
>           o See SDP issues
>           o Mainly test basic verbs (libibverbs)
>           o Will have a code update in 1-2 weeks
>     * Mellanox:
>           o Covers x86, x86_64, PPC, all OSes on the matrix
>           o SDP is still broken - should be fixed by end of this week
>           o Test Open MPI and MVAPICH
>           o See issues with IPoIB performance - under work
>
>
> Tziporet
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


From vlad at lists.openfabrics.org  Tue Dec  4 02:53:12 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Tue,  4 Dec 2007 02:53:12 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071204-0200 daily build status
Message-ID: <20071204105312.8686CE6000C@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.21.1
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-53.el5

Failed:


From hrosenstock at xsigo.com  Tue Dec  4 03:59:59 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Tue, 04 Dec 2007 03:59:59 -0800
Subject: [ofa-general] OFED QoS Support
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90282E475@mtlexch01.mtl.com>
References: <8b9320b60712032108x3b85e5ebuf2ff79aa8b8e7f12@mail.gmail.com>
	<6C2C79E72C305246B504CBA17B5500C90282E475@mtlexch01.mtl.com>
Message-ID: <1196769599.30768.129.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-04 at 11:09 +0200, Tziporet Koren wrote:
> QoS is supported only with ConnectX HCA, and not Inifnihost III Ex HCA.
> 
> Also only from the coming FW release (2.3.0 that should be published
> soon) this feature will be enabled.

That's for Connect-X, right ? 

Does 5.2.0 support this for InfiniHost II Ex ? Is special FW config
needed there too to enable this ?

-- Hal

> Note that it will not be enabled by default and a FW configuration
> should be maid to use it
>  
> Also a good explanation on QoS can be fount in the QoS Support
> <http://www.openfabrics.org/archives/nov2007sc/OFA_QoS_07.ppt>
> presentation by Sean Hefty, Intel; Dror Goldenberg, Mellanox under:
> 
>  
> You must use OFED 1.3 to use QoS since it was not supported in OFED 1.2
> Yevgeny will be able to help you regarding OSM setup to support QoS
>  
> Tziporet
>  
> 
> ________________________________
> 
> From: general-bounces at lists.openfabrics.org
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Mukil
> Kesavan
> Sent: Tuesday, December 04, 2007 7:08 AM
> To: general at lists.openfabrics.org
> Subject: [ofa-general] OFED QoS Support
> 
> 
> Hi,
> 
> I am a grad student at Georgia Institute of Technology working on
> Infiniband for a project. My focus is on IB QoS support. I used the OFED
> 1.2 distribution - opensm to configure the SL-VL mappings and the VLArb
> tables with the opensm.opts file in /var/cache/osm. Currently, I am
> configuring QoS as 1 high priority VL with weight 200 and disabling
> other high priority VLs. When I run the tests, there is no change in BW
> Avgs as per the QoS setting and it appears that BW is shared evenly
> among different applications during the tests.
> 
> I currently use a Mellanox MT25208 Inifnihost III Ex HCA - Firmware
> version 4.6.0. Could someone help me getting the QoS to work? There are
> not a lot of resources on the web and I've been following the email
> exchanges amongst you guys involved in development but they aren't
> conclusive to me.
> 
> Thanks for your time,
> 
> Mukil 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From ogerlitz at voltaire.com  Tue Dec  4 05:16:57 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Tue, 04 Dec 2007 15:16:57 +0200
Subject: [ofa-general] OFED QoS Support
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90282E475@mtlexch01.mtl.com>
References: <8b9320b60712032108x3b85e5ebuf2ff79aa8b8e7f12@mail.gmail.com>
	<6C2C79E72C305246B504CBA17B5500C90282E475@mtlexch01.mtl.com>
Message-ID: <47555349.8050402@voltaire.com>

Tziporet Koren wrote:
> Also only from the coming FW release (2.3.0 that should be published
> soon) this feature will be enabled.
> Note that it will not be enabled by default and a FW configuration
> should be maid to use it

What is the feature you are referring to? is it per VL 
queuing/arbitration at the port level or injection (rate) limitation per 
QP or something else?

Or.


From kliteyn at mellanox.co.il  Tue Dec  4 05:52:28 2007
From: kliteyn at mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 04 Dec 2007 15:52:28 +0200
Subject: [ofa-general] OFED QoS Support
In-Reply-To: <47555349.8050402@voltaire.com>
References: <8b9320b60712032108x3b85e5ebuf2ff79aa8b8e7f12@mail.gmail.com>	<6C2C79E72C305246B504CBA17B5500C90282E475@mtlexch01.mtl.com>
	<47555349.8050402@voltaire.com>
Message-ID: <47555B9C.3060403@mellanox.co.il>


Or Gerlitz wrote:
> Tziporet Koren wrote:
>> Also only from the coming FW release (2.3.0 that should be published
>> soon) this feature will be enabled.
>> Note that it will not be enabled by default and a FW configuration
>> should be maid to use it
>
> What is the feature you are referring to? is it per VL 
> queuing/arbitration at the port level or injection (rate) limitation 
> per QP or something else?
>

It's VL arbitration - both arbitration itself and the VLArb tables 
configuration.

-- Yevgeny

> Or.
>
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
>


From dwslfam at slfa.com  Tue Dec  4 05:51:13 2007
From: dwslfam at slfa.com (Bent LALLOZ)
Date: Tue, 4 Dec 2007 21:51:13 +0800
Subject: [ofa-general] Lower your medication expenses with Canadian Pharmacy.
Message-ID: <01c836bf$c9c12170$9ea25dde@dwslfam>

 «CanadianPharmacy» is committed to provide their customers with top quality medications at absolutely low prices. The major part of our customers is Americans as medications in Canada are cheaper than in America.

 «CanadianPharmacy» guarantees fast delivery in discreet packaging and confidentiality of your private information. Visit our site to find the product you need, and you will be surprised by our prices. Customer service staff will help you with the initial order.

http://blockwash.cn

 Quality medications should be affordable for all.

Bent LALLOZ


From kliteyn at dev.mellanox.co.il  Tue Dec  4 07:04:11 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 04 Dec 2007 17:04:11 +0200
Subject: [ofa-general] [PATCH 3/3 v2] opensm: Fixing broken logic in 'process
 world' part of LinkRecord processing
Message-ID: <47556C6B.4080900@dev.mellanox.co.il>

Fixing broken logic in 'process world' part of LinkRecord processing.
When both HCA's ports belong to the same subnet, OpenSM would scan
'half-world' for each port of this HCA, and then for each port it would
get the node and iterate again through all the ports of this node.
In addition to the time consumed by these unnecessary iterations, it
also caused some records to be found twice.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

---
 opensm/opensm/osm_sa_link_record.c |   37 ++++++++++++++++++++++-------------
 1 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index 0497bcd..a6bdc8f 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -304,7 +304,8 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv,
 {
 	const osm_physp_t *p_src_physp;
 	const osm_physp_t *p_dest_physp;
-	const cl_qmap_t *p_port_tbl;
+	const cl_qmap_t *p_node_tbl;
+	osm_node_t * p_node;
 	uint8_t port_num;
 	uint8_t num_ports;
 	uint8_t dest_num_ports;
@@ -421,19 +422,27 @@ __osm_lr_rcv_get_port_links(IN osm_lr_rcv_t * const p_rcv,
 			/*
 			   Process the world (recurse once back into this function).
 			 */
-			p_port_tbl = &p_rcv->p_subn->port_guid_tbl;
-			p_src_port = (osm_port_t *) cl_qmap_head(p_port_tbl);
-
-			while (p_src_port !=
-			       (osm_port_t *) cl_qmap_end(p_port_tbl)) {
-				__osm_lr_rcv_get_port_links(p_rcv, p_lr,
-							    p_src_port, NULL,
-							    comp_mask, p_list,
-							    p_req_physp);
-
-				p_src_port =
-				    (osm_port_t *) cl_qmap_next(&p_src_port->
-								map_item);
+			p_node_tbl = &p_rcv->p_subn->node_guid_tbl;
+			p_node = (osm_node_t *)cl_qmap_head(p_node_tbl);
+
+			while (p_node != (osm_node_t *)cl_qmap_end(p_node_tbl)) {
+				/*
+				   Get only one port for each node.
+				   After the recursive call, this function will
+				   scan all the ports of this node anyway.
+				 */
+				p_src_physp = osm_node_get_any_physp_ptr(p_node);
+				if (osm_physp_is_valid(p_src_physp)) {
+					p_src_port = (osm_port_t *)
+					    cl_qmap_get(&p_rcv->p_subn->port_guid_tbl,
+					        osm_physp_get_port_guid(p_src_physp));
+					__osm_lr_rcv_get_port_links(p_rcv, p_lr,
+								    p_src_port, NULL,
+								    comp_mask, p_list,
+								    p_req_physp);
+				}
+				p_node = (osm_node_t *) cl_qmap_next(&p_node->
+								     map_item);
 			}
 		}
 	}
-- 
1.5.1.4


From eli at dev.mellanox.co.il  Tue Dec  4 07:15:41 2007
From: eli at dev.mellanox.co.il (Eli Cohen)
Date: Tue, 04 Dec 2007 17:15:41 +0200
Subject: [ofa-general] IPoIB CQ overrun
In-Reply-To: <20071204014021.GF10669@sgi.com>
References: <20071115202302.GK5448@sgi.com> <20071116232331.GK24803@sgi.com>
	<1195374443.2802.31.camel@mtls03>  <20071204014021.GF10669@sgi.com>
Message-ID: <1196781341.16214.3.camel@mtls03>


On Mon, 2007-12-03 at 17:40 -0800, akepner at sgi.com wrote:

> Looking at the contents of the CQ context table (right after the 
> overrun at 15:50:53), do the producer and consumer indices look 
> reasonable? I expected to find that producer_index + 1 == consumer_index.

I am not sure the value you get in the query for the consumer index
reflects exactly what the consumer actually polled. I am checking with
our FW team.
Where you able to reproduce this on a system with less nodes?


From teenqueen1171987 at bellsouth.net  Tue Dec  4 07:44:29 2007
From: teenqueen1171987 at bellsouth.net (Manuela Knight)
Date: Tue, 4 Dec 2007 17:44:29 +0200
Subject: [ofa-general] Looking for really effective non-surgery penis
	enlargement method?
Message-ID: <01c8369d$51d184b0$487a515c@teenqueen1171987>

The best way to enlarge your penis is the safest one with permanent results. Such method is ExpressHerbals penis enlargement system. Convenient, discreet, easy to use, wear and take off this system is medically approved and 100% safe.
 Make order and be sure that your ExpressHerbals will be delivered in short terms. Completely safe and non-embarrassing online ordering process! Prompt service.

http://geocities.com/DwayneRosario46/

Order ExpressHerbals today!


From eli at mellanox.co.il  Tue Dec  4 07:46:40 2007
From: eli at mellanox.co.il (Eli Cohen)
Date: Tue, 04 Dec 2007 17:46:40 +0200
Subject: [ofa-general] [PATCH] IPOIB: use LRO
Message-ID: <1196783200.16214.16.camel@mtls03>

IPOIB use LRO

modify IPOIB to use LRO. Checksum offload is still required
to ensure reliability of the packets.

Signed-off-by: Eli Cohen <eli at mellanox.co.il>
---

TODO:
add checksum offload support to the core and hw devices.
add ethtool support to provide interface for statistics.


 drivers/infiniband/ulp/ipoib/Kconfig      |    1 +
 drivers/infiniband/ulp/ipoib/ipoib.h      |    8 +++++
 drivers/infiniband/ulp/ipoib/ipoib_ib.c   |    9 +++++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c |   47 +++++++++++++++++++++++++++++
 4 files changed, 64 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/Kconfig b/drivers/infiniband/ulp/ipoib/Kconfig
index 1f76bad..691525c 100644
--- a/drivers/infiniband/ulp/ipoib/Kconfig
+++ b/drivers/infiniband/ulp/ipoib/Kconfig
@@ -1,6 +1,7 @@
 config INFINIBAND_IPOIB
 	tristate "IP-over-InfiniBand"
 	depends on NETDEVICES && INET && (IPV6 || IPV6=n)
+	select INET_LRO
 	---help---
 	  Support for the IP-over-InfiniBand protocol (IPoIB). This
 	  transports IP packets over InfiniBand so you can use your IB
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index eb7edab..4621e93 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -52,6 +52,7 @@
 #include <rdma/ib_verbs.h>
 #include <rdma/ib_pack.h>
 #include <rdma/ib_sa.h>
+#include <linux/inet_lro.h>
 
 /* constants */
 
@@ -93,6 +94,9 @@ enum {
 	IPOIB_MCAST_FLAG_SENDONLY = 1,
 	IPOIB_MCAST_FLAG_BUSY 	  = 2,	/* joining or already joined */
 	IPOIB_MCAST_FLAG_ATTACHED = 3,
+
+	IPOIB_MAX_LRO_DESCRIPTORS = 8,
+	IPOIB_LRO_MAX_AGGR 	  = 64,
 };
 
 #define	IPOIB_OP_RECV   (1ul << 31)
@@ -313,6 +317,9 @@ struct ipoib_dev_priv {
 	struct dentry *mcg_dentry;
 	struct dentry *path_dentry;
 #endif
+
+	struct net_lro_mgr lro_mgr;
+	struct net_lro_desc lro_desc[IPOIB_MAX_LRO_DESCRIPTORS];
 };
 
 struct ipoib_ah {
@@ -622,6 +629,7 @@ extern struct ib_sa_client ipoib_sa_client;
 
 #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
 extern int ipoib_debug_level;
+extern int ipoib_use_lro;
 
 #define ipoib_dbg(priv, format, arg...)			\
 	do {					        \
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 5063dd5..07f30ad 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -231,7 +231,11 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 	skb->dev = dev;
 	/* XXX get correct PACKET_ type here */
 	skb->pkt_type = PACKET_HOST;
-	netif_receive_skb(skb);
+
+	if (ipoib_use_lro)
+		lro_receive_skb(&priv->lro_mgr, skb, 0);
+	else
+		netif_receive_skb(skb);
 
 repost:
 	if (unlikely(ipoib_ib_post_receive(dev, wr_id)))
@@ -327,6 +331,9 @@ poll_more:
 			goto poll_more;
 	}
 
+	if (ipoib_use_lro)
+		lro_flush_all(&priv->lro_mgr);
+
 	return done;
 }
 
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index c9f6077..8623075 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -61,6 +61,11 @@ MODULE_PARM_DESC(send_queue_size, "Number of descriptors in send queue");
 module_param_named(recv_queue_size, ipoib_recvq_size, int, 0444);
 MODULE_PARM_DESC(recv_queue_size, "Number of descriptors in receive queue");
 
+
+int ipoib_use_lro __read_mostly = 0;
+module_param_named(ipoib_use_lro, ipoib_use_lro, int, 0644);
+MODULE_PARM_DESC(ipoib_use_lro, "Enable LRO if not equal 0");
+
 #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
 int ipoib_debug_level;
 
@@ -946,6 +951,46 @@ static const struct header_ops ipoib_header_ops = {
 	.create	= ipoib_hard_header,
 };
 
+static int get_skb_hdr(struct sk_buff *skb, void **iphdr,
+		       void **tcph, u64 *hdr_flags, void *priv)
+{
+	unsigned int ip_len;
+	struct iphdr *iph;
+
+	/* FIXME - verify CQE checksum ??? */
+
+	/* non tcp packet */
+	skb_reset_network_header(skb);
+	iph = ip_hdr(skb);
+	if (iph->protocol != IPPROTO_TCP)
+		return -1;
+
+	ip_len = ip_hdrlen(skb);
+	skb_set_transport_header(skb, ip_len);
+	*tcph = tcp_hdr(skb);
+
+	/* check if ip header and tcp header are complete */
+	if (iph->tot_len < ip_len + tcp_hdrlen(skb))
+		return -1;
+
+	*hdr_flags = LRO_IPV4 | LRO_TCP;
+	*iphdr = iph;
+
+	return 0;
+}
+
+static void ipoib_lro_setup(struct ipoib_dev_priv *priv)
+{
+	priv->lro_mgr.max_aggr = IPOIB_LRO_MAX_AGGR;
+	priv->lro_mgr.max_desc = IPOIB_MAX_LRO_DESCRIPTORS;
+	priv->lro_mgr.lro_arr = priv->lro_desc;
+	priv->lro_mgr.get_skb_header = get_skb_hdr;
+	priv->lro_mgr.features = LRO_F_NAPI;
+	priv->lro_mgr.dev = priv->dev;
+	priv->lro_mgr.ip_summed = CHECKSUM_UNNECESSARY;
+	priv->lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
+}
+
 static void ipoib_setup(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
@@ -985,6 +1030,8 @@ static void ipoib_setup(struct net_device *dev)
 
 	priv->dev = dev;
 
+	ipoib_lro_setup(priv);
+
 	spin_lock_init(&priv->lock);
 	spin_lock_init(&priv->tx_lock);
 
-- 
1.5.3.6


From chrise at sgi.com  Tue Dec  4 07:50:07 2007
From: chrise at sgi.com (Chris Elmquist)
Date: Tue, 4 Dec 2007 09:50:07 -0600
Subject: [ofa-general] IPoIB CQ overrun
In-Reply-To: <1196781341.16214.3.camel@mtls03>
References: <20071115202302.GK5448@sgi.com> <20071116232331.GK24803@sgi.com>
	<1195374443.2802.31.camel@mtls03> <20071204014021.GF10669@sgi.com>
	<1196781341.16214.3.camel@mtls03>
Message-ID: <20071204155007.GA3083@sgi.com>

On Tuesday (12/04/2007 at 05:15PM +0200), Eli Cohen wrote:
> 
> On Mon, 2007-12-03 at 17:40 -0800, akepner at sgi.com wrote:
> 
> > Looking at the contents of the CQ context table (right after the 
> > overrun at 15:50:53), do the producer and consumer indices look 
> > reasonable? I expected to find that producer_index + 1 == consumer_index.
> 
> I am not sure the value you get in the query for the consumer index
> reflects exactly what the consumer actually polled. I am checking with
> our FW team.
> Where you able to reproduce this on a system with less nodes?

It seems to be very difficult to reproduce when the node count is
small.  I am not sure we have yet.  On the other hand, very large
node counts make it real obvious.

Chris
-- 
Chris Elmquist          mailto:chrise at sgi.com      (651)683-3093
                        Silicon Graphics, Inc.     Eagan, MN


From mildlyozy0 at thelandofnod.com  Tue Dec  4 08:11:22 2007
From: mildlyozy0 at thelandofnod.com (Gonzalo Bolden)
Date: Tue, 4 Dec 2007 18:11:22 +0200
Subject: [ofa-general] EarnestMacroFuckstick
Message-ID: <01c836a1$13360100$4d21f83e@mildlyozy0>

PhallusProdigiousDarrellhttp://chemhg.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071204/8d59ec09/attachment.html>

From eli at dev.mellanox.co.il  Tue Dec  4 08:15:21 2007
From: eli at dev.mellanox.co.il (Eli Cohen)
Date: Tue, 04 Dec 2007 18:15:21 +0200
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <OFA7A18491.37BC7D4D-ON872573A6.007CFD7C-882573A6.004EB91D@us.ibm.com>
References: <OFA7A18491.37BC7D4D-ON872573A6.007CFD7C-882573A6.004EB91D@us.ibm.com>
Message-ID: <1196784921.16214.32.camel@mtls03>


On Mon, 2007-12-03 at 15:19 -0800, Shirley Ma wrote:
> Ethtool does support rx csum and tx csum:
> #define ETHTOOL_GRXCSUM         0x00000014 /* Get RX hw csum enable
> (ethtool_value) */
> #define ETHTOOL_SRXCSUM         0x00000015 /* Set RX hw csum enable
> (ethtool_value) */
> #define ETHTOOL_GTXCSUM         0x00000016 /* Get TX hw csum enable
> (ethtool_value) */
> #define ETHTOOL_STXCSUM         0x00000017 /* Set TX hw csum enable
> (ethtool_value) */
> 
I believe this configuration relates to checksum generation/validation
of the device and not to this specific feature.


> That's something we discussed long time ago when we wanted GSO to
> avoid extra copy by using ICRC to enable SG feature. I remembered
> Roland rejected this idea since there could be potenical data
> corruption. And even if we do prove that ICRC is 100% accurate, then
> we should have some codes here to limit the IP destination within IB
> subnet when using ICRC. Otherwise, if the packets routing out to
> ehthernet IP subnet, these packets will be dropped.
> 

I agree that we should not let a node configured to route packets, to
work at this mode.
But what if we use this rule:

if a node is configured, by root user, for ip forwarding then that root
user should make sure not to enable this feature.

Does this make sense?


From akepner at sgi.com  Tue Dec  4 08:36:09 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Tue, 4 Dec 2007 08:36:09 -0800
Subject: [ofa-general] IPoIB CQ overrun
In-Reply-To: <aday7cbhx0n.fsf@cisco.com>
References: <20071115202302.GK5448@sgi.com> <adazlx9wndb.fsf@cisco.com>
	<20071204020104.GG10669@sgi.com> <aday7cbhx0n.fsf@cisco.com>
Message-ID: <20071204163609.GI10669@sgi.com>

On Mon, Dec 03, 2007 at 09:03:20PM -0800, Roland Dreier wrote:
> ....
> What kind of hardware was this on again?  It's x86-64, right?  But is
> there anything out of the ordinary about these systems?
> 

Yes, these are x86_64. 2 quad-core CPUs, 2 MT25204 (DDR). Nothing 
unusual.

-- 
Arthur


From dotanb at dev.mellanox.co.il  Tue Dec  4 09:08:34 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Tue, 04 Dec 2007 19:08:34 +0200
Subject: [ofa-general] Question:  Verbs API Error code recover
In-Reply-To: <4754A46C.1030808@hermes-microvision.com>
References: <4750A321.5080406@hermes-microvision.com>
	<47527C85.8050801@dev.mellanox.co.il>
	<4754A46C.1030808@hermes-microvision.com>
Message-ID: <47558992.5020906@dev.mellanox.co.il>

Wei Fang wrote:
> Hi, Dotan:
>
> When I got that error, I quit my program and use ib_rmda_bw prorgam to 
> test Infiniband link. It still fails like this:
>
> ib_rdma_bw 10.8.6.3
> 19068: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | 
> iters=1000 | duplex=0 | cma=0 |
> 19068: Local address:  LID 0x01, QPN 0x2e0404, PSN 0xc39344 RKey 
> 0x4c003101 VAddr 0x00002a958bc000
> 19068: Remote address: LID 0x3d9, QPN 0x140404, PSN 0x77012a, RKey 
> 0x74003100 VAddr 0x00002a958bc000
>
> 19068:main: Completion with error at client:
> 19068:main: Failed status 12: wr_id 3
> 19068:main: scnt=100, ccnt=0
This means that the remote QP didn't response (or didn't send the 
respond in time).
can you try to execute ibv_rc_pingpong between the sides and check what 
is the status?
what is the output of ibv_devinfo in both sides?
(maybe something bad happened to the link)

thanks
Dotan


From sashak at voltaire.com  Tue Dec  4 09:39:00 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 4 Dec 2007 17:39:00 +0000
Subject: [ofa-general] Re: [PATCH 3/3 v2] opensm: Fixing broken logic in
	'process world' part of LinkRecord processing
In-Reply-To: <47556C6B.4080900@dev.mellanox.co.il>
References: <47556C6B.4080900@dev.mellanox.co.il>
Message-ID: <20071204173900.GB20470@sashak.voltaire.com>

On 17:04 Tue 04 Dec     , Yevgeny Kliteynik wrote:
> Fixing broken logic in 'process world' part of LinkRecord processing.
> When both HCA's ports belong to the same subnet, OpenSM would scan
> 'half-world' for each port of this HCA, and then for each port it would
> get the node and iterate again through all the ports of this node.
> In addition to the time consumed by these unnecessary iterations, it
> also caused some records to be found twice.
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From rdreier at cisco.com  Tue Dec  4 09:31:49 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 09:31:49 -0800
Subject: [ofa-general] [PATCH v9] IB/mlx4: shrinking WQE
In-Reply-To: <200712041032.39020.jackm@dev.mellanox.co.il> (Jack Morgenstein's
	message of "Tue, 4 Dec 2007 10:32:38 +0200")
References: <200712041032.39020.jackm@dev.mellanox.co.il>
Message-ID: <ada8x4aicxm.fsf@cisco.com>

I've been meaning to say this for a while, so sorry for not commenting
sooner.  but I think this would be much easier to review as two
patches: one that just uses vmap to make all queues virtually
contiguous, and one that actually enables using wqe bbs to post
smaller wqes.

 - R.


From sean.hefty at intel.com  Tue Dec  4 09:46:50 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Tue, 4 Dec 2007 09:46:50 -0800
Subject: [ofa-general] RE: [PATCH]  CMA: Enable conn_id remove
In-Reply-To: <475518FB.1080501@dev.mellanox.co.il>
References: <475518FB.1080501@dev.mellanox.co.il>
Message-ID: <000001c8369d$a60cef10$3c98070a@amr.corp.intel.com>

>I have the following issue: The IB driver can't be unloaded after running
>applications over RDS.

Do you mean the HCA driver?  Are there still active connections when the driver
is unloaded?

>I saw that the 'dev_remove' counter does not reach 0 value on the passive side
>(after connection establishment).
>
>Please review the following patch:
>
>CMA: Enable conn_id remove on the passive side after
>connection establishment.
>
>Signed-off-by: Vladimir Sokolovsky <vlad at mellanox.co.il>

This looks correct to me, but I'd like to understand why we don't see problems
more often.

- Sean

>---
>diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>index 0751697..656d6df 100644
>--- a/drivers/infiniband/core/cma.c
>+++ b/drivers/infiniband/core/cma.c
>@@ -1122,8 +1122,10 @@ static int cma_req_handler(struct ib_cm_id *cm_id,
>struct ib_cm_event *ib_event)
>         cm_id->cm_handler = cma_ib_handler;
>
>         ret = conn_id->id.event_handler(&conn_id->id, &event);
>-       if (!ret)
>+       if (!ret) {
>+               cma_enable_remove(conn_id);
>                 goto out;
>+       }
>
>         /* Destroy the CM ID by returning a non-zero value. */
>         conn_id->cm_id.ib = NULL;


From dwsontagm at sontag.com  Tue Dec  4 10:04:05 2007
From: dwsontagm at sontag.com (Keith Osborne)
Date: Tue, 4 Dec 2007 15:04:05 -0300
Subject: [ofa-general] Quality medications can be cheap!
Message-ID: <01c83686$e96fe080$eda614be@dwsontagm>

 {PharmaCanadaAll-1}
 {PharmaCanadaAll-1}

 {PharmaCanadaAll-2}

http://touchpoem.cn

 You'll never be disappointed with ŤCanadianPharmacyť drugstore online service.

Keith Osborne


From wei.fang at hermes-microvision.com  Tue Dec  4 10:05:03 2007
From: wei.fang at hermes-microvision.com (Wei Fang)
Date: Tue, 04 Dec 2007 10:05:03 -0800
Subject: [ofa-general] Question:  Verbs API Error code recover
In-Reply-To: <47558992.5020906@dev.mellanox.co.il>
References: <4750A321.5080406@hermes-microvision.com>
	<47527C85.8050801@dev.mellanox.co.il>
	<4754A46C.1030808@hermes-microvision.com>
	<47558992.5020906@dev.mellanox.co.il>
Message-ID: <475596CF.4010804@hermes-microvision.com>

Hi, Dotan:

I found that this issue happen in kernel 2.6.9-22 and related to 
opensm.  When this issue happen, any test always fail. I pull out 
Infiniband cable and relink it, opensm can not response it.  When I stop 
opensm serivce and restart opensm,  Infiniband link recover.  But I 
didn't found this issue in kernel 2.6.20 or 2.6.22.

Dotan Barak wrote:
> Wei Fang wrote:
>> Hi, Dotan:
>>
>> When I got that error, I quit my program and use ib_rmda_bw prorgam 
>> to test Infiniband link. It still fails like this:
>>
>> ib_rdma_bw 10.8.6.3
>> 19068: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | 
>> iters=1000 | duplex=0 | cma=0 |
>> 19068: Local address:  LID 0x01, QPN 0x2e0404, PSN 0xc39344 RKey 
>> 0x4c003101 VAddr 0x00002a958bc000
>> 19068: Remote address: LID 0x3d9, QPN 0x140404, PSN 0x77012a, RKey 
>> 0x74003100 VAddr 0x00002a958bc000
>>
>> 19068:main: Completion with error at client:
>> 19068:main: Failed status 12: wr_id 3
>> 19068:main: scnt=100, ccnt=0
> This means that the remote QP didn't response (or didn't send the 
> respond in time).
> can you try to execute ibv_rc_pingpong between the sides and check 
> what is the status?
> what is the output of ibv_devinfo in both sides?
> (maybe something bad happened to the link)
>
> thanks
> Dotan
>
>

-- 
Best Regards

Wei Fang

Hermes Microvision Inc.

(Tel)       (408)597-8600
(Fax)       (408)597-8601
(Direct Tel)(408)597-8646

============================================
The information contained in this document is confidential and may be
legally privileged. It is intended solely for the use of the addressee and
others authorized to receive it. If you are not the intended recipient you
are hereby notified that any disclosure, copying, distribution or any action
taken or omitted in reliance on it is strictly prohibited and may be
unlawful.
============================================


From dotanb at dev.mellanox.co.il  Tue Dec  4 10:45:21 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Tue, 04 Dec 2007 20:45:21 +0200
Subject: [ofa-general] Question:  Verbs API Error code recover
In-Reply-To: <475596CF.4010804@hermes-microvision.com>
References: <4750A321.5080406@hermes-microvision.com>
	<47527C85.8050801@dev.mellanox.co.il>
	<4754A46C.1030808@hermes-microvision.com>
	<47558992.5020906@dev.mellanox.co.il>
	<475596CF.4010804@hermes-microvision.com>
Message-ID: <4755A041.6090801@dev.mellanox.co.il>

I'm trying to gather some data in order to reproduce this in our lab
(we didn't encounter this behavior in our regression)

Which Linux distribution do you use?
Do you have error messages in the /var/log/messages?
Can you execute perfquery and check if there are errors on the link?


thanks
Dotan


Wei Fang wrote:
> Hi, Dotan:
>
> I found that this issue happen in kernel 2.6.9-22 and related to 
> opensm.  When this issue happen, any test always fail. I pull out 
> Infiniband cable and relink it, opensm can not response it.  When I 
> stop opensm serivce and restart opensm,  Infiniband link recover.  But 
> I didn't found this issue in kernel 2.6.20 or 2.6.22.
>
> Dotan Barak wrote:
>> Wei Fang wrote:
>>> Hi, Dotan:
>>>
>>> When I got that error, I quit my program and use ib_rmda_bw prorgam 
>>> to test Infiniband link. It still fails like this:
>>>
>>> ib_rdma_bw 10.8.6.3
>>> 19068: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | 
>>> iters=1000 | duplex=0 | cma=0 |
>>> 19068: Local address:  LID 0x01, QPN 0x2e0404, PSN 0xc39344 RKey 
>>> 0x4c003101 VAddr 0x00002a958bc000
>>> 19068: Remote address: LID 0x3d9, QPN 0x140404, PSN 0x77012a, RKey 
>>> 0x74003100 VAddr 0x00002a958bc000
>>>
>>> 19068:main: Completion with error at client:
>>> 19068:main: Failed status 12: wr_id 3
>>> 19068:main: scnt=100, ccnt=0
>> This means that the remote QP didn't response (or didn't send the 
>> respond in time).
>> can you try to execute ibv_rc_pingpong between the sides and check 
>> what is the status?
>> what is the output of ibv_devinfo in both sides?
>> (maybe something bad happened to the link)
>>
>> thanks
>> Dotan
>>
>>
>


From jik at boztek.com.au  Tue Dec  4 11:08:01 2007
From: jik at boztek.com.au (Moises Santiago)
Date: Tue, 4 Dec 2007 21:08:01 +0200
Subject: [ofa-general] Increase the amount of semen that you produce with
	WonderCum
Message-ID: <01c836b9$c0b50e80$c087e858@jik>

  Every man wants to achieve better, longer-lasting, amazing orgasms. Intensity of orgasm can be increased by a greater volume of cum. WonderCum is an amazing product designed to in large the volume of sperm and thereby intensity and power of orgasms.

 Our customer-service staff is working hard to ensure the best service and fast delivery. We understand that you’re eager to receive WonderCum as soon as possible. Confidentiality and security are guaranteed.

http://geocities.com/GordonHead07/

  No prescription needed!


From rdreier at cisco.com  Tue Dec  4 11:17:30 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 11:17:30 -0800
Subject: [ofa-general] [PATCH 0/6] nes: Cosmetic changes;
	support virtual WQs and PPC
In-Reply-To: <ada4pezjsya.fsf@cisco.com> (Roland Dreier's message of "Mon,
	03 Dec 2007 14:48:13 -0800")
References: <20071114221453.3ADD5E609F0@openfabrics.org>
	<adalk8jnsty.fsf@cisco.com>
	<5E701717F2B2ED4EA60F87C8AA57B7CC07A57572@venom2>
	<ada4pezjsya.fsf@cisco.com>
Message-ID: <adair3eclrp.fsf@cisco.com>

OK, I just pushed out a few more small cleanups (running unifdef,
fixing signedness warnings, and fixing a locking bug on an error
path).  One question: what is the point of the monkeying with
SPIN_BUG_ON on in nes.c?

 - R.


From rdreier at cisco.com  Tue Dec  4 11:21:06 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 11:21:06 -0800
Subject: [ofa-general] [PATCH 5/5] nes: napi interface fix
In-Reply-To: <000301c83381$dd3fc1c0$ff0da8c0@amr.corp.intel.com> (Sean Hefty's
	message of "Fri, 30 Nov 2007 10:50:23 -0800")
References: <20071130183705.8A819E601CE@openfabrics.org>
	<000301c83381$dd3fc1c0$ff0da8c0@amr.corp.intel.com>
Message-ID: <adaeje2cllp.fsf@cisco.com>

 > > #ifdef NES_NAPI
 > 
 > Is #ifdef napi sprinkled throughout the code common for most drivers?  Is there
 > a better way to handle this?  (Is this OFED only for backports, or for
 > upstream?)

Is there any reason why we want the upstream kernel to have both NAPI
and non-NAPI support?  If so, then this should probably be settable
through Kconfig rather than having to edit the Makefile to change the
NES_NAPI define.  However, what almost always seems to happen is that
no one uses the non-default code and it ends up bitrotting to the
point of not compiling.  So I would strongly suggest just having the
NAPI code and getting rid of the NES_NAPI tests entirely.  Is there
any reason not to do that?

 - R.


From Jeffrey.C.Becker at nasa.gov  Tue Dec  4 11:41:24 2007
From: Jeffrey.C.Becker at nasa.gov (Jeff Becker)
Date: Tue, 04 Dec 2007 11:41:24 -0800
Subject: [ofa-general] Bugzilla is back!
Message-ID: <4755AD64.50407@nasa.gov>

Hi all. I created a self-signed certificate, so after you accept it (it
claims to be from me), you should be able to access OFA bugzilla without
further "expired certificate" hassles.

Off to fix the wiki now....

-jeff


From ardavis at ichips.intel.com  Tue Dec  4 11:40:17 2007
From: ardavis at ichips.intel.com (Arlin Davis)
Date: Tue, 04 Dec 2007 11:40:17 -0800
Subject: [ofa-general] uDAPL EVD queue length issue
In-Reply-To: <20071203224550.GF11990@opengridcomputing.com>
References: <20071203224550.GF11990@opengridcomputing.com>
Message-ID: <4755AD21.4080001@ichips.intel.com>

Jon Mason wrote:
> While working on OMPI udapl btl, I have noticed some "interesting"
> behavior.  OFA udapl wants the evd queues to be a power of 2 and
> then will subtract 1 for book keeping (ie, so that internal head and
> tail pointers never touch except when the ring is empty).  OFA udapl
> will report the queue length as this number (and not the original
> size requested) when queried.  This becomes interesting when a power
> of 2 is passed in and then queried.  For example, a requested queue
> of length 256 will report a length of 255 when queried.  

Something is not right. You should ALWAYS get at least what you request. 
On my system with an mthca, a request of 256 gets you 511. It is the 
verbs provider that is rounding up, not uDAPL.

Here is my uDAPL debug output (DAPL_DBG_TYPE=0xffff) using dtest:

  cq_object_create: (0x519bb0,0x519d00)
dapls_ib_cq_alloc: evd 0x519bb0 cqlen=256
dapls_ib_cq_alloc: new_cq 0x519d60 cqlen=511

This is before and after the ibv_create_cq call. uDAPL builds it's EVD 
resources based on what is returned from this call.

I modified dtest to double check the dat_evd_query and I get the same:

8962 dto_rcv_evd created 0x519e80
8962 dto_req_evd QLEN - requested 256 and actual 511

What OFED release and device are you using?

-arlin


From ggrundstrom at NetEffect.com  Tue Dec  4 11:56:24 2007
From: ggrundstrom at NetEffect.com (Glenn Grundstrom)
Date: Tue, 4 Dec 2007 13:56:24 -0600
Subject: [ofa-general] [PATCH 0/6] nes: Cosmetic changes;
	support virtual WQs and PPC
In-Reply-To: <adair3eclrp.fsf@cisco.com>
References: <20071114221453.3ADD5E609F0@openfabrics.org><adalk8jnsty.fsf@cisco.com><5E701717F2B2ED4EA60F87C8AA57B7CC07A57572@venom2><ada4pezjsya.fsf@cisco.com>
	<adair3eclrp.fsf@cisco.com>
Message-ID: <5E701717F2B2ED4EA60F87C8AA57B7CC07AF661D@venom2>

> 
> OK, I just pushed out a few more small cleanups (running unifdef,
> fixing signedness warnings, and fixing a locking bug on an error
> path).  One question: what is the point of the monkeying with
> SPIN_BUG_ON on in nes.c?
>

Probably just some leftover debugging.  I can remove it.

Thanks for letting me know that you've push content
to your branch.  I'll pick it up.  Btw, I'll be preparing
another set of patches that should be ready very soon.

Glenn.
 
>  - R.
> 


From jim at mellanox.com  Tue Dec  4 12:16:47 2007
From: jim at mellanox.com (Jim Mott)
Date: Tue, 4 Dec 2007 12:16:47 -0800
Subject: [ofa-general] [PATCH 1/1 V2] SDP - Various bzcopy fixes
Message-ID: <F57121538EA0C94F86018DDD40ADA1D19C1D5B@mtiexch01.mti.com>

The Mellanox regression tests posted a number of failures when
multiple threads were accessing the same sockets concurrently.  In
addition to test failures, there were log messages of the form:
  sdp_sock(54386:19002): Could not reap -5 in-flight sends

This fix handles all these failures and errors.

The V2 is a fix to handle 2.6.22+ kernels where sk_buffs have
changed.

Signed-off-by: Jim Mott <jim at mellanox.com>
---

Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp.h
===================================================================
--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp.h
2007-12-04 12:08:38.000000000 -0600
+++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp.h	2007-12-04
12:08:59.000000000 -0600
@@ -173,7 +173,6 @@
 
 	/* BZCOPY data */
 	int   zcopy_thresh;
-	void *zcopy_context;
 
 	struct ib_sge ibsge[SDP_MAX_SEND_SKB_FRAGS + 1];
 	struct ib_wc  ibwc[SDP_NUM_WC];
Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c
===================================================================
--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_bcopy.c
2007-12-04 12:08:38.000000000 -0600
+++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c
2007-12-04 12:55:45.000000000 -0600
@@ -218,6 +218,7 @@
 	struct ib_device *dev;
 	struct sdp_buf *tx_req;
 	struct sk_buff *skb;
+	struct bzcopy_state *bz;
 	int i, frags;
 
 	if (unlikely(mseq != ssk->tx_tail)) {
@@ -242,16 +243,9 @@
 	++ssk->tx_tail;
 
 	/* TODO: AIO and real zcopy cdoe; add their context support here
*/
-	if (ssk->zcopy_context && skb->data_len) {
-		struct bzcopy_state *bz;
-		struct sdp_bsdh *h;
-
-		h = (struct sdp_bsdh *)skb->data;
-		if (h->mid == SDP_MID_DATA) {
-			bz = (struct bzcopy_state *)ssk->zcopy_context;
-			bz->busy--;
-		}
-	}
+	bz = *(struct bzcopy_state **)skb->cb;
+	if (bz)
+		bz->busy--;
 
 	return skb;
 }
@@ -751,12 +745,8 @@
 		sdp_post_recvs(ssk);
 		sdp_post_sends(ssk, 0);
 
-		if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) {
-			if (ssk->zcopy_context)
-				sdp_bzcopy_write_space(ssk);
-			else
-				sk_stream_write_space(&ssk->isk.sk);
-		}
+		if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
+			sk_stream_write_space(&ssk->isk.sk);
 	}
 
 	return ret;
Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c
===================================================================
--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_main.c
2007-12-04 12:08:38.000000000 -0600
+++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_main.c
2007-12-04 12:54:34.000000000 -0600
@@ -1203,10 +1203,24 @@
 
 static inline struct bzcopy_state *sdp_bz_cleanup(struct bzcopy_state
*bz)
 {
-	int i;
+	int i, max_retry;
 	struct sdp_sock *ssk = (struct sdp_sock *)bz->ssk;
 
-	ssk->zcopy_context = NULL;
+	/* Wait for in-flight sends; should be quick */
+	if (bz->busy) {
+		struct sock *sk = &ssk->isk.sk;
+
+		for (max_retry = 0; max_retry < 10000; max_retry++) {
+			poll_send_cq(sk);
+
+			if (!bz->busy)
+				break;
+		}
+
+		if (bz->busy)
+			sdp_warn(sk, "Could not reap %d in-flight
sends\n",
+				 bz->busy);
+	}
 
 	if (bz->pages) {
 		for (i = bz->cur_page; i < bz->page_cnt; i++)
@@ -1280,14 +1294,14 @@
 	}
 
 	up_write(&current->mm->mmap_sem);
-	ssk->zcopy_context = bz;
 
 	return bz;
 
 out_2:
 	up_write(&current->mm->mmap_sem);
+	kfree(bz->pages);
 out_1:
-	sdp_bz_cleanup(bz);
+	kfree(bz);
 
 	return NULL;
 }
@@ -1461,19 +1475,17 @@
 };
 
 /* like sk_stream_memory_free - except measures remote credits */
-static inline int sdp_bzcopy_slots_avail(struct sdp_sock *ssk)
+static inline int sdp_bzcopy_slots_avail(struct sdp_sock *ssk,
+					 struct bzcopy_state *bz)
 {
-	struct bzcopy_state *bz = (struct bzcopy_state
*)ssk->zcopy_context;
-
-	BUG_ON(!bz);
 	return slots_free(ssk) > bz->busy;
 }
 
 /* like sk_stream_wait_memory - except waits on remote credits */
-static int sdp_bzcopy_wait_memory(struct sdp_sock *ssk, long *timeo_p)
+static int sdp_bzcopy_wait_memory(struct sdp_sock *ssk, long *timeo_p,
+				  struct bzcopy_state *bz)
 {
 	struct sock *sk = &ssk->isk.sk;
-	struct bzcopy_state *bz = (struct bzcopy_state
*)ssk->zcopy_context;
 	int err = 0;
 	long vm_wait = 0;
 	long current_timeo = *timeo_p;
@@ -1481,7 +1493,7 @@
 
 	BUG_ON(!bz);
 
-	if (sdp_bzcopy_slots_avail(ssk))
+	if (sdp_bzcopy_slots_avail(ssk, bz))
 		current_timeo = vm_wait = (net_random() % (HZ / 5)) + 2;
 
 	while (1) {
@@ -1506,13 +1518,13 @@
 
 		clear_bit(SOCK_ASYNC_NOSPACE, &sk->sk_socket->flags);
 
-		if (sdp_bzcopy_slots_avail(ssk))
+		if (sdp_bzcopy_slots_avail(ssk, bz))
 			break;
 
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 		sk->sk_write_pending++;
 		sk_wait_event(sk, &current_timeo,
-			sdp_bzcopy_slots_avail(ssk) && vm_wait);
+			sdp_bzcopy_slots_avail(ssk, bz) && vm_wait);
 		sk->sk_write_pending--;
 
 		if (vm_wait) {
@@ -1603,7 +1615,8 @@
 			skb = sk->sk_write_queue.prev;
 
 			if (!sk->sk_send_head ||
-			    (copy = size_goal - skb->len) <= 0) {
+			    (copy = size_goal - skb->len) <= 0 ||
+			    bz != *(struct bzcopy_state **)skb->cb) {
 
 new_segment:
 				/*
@@ -1614,7 +1627,7 @@
 				 * receive credits.
 				 */
 				if (bz) {
-					if
(!sdp_bzcopy_slots_avail(ssk))
+					if (!sdp_bzcopy_slots_avail(ssk,
bz))
 						goto wait_for_sndbuf;
 				} else {
 					if (!sk_stream_memory_free(sk))
@@ -1626,6 +1639,8 @@
 				if (!skb)
 					goto wait_for_memory;
 
+				*((struct bzcopy_state **)skb->cb) = bz;
+
 				/*
 				 * Check whether we can use HW checksum.
 				 */
@@ -1691,7 +1706,7 @@
 			if (copied)
 				sdp_push(sk, ssk, flags & ~MSG_MORE,
mss_now, TCP_NAGLE_PUSH);
 
-			err = (bz) ? sdp_bzcopy_wait_memory(ssk, &timeo)
:
+			err = (bz) ? sdp_bzcopy_wait_memory(ssk, &timeo,
bz) :
 				     sk_stream_wait_memory(sk, &timeo);
 			if (err)
 				goto do_error;
@@ -1704,24 +1719,10 @@
 out:
 	if (copied) {
 		sdp_push(sk, ssk, flags, mss_now, ssk->nonagle);
-		if (bz) {
-			int max_retry;
-
-			/* Wait for in-flight sends; should be quick */
-			for (max_retry = 0; max_retry < 10000;
max_retry++) {
-				if (!bz->busy)
-					break;
-
-				poll_send_cq(sk);
-			}
-
-			if (bz->busy)
-				sdp_warn(sk,
-					 "Could not reap %d in-flight
sends\n",
-					 bz->busy);
 
+		if (bz)
 			bz = sdp_bz_cleanup(bz);
-		} else
+		else
 			if (size > send_poll_thresh)
 				poll_send_cq(sk);
 	}


From weiny2 at llnl.gov  Tue Dec  4 13:30:26 2007
From: weiny2 at llnl.gov (Ira Weiny)
Date: Tue, 4 Dec 2007 13:30:26 -0800
Subject: [ofa-general] OFED 1.2.5.4 is ready on the ofa server
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90282E46F@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C90282E46F@mtlexch01.mtl.com>
Message-ID: <20071204133026.48cfece1.weiny2@llnl.gov>

It looks like there is something corrupt with the tarball.

13:29:50 > tar xzf OFED-1.2.5.4.tgz 

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now


Ira


On Tue, 4 Dec 2007 10:34:48 +0200
"Tziporet Koren" <tziporet at mellanox.co.il> wrote:

> OFED-1.2.5.4 is ready: 
> http://www.openfabrics.org/downloads/OFED/ofed-1.2.5/OFED-1.2.5.4.tgz
> 
> Changes since OFED 1.2.5 
> ======================== 
> - RDS: 
>   - Performance enhancements 
>   - GA for Oracle 11 
> - IPoIB: 
>   - Use NAPI by default 
>   - For small received packets, allocate a new, smaller SKB to relief
> accounting 
>     on the socket. 
> - mlx4: 
>   - Enable changing default max HCA resource limits using module
> options. 
>   - Support opening of more resources then the default by increasing
> command 
>     timeout for INIT_HCA to 10 seconds 
> - PPC64 support: 
>   - Fixed compilation problems on SLES10 SP1 
> 
> Changes from OFED 1.2.5.3: 
> ========================== 
> - Low level drivers update:
>   - cxgb3: Pull in latest fixes.
>   - ipath: Pull in latest fixes.
> - OSes support:
>   - Added support for SLES9 SP4 (no QA was done)
>   - Added support for RHEL5 up1 (no QA was done)
> - IPOIB:
>   - Removed the usage of unsignalled QP in Tx due to deadlock.
> - RDS:
>   - Relax the header consistency check on fragment reassembly
> 
> 
> Tziporet & Vlad 
> 
> 
> 
> 
> Tziporet Koren
> Software Director
> Mellanox Technologies
> mailto: tziporet at mellanox.co.il
> Tel +972-4-9097200, ext 380
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


From or.gerlitz at gmail.com  Tue Dec  4 13:41:47 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Tue, 4 Dec 2007 23:41:47 +0200
Subject: [ofa-general] RE: [PATCH] CMA: Enable conn_id remove
In-Reply-To: <000001c8369d$a60cef10$3c98070a@amr.corp.intel.com>
References: <475518FB.1080501@dev.mellanox.co.il>
	<000001c8369d$a60cef10$3c98070a@amr.corp.intel.com>
Message-ID: <15ddcffd0712041341x738769c9i74de69d38b14c8a3@mail.gmail.com>

On 12/4/07, Sean Hefty <sean.hefty at intel.com> wrote:
> >I saw that the 'dev_remove' counter does not reach 0 value on the passive side
> >(after connection establishment).

> >CMA: Enable conn_id remove on the passive side after
> >connection establishment.

> This looks correct to me, but I'd like to understand why we don't see problems
> more often.

Sean,

Can you elaborate more on the problem here? it sounds like a general
one so I don't see
the relation to RDS, why it does not happen all the time with all the ULPs?

Or.


From Jeffrey.C.Becker at nasa.gov  Tue Dec  4 13:58:49 2007
From: Jeffrey.C.Becker at nasa.gov (Jeff Becker)
Date: Tue, 04 Dec 2007 13:58:49 -0800
Subject: [ofa-general] Wiki is back!
Message-ID: <4755CD99.10801@nasa.gov>

-jeff


From changquing.tang at hp.com  Tue Dec  4 15:36:34 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Tue, 4 Dec 2007 23:36:34 +0000
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>

Here is an issue we have:

struct ibv_context {
        struct ibv_device      *device;
        struct ibv_context_ops  ops;
        int                     cmd_fd;
        int                     async_fd;
        int                     num_comp_vectors;
        pthread_mutex_t         mutex;
        void                   *abi_compat;
};

The binary is compiled with OFED 1.2 header files,  it tries to set async_fd to non-blocking, I get error:
Bad file descriptor.   If I compile the binary with OFED-1.3-beta header files (with XRC changes), it works fine.


Is this the expected behavior, or there will be a fix ?


Thanks.
--CQ Tang


________________________________
From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Tziporet Koren
Sent: Thursday, November 22, 2007 9:46 AM
To: ewg at lists.openfabrics.org
Cc: general at lists.openfabrics.org
Subject: [ofa-general] OFED 1.3 Beta release is available


Hi,

OFED 1.3 Beta release is available on
http://www.openfabrics.org/downloads/OFED/ofed-1.3/OFED-1.3-beta2.tgz
To get BUILD_ID run ofed_info

Please report any issues in bugzilla https://bugs.openfabrics.org/

The RC1 release is expected on December 5

Tziporet & Vlad

========================================================================

Release information:
--------------------
OS support:
Novell:
    - SLES10
    - SLES10 SP1 and up1
Redhat:
    - Redhat EL4 up4 and up5
    - Redhat EL5 and up1
kernel.org:
    - 2.6.23 and 2.6.24-rc2

Systems:
    * x86_64
    * x86
    * ia64
    * ppc64*

Main Changes from OFED 1.3-alpha
================================

 *   Kernel code based on 2.6.24-rc2
 *   New packages:
    *   SRP target
    *   qperf test from Qlogic
    *   ibsim package
    *   uDAPL 2.0 library (1.0 & 2.0 are coexist)
 *   New OSes Support:
    *   RHEL 5 up1
    *   SLES10 SP1 up1
 *   Compilation issues resolved:
    *   Open MPI compilation on SLES10 SP1
    *   ibutils compiles on SLES10 PPC64 (64 bits)
    *   Apply patches that fix warning of backport patches
    *   Prefix is now supported properly
 *   RDS implementation for API version 2 was updated form 1.2.5 branch
 *   Fix binary compatibility of libibverbs caused by XRC implementation
 *   Uninstall is now working properly
 *   ib-bonding update to release 19
 *   MPI packages update:
    *   mvapich-1.0.0-1625.src.rpm
    *   mvapich2-1.0.1-1.src.rpm
    *   openmpi-1.2.4-1.src.rpm

Mlx4 driver specific changes:

 *   Enable changing the default of HCA resource limits with module parameters
 *   Default number of maximum QPs is now 128K (was 64K)
 *   Fixing max_cqe's (not adding an extra cqe)
 *   Fix state check in mlx4_qp_modify
 *   Sanity check userspace send queue sizes
 *   Several bug fixes in XRC


Tasks that should be completed for the beta release:
====================================================
1. 32-bit libraries to be supported on SLES10 SP1 Update1.
2. Fix SDP stability issues
3. IPoIB performance improvements for small messages
4. Fix bugs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071204/60a2aa87/attachment.html>

From wei.fang at hermes-microvision.com  Tue Dec  4 15:51:42 2007
From: wei.fang at hermes-microvision.com (Wei Fang)
Date: Tue, 04 Dec 2007 15:51:42 -0800
Subject: [ofa-general] Question:  Verbs API Error code recover
In-Reply-To: <4755A041.6090801@dev.mellanox.co.il>
References: <4750A321.5080406@hermes-microvision.com>
	<47527C85.8050801@dev.mellanox.co.il>
	<4754A46C.1030808@hermes-microvision.com>
	<47558992.5020906@dev.mellanox.co.il>
	<475596CF.4010804@hermes-microvision.com>
	<4755A041.6090801@dev.mellanox.co.il>
Message-ID: <4755E80E.9070609@hermes-microvision.com>


Dotan Barak wrote:
> I'm trying to gather some data in order to reproduce this in our lab
> (we didn't encounter this behavior in our regression)
>
> Which Linux distribution do you use?
We use RedHat 5.0 distribution, kernek version is 2.6.9-22.
> Do you have error messages in the /var/log/messages?
No find any error messages in that.
> Can you execute perfquery and check if there are errors on the link?
>
Next error happen, I will use perfquery to check it. In our 
application,  there are  8 computer linked to an Infiniband switch.  
When this issue happen, all infiniband link lost.  Before we run opensm 
service in each computer. Right now, we have turned off  7 computers's 
opensm service and just leave one computer run opensm service.  Today we 
will continue to test this issue.
>
> thanks
> Dotan
>
>
> Wei Fang wrote:
>> Hi, Dotan:
>>
>> I found that this issue happen in kernel 2.6.9-22 and related to 
>> opensm.  When this issue happen, any test always fail. I pull out 
>> Infiniband cable and relink it, opensm can not response it.  When I 
>> stop opensm serivce and restart opensm,  Infiniband link recover.  
>> But I didn't found this issue in kernel 2.6.20 or 2.6.22.
>>
>> Dotan Barak wrote:
>>> Wei Fang wrote:
>>>> Hi, Dotan:
>>>>
>>>> When I got that error, I quit my program and use ib_rmda_bw prorgam 
>>>> to test Infiniband link. It still fails like this:
>>>>
>>>> ib_rdma_bw 10.8.6.3
>>>> 19068: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | 
>>>> iters=1000 | duplex=0 | cma=0 |
>>>> 19068: Local address:  LID 0x01, QPN 0x2e0404, PSN 0xc39344 RKey 
>>>> 0x4c003101 VAddr 0x00002a958bc000
>>>> 19068: Remote address: LID 0x3d9, QPN 0x140404, PSN 0x77012a, RKey 
>>>> 0x74003100 VAddr 0x00002a958bc000
>>>>
>>>> 19068:main: Completion with error at client:
>>>> 19068:main: Failed status 12: wr_id 3
>>>> 19068:main: scnt=100, ccnt=0
>>> This means that the remote QP didn't response (or didn't send the 
>>> respond in time).
>>> can you try to execute ibv_rc_pingpong between the sides and check 
>>> what is the status?
>>> what is the output of ibv_devinfo in both sides?
>>> (maybe something bad happened to the link)
>>>
>>> thanks
>>> Dotan
>>>
>>>
>>
>
>

-- 
Best Regards

Wei Fang

Hermes Microvision Inc.

(Tel)       (408)597-8600
(Fax)       (408)597-8601
(Direct Tel)(408)597-8646

============================================
The information contained in this document is confidential and may be
legally privileged. It is intended solely for the use of the addressee and
others authorized to receive it. If you are not the intended recipient you
are hereby notified that any disclosure, copying, distribution or any action
taken or omitted in reliance on it is strictly prohibited and may be
unlawful.
============================================


From rdreier at cisco.com  Tue Dec  4 16:18:18 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 16:18:18 -0800
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
	(Changqing Tang's message of "Tue, 4 Dec 2007 23:36:34 +0000")
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
Message-ID: <adahciydmet.fsf@cisco.com>

 > Here is an issue we have:
 > 
 > struct ibv_context {
 >         struct ibv_device      *device;
 >         struct ibv_context_ops  ops;
 >         int                     cmd_fd;
 >         int                     async_fd;
 >         int                     num_comp_vectors;
 >         pthread_mutex_t         mutex;
 >         void                   *abi_compat;
 > };
 > 
 > The binary is compiled with OFED 1.2 header files,  it tries to set async_fd to non-blocking, I get error:
 > Bad file descriptor.   If I compile the binary with OFED-1.3-beta header files (with XRC changes), it works fine.
 > 
 > Is this the expected behavior, or there will be a fix ?

Unfortunately the XRC patches were put into OFED 1.3 before they went
into the upstream libibverbs tree, so I have not reviewed them in
detail.  If XRC support requires an ABI change, then we'll have to
create a new ABI and provide versioned symbols for backwards compatibility.

However your problem seems quite strange: I don't see any change 
to struct ibv_context caused by the XRC patches.  So I don't
understand exactly what is causing the problem you see.  Can you debug
further to see which structure layout change is the real issue?

 - R.


From rdreier at cisco.com  Tue Dec  4 16:40:03 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 16:40:03 -0800
Subject: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <adahciydmet.fsf@cisco.com> (Roland Dreier's message of "Tue,
	04 Dec 2007 16:18:18 -0800")
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
	<adahciydmet.fsf@cisco.com>
Message-ID: <adad4tmdlek.fsf@cisco.com>

BTW, sifting through the OFED 1.3 libibverbs tree, I do see that the
commit to add max_xrc_domains to struct ibv_device_attr did break
things by adding the member in the middle of the structure (so that an
app compiled against the old header will see bogus values for
local_ca_ack_delay and phys_port_count.

Actually looking at the commit again, it's worse than that... anything
compiled against the old header that calls ibv_query_device() may get
memory corrupted, because the new ibv_query_device() writes to a
bigger structure than the app passes in.

The perils of not reviewing properly I guess...

 - R.


From rdreier at cisco.com  Tue Dec  4 16:41:17 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 16:41:17 -0800
Subject: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <adad4tmdlek.fsf@cisco.com> (Roland Dreier's message of "Tue,
	04 Dec 2007 16:40:03 -0800")
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
	<adahciydmet.fsf@cisco.com> <adad4tmdlek.fsf@cisco.com>
Message-ID: <ada8x4adlci.fsf@cisco.com>

oops, sorry... I see that the very next OFED 1.3 commit reverted that
change, so things aren't as bad as I thought.

Never mind.

 - R.


From jon at opengridcomputing.com  Tue Dec  4 16:57:15 2007
From: jon at opengridcomputing.com (Jon Mason)
Date: Tue, 4 Dec 2007 18:57:15 -0600
Subject: [ofa-general] uDAPL EVD queue length issue
In-Reply-To: <4755AD21.4080001@ichips.intel.com>
References: <20071203224550.GF11990@opengridcomputing.com>
	<4755AD21.4080001@ichips.intel.com>
Message-ID: <20071205005711.GE17358@opengridcomputing.com>

On Tue, Dec 04, 2007 at 11:40:17AM -0800, Arlin Davis wrote:
> Jon Mason wrote:
>> While working on OMPI udapl btl, I have noticed some "interesting"
>> behavior.  OFA udapl wants the evd queues to be a power of 2 and
>> then will subtract 1 for book keeping (ie, so that internal head and
>> tail pointers never touch except when the ring is empty).  OFA udapl
>> will report the queue length as this number (and not the original
>> size requested) when queried.  This becomes interesting when a power
>> of 2 is passed in and then queried.  For example, a requested queue
>> of length 256 will report a length of 255 when queried.  
>
> Something is not right. You should ALWAYS get at least what you request. On 
> my system with an mthca, a request of 256 gets you 511. It is the verbs 
> provider that is rounding up, not uDAPL.
>
> Here is my uDAPL debug output (DAPL_DBG_TYPE=0xffff) using dtest:
>
>  cq_object_create: (0x519bb0,0x519d00)
> dapls_ib_cq_alloc: evd 0x519bb0 cqlen=256
> dapls_ib_cq_alloc: new_cq 0x519d60 cqlen=511
>
> This is before and after the ibv_create_cq call. uDAPL builds it's EVD 
> resources based on what is returned from this call.
>
> I modified dtest to double check the dat_evd_query and I get the same:
>
> 8962 dto_rcv_evd created 0x519e80
> 8962 dto_req_evd QLEN - requested 256 and actual 511
>
> What OFED release and device are you using?

I'm running OFED 1.2.5 and using Chelsio.

The behavior of the iwch_create_cq in
drivers/infiniband/hw/cxgb3/iwch_provider.c is to allocate the amount
given (rounded to the power of 2).  So this function will give 256 if
256 is requested, but uDAPL will consume one of those for book keeping
and thus only have 255.

For my clarification, the provider should take into account the
bookkeeping of uDAPL and roundup to the next power of 2 when given a
power of 2 size?  I'm probably being thick, but why doesn't uDAPL
increase the size requested by one before passing the request to the
provider (or is this the documented behavior of the function and the
provider should conform)?

Thanks,
Jon

>
> -arlin
>
>
>
>


From kononov at dls.net  Tue Dec  4 18:06:34 2007
From: kononov at dls.net (Roman Kononov)
Date: Tue, 04 Dec 2007 20:06:34 -0600
Subject: [ofa-general] Bogus Receive Completions
Message-ID: <475607AA.301@dls.net>

Hello all,

I have weird behavior of libibverbs + libmthca, which makes me suspicious
about either libmthca or the HCA firmware.

The source of the test code is attached.

I have a pair of processes talking to each other. They are very similar to
ibv_rc_pingpong. The difference is that my processes issue
ibv_post_send(IBV_WR_RDMA_WRITE_WITH_IMM), and they try to keep several
(namely 4) outstanding Receive and Send Work Requests.

All Send Work Requests are sequentially numbered. The number is placed into
the wr_id and imm_data fields. When the process receives a Send Work
Completion, wr_id is checked for consistency with the sent numbers, and the
next Send Work Request is posted (RDMA Write with IMM, 32 bytes of
out-of-line data). So far so good.

All Receive Work Requests are sequentially numbered as well. The number is
placed into the wr_id field. When the process gets a Receive Work
Completion, it checks both the wr_id and imm_data for consistency with the
expected numbers, and posts the next Receive Work Request. The consistency
test eventually fails (after a few hundred thousand iterations - a few
seconds). The Completion status is "success", wr_id is out of order,
imm_data is in order. Despite inconsistency, the process still tries to post
the next Receive Work Request, which fails as if the Receive Queue were full
(I modified libmthca's mthca_tavor_post_recv() to return distinct error
codes). All subsequent Receive Work Completions fail the consistency test
and ibv_post_recv() fail in the same manner. Then everything stops waiting
for Work Completions inside ibv_get_cq_event().

I believe that, in this test, wr_id from Receive Work Completions must
arrive in order, but they do not.

I am sure that "queue overflow" failures of ibv_post_recv() are illegal
because I keep the queue no more than half-full.

The test fails with libibverbs-1.0.5 (and older), libmthca-1.0.4 (and older).

~>uname -a
Linux node02 2.6.23.9 #1 SMP PREEMPT Fri Nov 30 21:23:11 CST 2007 x86_64
x86_64 x86_64 GNU/Linux

~>grep 'model name' /proc/cpuinfo
model name      : Dual Core AMD Opteron(tm) Processor 285
model name      : Dual Core AMD Opteron(tm) Processor 285

~>ibv_devinfo
hca_id: mthca0
          fw_ver:                         4.8.200
          node_guid:                      0002:c902:0024:42f4
          sys_image_guid:                 0002:c902:0024:42f7
          vendor_id:                      0x02c9
          vendor_part_id:                 25208
          hw_ver:                         0xA0
          board_id:                       MT_0330140002
          phys_port_cnt:                  2

The attached source code evolved from a huge application in the attempt to
reduce the code to a reasonable size, so it looks weird.  Run "gcc -O2
flaw.c -o flaw -lpthread -libverbs" to compile it. On one end run "sudo
flaw", on the other end start "sudo flaw <hostname>", where hostname is the
name of the first end. It fails much sooner if you start another pair of
processes.

This is a typical output of the program:

~>./flaw
device mthca0
completion thread runs
QP connected
55871
147789
241006
330033
421304
509184
595437
682410
779035
872444
964561
1051060
1138062
1224279
1311327
wr_id=00152d78, recv_wr_id_rsp=00152d76, imm.seqn=00002d76
wr_id=00152d79, recv_wr_id_rsp=00152d77, imm.seqn=00002d77
ibv_post_recv() failed, code=-2
wr_id=00152d79, recv_wr_id_rsp=00152d78, imm.seqn=00002d78
ibv_post_recv() failed, code=-2
wr_id=00152d79, recv_wr_id_rsp=00152d79, imm.seqn=00002d79
ibv_post_recv() failed, code=-2
wr_id=00152d7a, recv_wr_id_rsp=00152d7a, imm.seqn=00002d7a
ibv_post_recv() failed, code=-2

The several rows of numbers are the iteration counter printed once per
second. In this case it made at least 1311327 successful iterations.

The iteration #1387894 (0x152d76) failed. In a Receive Work Completion,
wr_id was 00152d78, while 00152d76 was expected. The imm_data received was
good (it must be the 16 LSBs of recv_wr_id_rsp). The Completion caused the
next Receive Work Request to be successfully posted (since no error is printed).

The next iteration #1387895 (0x152d77) failed in a similar fashion. The
Completion, in an attempt to post the next Receive Work Request, got "work
queue overflow" error, which is impossible because the queue size is 8.

All subsequent iterations failed similarly.

The peer process displays no errors.

I am new to libibverbs and it possible that I am misusing it.

Thank you,

Roman Kononov

-------------- next part --------------
A non-text attachment was scrubbed...
Name: flaw.c
Type: text/x-csrc
Size: 16687 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071204/cff19ca6/attachment.c>

From changquing.tang at hp.com  Tue Dec  4 20:39:47 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Wed, 5 Dec 2007 04:39:47 +0000
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <adahciydmet.fsf@cisco.com>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
	<adahciydmet.fsf@cisco.com>
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>


I think the problem is that  sizeof "struct ibv_context_ops" has changed, so the new driver returns a
big "struct ibv_context", app compiled with older header file has a smaller "struct ibv_context" and
use the old offset to find fields after "ops".

--CQ


> -----Original Message-----
> From: Roland Dreier [mailto:rdreier at cisco.com]
> Sent: Tuesday, December 04, 2007 6:18 PM
> To: Tang, Changqing
> Cc: Tziporet Koren; ewg at lists.openfabrics.org;
> general at lists.openfabrics.org
> Subject: Re: [ofa-general] OFED 1.3 Beta release is available
>
>  > Here is an issue we have:
>  >
>  > struct ibv_context {
>  >         struct ibv_device      *device;
>  >         struct ibv_context_ops  ops;
>  >         int                     cmd_fd;
>  >         int                     async_fd;
>  >         int                     num_comp_vectors;
>  >         pthread_mutex_t         mutex;
>  >         void                   *abi_compat;
>  > };
>  >
>  > The binary is compiled with OFED 1.2 header files,  it
> tries to set async_fd to non-blocking, I get error:
>  > Bad file descriptor.   If I compile the binary with
> OFED-1.3-beta header files (with XRC changes), it works fine.
>  >
>  > Is this the expected behavior, or there will be a fix ?
>
> Unfortunately the XRC patches were put into OFED 1.3 before
> they went into the upstream libibverbs tree, so I have not
> reviewed them in detail.  If XRC support requires an ABI
> change, then we'll have to create a new ABI and provide
> versioned symbols for backwards compatibility.
>
> However your problem seems quite strange: I don't see any
> change to struct ibv_context caused by the XRC patches.  So I
> don't understand exactly what is causing the problem you see.
>  Can you debug further to see which structure layout change
> is the real issue?
>
>  - R.
>


From kliteyn at mellanox.co.il  Tue Dec  4 21:16:09 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 5 Dec 2007 07:16:09 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-05:normal completion
Message-ID: <MTLEXCH01CpU6q5ijb100009943@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-04
OpenSM git rev = Tue_Dec_4_17:04:11_2007 [39ef735bb44d14463e34d23e05091ddabcd2e0a3]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From rdreier at cisco.com  Tue Dec  4 21:24:33 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 21:24:33 -0800
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>
	(Changqing Tang's message of "Wed, 5 Dec 2007 04:39:47 +0000")
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
	<adahciydmet.fsf@cisco.com>
	<D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>
Message-ID: <ada4pexemsu.fsf@cisco.com>

 > I think the problem is that sizeof "struct ibv_context_ops" has
 > changed, so the new driver returns a big "struct ibv_context", app
 > compiled with older header file has a smaller "struct ibv_context"
 > and use the old offset to find fields after "ops".

Oh crud, you're obviously right.  For some reason I kept missing that
when I looked over the code.

I think the only alternative we have to preserve backwards
compatibility is to leave struct ibv_context_ops alone and change the
structure to:

struct ibv_context {
        struct ibv_device      *device;
        struct ibv_context_ops  ops;
        int                     cmd_fd;
        int                     async_fd;
        int                     num_comp_vectors;
        pthread_mutex_t         mutex;
        void                   *abi_compat;
        struct ibv_xrc_op      *xrc_ops;
};

with xrc_ops added at the end.  It's my fault for not making the ops
member a pointer I guess.

Tziporet/Jack/whoever -- please fix up the libibverbs you ship for
OFED 1.3 to resolve this.

We can clean this up for libibverbs 1.2 when the ABI can change,
if/when we have something worth breaking the ABI for.

 - R.


From rdreier at cisco.com  Tue Dec  4 22:20:19 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 04 Dec 2007 22:20:19 -0800
Subject: [ofa-general] Bogus Receive Completions
In-Reply-To: <475607AA.301@dls.net> (Roman Kononov's message of "Tue,
	04 Dec 2007 20:06:34 -0600")
References: <475607AA.301@dls.net>
Message-ID: <adazlwpd5ng.fsf@cisco.com>

Thanks for the excellent bug report!  With the test case to reproduce
this, resolving the issue should be pretty quick.

 > I have weird behavior of libibverbs + libmthca, which makes me suspicious
 > about either libmthca or the HCA firmware.

I was able to reproduce this here with libibverbs 1.1.1 and libmthca
1.0.4, but only on HCAs running in non-mem-free mode, which makes me
think it must be a firmware issue.  In fact, I get the failure you see
almost instantly on a system with HCA running FW 4.8.917, but if I
take the exact same system and just update the HCA to FW 5.2.917 (so
the HCA runs in mem-free mode) then the test runs for a long time with
no problem (up to iteration 150327477 so far).

I added a little debugging patch to src/cq.c in libmthca, and I found
that when the failure happened, the CQE had a WQE address that was out
of sequence -- the RQ has size 0x200 with 0x20 byte WQEs, and the CQEs
had WQE address 0x100 then WQE address 0x0; or address 0x0 then 0x140;
or even 0x80 twice in a row.

Mellanox: can you take this test case and see if it is indeed a
firmware issue?  I could believe that there is a bug in libmthca's
mthca_tavor_post_recv() function too...

BTW, here are a few comments about things I had to fix to run the test
case:

 > 	memset(&attr1,0,sizeof(attr1));

I needed to add "#include <string.h>" to get a prototype for memset()...

 > 	hints.ai_family=AF_UNSPEC;
 ...
 > 	struct sockaddr remote_saddr;
 > 	socklen_t remote_saddrlen=sizeof(remote_saddr);
 > 	int hsock=accept(ssock,&remote_saddr,&remote_saddrlen);
 > 	close(ssock);
 > 	assert(hsock>=0);
 > 	assert(remote_saddrlen==sizeof(remote_saddr));

On my system at least, using AF_UNSPEC led to accept() returning an
IPv6 socket address, and actually sizeof (struct sockaddr_in6) is 28,
which is bigger than sizeof (struct sockaddr), so this last assert
failed for me.  I fixed this by setting hints.ai_family to AF_INET.

 > 	printf("accepted connection from %i.%i.%i.%i\n",remote_saddr.sa_data[2],remote_saddr.sa_data[3],remote_saddr.sa_data[4],remote_saddr.sa_data[5]);

It didn't cause anything but a cosmetic issue, but on my system at
least, sa_data is an array of signed char, so if any of the octets in
the remote address are > 128, they printed out as negative numbers.  I
fixed this by adding casts to uint8_t here.

 - R.


From vlad at dev.mellanox.co.il  Tue Dec  4 22:27:14 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Wed, 05 Dec 2007 08:27:14 +0200
Subject: [ofa-general] RE: [PATCH]  CMA: Enable conn_id remove
In-Reply-To: <000001c8369d$a60cef10$3c98070a@amr.corp.intel.com>
References: <475518FB.1080501@dev.mellanox.co.il>
	<000001c8369d$a60cef10$3c98070a@amr.corp.intel.com>
Message-ID: <475644C2.3070302@dev.mellanox.co.il>

Sean Hefty wrote:
>> I have the following issue: The IB driver can't be unloaded after running
>> applications over RDS.
> 
> Do you mean the HCA driver?  Are there still active connections when the driver
> is unloaded?
> 

Yes, HCA driver and there are active connections when the driver is unloaded.

- Vladimir

>> I saw that the 'dev_remove' counter does not reach 0 value on the passive side
>> (after connection establishment).
>>
>> Please review the following patch:
>>
>> CMA: Enable conn_id remove on the passive side after
>> connection establishment.
>>
>> Signed-off-by: Vladimir Sokolovsky <vlad at mellanox.co.il>
> 
> This looks correct to me, but I'd like to understand why we don't see problems
> more often.
> 
> - Sean
> 
>> ---
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index 0751697..656d6df 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -1122,8 +1122,10 @@ static int cma_req_handler(struct ib_cm_id *cm_id,
>> struct ib_cm_event *ib_event)
>>         cm_id->cm_handler = cma_ib_handler;
>>
>>         ret = conn_id->id.event_handler(&conn_id->id, &event);
>> -       if (!ret)
>> +       if (!ret) {
>> +               cma_enable_remove(conn_id);
>>                 goto out;
>> +       }
>>
>>         /* Destroy the CM ID by returning a non-zero value. */
>>         conn_id->cm_id.ib = NULL;
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


From npiggin at suse.de  Tue Dec  4 23:15:58 2007
From: npiggin at suse.de (npiggin at suse.de)
Date: Wed, 05 Dec 2007 18:15:58 +1100
Subject: [ofa-general] [patch 11/18] ib: nopage
References: <20071205071547.701344000@nick.local0.net>
Message-ID: <20071205071627.900265000@nick.local0.net>

An embedded and charset-unspecified text was scrubbed...
Name: ib-nopage.patch
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071205/765ff2c1/attachment.ksh>

From sweitzen at cisco.com  Wed Dec  5 00:25:20 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 5 Dec 2007 00:25:20 -0800
Subject: [ofa-general] change in diags in OFED 1.3? (2 ports;
	only 1 supported currently)
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>

This seems new in OFED 1.3:
 
[root at svbu-qa-pcie-1 ~]# ibcheckerrors
perfquery: iberror: failed: smp query nodeinfo: 2 ports; only 1
supported currently
Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port all: FAILED
perfquery: iberror: failed: perfquery
Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port 1: FAILED
# Checked Ca: nodeguid 0x0005ad0000200860 with failure
...
 
I see these errors with all two-port HCAs.
Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071205/6e44c233/attachment.html>

From keshetti85-student at yahoo.co.in  Wed Dec  5 01:24:16 2007
From: keshetti85-student at yahoo.co.in (Keshetti Mahesh)
Date: Wed, 5 Dec 2007 14:54:16 +0530
Subject: [ofa-general] [openSM+lash] *** glibc detected *** free(): invalid
	next size (fast)
Message-ID: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>

when I ran openSM (of OFED-1.2) with LASH routing algorithm
enable I am getting an error like,

" *** glibc detected *** free(): invalid next size (fast):
0x000000000088cff0 *** "

gdb backtrace is,

*** glibc detected *** free(): invalid next size (fast): 0x000000000088cff0 ***

Program received signal SIGABRT, Aborted.
[Switching to Thread 1126189408 (LWP 19983)]
0x000000342ec2e2ed in raise () from /lib64/tls/libc.so.6
(gdb)
(gdb) bt
#0  0x000000342ec2e2ed in raise () from /lib64/tls/libc.so.6
#1  0x000000342ec2fa3e in abort () from /lib64/tls/libc.so.6
#2  0x000000342ec62d41 in __libc_message () from /lib64/tls/libc.so.6
#3  0x000000342ec6881e in _int_free () from /lib64/tls/libc.so.6
#4  0x000000342ec68b66 in free () from /lib64/tls/libc.so.6
#5  0x000000000045625e in switch_delete ()
#6  0x00000000004579bf in lash_cleanup ()
#7  0x0000000000457cfa in lash_process ()
#8  0x0000000000451b78 in osm_ucast_mgr_process ()
#9  0x00000000004468af in osm_state_mgr_process ()
#10 0x00000000004472cc in __osm_state_mgr_ctrl_disp_callback ()
#11 0x0000002a9589df8f in __cl_disp_worker (context=0x7fbffff270) at
cl_dispatcher.c:102
#12 0x0000002a958a5121 in __cl_thread_pool_routine
(context=0x7fbffff2e8) at cl_threadpool.c:74
#13 0x0000002a958a4f6a in __cl_thread_wrapper (arg=0x59fa00) at cl_thread.c:58
#14 0x000000342f10610a in start_thread () from /lib64/tls/libpthread.so.0
#15 0x000000342ecc5ee3 in clone () from /lib64/tls/libc.so.6


can anyone help me with this?

-Mahesh


From vlad at lists.openfabrics.org  Wed Dec  5 02:55:32 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Wed,  5 Dec 2007 02:55:32 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071205-0200 daily build status
Message-ID: <20071205105532.7FEE0E60050@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.12
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.13
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-1.2798.fc6

Failed:


From bart.vanassche at gmail.com  Wed Dec  5 03:58:28 2007
From: bart.vanassche at gmail.com (Bart Van Assche)
Date: Wed, 5 Dec 2007 12:58:28 +0100
Subject: [ofa-general] [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
Message-ID: <e2e108260712050358s18c19d77k9e86f2e20c2bcff9@mail.gmail.com>

On Dec 5, 2007 10:24 AM, Keshetti Mahesh <keshetti85-student at yahoo.co.in>
wrote:

> when I ran openSM (of OFED-1.2) with LASH routing algorithm
> enable I am getting an error like,
>
> " *** glibc detected *** free(): invalid next size (fast):
> 0x000000000088cff0 *** "
>
>
Rerun your application under Valgrind if you want more info (
http://www.valgrind.org/).

Regards,

Bart Van Assche.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071205/e1c73edf/attachment.html>

From tialp at bramselectric.com  Wed Dec  5 04:30:06 2007
From: tialp at bramselectric.com (Elinor Dickens)
Date: Wed, 5 Dec 2007 20:30:06 +0800
Subject: [ofa-general] Create the image of a slim and successful person!
Message-ID: <01c8377d$9f1d5b00$7a5008da@tialp>

Weight loss always changes life for good. It gives a new start in love life or even career, as toned up and fit people have higher self-esteem than even slightly overweight people.
  The fastest and most effective weight loss product among modern weight loss products is Hoodia Gordonii. It suppresses your appetite, gives you energy, improves your mood and you lose your body fat fast.

http://geocities.com/RolandoFaulkner06/

Lose weight and improve the quality of your life!


From a-acontrols at advmicrolites.com  Wed Dec  5 05:46:48 2007
From: a-acontrols at advmicrolites.com (Earle Bruce)
Date: Wed, 5 Dec 2007 21:46:48 +0800
Subject: [ofa-general] Can we talk?
Message-ID: <957885385.01111915549183@advmicrolites.com>

Hello! I am bored today. I am nice girl that would like to chat with you. Email me at yulr at ShineBal.info only, because I am writing not from my personal email. Don't miss some of my naughty pictures.


From dotanb at dev.mellanox.co.il  Wed Dec  5 05:50:51 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Wed, 05 Dec 2007 15:50:51 +0200
Subject: [ofa-general] who can give me info on the utility vendstat?
Message-ID: <4756ACBB.6050604@dev.mellanox.co.il>

Hi.

The utility vendstat exists in the managment/diags and when i execute 
it, i get the following output:


vendstat -N -w
hw_dev_rev:  0x0000
hw_dev_id:   0x0000
hw_uptime:   0x00000000
fw_version:  23.32.07
fw_build_id: 0x1824
fw_date:     00/00/0000
fw_psid:     ''
fw_ini_ver:  0
sw_version:  00.00.00
vendstat: iberror: failed: Unsupported device ID 0x0


Who can give me info on this utility and tell me if this is the expected 
output?


thanks
Dotanb


From hrosenstock at xsigo.com  Wed Dec  5 06:01:58 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 05 Dec 2007 06:01:58 -0800
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <4756ACBB.6050604@dev.mellanox.co.il>
References: <4756ACBB.6050604@dev.mellanox.co.il>
Message-ID: <1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>

Hi Dotan,

On Wed, 2007-12-05 at 15:50 +0200, Dotan Barak wrote:
> Hi.
> 
> The utility vendstat exists in the managment/diags and when i execute 
> it, i get the following output:
> 
> 
> 
> vendstat -N -w
> hw_dev_rev:  0x0000
> hw_dev_id:   0x0000
> hw_uptime:   0x00000000
> fw_version:  23.32.07
> fw_build_id: 0x1824
> fw_date:     00/00/0000
> fw_psid:     ''
> fw_ini_ver:  0
> sw_version:  00.00.00
> vendstat: iberror: failed: Unsupported device ID 0x0
> 
> 
> Who can give me info on this utility and tell me if this is the expected 
> output?

What device is this from ? Also, what version of vendstat are you
using ?

-- Hal

> thanks
> Dotanb
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From sashak at voltaire.com  Wed Dec  5 06:21:39 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 5 Dec 2007 14:21:39 +0000
Subject: [ofa-general] [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <e2e108260712050358s18c19d77k9e86f2e20c2bcff9@mail.gmail.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
	<e2e108260712050358s18c19d77k9e86f2e20c2bcff9@mail.gmail.com>
Message-ID: <20071205142139.GF20470@sashak.voltaire.com>

On 12:58 Wed 05 Dec     , Bart Van Assche wrote:
> On Dec 5, 2007 10:24 AM, Keshetti Mahesh <keshetti85-student at yahoo.co.in>
> wrote:
> 
> > when I ran openSM (of OFED-1.2) with LASH routing algorithm
> > enable I am getting an error like,
> >
> > " *** glibc detected *** free(): invalid next size (fast):
> > 0x000000000088cff0 *** "
> >
> >
> Rerun your application under Valgrind if you want more info (
> http://www.valgrind.org/).

Or rerun OpenSM under gdb, then look at backtrace (with 'bt' command).

Sasha


From hrosenstock at xsigo.com  Wed Dec  5 06:09:40 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 05 Dec 2007 06:09:40 -0800
Subject: [ofa-general] Re: [ewg] change in diags in OFED 1.3? (2 ports;
	only 1 supported currently)
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
Message-ID: <1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-05 at 00:25 -0800, Scott Weitzenkamp (sweitzen) wrote:
> This seems new in OFED 1.3:
>  
> [root at svbu-qa-pcie-1 ~]# ibcheckerrors
> perfquery: iberror: failed: smp query nodeinfo: 2 ports; only 1
> supported currently

There was a thread on this starting on Oct 12 titled
"ibcheckerrrors/perfquery failure"
http://lists.openfabrics.org/pipermail/general/2007-October/041901.html

There was some code which went in to support PMAs which don't support
the AllPortSelect option. Patches were sent on the list.

> Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port all: FAILED
> perfquery: iberror: failed: perfquery
> Error check on lid 8 (svbu-qa-pcie-2 HCA-1) port 1: FAILED
> # Checked Ca: nodeguid 0x0005ad0000200860 with failure
> ...
>  
> I see these errors with all two-port HCAs.

All two port HCAs ? Are all of them the same 2 port PCIe model or are
there others ?

Can you provide:

smpquery nodedesc
smpquery nodeinfo
and most importantly
perfquery -de

for a failed node/port ?

-- Hal

> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
> 
> 
> 
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


From agehcygde at bobbellbelair.com  Wed Dec  5 06:12:25 2007
From: agehcygde at bobbellbelair.com (Miriam Coley)
Date: Wed, 5 Dec 2007 15:12:25 +0100
Subject: [ofa-general] An awesome sex toy for men!
Message-ID: <01c83751$3e18a3e0$6a358753@agehcygde>

 There are many male sex toys specially designed for men. They are of different kinds and are made of different materials. Order your Personal Pussy made of specially textured soft silicone ribbed inside and pre lubricated for extra pleasure.
   Enjoy lifelike sensations with a specially designed to feel like a real pussy hand held masturbator. The Personal Pussy can be fucked any day and any time. Made of best modern materials it is reported by some men to be better than the real pussy.

http://geocities.com/JeanBurns32/

 It gets the job done!


From dotanb at dev.mellanox.co.il  Wed Dec  5 06:18:23 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Wed, 05 Dec 2007 16:18:23 +0200
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>
References: <4756ACBB.6050604@dev.mellanox.co.il>
	<1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>
Message-ID: <4756B32F.5010801@dev.mellanox.co.il>

Hal Rosenstock wrote:
> Hi Dotan,
>
> On Wed, 2007-12-05 at 15:50 +0200, Dotan Barak wrote:
>   
>> Hi.
>>
>> The utility vendstat exists in the managment/diags and when i execute 
>> it, i get the following output:
>>
>>
>>
>> vendstat -N -w
>> hw_dev_rev:  0x0000
>> hw_dev_id:   0x0000
>> hw_uptime:   0x00000000
>> fw_version:  23.32.07
>> fw_build_id: 0x1824
>> fw_date:     00/00/0000
>> fw_psid:     ''
>> fw_ini_ver:  0
>> sw_version:  00.00.00
>> vendstat: iberror: failed: Unsupported device ID 0x0
>>
>>
>> Who can give me info on this utility and tell me if this is the expected 
>> output?
>>     
>
> What device is this from ? Also, what version of vendstat are you
> using ?
>
> -- Hal
>   
I got this output when executing this utility over ConnectX or over 
InfiniHost III tavor mode.
I don't know how to get the version of the vendstat what this is the 
utility that comes with OFED 1.3.

thanks
Dotan


From hrosenstock at xsigo.com  Wed Dec  5 06:25:53 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 05 Dec 2007 06:25:53 -0800
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <4756B32F.5010801@dev.mellanox.co.il>
References: <4756ACBB.6050604@dev.mellanox.co.il>
	<1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>
	<4756B32F.5010801@dev.mellanox.co.il>
Message-ID: <1196864753.30768.296.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-05 at 16:18 +0200, Dotan Barak wrote:
> Hal Rosenstock wrote:
> > Hi Dotan,
> >
> > On Wed, 2007-12-05 at 15:50 +0200, Dotan Barak wrote:
> >   
> >> Hi.
> >>
> >> The utility vendstat exists in the managment/diags and when i execute 
> >> it, i get the following output:
> >>
> >>
> >>
> >> vendstat -N -w
> >> hw_dev_rev:  0x0000
> >> hw_dev_id:   0x0000
> >> hw_uptime:   0x00000000
> >> fw_version:  23.32.07
> >> fw_build_id: 0x1824
> >> fw_date:     00/00/0000
> >> fw_psid:     ''
> >> fw_ini_ver:  0
> >> sw_version:  00.00.00
> >> vendstat: iberror: failed: Unsupported device ID 0x0
> >>
> >>
> >> Who can give me info on this utility and tell me if this is the expected 
> >> output?
> >>     
> >
> > What device is this from ? Also, what version of vendstat are you
> > using ?
> >
> > -- Hal
> >   
> I got this output when executing this utility over ConnectX or over 
> InfiniHost III tavor mode.

vendstat is using a Mellanox specific vendor MAD for this and just
reformatting what it is getting back. Works for a number of Mellanox
devices. Does Connect X/InfiniHost III tavor mode support this ?

> I don't know how to get the version of the vendstat what this is the 
> utility that comes with OFED 1.3.

That's what I was looking for.

-- Hal

> thanks
> Dotan


From dotanb at dev.mellanox.co.il  Wed Dec  5 06:55:50 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Wed, 05 Dec 2007 16:55:50 +0200
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <1196864753.30768.296.camel@hrosenstock-ws.xsigo.com>
References: <4756ACBB.6050604@dev.mellanox.co.il>	
	<1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>	
	<4756B32F.5010801@dev.mellanox.co.il>
	<1196864753.30768.296.camel@hrosenstock-ws.xsigo.com>
Message-ID: <4756BBF6.8040008@dev.mellanox.co.il>


> vendstat is using a Mellanox specific vendor MAD for this and just
> reformatting what it is getting back. Works for a number of Mellanox
> devices. Does Connect X/InfiniHost III tavor mode support this ?
>
>   
Where can i find the format of this MAD?
Do you know about a specific device (and FW version) which supported 
this MAD?
(if the support of this MAD was broken in the past i would like to know 
when ....)


thanks
Dotan


From hrosenstock at xsigo.com  Wed Dec  5 07:05:01 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 05 Dec 2007 07:05:01 -0800
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <4756BBF6.8040008@dev.mellanox.co.il>
References: <4756ACBB.6050604@dev.mellanox.co.il>
	<1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>
	<4756B32F.5010801@dev.mellanox.co.il>
	<1196864753.30768.296.camel@hrosenstock-ws.xsigo.com>
	<4756BBF6.8040008@dev.mellanox.co.il>
Message-ID: <1196867101.30768.306.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-05 at 16:55 +0200, Dotan Barak wrote:
> > vendstat is using a Mellanox specific vendor MAD for this and just
> > reformatting what it is getting back. Works for a number of Mellanox
> > devices. Does Connect X/InfiniHost III tavor mode support this ?
> >
> >   
> Where can i find the format of this MAD?

In the Mellanox PRMs.

> Do you know about a specific device (and FW version) which supported 
> this MAD?

Yes, IS3 definitely works for this. I think Anafa also works too as well
as Tavor HCA although I can't be 100% sure in my current test
environment.

-- Hal

> (if the support of this MAD was broken in the past i would like to know 
> when ....)
> 
> 
> thanks
> Dotan


From kliteyn at dev.mellanox.co.il  Wed Dec  5 07:28:40 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 05 Dec 2007 17:28:40 +0200
Subject: [ofa-general] [PATCH] opensm: printing to stderr note about error in
 QoS policy file
Message-ID: <4756C3A8.8000504@dev.mellanox.co.il>

Error in QoS policy file validation is similar to syntax error
in policy file in terms of consequences - in both cases the
whole QoS policy file will be ignored by OpenSM.
Printing error notice to stderr (similar to syntax error).

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_qos_parser.y |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_qos_parser.y b/opensm/opensm/osm_qos_parser.y
index c3d04df..f87102c 100644
--- a/opensm/opensm/osm_qos_parser.y
+++ b/opensm/opensm/osm_qos_parser.y
@@ -2341,6 +2341,9 @@ int osm_qos_parse_policy_file(IN osm_subn_t * const p_subn)
                 "osm_qos_parse_policy_file: ERR AC04: "
                 "Error(s) in QoS policy file (%s)\n",
                 p_subn->opt.qos_policy_file);
+        fprintf(stderr,
+                "Error(s) in QoS policy file (%s)\n",
+                p_subn->opt.qos_policy_file);
         osm_qos_policy_destroy(p_subn->p_qos_policy);
         p_subn->p_qos_policy = NULL;
         res = 1;
-- 
1.5.1.4


From sashak at voltaire.com  Wed Dec  5 07:45:18 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 5 Dec 2007 15:45:18 +0000
Subject: [ofa-general] Re: [PATCH] opensm: printing to stderr note about
	error in QoS policy file
In-Reply-To: <4756C3A8.8000504@dev.mellanox.co.il>
References: <4756C3A8.8000504@dev.mellanox.co.il>
Message-ID: <20071205154518.GI20470@sashak.voltaire.com>

On 17:28 Wed 05 Dec     , Yevgeny Kliteynik wrote:
> Error in QoS policy file validation is similar to syntax error
> in policy file in terms of consequences - in both cases the
> whole QoS policy file will be ignored by OpenSM.
> Printing error notice to stderr (similar to syntax error).
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From dotanb at dev.mellanox.co.il  Wed Dec  5 07:40:21 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Wed, 05 Dec 2007 17:40:21 +0200
Subject: [ofa-general] Re: [PATCH] ib/mad: fix incorrect access to items on
	local_list
In-Reply-To: <000001c8337a$cdc18e60$ff0da8c0@amr.corp.intel.com>
References: <474BE237.8050602@dev.mellanox.co.il> <aday7cjntc9.fsf@cisco.com>
	<000001c8337a$cdc18e60$ff0da8c0@amr.corp.intel.com>
Message-ID: <4756C665.5060404@dev.mellanox.co.il>

Hi.


Sean Hefty wrote:
> In cancel_mads(), MADs are moved from the wait_list and local_list
> to a cancel_list for processing.  However, the structures on these two
> lists are not the same.  The wait_list references struct
> ib_mad_send_wr_private, but local_list references struct
> ib_mad_local_private.  Cancel_mads() treats all items moved to the
> cancel_list as struct ib_mad_send_wr_private.  This leads to a system
> crash when requests are moved from the local_list to the cancel_list.
>
> Fix this by leaving local_list alone.  All requests on the local_list
> have completed are just awaiting processing by a queued worker thread.
>
> Bug (crash) reported by Dotan Barak <dotanb at dev.mellanox.co.il>.
> Problem with local_list access reported by Robert Reynolds
> <rreynolds at opengridcomputing.com>.
>
> Signed-off-by: Sean Hefty <sean.hefty at intel.com>
> ---
> This patch is untested.  Dotan, can you see if this fixes the crash that
> you were seeing?
>   
Just want to let me know that i didn't forget about this issue.

I tried to reproduce the failure before applying the bug, but this one 
is not easy to reproduce.

I will give you a feedback as soon as I'll have one ..


Dotan


From changquing.tang at hp.com  Wed Dec  5 07:45:17 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Wed, 5 Dec 2007 15:45:17 +0000
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <ada4pexemsu.fsf@cisco.com>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
	<adahciydmet.fsf@cisco.com>
	<D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>
	<ada4pexemsu.fsf@cisco.com>
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>


Roland:
        I think in future we will have more such changes, why don't we take the
pain now to make ops as a pointer and mark it as verbs 1.2 ?


--CQ


> -----Original Message-----
> From: Roland Dreier [mailto:rdreier at cisco.com]
> Sent: Tuesday, December 04, 2007 11:25 PM
> To: Tang, Changqing
> Cc: Tziporet Koren; ewg at lists.openfabrics.org;
> general at lists.openfabrics.org
> Subject: Re: [ofa-general] OFED 1.3 Beta release is available
>
>  > I think the problem is that sizeof "struct
> ibv_context_ops" has  > changed, so the new driver returns a
> big "struct ibv_context", app  > compiled with older header
> file has a smaller "struct ibv_context"
>  > and use the old offset to find fields after "ops".
>
> Oh crud, you're obviously right.  For some reason I kept
> missing that when I looked over the code.
>
> I think the only alternative we have to preserve backwards
> compatibility is to leave struct ibv_context_ops alone and
> change the structure to:
>
> struct ibv_context {
>         struct ibv_device      *device;
>         struct ibv_context_ops  ops;
>         int                     cmd_fd;
>         int                     async_fd;
>         int                     num_comp_vectors;
>         pthread_mutex_t         mutex;
>         void                   *abi_compat;
>         struct ibv_xrc_op      *xrc_ops;
> };
>
> with xrc_ops added at the end.  It's my fault for not making
> the ops member a pointer I guess.
>
> Tziporet/Jack/whoever -- please fix up the libibverbs you
> ship for OFED 1.3 to resolve this.
>
> We can clean this up for libibverbs 1.2 when the ABI can
> change, if/when we have something worth breaking the ABI for.
>
>  - R.
>


From tziporet at dev.mellanox.co.il  Wed Dec  5 07:55:15 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Wed, 05 Dec 2007 17:55:15 +0200
Subject: [ofa-general] Bogus Receive Completions
In-Reply-To: <adazlwpd5ng.fsf@cisco.com>
References: <475607AA.301@dls.net> <adazlwpd5ng.fsf@cisco.com>
Message-ID: <4756C9E3.7050900@mellanox.co.il>

Roland Dreier wrote:
>
> Mellanox: can you take this test case and see if it is indeed a
> firmware issue?  I could believe that there is a bug in libmthca's
> mthca_tavor_post_recv() function too...
>
>   
Yes - our FW people will look into it

Tziporet


From dotanb at dev.mellanox.co.il  Wed Dec  5 07:59:52 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Wed, 05 Dec 2007 17:59:52 +0200
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <1196867101.30768.306.camel@hrosenstock-ws.xsigo.com>
References: <4756ACBB.6050604@dev.mellanox.co.il>	
	<1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>	
	<4756B32F.5010801@dev.mellanox.co.il>	
	<1196864753.30768.296.camel@hrosenstock-ws.xsigo.com>	
	<4756BBF6.8040008@dev.mellanox.co.il>
	<1196867101.30768.306.camel@hrosenstock-ws.xsigo.com>
Message-ID: <4756CAF8.2060605@dev.mellanox.co.il>


> Yes, IS3 definitely works for this. I think Anafa also works too as well
> as Tavor HCA although I can't be 100% sure in my current test
> environment.
>
> -- Hal
>   

After some digging i notices that only the switches supports this MAD
(i didn't see this in the HCA's PRMs).

thanks for the quick response
Dotan


From hrosenstock at xsigo.com  Wed Dec  5 08:07:11 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 05 Dec 2007 08:07:11 -0800
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <4756CAF8.2060605@dev.mellanox.co.il>
References: <4756ACBB.6050604@dev.mellanox.co.il>
	<1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>
	<4756B32F.5010801@dev.mellanox.co.il>
	<1196864753.30768.296.camel@hrosenstock-ws.xsigo.com>
	<4756BBF6.8040008@dev.mellanox.co.il>
	<1196867101.30768.306.camel@hrosenstock-ws.xsigo.com>
	<4756CAF8.2060605@dev.mellanox.co.il>
Message-ID: <1196870831.30768.318.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-05 at 17:59 +0200, Dotan Barak wrote:
> > Yes, IS3 definitely works for this. I think Anafa also works too as well
> > as Tavor HCA although I can't be 100% sure in my current test
> > environment.
> >
> > -- Hal
> >   
> 
> After some digging i notices that only the switches supports this MAD
> (i didn't see this in the HCA's PRMs).

It seems they do respond though but maybe not with the proper info so
should they respond with some error then ?

Should vendstat query node type first and make sure it is a switch for
this option ?

-- Hal

> thanks for the quick response
> Dotan


From swise at opengridcomputing.com  Wed Dec  5 08:08:29 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Wed, 05 Dec 2007 10:08:29 -0600
Subject: [ofa-general] uDAPL EVD queue length issue
In-Reply-To: <20071205005711.GE17358@opengridcomputing.com>
References: <20071203224550.GF11990@opengridcomputing.com>
	<4755AD21.4080001@ichips.intel.com>
	<20071205005711.GE17358@opengridcomputing.com>
Message-ID: <4756CCFD.1040806@opengridcomputing.com>

Jon Mason wrote:
> On Tue, Dec 04, 2007 at 11:40:17AM -0800, Arlin Davis wrote:
>> Jon Mason wrote:
>>> While working on OMPI udapl btl, I have noticed some "interesting"
>>> behavior.  OFA udapl wants the evd queues to be a power of 2 and
>>> then will subtract 1 for book keeping (ie, so that internal head and
>>> tail pointers never touch except when the ring is empty).  OFA udapl
>>> will report the queue length as this number (and not the original
>>> size requested) when queried.  This becomes interesting when a power
>>> of 2 is passed in and then queried.  For example, a requested queue
>>> of length 256 will report a length of 255 when queried.  
>> Something is not right. You should ALWAYS get at least what you request. On 
>> my system with an mthca, a request of 256 gets you 511. It is the verbs 
>> provider that is rounding up, not uDAPL.
>>
>> Here is my uDAPL debug output (DAPL_DBG_TYPE=0xffff) using dtest:
>>
>>  cq_object_create: (0x519bb0,0x519d00)
>> dapls_ib_cq_alloc: evd 0x519bb0 cqlen=256
>> dapls_ib_cq_alloc: new_cq 0x519d60 cqlen=511
>>
>> This is before and after the ibv_create_cq call. uDAPL builds it's EVD 
>> resources based on what is returned from this call.
>>
>> I modified dtest to double check the dat_evd_query and I get the same:
>>
>> 8962 dto_rcv_evd created 0x519e80
>> 8962 dto_req_evd QLEN - requested 256 and actual 511
>>
>> What OFED release and device are you using?
> 
> I'm running OFED 1.2.5 and using Chelsio.
> 
> The behavior of the iwch_create_cq in
> drivers/infiniband/hw/cxgb3/iwch_provider.c is to allocate the amount
> given (rounded to the power of 2).  So this function will give 256 if
> 256 is requested, but uDAPL will consume one of those for book keeping
> and thus only have 255.
> 
> For my clarification, the provider should take into account the
> bookkeeping of uDAPL and roundup to the next power of 2 when given a
> power of 2 size?  I'm probably being thick, but why doesn't uDAPL
> increase the size requested by one before passing the request to the
> provider (or is this the documented behavior of the function and the
> provider should conform)?
> 
> Thanks,
> Jon

 From the linux rdma verbs perspective, ibv_create_cq() will create a cq 
that is >= the requested depth.  The fact that mthca always bumps the 
size up to the next power of 2 isn't something udapl can rely on.

Here's the crux:  If creating a udapl evd of 256 results in a cq of 256 
and the udapl returns a evd with size 255, then udapl is broken...

My 2 cents...

Stevo.


From sweitzen at cisco.com  Wed Dec  5 08:18:40 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 5 Dec 2007 08:18:40 -0800
Subject: [ofa-general] RE: [ewg] change in diags in OFED 1.3? (2 ports;
	only 1 supportedcurrently)
In-Reply-To: <1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>

I'll open a bug.

> All two port HCAs ? Are all of them the same 2 port PCIe model or are
> there others ?

All type of 2 port PCIe HCAs (LionCub, LionMini, and Eagle).

> Can you provide:
> 
> smpquery nodedesc
> smpquery nodeinfo
> and most importantly
> perfquery -de
> 
> for a failed node/port ?


[root at svbu-qa-pcie-1 ~]# perfquery -de
ibwarn: [18050] smp_query: attr 0x11 mod 0x0 route DR path 0
ibwarn: [18050] mad_rpc: data offs 64 sz 64
mad data
0101 0102 0005 ad00 0100 d050 0005 ad00
0020 0848 0005 ad00 0020 0849 0040 6282
0000 00a0 0100 05ad 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
ibwarn: [18050] smp_query: attr 0x15 mod 0x0 route DR path 0
ibwarn: [18050] mad_rpc: data offs 64 sz 64
mad data
0000 0000 0000 0000 fe80 0000 0000 0000
0005 0002 0251 0a68 0000 000f 0103 0302
1452 0011 4040 0008 0804 f240 0000 0000
0000 2008 10f0 0000 0000 0000 0000 0000
ibwarn: [18050] pma_query: lid 5 port 1
ibwarn: [18050] mad_rpc: data offs 64 sz 192
mad data
0101 0000 0000 0014 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
ibwarn: [18050] main: PerfMgt ClassPortInfo 0x0 extended counters not
indicated

ibwarn: [18050] pma_query: lid 5 port 1
ibwarn: [18050] mad_rpc: MAD completed with error status 0xc
perfquery: iberror: [pid 18050] main: failed: perfextquery


From rdreier at cisco.com  Wed Dec  5 08:32:48 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 05 Dec 2007 08:32:48 -0800
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>
	(Changqing Tang's message of "Wed, 5 Dec 2007 15:45:17 +0000")
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE00969C@G5W0278.americas.hpqcorp.net>
	<adahciydmet.fsf@cisco.com>
	<D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>
	<ada4pexemsu.fsf@cisco.com>
	<D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>
Message-ID: <adaprxlcdan.fsf@cisco.com>

 >         I think in future we will have more such changes, why don't we take the
 > pain now to make ops as a pointer and mark it as verbs 1.2 ?

The problem is that undoubtedly the changes that require changing the
ABI will require something more than just additional ops, so we'll end
up needing yet another new ABI anyway.  So I don't see any real
benefit to breaking the ABI now except to make the change a little
cleaner, and that doesn't seem worth it to me.

 - R.


From hrosenstock at xsigo.com  Wed Dec  5 08:52:59 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 05 Dec 2007 08:52:59 -0800
Subject: [ofa-general]
	RE: [ewg] change in diags in OFED 1.3? (2 ports; only 1
	supportedcurrently)
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
Message-ID: <1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-05 at 08:18 -0800, Scott Weitzenkamp (sweitzen) wrote:
> I'll open a bug.
> 
> > All two port HCAs ? Are all of them the same 2 port PCIe model or are
> > there others ?
> 
> All type of 2 port PCIe HCAs (LionCub, LionMini, and Eagle).
> 
> > Can you provide:
> > 
> > smpquery nodedesc
> > smpquery nodeinfo
> > and most importantly
> > perfquery -de
> > 
> > for a failed node/port ?
> 
> 
> [root at svbu-qa-pcie-1 ~]# perfquery -de
> ibwarn: [18050] smp_query: attr 0x11 mod 0x0 route DR path 0
> ibwarn: [18050] mad_rpc: data offs 64 sz 64
> mad data
> 0101 0102 0005 ad00 0100 d050 0005 ad00
> 0020 0848 0005 ad00 0020 0849 0040 6282
> 0000 00a0 0100 05ad 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> ibwarn: [18050] smp_query: attr 0x15 mod 0x0 route DR path 0
> ibwarn: [18050] mad_rpc: data offs 64 sz 64
> mad data
> 0000 0000 0000 0000 fe80 0000 0000 0000
> 0005 0002 0251 0a68 0000 000f 0103 0302
> 1452 0011 4040 0008 0804 f240 0000 0000
> 0000 2008 10f0 0000 0000 0000 0000 0000
> ibwarn: [18050] pma_query: lid 5 port 1
> ibwarn: [18050] mad_rpc: data offs 64 sz 192
> mad data
> 0101 0000 0000 0014 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> ibwarn: [18050] main: PerfMgt ClassPortInfo 0x0 extended counters not
> indicated

That's what I was afraid of and asked about on the list a while ago
(about whether there was such a use case and there is) :-(

Not sure exactly why this wasn't a problem prior to this change.
When was the last time you tried this on a subnet which included one or
more of these HCAs ?

So...

How important is fixing this (for OFED 1.3) ?

-- Hal

> ibwarn: [18050] pma_query: lid 5 port 1
> ibwarn: [18050] mad_rpc: MAD completed with error status 0xc
> perfquery: iberror: [pid 18050] main: failed: perfextquery


From sweitzen at cisco.com  Wed Dec  5 08:58:43 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 5 Dec 2007 08:58:43 -0800
Subject: [ofa-general] RE: [ewg] change in diags in OFED 1.3? (2 ports;
	only 1supportedcurrently)
In-Reply-To: <1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>

> That's what I was afraid of and asked about on the list a while ago
> (about whether there was such a use case and there is) :-(
> 
> Not sure exactly why this wasn't a problem prior to this change.
> When was the last time you tried this on a subnet which 
> included one or
> more of these HCAs ?

Did not see this problem with OFED 1.2 or 1.2.5.

> So...
> 
> How important is fixing this (for OFED 1.3) ?

It's a regression, so I think it must be fixed.

Scott


From swise at opengridcomputing.com  Wed Dec  5 09:12:31 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Wed, 05 Dec 2007 11:12:31 -0600
Subject: [ofa-general] [GIT PULL ofed-1.3] - RDMA/cxgb3 - fixes and 5.0
	firmware support
Message-ID: <4756DBFF.7030607@opengridcomputing.com>

Vlad, please pull cxgb3 fixes for ofed-1.3 from:

git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel

These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.3.

The patches are all accepted upstream and were posted here:

http://www.spinics.net/lists/netdev/msg47492.html

and here:

http://www.spinics.net/lists/netdev/msg48240.html

Also, please pull version 1.1.0 of libcxgb3 from:

git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3

The library and drivers need to be included together as they are both 
needed to support the chelsio 5.0 firmware.


Thanks,

Steve.


From hrosenstock at xsigo.com  Wed Dec  5 09:14:35 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 05 Dec 2007 09:14:35 -0800
Subject: [ofa-general] RE: [ewg] change in diags in OFED 1.3? (2 ports; only
	1supportedcurrently)
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
Message-ID: <1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-05 at 08:58 -0800, Scott Weitzenkamp (sweitzen) wrote:
> > That's what I was afraid of and asked about on the list a while ago
> > (about whether there was such a use case and there is) :-(
> > 
> > Not sure exactly why this wasn't a problem prior to this change.
> > When was the last time you tried this on a subnet which 
> > included one or
> > more of these HCAs ?
> 
> Did not see this problem with OFED 1.2 or 1.2.5.

Are you 100% sure ? Same HCAs and firmware ?

> > So...
> > 
> > How important is fixing this (for OFED 1.3) ?
> 
> It's a regression, so I think it must be fixed.

It's a regression in a sense. It may have mistakenly reported no errors
before but now is more pendantic about it's checks due to fixing some
other issue which was reported. I'm not 100% sure about this yet but not
sure how much splitting these hairs really matter.

Apparently, it is not just single port HCAs which don't support PerfMgt
AllPortSelect. Not sure why that is (even though it is a perfectly legal
option; it makes life more difficult).

I know what it takes to fix this; just not sure when I have the time to
implement it and I also don't have a machine with one of these two port
HCAs so I would only be able to partially test a patch to fix this.

-- Hal

> Scott


From sweitzen at cisco.com  Wed Dec  5 09:15:56 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 5 Dec 2007 09:15:56 -0800
Subject: [ofa-general] RE: [ewg] change in diags in OFED 1.3? (2 ports;
	only1supportedcurrently)
In-Reply-To: <1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>


> > Did not see this problem with OFED 1.2 or 1.2.5.
> 
> Are you 100% sure ? Same HCAs and firmware ?

100% sure.


From tenderpup at sisqtel.net  Wed Dec  5 09:48:49 2007
From: tenderpup at sisqtel.net (Christian Newman)
Date: Wed, 5 Dec 2007 18:48:49 +0100
Subject: [ofa-general] Time control
Message-ID: <677588099.73529190769624@sisqtel.net>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071205/3b9f6ac8/attachment.html>

From ardavis at ichips.intel.com  Wed Dec  5 09:57:53 2007
From: ardavis at ichips.intel.com (Arlin Davis)
Date: Wed, 05 Dec 2007 09:57:53 -0800
Subject: [ofa-general] uDAPL EVD queue length issue
In-Reply-To: <4756CCFD.1040806@opengridcomputing.com>
References: <20071203224550.GF11990@opengridcomputing.com>
	<4755AD21.4080001@ichips.intel.com>
	<20071205005711.GE17358@opengridcomputing.com>
	<4756CCFD.1040806@opengridcomputing.com>
Message-ID: <4756E6A1.2010005@ichips.intel.com>

>>
>> I'm running OFED 1.2.5 and using Chelsio.
>>
>  From the linux rdma verbs perspective, ibv_create_cq() will create a cq 
> that is >= the requested depth.  The fact that mthca always bumps the 
> size up to the next power of 2 isn't something udapl can rely on.

It doesn't.

uDAPL passes the users requested qlen directly to the verbs 
ibv_create_cq call (dapl_openib_cq.c) and then uses the returned qlen to 
allocate EVD's (dapl_evd_util.c) and a ring buffer 
(dapl_ring_buffer_util.c) for managing the free and pending events.

The EVD's are allocated based on returned qlen from verbs and the 
ring_buffer is allocated based on qlen, next power of 2 minus 1. Unless 
I am missing something, I don't see how we can get less then what is 
requested.

> 
> Here's the crux:  If creating a udapl evd of 256 results in a cq of 256 
> and the udapl returns a evd with size 255, then udapl is broken...

Yes, I agree. But I don't see how this is happening.

Here is output from my dtest when requesting 256 and verbs returning the 
same qlen. You can see before and after verbs we get 256, the rbufs are 
511, and the query EVD call returns 256 to the user.

  cq_object_create: (0x519bb0,0x519d00)
dapls_ib_cq_alloc: evd 0x519bb0 cqlen=256
dapls_ib_cq_alloc: new_cq 0x519d60 cqlen=256
  setup_async_cb: ia 0x518270 type 2 hdl 0x519bb0 cb 0x2a957c8e70 ctx 
0x519bb0
  >>> rbuf_alloc: size 256 rsize 511
  >>> rbuf_alloc: size 256 rsize 511
dapl_evd_create () returns 0x0

9920 dto_req_evd QLEN - requested 256 and actual 256
9920 Create events done

Can you turn up DAPL debug(DAPL_DBG_TYPE=0xffff) so I can see what is 
happening?

Thanks,

-arlin


From vuhuong at mellanox.com  Wed Dec  5 10:00:35 2007
From: vuhuong at mellanox.com (Vu Pham)
Date: Wed, 05 Dec 2007 10:00:35 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
Message-ID: <4756E743.3040900@mellanox.com>

Hi Moni,

My systems are RHEL 5.1 x86-64, 2 Sinai hcas, fw 1.2.0

I setup bonding as follow:
IPOIBBOND_ENABLE=yes
IPOIB_BONDS=bond0
bond0_IP=11.1.1.1
bond0_SLAVEs=ib0,ib1
in /etc/infiniband/openib.conf in order to start ib-bond automatically

Testing with OFED-1.3-beta2, I got the following crash while system is 
booting up

Stack: ffffffff883429d0 fff810428519d30 ................
Call Trace:
[<ffffffff883429d0>] :bonding:bond_get_stats+0x4a/0x131
[<        8020e9cd>] rtnetlink_fill_ifinfo+0x4ba/0x5c4
              ee19>] rtmsg_if info+0x44/0x8d
              eea2>] rtnetlink_event+0x40/0x44
          8006492a>] notifier_call_chain+0x20/0x32
          80208b5e>] dev_open+0x68/0x6e
              72e8>] dev_change_flags+0x5a/0x119
          80239762>] devinet_ioctl+0x235/0x59c
          801ffcf6>] sock_ioctl+0x1c1/0x1e5
          8003fc3f>] do_ioctl+0x21/0x6b
          8002fa45>] vfs_ioctl+0x248/0261
          8004a24b>] sys_ioctl+0x59/0x78
          8005b14e>] system_call+0x7e/0x83

Code: Bad RIP value
RIP [0000000000000000000000] _stext+0x7ffff000/0x1000
 RSP <ffff10428519cc0>
CR2: 000000000000000000000
 <0>Kernel panic - not syncing: Fatal exception

I open bug #812 for this issue.

I moved our systems back to ofed-1.2.5.4 and tested ib-bond again. We 
tested it with ib0 and ib1 (connected to different switch/fabric) been 
on the same subnet (10.2.1.x, 255.255.255.0) and on different subnets 
(10.2.1.x and 10.3.1.x, 255.255.255.0). In both cases there is the issue 
of loosing communication between the servers if nodes have not been on 
the same primary ib interface.

Example:
1. original state: ib0's are the primary on both servers - pinging bond0 
between the servers is fine
2. fail ib0 on one of the servers (ib1 become primary on this server) - 
pinging bond0 between the servers fails
3. fail ib0 on the second server (ib1 become primary) - pinging bond0 
between the servers is fine again

Is this behavior expected?

thanks,
-vu


From sweitzen at cisco.com  Wed Dec  5 10:03:43 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 5 Dec 2007 10:03:43 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <4756E743.3040900@mellanox.com>
References: <4756E743.3040900@mellanox.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A9695A@xmb-sjc-216.amer.cisco.com>

This is https://bugs.openfabrics.org/show_bug.cgi?id=795, it appears to
be fixed in the 1128 build.

 
> -----Original Message-----
> From: general-bounces at lists.openfabrics.org 
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vu Pham
> Sent: Wednesday, December 05, 2007 10:01 AM
> To: Moni Shoua
> Cc: OpenFabrics General
> Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 
> and 1.2.5.4,
> 
> Hi Moni,
> 
> My systems are RHEL 5.1 x86-64, 2 Sinai hcas, fw 1.2.0
> 
> I setup bonding as follow:
> IPOIBBOND_ENABLE=yes
> IPOIB_BONDS=bond0
> bond0_IP=11.1.1.1
> bond0_SLAVEs=ib0,ib1
> in /etc/infiniband/openib.conf in order to start ib-bond automatically
> 
> Testing with OFED-1.3-beta2, I got the following crash while 
> system is 
> booting up
> 
> Stack: ffffffff883429d0 fff810428519d30 ................
> Call Trace:
> [<ffffffff883429d0>] :bonding:bond_get_stats+0x4a/0x131
> [<        8020e9cd>] rtnetlink_fill_ifinfo+0x4ba/0x5c4
>               ee19>] rtmsg_if info+0x44/0x8d
>               eea2>] rtnetlink_event+0x40/0x44
>           8006492a>] notifier_call_chain+0x20/0x32
>           80208b5e>] dev_open+0x68/0x6e
>               72e8>] dev_change_flags+0x5a/0x119
>           80239762>] devinet_ioctl+0x235/0x59c
>           801ffcf6>] sock_ioctl+0x1c1/0x1e5
>           8003fc3f>] do_ioctl+0x21/0x6b
>           8002fa45>] vfs_ioctl+0x248/0261
>           8004a24b>] sys_ioctl+0x59/0x78
>           8005b14e>] system_call+0x7e/0x83
> 
> Code: Bad RIP value
> RIP [0000000000000000000000] _stext+0x7ffff000/0x1000
>  RSP <ffff10428519cc0>
> CR2: 000000000000000000000
>  <0>Kernel panic - not syncing: Fatal exception
> 
> I open bug #812 for this issue.
> 
> I moved our systems back to ofed-1.2.5.4 and tested ib-bond again. We 
> tested it with ib0 and ib1 (connected to different 
> switch/fabric) been 
> on the same subnet (10.2.1.x, 255.255.255.0) and on different subnets 
> (10.2.1.x and 10.3.1.x, 255.255.255.0). In both cases there 
> is the issue 
> of loosing communication between the servers if nodes have 
> not been on 
> the same primary ib interface.
> 
> Example:
> 1. original state: ib0's are the primary on both servers - 
> pinging bond0 
> between the servers is fine
> 2. fail ib0 on one of the servers (ib1 become primary on this 
> server) - 
> pinging bond0 between the servers fails
> 3. fail ib0 on the second server (ib1 become primary) - pinging bond0 
> between the servers is fine again
> 
> Is this behavior expected?
> 
> thanks,
> -vu
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From jackm at dev.mellanox.co.il  Wed Dec  5 10:09:50 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Wed, 5 Dec 2007 20:09:50 +0200
Subject: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <adad4tmdlek.fsf@cisco.com>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<adahciydmet.fsf@cisco.com> <adad4tmdlek.fsf@cisco.com>
Message-ID: <200712052009.51124.jackm@dev.mellanox.co.il>

On Wednesday 05 December 2007 02:40, Roland Dreier wrote:
> BTW, sifting through the OFED 1.3 libibverbs tree, I do see that the
> commit to add max_xrc_domains to struct ibv_device_attr did break
> things by adding the member in the middle of the structure (so that an
> app compiled against the old header will see bogus values for
> local_ca_ack_delay and phys_port_count.
> 
> Actually looking at the commit again, it's worse than that... anything
> compiled against the old header that calls ibv_query_device() may get
> memory corrupted, because the new ibv_query_device() writes to a
> bigger structure than the app passes in.
> 
> The perils of not reviewing properly I guess...

This commit was subsequently reversed for exactly that reason.
Unfortunately, though, when I reviewed things regarding backwards binary
compatibility at the time I reversed the above commit, I also missed the
problem of the ibv_context_ops structure.

- Jack


From vuhuong at mellanox.com  Wed Dec  5 10:15:25 2007
From: vuhuong at mellanox.com (Vu Pham)
Date: Wed, 05 Dec 2007 10:15:25 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A9695A@xmb-sjc-216.amer.cisco.com>
References: <4756E743.3040900@mellanox.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A9695A@xmb-sjc-216.amer.cisco.com>
Message-ID: <4756EABD.5070908@mellanox.com>

Scott Weitzenkamp (sweitzen) wrote:
> This is https://bugs.openfabrics.org/show_bug.cgi?id=795, it appears to
> be fixed in the 1128 build.
>
>   

Thanks. I'll close bug #812
Could you or Moni answer/explain the behavior of ib-bond in 1.2.5.4?

>  
>
>   
>> -----Original Message-----
>> From: general-bounces at lists.openfabrics.org 
>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vu Pham
>> Sent: Wednesday, December 05, 2007 10:01 AM
>> To: Moni Shoua
>> Cc: OpenFabrics General
>> Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 
>> and 1.2.5.4,
>>
>> Hi Moni,
>>
>> My systems are RHEL 5.1 x86-64, 2 Sinai hcas, fw 1.2.0
>>
>> I setup bonding as follow:
>> IPOIBBOND_ENABLE=yes
>> IPOIB_BONDS=bond0
>> bond0_IP=11.1.1.1
>> bond0_SLAVEs=ib0,ib1
>> in /etc/infiniband/openib.conf in order to start ib-bond automatically
>>
>> Testing with OFED-1.3-beta2, I got the following crash while 
>> system is 
>> booting up
>>
>> Stack: ffffffff883429d0 fff810428519d30 ................
>> Call Trace:
>> [<ffffffff883429d0>] :bonding:bond_get_stats+0x4a/0x131
>> [<        8020e9cd>] rtnetlink_fill_ifinfo+0x4ba/0x5c4
>>               ee19>] rtmsg_if info+0x44/0x8d
>>               eea2>] rtnetlink_event+0x40/0x44
>>           8006492a>] notifier_call_chain+0x20/0x32
>>           80208b5e>] dev_open+0x68/0x6e
>>               72e8>] dev_change_flags+0x5a/0x119
>>           80239762>] devinet_ioctl+0x235/0x59c
>>           801ffcf6>] sock_ioctl+0x1c1/0x1e5
>>           8003fc3f>] do_ioctl+0x21/0x6b
>>           8002fa45>] vfs_ioctl+0x248/0261
>>           8004a24b>] sys_ioctl+0x59/0x78
>>           8005b14e>] system_call+0x7e/0x83
>>
>> Code: Bad RIP value
>> RIP [0000000000000000000000] _stext+0x7ffff000/0x1000
>>  RSP <ffff10428519cc0>
>> CR2: 000000000000000000000
>>  <0>Kernel panic - not syncing: Fatal exception
>>
>> I open bug #812 for this issue.
>>
>> I moved our systems back to ofed-1.2.5.4 and tested ib-bond again. We 
>> tested it with ib0 and ib1 (connected to different 
>> switch/fabric) been 
>> on the same subnet (10.2.1.x, 255.255.255.0) and on different subnets 
>> (10.2.1.x and 10.3.1.x, 255.255.255.0). In both cases there 
>> is the issue 
>> of loosing communication between the servers if nodes have 
>> not been on 
>> the same primary ib interface.
>>
>> Example:
>> 1. original state: ib0's are the primary on both servers - 
>> pinging bond0 
>> between the servers is fine
>> 2. fail ib0 on one of the servers (ib1 become primary on this 
>> server) - 
>> pinging bond0 between the servers fails
>> 3. fail ib0 on the second server (ib1 become primary) - pinging bond0 
>> between the servers is fine again
>>
>> Is this behavior expected?
>>
>> thanks,
>> -vu
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit 
>> http://openib.org/mailman/listinfo/openib-general
>>
>>     


From anton.bodner at qlogic.com  Wed Dec  5 10:15:52 2007
From: anton.bodner at qlogic.com (Anton Bodner)
Date: Wed, 5 Dec 2007 12:15:52 -0600
Subject: [ofa-general] Application layer support for RMPP using OFED stack.
Message-ID: <99863D2ED484D449811D97A4C44C9CBD5DE043@EPEXCH2.qlogic.org>

 
Hello -

 
I'm from QLogic Corp, and I'm trying to port some older applications to
OFED, using the OFED stack and the OFED umad api (specifically - opening
the /dev/infiniband/umadX, and using read / write).  I am using OFED
1.2.5.

 
My old application interrogates the SA, and also implements the RMPP in
the application layer.  In porting this application to OFED, I realize
that the OFED stack has the capability to do the RMPP on my behalf, but
that has an adverse effect on my application since my infiniband
interface in my app transfers 256 byte packets ONLY.  

 
Hence - I want my app to do the RMPP, not OFED.

 
So - I disabled RMPP support in OFED stack by registering with RMPP
version of 0.  This seems effective in stopping OFED stack from doing
the RMPP for me.   However - when I try to send the RMPP ack, it fails
at the write ().

 
I investigated the OFED stack code, and its failing in
ib_create_send_mad (drivers/infiniband/core/mad.c).  Investigation shows
that if one registered with RMPP version 0, then the 'send' never allows
rmpp to be active...  (code snippet below)

            if ((!mad_agent->rmpp_version &&

                 (rmpp_active || message_size > sizeof(struct ib_mad)))
||

                (!rmpp_active && message_size > sizeof(struct ib_mad)))

    {

                        return ERR_PTR(-EINVAL);

    }

 
My questions are these:

1. Has anyone tried an application layer supported RMPP ? If so - how
did you get past this logic (possible OFED bug??)

 
2. Is the OFED implementation intent to disable RMPP support COMPLETELY
when registering with a RMPP version of  0? If so, then how does one
implement a user level RMPP using the OFED stack?  Perhaps no one is
doing this at all since the stack already does it for you?

 
Like I said - the only reason I'm doing this is to port an old/ existing
app to OFED, and there is a considerable level of difficulty to rip our
app level RMPP support.  But perhaps this is the alternative I must
do...

 
Thanks

 
Anton Bodner

 
 Anton Bodner Jr.

QLogic Corporation

(610)233-4856

anton.bodner at qlogic.com

http://www.qlogic.com

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071205/875c99d7/attachment.html>

From jackm at dev.mellanox.co.il  Wed Dec  5 10:33:48 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Wed, 5 Dec 2007 20:33:48 +0200
Subject: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <ada4pexemsu.fsf@cisco.com>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>
	<ada4pexemsu.fsf@cisco.com>
Message-ID: <200712052033.49107.jackm@dev.mellanox.co.il>

On Wednesday 05 December 2007 07:24, Roland Dreier wrote:
> 
> I think the only alternative we have to preserve backwards
> compatibility is to leave struct ibv_context_ops alone and change the
> structure to:
> 
> struct ibv_context {
>         struct ibv_device      *device;
>         struct ibv_context_ops  ops;
>         int                     cmd_fd;
>         int                     async_fd;
>         int                     num_comp_vectors;
>         pthread_mutex_t         mutex;
>         void                   *abi_compat;
>         struct ibv_xrc_op      *xrc_ops;
> };
> 
> with xrc_ops added at the end.  It's my fault for not making the ops
> member a pointer I guess.
> 

We don't need to have this as a pointer, really (I'd like to save the
extra malloc and associated bookkeeping). If we have the ibv_xrc_op struct
at the end of ibv_context, this is sufficient for backwards binary
compatibility(libmlx4 itself allocates the ibv_context structure for
libibverbs.  If the actual structure is a bit bigger, who cares --
we just need to preserve the current offsets of the structure
fields for binary compatibility).

If you want to be a bit more generic, we could do this as an "extra_ops"
structure and add new ops as needed.
(If future changes are messier than just adding a new op, we can then
increment the API version):

struct ibv_context_extra_ops {
	struct ibv_srq *	(*create_xrc_srq)(struct ibv_pd *pd,
						  struct ibv_xrc_domain *xrc_domain,
						  struct ibv_cq *xrc_cq,
						  struct ibv_srq_init_attr *srq_init_attr);
	struct ibv_xrc_domain *	(*open_xrc_domain)(struct ibv_context *context,
						   int fd, int oflag);
	int			(*close_xrc_domain)(struct ibv_xrc_domain *d);
};

 struct ibv_context {
         struct ibv_device      *device;
         struct ibv_context_ops  ops;
         int                     cmd_fd;
         int                     async_fd;
         int                     num_comp_vectors;
         pthread_mutex_t         mutex;
         void                   *abi_compat;
         struct ibv_context_extra_ops  extra_ops;
 };
 

From sashak at voltaire.com  Wed Dec  5 11:03:03 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 5 Dec 2007 19:03:03 +0000
Subject: [ofa-general] [PATCH] opensm: don't break name_map using when
	routing_engine was not found.
Message-ID: <20071205190303.GB25758@sashak.voltaire.com>


Don't break using node_name_map using when routing_engine was not found
by the name provided.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_opensm.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
index a08b0f9..26b9969 100644
--- a/opensm/opensm/osm_opensm.c
+++ b/opensm/opensm/osm_opensm.c
@@ -305,13 +305,11 @@ osm_opensm_init(IN osm_opensm_t * const p_osm,
 #endif				/* ENABLE_OSM_PERF_MGR */
 
 	if (p_opt->routing_engine_name &&
-	    setup_routing_engine(p_osm, p_opt->routing_engine_name)) {
+	    setup_routing_engine(p_osm, p_opt->routing_engine_name))
 		osm_log(&p_osm->log, OSM_LOG_VERBOSE,
 			"osm_opensm_init: cannot find or setup routing engine"
 			" \'%s\'. Default will be used instead\n",
 			p_opt->routing_engine_name);
-		goto Exit;
-	}
 
 	p_osm->node_name_map = open_node_name_map(p_opt->node_name_map_name);
 
-- 
1.5.3.4.206.g58ba4


From Jeffrey.C.Becker at nasa.gov  Wed Dec  5 11:22:12 2007
From: Jeffrey.C.Becker at nasa.gov (Jeff Becker)
Date: Wed, 05 Dec 2007 11:22:12 -0800
Subject: [ofa-general] test -plz ignore
Message-ID: <4756FA64.1070901@nasa.gov>

-jeff


From changquing.tang at hp.com  Wed Dec  5 11:59:17 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Wed, 5 Dec 2007 19:59:17 +0000
Subject: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <200712052033.49107.jackm@dev.mellanox.co.il>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>
	<ada4pexemsu.fsf@cisco.com>
	<200712052033.49107.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE0493C3@G5W0278.americas.hpqcorp.net>


There are some other input structure changes such as ibv_qp_init_attr, if the qp_type is not IBV_QPT_XRC,
the field xrc_domain is not touched, right ?

Similar thing for "struct ibv_send_wr" xrc_remote_srq_num field.


--CQ Tang


> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Wednesday, December 05, 2007 12:34 PM
> To: ewg at lists.openfabrics.org
> Cc: Roland Dreier; Tang, Changqing;
> general at lists.openfabrics.org; tziporet at mellanox.co.il
> Subject: Re: [ewg] Re: [ofa-general] OFED 1.3 Beta release is
> available
>
> On Wednesday 05 December 2007 07:24, Roland Dreier wrote:
> >
> > I think the only alternative we have to preserve backwards
> > compatibility is to leave struct ibv_context_ops alone and
> change the
> > structure to:
> >
> > struct ibv_context {
> >         struct ibv_device      *device;
> >         struct ibv_context_ops  ops;
> >         int                     cmd_fd;
> >         int                     async_fd;
> >         int                     num_comp_vectors;
> >         pthread_mutex_t         mutex;
> >         void                   *abi_compat;
> >         struct ibv_xrc_op      *xrc_ops;
> > };
> >
> > with xrc_ops added at the end.  It's my fault for not
> making the ops
> > member a pointer I guess.
> >
>
> We don't need to have this as a pointer, really (I'd like to
> save the extra malloc and associated bookkeeping). If we have
> the ibv_xrc_op struct at the end of ibv_context, this is
> sufficient for backwards binary
> compatibility(libmlx4 itself allocates the ibv_context
> structure for libibverbs.  If the actual structure is a bit
> bigger, who cares -- we just need to preserve the current
> offsets of the structure fields for binary compatibility).
>
> If you want to be a bit more generic, we could do this as an
> "extra_ops"
> structure and add new ops as needed.
> (If future changes are messier than just adding a new op, we
> can then increment the API version):
>
> struct ibv_context_extra_ops {
>         struct ibv_srq *        (*create_xrc_srq)(struct ibv_pd *pd,
>                                                   struct
> ibv_xrc_domain *xrc_domain,
>                                                   struct
> ibv_cq *xrc_cq,
>                                                   struct
> ibv_srq_init_attr *srq_init_attr);
>         struct ibv_xrc_domain * (*open_xrc_domain)(struct
> ibv_context *context,
>                                                    int fd, int oflag);
>         int                     (*close_xrc_domain)(struct
> ibv_xrc_domain *d);
> };
>
>  struct ibv_context {
>          struct ibv_device      *device;
>          struct ibv_context_ops  ops;
>          int                     cmd_fd;
>          int                     async_fd;
>          int                     num_comp_vectors;
>          pthread_mutex_t         mutex;
>          void                   *abi_compat;
>          struct ibv_context_extra_ops  extra_ops;  };
>
>


From rdreier at cisco.com  Wed Dec  5 12:29:11 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 05 Dec 2007 12:29:11 -0800
Subject: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <200712052033.49107.jackm@dev.mellanox.co.il> (Jack Morgenstein's
	message of "Wed, 5 Dec 2007 20:33:48 +0200")
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE048AEF@G5W0278.americas.hpqcorp.net>
	<ada4pexemsu.fsf@cisco.com>
	<200712052033.49107.jackm@dev.mellanox.co.il>
Message-ID: <adak5nsdgx4.fsf@cisco.com>

 > > struct ibv_context {
 > >         struct ibv_device      *device;
 > >         struct ibv_context_ops  ops;
 > >         int                     cmd_fd;
 > >         int                     async_fd;
 > >         int                     num_comp_vectors;
 > >         pthread_mutex_t         mutex;
 > >         void                   *abi_compat;
 > >         struct ibv_xrc_op      *xrc_ops;
 > > };
 > > 
 > 
 > We don't need to have this as a pointer, really (I'd like to save the
 > extra malloc and associated bookkeeping).

I think we could actually have libmlx4 have one copy of xrc_ops and
set the pointer to point at its copy.  And then the tests in each of
the xrc operations become just 'if (!context->xrc_ops) return ENOSYS;"
But it's not a big deal really.

 > If we have the ibv_xrc_op struct at the end of ibv_context, this is
 > sufficient for backwards binary compatibility(libmlx4 itself
 > allocates the ibv_context structure for libibverbs.  If the actual
 > structure is a bit bigger, who cares -- we just need to preserve
 > the current offsets of the structure fields for binary
 > compatibility).

Yes, that's true.  I don't have much objection to adding a struct
ibv_xrc_ops inside the structure (rather than the pointer as I
suggested).

 > If you want to be a bit more generic, we could do this as an "extra_ops"
 > structure and add new ops as needed.

Actually I'd prefer to add xrc_ops and then if we need to extend
further with more new ops, add another structure afterw it.  That way
we avoid having to put any define in libibverbs to tell drivers like
libmlx4 that xrc support is present; libmlx4 et al can just use
AC_CHECK_MEMBER(struct ibv_context.xrc_ops) to test with autoconf.

 - R.


From mshefty at ichips.intel.com  Wed Dec  5 13:42:00 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Wed, 05 Dec 2007 13:42:00 -0800
Subject: [ofa-general] RE: [PATCH]  CMA: Enable conn_id remove
In-Reply-To: <475644C2.3070302@dev.mellanox.co.il>
References: <475518FB.1080501@dev.mellanox.co.il>	<000001c8369d$a60cef10$3c98070a@amr.corp.intel.com>
	<475644C2.3070302@dev.mellanox.co.il>
Message-ID: <47571B28.20606@ichips.intel.com>

>>> I have the following issue: The IB driver can't be unloaded after 
>>> running
>>> applications over RDS.
>>
> Yes, HCA driver and there are active connections when the driver is 
> unloaded.

At least from looking at the code, it looks like unloading the device 
driver with connections on the passive side should hang.  I'm updating 
my test code to verify that this is the case, but can you confirm that 
this problem is easily reproducible using RDS?

I'm guessing that most of my device removal testing was always done from 
the active side, and none of the mainline kernel ULPs call 
rdma_listen().  I'm hoping this would explain why this bug has gone 
undetected for this long.

- Sean


From vlad at dev.mellanox.co.il  Wed Dec  5 13:58:42 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Wed, 05 Dec 2007 23:58:42 +0200
Subject: [ofa-general] RE: [PATCH]  CMA: Enable conn_id remove
In-Reply-To: <47571B28.20606@ichips.intel.com>
References: <475518FB.1080501@dev.mellanox.co.il>	<000001c8369d$a60cef10$3c98070a@amr.corp.intel.com>
	<475644C2.3070302@dev.mellanox.co.il>
	<47571B28.20606@ichips.intel.com>
Message-ID: <47571F12.3050107@dev.mellanox.co.il>

Sean Hefty wrote:
>>>> I have the following issue: The IB driver can't be unloaded after 
>>>> running
>>>> applications over RDS.
>>>
>> Yes, HCA driver and there are active connections when the driver is 
>> unloaded.
> 
> At least from looking at the code, it looks like unloading the device 
> driver with connections on the passive side should hang.  I'm updating 
> my test code to verify that this is the case, but can you confirm that 
> this problem is easily reproducible using RDS?
> 

Yes, it is easily reproducible:

I have 2 servers with MT25418 HCAs (mlx4) connected to the IB switch.
Driver removal hangs on one of them every time if RDS connection was active
(run rds-stress test, for example, to create the connection).

- Vladimir

> I'm guessing that most of my device removal testing was always done from 
> the active side, and none of the mainline kernel ULPs call 
> rdma_listen().  I'm hoping this would explain why this bug has gone 
> undetected for this long.
> 
> - Sean
> 


From vtpcplavhcc at bminc.com  Wed Dec  5 16:27:15 2007
From: vtpcplavhcc at bminc.com (Nathan Padgett)
Date: Thu, 6 Dec 2007 08:27:15 +0800
Subject: [ofa-general] Be full of energy and fill your partner with it
Message-ID: <01c837e1$cefac790$9e88c3dd@vtpcplavhcc>

Ladies always giggled at me and even guys did in the public toilets!    Well now I laugh at them because I took megadik    for 6 months and now my dick is much bigger than "average" size.Order MegaDik Now

Don't hesitate, make your order todayDid you know... MegaDik was featured in leading mens magazines such as FHM, MAXIM, plus many others,   and rated No.1 choice for penis enlargement... Also seen on TV

http://geocities.com/RufusFowler70/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071206/c73c39e9/attachment.html>

From keshetti85-student at yahoo.co.in  Wed Dec  5 21:02:52 2007
From: keshetti85-student at yahoo.co.in (Keshetti Mahesh)
Date: Thu, 6 Dec 2007 10:32:52 +0530
Subject: [ofa-general] Re: [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
Message-ID: <829ded920712052102i3e6a414amd1f9300dc246b24b@mail.gmail.com>

>> " *** glibc detected *** free(): invalid next size (fast):
>> 0x000000000088cff0 *** "
> >
> >gdb backtrace is,
>>
>> *** glibc detected *** free(): invalid next size (fast):
0x000000000088cff0 ***
>>
>> Program received signal SIGABRT, Aborted.
>> [Switching to Thread 1126189408 (LWP 19983)]
>> 0x000000342ec2e2ed in raise () from /lib64/tls/libc.so.6
>> (gdb)
>> (gdb) bt
>> #0  0x000000342ec2e2ed in raise () from /lib64/tls/libc.so.6
>> #1  0x000000342ec2fa3e in abort () from /lib64/tls/libc.so.6
>> #2  0x000000342ec62d41 in __libc_message () from /lib64/tls/libc.so.6
>> #3  0x000000342ec6881e in _int_free () from /lib64/tls/libc.so.6
>> #4  0x000000342ec68b66 in free () from /lib64/tls/libc.so.6
>> #5  0x000000000045625e in switch_delete ()
>> #6  0x00000000004579bf in lash_cleanup ()
>> #7  0x0000000000457cfa in lash_process ()
>> #8  0x0000000000451b78 in osm_ucast_mgr_process ()
>> #9  0x00000000004468af in osm_state_mgr_process ()
>> #10 0x00000000004472cc in __osm_state_mgr_ctrl_disp_callback ()
>> #11 0x0000002a9589df8f in __cl_disp_worker (context=0x7fbffff270) at
>> cl_dispatcher.c:102
>> #12 0x0000002a958a5121 in __cl_thread_pool_routine
>> (context=0x7fbffff2e8) at cl_threadpool.c:74
>> #13 0x0000002a958a4f6a in __cl_thread_wrapper (arg=0x59fa00) at
cl_thread.c:58
>> #14 0x000000342f10610a in start_thread () from /lib64/tls/libpthread.so.0
>> #15 0x000000342ecc5ee3 in clone () from /lib64/tls/libc.so.6

> Or rerun OpenSM under gdb, then look at backtrace (with 'bt' command).

I have already attached the gdb backtrace in my first mail.

-Mahesh.

> Sasha


From kliteyn at mellanox.co.il  Wed Dec  5 21:21:02 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 6 Dec 2007 07:21:02 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-06:normal completion
Message-ID: <MTLEXCH01jbDt3HvoFO00009bd9@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-05
OpenSM git rev = Wed_Dec_5_17:28:40_2007 [631151b0e47868ddb7cd8fadd6dca2812d7f0b28]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=480  Pass=479  Fail=1
 
 
Pass:
36 Stability IS1-16.topo
36 Pkey IS1-16.topo
36 OsmTest IS1-16.topo
36 OsmStress IS1-16.topo
36 Multicast IS1-16.topo
36 LidMgr IS1-16.topo
12 Stability IS3-loop.topo
12 Stability IS3-128.topo
12 Pkey IS3-128.topo
12 OsmTest IS3-loop.topo
12 OsmTest IS3-128.topo
12 OsmStress IS3-128.topo
12 Multicast IS3-loop.topo
12 Multicast IS3-128.topo
12 FatTree merge-roots-4-ary-2-tree.topo
12 FatTree merge-root-4-ary-3-tree.topo
12 FatTree gnu-stallion-64.topo
12 FatTree blend-4-ary-2-tree.topo
12 FatTree RhinoDDR.topo
12 FatTree FullGnu.topo
12 FatTree 4-ary-2-tree.topo
12 FatTree 2-ary-4-tree.topo
12 FatTree 12-node-spaced.topo
12 FTreeFail 4-ary-2-tree-missing-sw-link.topo
12 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
12 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
12 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
11 LidMgr IS3-128.topo

Failures:
1 LidMgr IS3-128.topo


From fluorides at mightyclips.net  Wed Dec  5 23:43:47 2007
From: fluorides at mightyclips.net (Magnus Mitchell)
Date: Thu, 06 Dec 2007 03:43:47 -0400
Subject: [ofa-general] Adobe Acrobat Professional 8 MAC/XP/Vista for 79,
	Retails @ 599 (You Save 520)
Message-ID: <000001c837c9$b7b36280$0100007f@localhost>

stuffit deluxe 11 for mac - 29
adobe photoshop cs3 extended - 89
corel wordperfect office x3 standard - 49
alias studiotools 11.02 - 49
adobe golive cs2 - 49
roxio digitalmedia studio deluxe suite 7.0 - 49
adobe atmosphere 1.0 - 29
crystal reports professional edition 11 - 69
adobe acrobat 3d - 59
google sketchup pro 6 for mac - 59
luxology modo 301 for mac - 129
avid xpress pro 5.7 - 119

http://cheapmicrosoftoem.cn


From jackm at dev.mellanox.co.il  Wed Dec  5 22:21:45 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Thu, 6 Dec 2007 08:21:45 +0200
Subject: [ewg] Re: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE0493C3@G5W0278.americas.hpqcorp.net>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<200712052033.49107.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FDE0493C3@G5W0278.americas.hpqcorp.net>
Message-ID: <200712060821.45979.jackm@dev.mellanox.co.il>

On Wednesday 05 December 2007 21:59, Tang, Changqing wrote:
> There are some other input structure changes such as ibv_qp_init_attr, if the qp_type is not IBV_QPT_XRC,
> the field xrc_domain is not touched, right ?
> 
Right.

> Similar thing for "struct ibv_send_wr" xrc_remote_srq_num field.
> 
Same thing -- the fields are not touched unless the qp type is IBV_QPT_XRC.

- Jack


From ogerlitz at voltaire.com  Wed Dec  5 23:49:20 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 06 Dec 2007 09:49:20 +0200
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <4756E743.3040900@mellanox.com>
References: <4756E743.3040900@mellanox.com>
Message-ID: <4757A980.2030403@voltaire.com>

Vu Pham wrote:
> My systems are RHEL 5.1 x86-64, 2 Sinai hcas, fw 1.2.0
> I setup bonding as follow:
> IPOIBBOND_ENABLE=yes
> IPOIB_BONDS=bond0
> bond0_IP=11.1.1.1
> bond0_SLAVEs=ib0,ib1
> in /etc/infiniband/openib.conf in order to start ib-bond automatically

Hi Vu,

Please note that in RH5 there's a native support for bonding 
configuration through the initscripts tools (network scripts, etc), see 
section 3.1.2 at the ib-bonding.txt document provided with the bonding 
package.

The persistency mechanism which you have used (eg through 
/etc/init.d/openibd and /etc/openib.conf) is there only for somehow OLD 
distributions for which there's no native (*) support for bonding 
configuration, actually I was thinking we wanted to remove it 
altogether, Moni?

(*) under RH4 the native support it broken for ipoib/bonding and hence 
we patched the some initscripts scripts.

> I moved our systems back to ofed-1.2.5.4 and tested ib-bond again. We 
> tested it with ib0 and ib1 (connected to different switch/fabric) been 
> on the same subnet (10.2.1.x, 255.255.255.0) and on different subnets 
> (10.2.1.x and 10.3.1.x, 255.255.255.0). In both cases there is the issue 
> of loosing communication between the servers if nodes have not been on 
> the same primary ib interface.

Generally speaking, I don't see the point in using bonding for 
--high-availability-- where each slave is connected to different fabric. 
This is b/c when there's fail-over in one system you need also the 
second system to fail-over, you would also not be able to count on local 
link detection mechanisms, since the remote node also must fail-over now 
even with his local link being perfectly fine. This is correct 
regardless of the interconnect type.

Am I missing something here regarding to your setup?

The question on usage case of bonding over separate fabrics have been 
brought to me several times and I gave this answer, no-one ever tried to 
educate me why its interesting, maybe you will do so...

Also what do you mean with "ib0 and ib1 been on the same/different 
subnets" its only the master device (eg bond0, bond1, etc) with has 
association/configuration with an IP subnet, correct?

> 1. original state: ib0's are the primary on both servers - pinging bond0 
> between the servers is fine

> 2. fail ib0 on one of the servers (ib1 become primary on this server) - 
> pinging bond0 between the servers fails
sure, b/c there's no reason for the remote bonding to issue fail-over

> 3. fail ib0 on the second server (ib1 become primary) - pinging bond0 
> between the servers is fine again
indeed.

Or.


From 999.rahul260 at ameritrade.com  Thu Dec  6 00:22:52 2007
From: 999.rahul260 at ameritrade.com (Marvin Pina)
Date: Thu, 6 Dec 2007 16:22:52 +0800
Subject: [ofa-general] What are you up to?
Message-ID: <01c83824$3fc65600$16427a7b@999.rahul260>

Hello! I am bored tonight. I am nice girl that would like to chat with you. Email me at f at ShineBal.info only, because I am writing not from my personal email. Mind me sending some of my pictures to you?


From sashak at voltaire.com  Thu Dec  6 02:45:30 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 6 Dec 2007 10:45:30 +0000
Subject: [ofa-general] Re: [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <829ded920712052102i3e6a414amd1f9300dc246b24b@mail.gmail.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
	<829ded920712052102i3e6a414amd1f9300dc246b24b@mail.gmail.com>
Message-ID: <20071206104530.GE25758@sashak.voltaire.com>

On 10:32 Thu 06 Dec     , Keshetti Mahesh wrote:
> >> " *** glibc detected *** free(): invalid next size (fast):
> >> 0x000000000088cff0 *** "
> > >
> > >gdb backtrace is,
> >>
> >> *** glibc detected *** free(): invalid next size (fast):
> 0x000000000088cff0 ***
> >>
> >> Program received signal SIGABRT, Aborted.
> >> [Switching to Thread 1126189408 (LWP 19983)]
> >> 0x000000342ec2e2ed in raise () from /lib64/tls/libc.so.6
> >> (gdb)
> >> (gdb) bt
> >> #0  0x000000342ec2e2ed in raise () from /lib64/tls/libc.so.6
> >> #1  0x000000342ec2fa3e in abort () from /lib64/tls/libc.so.6
> >> #2  0x000000342ec62d41 in __libc_message () from /lib64/tls/libc.so.6
> >> #3  0x000000342ec6881e in _int_free () from /lib64/tls/libc.so.6
> >> #4  0x000000342ec68b66 in free () from /lib64/tls/libc.so.6
> >> #5  0x000000000045625e in switch_delete ()
> >> #6  0x00000000004579bf in lash_cleanup ()
> >> #7  0x0000000000457cfa in lash_process ()
> >> #8  0x0000000000451b78 in osm_ucast_mgr_process ()
> >> #9  0x00000000004468af in osm_state_mgr_process ()
> >> #10 0x00000000004472cc in __osm_state_mgr_ctrl_disp_callback ()
> >> #11 0x0000002a9589df8f in __cl_disp_worker (context=0x7fbffff270) at
> >> cl_dispatcher.c:102
> >> #12 0x0000002a958a5121 in __cl_thread_pool_routine
> >> (context=0x7fbffff2e8) at cl_threadpool.c:74
> >> #13 0x0000002a958a4f6a in __cl_thread_wrapper (arg=0x59fa00) at
> cl_thread.c:58
> >> #14 0x000000342f10610a in start_thread () from /lib64/tls/libpthread.so.0
> >> #15 0x000000342ecc5ee3 in clone () from /lib64/tls/libc.so.6
> 
> > Or rerun OpenSM under gdb, then look at backtrace (with 'bt' command).
> 
> I have already attached the gdb backtrace in my first mail.

Right, sorry...

Do you able to reproduce the problem?

Sasha


From keshetti85-student at yahoo.co.in  Thu Dec  6 02:40:46 2007
From: keshetti85-student at yahoo.co.in (Keshetti Mahesh)
Date: Thu, 6 Dec 2007 16:10:46 +0530
Subject: [ofa-general] Re: [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <20071206104530.GE25758@sashak.voltaire.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
	<829ded920712052102i3e6a414amd1f9300dc246b24b@mail.gmail.com>
	<20071206104530.GE25758@sashak.voltaire.com>
Message-ID: <829ded920712060240w3c3955b6t6ad1854d32e11986@mail.gmail.com>

> Right, sorry...
>
> Do you able to reproduce the problem?

Yes, most of the times. Some times openSM runs successfully with LASH.
As it can be seen in the gdb backtrace there seems to be some problem in
switch_delete function, may be a memory double free is happening.  But I am not
able to find where exactly it is happening in the LASH code
(osm_ucasy_lash.c) as
I am not well verse with the openSM code.

-Mahesh

>
> Sasha


From vlad at lists.openfabrics.org  Thu Dec  6 02:55:17 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Thu,  6 Dec 2007 02:55:17 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071206-0200 daily build status
Message-ID: <20071206105517.DCA9BE6003A@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.18
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.22
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.21.1
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.14
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.16
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ppc64 with linux-2.6.18-8.el5

Failed:


From 6-astride at viridianpower.com  Thu Dec  6 03:51:22 2007
From: 6-astride at viridianpower.com (Esmeralda Esposito)
Date: Thu, 06 Dec 2007 12:51:22 +0100
Subject: [ofa-general] marketing list data Doctors Directory
Message-ID: <039447c2htr0$k0153vz0$2739a5g0@Delldim5150


Licensed Medical Doctors in the USA

788,719 in total - 17,400 e-mails
Medical Doctor in over 34 specialties

Many unique fields like 'medical school attended' and 'location of residency  training'

Special Introductory Offer -  $384

** FREE: Get the 4 directories below with the purchase of the Doctor **

Hospitals in the USA
complete contact information for CEO's, CFO's, Directors and more - 
over 23,000 listings in total for more than 7,000 hospitals in the USA

Contact List of US Dentists
More than half a million listings [worth $299 alone!]

Nursing Homes in the USA
Full data for CFO, Nursing Directors, Senior Admins [worth $249 alone]

American Chiropractors Contact List
100,000 Chiropractors in the USA (worth $249 alone)

MAIL soong.ahn at hotmail.comm   or call us at: 206-333-0060


From hrosenstock at xsigo.com  Thu Dec  6 04:41:35 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 06 Dec 2007 04:41:35 -0800
Subject: [ofa-general] Application layer support for RMPP using OFED stack.
In-Reply-To: <99863D2ED484D449811D97A4C44C9CBD5DE043@EPEXCH2.qlogic.org>
References: <99863D2ED484D449811D97A4C44C9CBD5DE043@EPEXCH2.qlogic.org>
Message-ID: <1196944895.30768.430.camel@hrosenstock-ws.xsigo.com>

Hi Anton,

On Wed, 2007-12-05 at 12:15 -0600, Anton Bodner wrote:
>  
> 
> Hello –
> 
>  
> 
> I’m from QLogic Corp, and I’m trying to port some older applications
> to OFED, using the OFED stack and the OFED umad api (specifically –
> opening the /dev/infiniband/umadX, and using read / write).  I am
> using OFED 1.2.5.
> 
>  
> 
> My old application interrogates the SA, and also implements the RMPP
> in the application layer.  In porting this application to OFED, I
> realize that the OFED stack has the capability to do the RMPP on my
> behalf, but that has an adverse effect on my application since my
> infiniband interface in my app transfers 256 byte packets ONLY.  
> 
>  
> 
> Hence – I want my app to do the RMPP, not OFED.

I understand why you want to do this but that was not a design goal. The
design goal was that the kernel always does the RMPP. There were long
discussions of this on the list way back when the approach for this was
being discussed. I think that was over 3 years ago and I've forgotten
all the details and haven't gone back to try to resurrect that
discussion. The bottom line was we wanted one RMPP implementation that
would be more likely to interoperate with other RMPPs. The downside is
the "feature" you are asking about: the ability for userspace to
implement its own RMPP.
 
> So – I disabled RMPP support in OFED stack by registering with RMPP
> version of 0.  This seems effective in stopping OFED stack from doing
> the RMPP for me.   

Yes, but that doesn't mean that userspace can do RMPP but just that your
client is not using RMPP (although in this case it is). You want this to
mean kernel don't do RMPP but that wasn't the design point.

> However – when I try to send the RMPP ack, it fails at the write ().

Yes, that is one of the issues. I think a similar problem also exists on
the receive side.

> I investigated the OFED stack code, and its failing in
> ib_create_send_mad (drivers/infiniband/core/mad.c).  Investigation
> shows that if one registered with RMPP version 0, then the ‘send’
> never allows rmpp to be active…  (code snippet below)

Because that is correct with the current meaning of rmpp_version.

>             if ((!mad_agent->rmpp_version &&
> 
>                  (rmpp_active || message_size > sizeof(struct
> ib_mad))) ||
> 
>                 (!rmpp_active && message_size > sizeof(struct
> ib_mad)))
> 
>     {
> 
>                         return ERR_PTR(-EINVAL);
> 
>     }
> 
>  
> 
> My questions are these:
> 
> 1. Has anyone tried an application layer supported RMPP ?

I suspect not.

>  If so – how did you get past this logic (possible OFED bug??)
> 
>  
> 
> 2. Is the OFED implementation intent to disable RMPP support
> COMPLETELY when registering with a RMPP version of  0?

No.

>  If so, then how does one implement a user level RMPP using the OFED
> stack?  Perhaps no one is doing this at all since the stack already
> does it for you?

That's how it's intended to be used.

> Like I said – the only reason I’m doing this is to port an old/
> existing app to OFED, and there is a considerable level of difficulty
> to rip our app level RMPP support.  

Understood.

> But perhaps this is the alternative I must do…

Not sure if that is the case but that certainly is an alternative and is
likely the most expeditious.

-- Hal

> Thanks
> 
>  
> 
> Anton Bodner
> 
>  
> 
>  Anton Bodner Jr.
> 
> 
> QLogic Corporation
> 
> (610)233-4856
> 
> anton.bodner at qlogic.com
> 
> http://www.qlogic.com
> 
>  
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From swise at opengridcomputing.com  Thu Dec  6 04:52:08 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 06 Dec 2007 06:52:08 -0600
Subject: [ofa-general] uDAPL EVD queue length issue
In-Reply-To: <4756E6A1.2010005@ichips.intel.com>
References: <20071203224550.GF11990@opengridcomputing.com>
	<4755AD21.4080001@ichips.intel.com>
	<20071205005711.GE17358@opengridcomputing.com>
	<4756CCFD.1040806@opengridcomputing.com>
	<4756E6A1.2010005@ichips.intel.com>
Message-ID: <4757F078.7000304@opengridcomputing.com>

Arlin Davis wrote:
>>>
>>> I'm running OFED 1.2.5 and using Chelsio.
>>>
>>  From the linux rdma verbs perspective, ibv_create_cq() will create a 
>> cq that is >= the requested depth.  The fact that mthca always bumps 
>> the size up to the next power of 2 isn't something udapl can rely on.
> 
> It doesn't.
> 
> uDAPL passes the users requested qlen directly to the verbs 
> ibv_create_cq call (dapl_openib_cq.c) and then uses the returned qlen to 
> allocate EVD's (dapl_evd_util.c) and a ring buffer 
> (dapl_ring_buffer_util.c) for managing the free and pending events.
> 
> The EVD's are allocated based on returned qlen from verbs and the 
> ring_buffer is allocated based on qlen, next power of 2 minus 1. Unless 
> I am missing something, I don't see how we can get less then what is 
> requested.
> 
>>
>> Here's the crux:  If creating a udapl evd of 256 results in a cq of 
>> 256 and the udapl returns a evd with size 255, then udapl is broken...
> 
> Yes, I agree. But I don't see how this is happening.
> 
> Here is output from my dtest when requesting 256 and verbs returning the 
> same qlen. You can see before and after verbs we get 256, the rbufs are 
> 511, and the query EVD call returns 256 to the user.
> 
>  cq_object_create: (0x519bb0,0x519d00)
> dapls_ib_cq_alloc: evd 0x519bb0 cqlen=256
> dapls_ib_cq_alloc: new_cq 0x519d60 cqlen=256
>  setup_async_cb: ia 0x518270 type 2 hdl 0x519bb0 cb 0x2a957c8e70 ctx 
> 0x519bb0
>  >>> rbuf_alloc: size 256 rsize 511
>  >>> rbuf_alloc: size 256 rsize 511
> dapl_evd_create () returns 0x0
> 
> 9920 dto_req_evd QLEN - requested 256 and actual 256
> 9920 Create events done
> 
> Can you turn up DAPL debug(DAPL_DBG_TYPE=0xffff) so I can see what is 
> happening?
> 
> Thanks,
> 
> -arlin
> 
> 
> 
> 

Sorry for the noise guys.  Looks like this is a cxgb3 bug.


Steve.


From fenkes at de.ibm.com  Thu Dec  6 07:07:19 2007
From: fenkes at de.ibm.com (Joachim Fenkes)
Date: Thu, 6 Dec 2007 16:07:19 +0100
Subject: [ofa-general] [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
Message-ID: <200712061607.20004.fenkes@de.ibm.com>

All firmware versions on POWER5 systems have a locking issue in the
HCA-related hCalls that can cause loss of Infiniband connectivity if
allocate and free calls happen in parallel. This may for example be caused
if two processes are using OpenMPI in parallel.
Circumvent this by serializing all HCA-related hCalls on POWER5.

Signed-off-by: Joachim Fenkes <fenkes at de.ibm.com>
---

We tested this patch, especially the autodetection, and it works okay.
Please review and apply for 2.6.24-rc5 - thanks!

 drivers/infiniband/hw/ehca/ehca_main.c |   16 ++++++++++++++++
 drivers/infiniband/hw/ehca/hcp_if.c    |   28 +++++++++++-----------------
 2 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index 90d4334..8f33d06 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -43,6 +43,9 @@
 #ifdef CONFIG_PPC_64K_PAGES
 #include <linux/slab.h>
 #endif
+
+#include <asm/cputable.h>
+
 #include "ehca_classes.h"
 #include "ehca_iverbs.h"
 #include "ehca_mrmw.h"
@@ -66,6 +69,7 @@ int ehca_poll_all_eqs  = 1;
 int ehca_static_rate   = -1;
 int ehca_scaling_code  = 0;
 int ehca_mr_largepage  = 1;
+int ehca_lock_hcalls   = -1;
 
 module_param_named(open_aqp1,     ehca_open_aqp1,     int, S_IRUGO);
 module_param_named(debug_level,   ehca_debug_level,   int, S_IRUGO);
@@ -77,6 +81,7 @@ module_param_named(poll_all_eqs,  ehca_poll_all_eqs,  int, S_IRUGO);
 module_param_named(static_rate,   ehca_static_rate,   int, S_IRUGO);
 module_param_named(scaling_code,  ehca_scaling_code,  int, S_IRUGO);
 module_param_named(mr_largepage,  ehca_mr_largepage,  int, S_IRUGO);
+module_param_named(lock_hcalls,   ehca_lock_hcalls,   bool, S_IRUGO);
 
 MODULE_PARM_DESC(open_aqp1,
 		 "AQP1 on startup (0: no (default), 1: yes)");
@@ -102,6 +107,9 @@ MODULE_PARM_DESC(scaling_code,
 MODULE_PARM_DESC(mr_largepage,
 		 "use large page for MR (0: use PAGE_SIZE (default), "
 		 "1: use large page depending on MR size");
+MODULE_PARM_DESC(lock_hcalls,
+		 "serialize all hCalls made by the driver "
+		 "(default: autodetect)");
 
 DEFINE_RWLOCK(ehca_qp_idr_lock);
 DEFINE_RWLOCK(ehca_cq_idr_lock);
@@ -924,6 +932,14 @@ int __init ehca_module_init(void)
 	printk(KERN_INFO "eHCA Infiniband Device Driver "
 	       "(Version " HCAD_VERSION ")\n");
 
+	/* Autodetect hCall locking -- we can't read the firmware version
+	 * directly, but we know that starting with POWER6, all firmware
+	 * versions are good.
+	 */
+	if (ehca_lock_hcalls == -1)
+		ehca_lock_hcalls = !(cur_cpu_spec->cpu_user_features
+				     & PPC_FEATURE_ARCH_2_05);
+
 	ret = ehca_create_comp_pool();
 	if (ret) {
 		ehca_gen_err("Cannot create comp pool.");
diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c
index c16a213..331b5e8 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -89,6 +89,7 @@
 #define HCALL9_REGS_FORMAT HCALL7_REGS_FORMAT " r11=%lx r12=%lx"
 
 static DEFINE_SPINLOCK(hcall_lock);
+extern int ehca_lock_hcalls;
 
 static u32 get_longbusy_msecs(int longbusy_rc)
 {
@@ -120,26 +121,21 @@ static long ehca_plpar_hcall_norets(unsigned long opcode,
 				    unsigned long arg7)
 {
 	long ret;
-	int i, sleep_msecs, do_lock;
-	unsigned long flags;
+	int i, sleep_msecs;
+	unsigned long flags = 0;
 
 	ehca_gen_dbg("opcode=%lx " HCALL7_REGS_FORMAT,
 		     opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7);
 
-	/* lock H_FREE_RESOURCE(MR) against itself and H_ALLOC_RESOURCE(MR) */
-	if ((opcode == H_FREE_RESOURCE) && (arg7 == 5)) {
-		arg7 = 0; /* better not upset firmware */
-		do_lock = 1;
-	}
-
 	for (i = 0; i < 5; i++) {
-		if (do_lock)
+		/* serialize hCalls to work around firmware issue */
+		if (ehca_lock_hcalls)
 			spin_lock_irqsave(&hcall_lock, flags);
 
 		ret = plpar_hcall_norets(opcode, arg1, arg2, arg3, arg4,
 					 arg5, arg6, arg7);
 
-		if (do_lock)
+		if (ehca_lock_hcalls)
 			spin_unlock_irqrestore(&hcall_lock, flags);
 
 		if (H_IS_LONG_BUSY(ret)) {
@@ -174,24 +170,22 @@ static long ehca_plpar_hcall9(unsigned long opcode,
 			      unsigned long arg9)
 {
 	long ret;
-	int i, sleep_msecs, do_lock;
+	int i, sleep_msecs;
 	unsigned long flags = 0;
 
 	ehca_gen_dbg("INPUT -- opcode=%lx " HCALL9_REGS_FORMAT, opcode,
 		     arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9);
 
-	/* lock H_ALLOC_RESOURCE(MR) against itself and H_FREE_RESOURCE(MR) */
-	do_lock = ((opcode == H_ALLOC_RESOURCE) && (arg2 == 5));
-
 	for (i = 0; i < 5; i++) {
-		if (do_lock)
+		/* serialize hCalls to work around firmware issue */
+		if (ehca_lock_hcalls)
 			spin_lock_irqsave(&hcall_lock, flags);
 
 		ret = plpar_hcall9(opcode, outs,
 				   arg1, arg2, arg3, arg4, arg5,
 				   arg6, arg7, arg8, arg9);
 
-		if (do_lock)
+		if (ehca_lock_hcalls)
 			spin_unlock_irqrestore(&hcall_lock, flags);
 
 		if (H_IS_LONG_BUSY(ret)) {
@@ -821,7 +815,7 @@ u64 hipz_h_free_resource_mr(const struct ipz_adapter_handle adapter_handle,
 	return ehca_plpar_hcall_norets(H_FREE_RESOURCE,
 				       adapter_handle.handle,    /* r4 */
 				       mr->ipz_mr_handle.handle, /* r5 */
-				       0, 0, 0, 0, 5);
+				       0, 0, 0, 0, 0);
 }
 
 u64 hipz_h_reregister_pmr(const struct ipz_adapter_handle adapter_handle,
-- 
1.5.2


From arnd at arndb.de  Thu Dec  6 07:48:23 2007
From: arnd at arndb.de (Arnd Bergmann)
Date: Thu, 6 Dec 2007 16:48:23 +0100
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <200712061607.20004.fenkes@de.ibm.com>
References: <200712061607.20004.fenkes@de.ibm.com>
Message-ID: <200712061648.24806.arnd@arndb.de>

On Thursday 06 December 2007, Joachim Fenkes wrote:
>         printk(KERN_INFO "eHCA Infiniband Device Driver "
>                "(Version " HCAD_VERSION ")\n");
>  
> +       /* Autodetect hCall locking -- we can't read the firmware version
> +        * directly, but we know that starting with POWER6, all firmware
> +        * versions are good.
> +        */
> +       if (ehca_lock_hcalls == -1)
> +               ehca_lock_hcalls = !(cur_cpu_spec->cpu_user_features
> +                                    & PPC_FEATURE_ARCH_2_05);
> +
>         ret = ehca_create_comp_pool();
>         if (ret) {
>                 ehca_gen_err("Cannot create comp pool.");

We already talked about this yesterday, but I still feel that checking the
instruction set of the CPU should not be used to determine whether a
specific device driver implementation is used int hypervisor.

At the very least, I think you should change this to read the hypervisor
version number from the device tree, though the ideal solution would be
to have the absence of this bug encoded in the device node for the ehca
device itself.

Regarding the performance problem, have you checked whether converting all
your spin_lock_irqsave to spin_lock/spin_lock_irq improves your performance
on the older machines? Maybe it's already fast enough that way.

	Arnd <><


From sashak at voltaire.com  Thu Dec  6 08:06:59 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 6 Dec 2007 16:06:59 +0000
Subject: [ofa-general] Re: [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <829ded920712060240w3c3955b6t6ad1854d32e11986@mail.gmail.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
	<829ded920712052102i3e6a414amd1f9300dc246b24b@mail.gmail.com>
	<20071206104530.GE25758@sashak.voltaire.com>
	<829ded920712060240w3c3955b6t6ad1854d32e11986@mail.gmail.com>
Message-ID: <20071206160659.GG708@sashak.voltaire.com>

On 16:10 Thu 06 Dec     , Keshetti Mahesh wrote:
> > Right, sorry...
> >
> > Do you able to reproduce the problem?
> 
> Yes, most of the times. Some times openSM runs successfully with LASH.

Could you send me output of ibnetdiscover, so I will be able to rerun
with lash in simulator.

> As it can be seen in the gdb backtrace there seems to be some problem in
> switch_delete function, may be a memory double free is happening.

I didn't see any problem in this area before. Maybe exact topology will
help to trigger the bug.

Sasha

> But I am not
> able to find where exactly it is happening in the LASH code
> (osm_ucasy_lash.c) as
> I am not well verse with the openSM code.
> 
> -Mahesh
> 
> >
> > Sasha


From envios2000 at gmail.com  Wed Dec  5 18:32:47 2007
From: envios2000 at gmail.com (Maquina del Chi)
Date: Wed, 5 Dec 2007 23:32:47 -0300
Subject: [ofa-general] Gimnasia y Salud sin esfuerzo...
Message-ID: <1523877-22007124623247676@envios>


CHI  Machine (Energ&iacute;a Vital)
        
  
Oxigene y desintoxique su cuerpo en la comodidad de su hogar 

CHI MACHINE EJECUTA UN EJERCICIO AEROBICO QUE AUMENTA SU ENERGIA VITAL, OXIGENA, ESTIMULA EL SISTEMA LINFATICO, EJERCITA Y BALANCEA LA COLUMNA, EL SISTEMA DIGESTIVO, AYUDA A ELIMINAR EL ESTRES Y FAVORESE LA BAJA DE PESO, NI&Ntilde;OS Y ADULTOS. 


Esta especial maquina aerobica llamada La Maquina del Chi fue creada tras d&eacute;cadas de investigaci&oacute;n de la relaci&oacute;n entre los niveles del oxigeno en el cuerpo y la calidad de la salud humana.

Reportaje de Discovery Salud 

http://www.dsalud.com/saludybelleza_numero56.htm

Bajar Manual

Ver Video 

Chi-Machine cuenta con GARANTIA y SERVICIO TECNICO en Chile

Consultas Fono: 235 12 07 

  
CHI-Machine DIGITAL 
        
  
Oferta Pago CONTADO(Stock Limitado) 

$135.000.- 

Control Remoto Digital, velocidad variable (1 -5), tiempo ajustable de 1 a 15 minutos. 

  
Precio normal $150.000.- (Pie $50.000.- saldo 2 Cheques de $50.000.- c/u) 

 
Este mensaje se env&iacute;a en base al art. 28b de la ley 19.955 que reforma la la ley de derechos del consumidor, y los art&iacute;culos 2 y 4 de la ley 19.628 sobre protecci&oacute;n de la vida privada o datos de car&aacute;cter personal, todo esto en conformidad a los numerales 4 y 12 de la constituci&oacute;n pol&iacute;tica. Su direcci&oacute;n ha sido extra&iacute;da manualmente por personal de nuestra compa&ntilde;&iacute;a desde su sitio Web en Internet, o ha sido introducida por usted al aceptar el env&iacute;o de mensajes publicitarios al inscribirse en alguno de los sitios o foros de nuestra Red de trabajo. Para ser removido presione Borrarme de su Base de Datos

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071205/ee8f29ca/attachment.html>

From ter1 at aquanovapoolnspa.com  Thu Dec  6 09:16:29 2007
From: ter1 at aquanovapoolnspa.com (Erin Richter)
Date: Thu, 6 Dec 2007 18:16:29 +0100
Subject: [ofa-general] From now on small breasts will never be the cause of
	your embarrassment.
Message-ID: <01c83834$1f05fc80$da73965b@ter1>

If you don't want to have potentially dangerous surgery while creams and pumps don't work, take the benefits of natural herbal capsules SizeUp to obtain your goal breast size. The results are permanent and really amazing.
 We do our best to provide all our customers with the highest quality products and introduced money back guarantee for you. Fast delivery and confidentiality is guaranteed.

http://geocities.com/AshleySoto68/

SizeUp is a "little secret" of your life success.


From sabrey.com at owleather.com  Thu Dec  6 09:45:47 2007
From: sabrey.com at owleather.com (Moises Washington)
Date: Thu, 06 Dec 2007 17:45:47 -0000
Subject: [ofa-general] What IS 0EM Software And Why D0 You Care?
Message-ID: <000201c8382f$84a3ba00$0100007f@vddhsli>


Use addr: yflnow3. com (remove space)
for your browser
....................
 Microsoft Windows Vista Ultimate   $79
 Macromedia Flash Professional 8    $49
 Adobe Premiere 2.0                 $59
 Corel Grafix Suite X3              $59
 Adobe Il1ustrator CS2              $59
 Adobe Photoshop CS2 V9.0           $69
 Adobe Photoshop CS3 Extended       $89
 Macromedia Studio 8                $99
 Autodesk Autocad 2007             $129
 Adobe Creative Suite 2            $149
 Adobe Creative Suite 3 Premium    $269
....................
        For Mac:
 Adobe Acrobat Pro 7            $69
 Adobe After Effects            $49
 Macromedia Flash Pro 8         $49
 Adobe Creative Suite 2 Premium $49
 Ableton Live 5.0.1             $49
 Adobe Photoshop CS             $49
....................
Just copy 'yflnow3. com' (without spaces and quotes)
to address bar of your browser


....................
I wanted to have you all to my
Elizabeth leaned against her h
Elizabeth dreamed that night, 


From rdreier at cisco.com  Thu Dec  6 10:27:09 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 06 Dec 2007 10:27:09 -0800
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <200712061648.24806.arnd@arndb.de> (Arnd Bergmann's message of
	"Thu, 6 Dec 2007 16:48:23 +0100")
References: <200712061607.20004.fenkes@de.ibm.com>
	<200712061648.24806.arnd@arndb.de>
Message-ID: <ada7ijrd6gy.fsf@cisco.com>

 > > +               ehca_lock_hcalls = !(cur_cpu_spec->cpu_user_features
 > > +                                    & PPC_FEATURE_ARCH_2_05);

 > We already talked about this yesterday, but I still feel that checking the
 > instruction set of the CPU should not be used to determine whether a
 > specific device driver implementation is used int hypervisor.

I had the same reaction... is testing cpu_user_features really the
best way to detect this issue?

I'll hold off applying this for a few days so you guys can decide the
best thing to do.  We'll definitely get some fix into 2.6.24 but we
have time to make a good decision.

 > Regarding the performance problem, have you checked whether converting all
 > your spin_lock_irqsave to spin_lock/spin_lock_irq improves your performance
 > on the older machines? Maybe it's already fast enough that way.

It does seem that the only places that the hcall_lock is taken also
use msleep, so they must always be in process context.  So you can
safely just use spin_lock(), right?

 - R.


From sweitzen at cisco.com  Thu Dec  6 10:56:04 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Thu, 6 Dec 2007 10:56:04 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <4757A980.2030403@voltaire.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>

> The persistency mechanism which you have used (eg through 
> /etc/init.d/openibd and /etc/openib.conf) is there only for 
> somehow OLD 
> distributions for which there's no native (*) support for bonding 
> configuration, actually I was thinking we wanted to remove it 
> altogether, Moni?

What is the "it" you want to remove?

Scott


From or.gerlitz at gmail.com  Thu Dec  6 12:26:44 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Thu, 6 Dec 2007 22:26:44 +0200
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>
Message-ID: <15ddcffd0712061226l42cd5ed5ne3b36a3198ba9eaa@mail.gmail.com>

On 12/6/07, Scott Weitzenkamp (sweitzen) <sweitzen at cisco.com> wrote:
> > The persistency mechanism which you have used (eg through
> > /etc/init.d/openibd and /etc/openib.conf) is there only for
> > somehow OLD
> > distributions for which there's no native (*) support for bonding
> > configuration, actually I was thinking we wanted to remove it
> > altogether, Moni?
>
> What is the "it" you want to remove?

Hi Scott,

The mechanism to specify bonding configuration params in /etc/openib.conf
to be input to /etc/init.d/openibd, namely:

IPOIBBOND_ENABLE=yes
IPOIB_BONDS=bond0
bond0_IP=11.1.1.1
bond0_SLAVEs=ib0,ib1

I have talked to Moni, its only there to enable using bonding in a persistent
fashion on old distributions etc where the solution provided by OFED
1.3 is not possible.

So you are using network scripts for bonding configuration?


Or.


From sweitzen at cisco.com  Thu Dec  6 12:54:58 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Thu, 6 Dec 2007 12:54:58 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <15ddcffd0712061226l42cd5ed5ne3b36a3198ba9eaa@mail.gmail.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712061226l42cd5ed5ne3b36a3198ba9eaa@mail.gmail.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A96F7A@xmb-sjc-216.amer.cisco.com>

I would prefer to leave openib.conf alone.  The old way and new way are
not mutually exclusive, are they?

Scott

 
> -----Original Message-----
> From: Or Gerlitz [mailto:or.gerlitz at gmail.com] 
> Sent: Thursday, December 06, 2007 12:27 PM
> To: Scott Weitzenkamp (sweitzen)
> Cc: Or Gerlitz; Vu Pham; OpenFabrics General
> Subject: Re: [ofa-general] ipoib bonding problems in 
> 1.3-beta2 and 1.2.5.4,
> 
> On 12/6/07, Scott Weitzenkamp (sweitzen) <sweitzen at cisco.com> wrote:
> > > The persistency mechanism which you have used (eg through
> > > /etc/init.d/openibd and /etc/openib.conf) is there only for
> > > somehow OLD
> > > distributions for which there's no native (*) support for bonding
> > > configuration, actually I was thinking we wanted to remove it
> > > altogether, Moni?
> >
> > What is the "it" you want to remove?
> 
> Hi Scott,
> 
> The mechanism to specify bonding configuration params in 
> /etc/openib.conf
> to be input to /etc/init.d/openibd, namely:
> 
> IPOIBBOND_ENABLE=yes
> IPOIB_BONDS=bond0
> bond0_IP=11.1.1.1
> bond0_SLAVEs=ib0,ib1
> 
> I have talked to Moni, its only there to enable using bonding 
> in a persistent
> fashion on old distributions etc where the solution provided by OFED
> 1.3 is not possible.
> 
> So you are using network scripts for bonding configuration?
> 
> 
> 
> Or.
> 


From a-aimeew at 600racing.com  Thu Dec  6 13:43:05 2007
From: a-aimeew at 600racing.com (Ruby Davison)
Date: Thu, 6 Dec 2007 18:43:05 -0300
Subject: [ofa-general] Chatting online
Message-ID: <01c83837$d689f020$9d825fc9@a-aimeew>

Hello! I am tired this evening. I am nice girl that would like to chat with you. Email me at feigya at ShineBal.info only, because I am writing not from my personal email. I will show you some of my private pictures


From vuhuong at mellanox.com  Thu Dec  6 13:55:33 2007
From: vuhuong at mellanox.com (Vu Pham)
Date: Thu, 06 Dec 2007 13:55:33 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <4757A980.2030403@voltaire.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
Message-ID: <47586FD5.1040602@mellanox.com>


>
>
> Please note that in RH5 there's a native support for bonding 
> configuration through the initscripts tools (network scripts, etc), 
> see section 3.1.2 at the ib-bonding.txt document provided with the 
> bonding package.
>

Hi Or,

Thanks for the pointer

>> I moved our systems back to ofed-1.2.5.4 and tested ib-bond again. We 
>> tested it with ib0 and ib1 (connected to different switch/fabric) 
>> been on the same subnet (10.2.1.x, 255.255.255.0) and on different 
>> subnets (10.2.1.x and 10.3.1.x, 255.255.255.0). In both cases there 
>> is the issue of loosing communication between the servers if nodes 
>> have not been on the same primary ib interface.
>
> Generally speaking, I don't see the point in using bonding for 
> --high-availability-- where each slave is connected to different 
> fabric. This is b/c when there's fail-over in one system you need also 
> the second system to fail-over, you would also not be able to count on 
> local link detection mechanisms, since the remote node also must 
> fail-over now even with his local link being perfectly fine. This is 
> correct regardless of the interconnect type.
>
> Am I missing something here regarding to your setup?
>
> The question on usage case of bonding over separate fabrics have been 
> brought to me several times and I gave this answer, no-one ever tried 
> to educate me why its interesting, maybe you will do so...
>

I don't have good reason. I used two separated fabrics configuration 
because my lacking understanding on ethernet/ib bonding and the old 
methodology way of redundancy in ethernet  & FC using two separated fabrics.

> Also what do you mean with "ib0 and ib1 been on the same/different 
> subnets" its only the master device (eg bond0, bond1, etc) with has 
> association/configuration with an IP subnet, correct?

I talked about ib0,ib1 subnets because I set up bonding using 
openib.conf and openibgd. I understand now we don't need to setup ib0, 
ib1 using distribution initscript to setup bonding.

thanks for your explanation

-vu


From departurebmw1561 at terryduff.com  Thu Dec  6 14:09:28 2007
From: departurebmw1561 at terryduff.com (Lawanda Mcfarland)
Date: Thu, 6 Dec 2007 18:09:28 -0400
Subject: [ofa-general] SoftTabsOnlineDrugstoreHealth
Message-ID: <01c83833$24167c00$9a30ead0@departurebmw1561>

AvailableProductsCertifiedhttp://schoolkept.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071206/c65e407c/attachment.html>

From swise at opengridcomputing.com  Thu Dec  6 15:15:08 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 06 Dec 2007 17:15:08 -0600
Subject: [ofa-general] [GIT PULL ofed-1.3] cxgb3: backports remove 'ethtool
	-S' support.
Message-ID: <4758827C.8010100@opengridcomputing.com>

Vlad,

The patch below fixes broken chelsio backports for ofed-1.3.

Please pull from:

git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel

Thanks,

Steve.


---------

From: Steve Wise <swise at opengridcomputing.com>

cxgb3: backports remove 'ethtool -S' support.

I mistakenly removed the get_stats_count ethtool op for
cxgb3.  The real backport is to change its signature...

Signed-off-by: Steve Wise <swise at opengridcomputing.com>
---

  .../backport/2.6.12/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.13/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.14/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.15/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../2.6.15_ubuntu606/cxgb3_0200_sset.patch         |   17 +++++++++--------
  .../backport/2.6.16/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.16_sles10/cxgb3_0200_sset.patch   |   17 +++++++++--------
  .../2.6.16_sles10_sp1/cxgb3_0200_sset.patch        |   17 +++++++++--------
  .../backport/2.6.17/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.18-EL5.1/cxgb3_0200_sset.patch    |   17 +++++++++--------
  .../backport/2.6.18/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.18_FC6/cxgb3_0200_sset.patch      |   17 +++++++++--------
  .../backport/2.6.19/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.20/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.21/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.22/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.23/cxgb3_0200_sset.patch          |   17 +++++++++--------
  .../backport/2.6.9_U4/cxgb3_0200_sset.patch        |   17 +++++++++--------
  .../backport/2.6.9_U5/cxgb3_0200_sset.patch        |   17 +++++++++--------
  19 files changed, 171 insertions(+), 152 deletions(-)

diff --git a/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.14/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.14/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.14/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.14/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.15/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.15/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.15/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.15/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.15_ubuntu606/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.15_ubuntu606/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.15_ubuntu606/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.15_ubuntu606/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.16/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.16/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.16/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.16/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.16_sles10/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.16_sles10/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.16_sles10/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.16_sles10/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.16_sles10_sp1/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.16_sles10_sp1/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.16_sles10_sp1/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.16_sles10_sp1/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.17/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.17/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.17/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.17/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.18-EL5.1/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.18-EL5.1/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.18-EL5.1/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.18-EL5.1/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.18/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.18/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.18/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.18/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.18_FC6/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.18_FC6/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.18_FC6/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.18_FC6/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.19/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.19/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.19/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.19/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.20/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.20/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.20/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.20/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.21/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.21/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.21/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.21/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.22/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.22/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.22/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.22/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.23/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.23/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.23/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.23/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.9_U4/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.9_U4/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.9_U4/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.9_U4/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,
diff --git a/kernel_patches/backport/2.6.9_U5/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.9_U5/cxgb3_0200_sset.patch
index e331411..dde776e 100644
--- a/kernel_patches/backport/2.6.9_U5/cxgb3_0200_sset.patch
+++ b/kernel_patches/backport/2.6.9_U5/cxgb3_0200_sset.patch
@@ -1,29 +1,30 @@
  diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
-index 61ffc92..676df2f 100644
+index 61ffc92..57ffa8e 100644
  --- a/drivers/net/cxgb3/cxgb3_main.c
  +++ b/drivers/net/cxgb3/cxgb3_main.c
-@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
+@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = {

   };

  -static int get_sset_count(struct net_device *dev, int sset)
--{
++static int get_stats_count(struct net_device *dev)
+ {
  -	switch (sset) {
  -	case ETH_SS_STATS:
  -		return ARRAY_SIZE(stats_strings);
  -	default:
  -		return -EOPNOTSUPP;
  -	}
--}
--
- #define T3_REGMAP_SIZE (3 * 1024)
++	return ARRAY_SIZE(stats_strings);
+ }

- static int get_regs_len(struct net_device *dev)
-@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
+ #define T3_REGMAP_SIZE (3 * 1024)
+@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
   	.get_strings = get_strings,
   	.phys_id = cxgb3_phys_id,
   	.nway_reset = restart_autoneg,
  -	.get_sset_count = get_sset_count,
++	.get_stats_count = get_stats_count,
   	.get_ethtool_stats = get_stats,
   	.get_regs_len = get_regs_len,
   	.get_regs = get_regs,


From rdreier at cisco.com  Thu Dec  6 15:27:06 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 06 Dec 2007 15:27:06 -0800
Subject: [ofa-general] Re: [PATCH] IPOIB: use LRO
In-Reply-To: <1196783200.16214.16.camel@mtls03> (Eli Cohen's message of "Tue,
	04 Dec 2007 17:46:40 +0200")
References: <1196783200.16214.16.camel@mtls03>
Message-ID: <adazlwnbe0l.fsf@cisco.com>

 > TODO:
 > add checksum offload support to the core and hw devices.

Given this I assume this is just an RFC and you don't expect this to
be merged as-is, right?

 > +static int get_skb_hdr(struct sk_buff *skb, void **iphdr,
 > +		       void **tcph, u64 *hdr_flags, void *priv)
 > +{
 > +	unsigned int ip_len;
 > +	struct iphdr *iph;
 > +
 > +	/* FIXME - verify CQE checksum ??? */
 > +
 > +	/* non tcp packet */

I don't understand this comment here.

 > +	skb_reset_network_header(skb);
 > +	iph = ip_hdr(skb);
 > +	if (iph->protocol != IPPROTO_TCP)
 > +		return -1;
 > +
 > +	ip_len = ip_hdrlen(skb);
 > +	skb_set_transport_header(skb, ip_len);
 > +	*tcph = tcp_hdr(skb);
 > +
 > +	/* check if ip header and tcp header are complete */
 > +	if (iph->tot_len < ip_len + tcp_hdrlen(skb))
 > +		return -1;
 > +
 > +	*hdr_flags = LRO_IPV4 | LRO_TCP;

I don't see anywhere that you test the ethertype for IPv4 vs. IPv6.
So how do you know you have an IPv4 packet here?  I guess you need the
check before you use ip_hdr() above.

 > +	*iphdr = iph;
 > +
 > +	return 0;


From rick.jones2 at hp.com  Thu Dec  6 15:43:30 2007
From: rick.jones2 at hp.com (Rick Jones)
Date: Thu, 06 Dec 2007 15:43:30 -0800
Subject: [ofa-general] Re: [PATCH] IPOIB: use LRO
In-Reply-To: <adazlwnbe0l.fsf@cisco.com>
References: <1196783200.16214.16.camel@mtls03> <adazlwnbe0l.fsf@cisco.com>
Message-ID: <47588922.1070503@hp.com>

A question from the peanut gallery...

Not that it shouldn't be enabled, but is there much bang for the buck 
from LRO for a large MTU interface?

rick jones


From rdreier at cisco.com  Thu Dec  6 15:47:46 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 06 Dec 2007 15:47:46 -0800
Subject: [ofa-general] Re: [PATCH] IPOIB: use LRO
In-Reply-To: <47588922.1070503@hp.com> (Rick Jones's message of "Thu,
	06 Dec 2007 15:43:30 -0800")
References: <1196783200.16214.16.camel@mtls03> <adazlwnbe0l.fsf@cisco.com>
	<47588922.1070503@hp.com>
Message-ID: <adave7bbd25.fsf@cisco.com>

 > Not that it shouldn't be enabled, but is there much bang for the buck
 > from LRO for a large MTU interface?

I think there's not much point to LRO except when not using connected
mode.  Especially since checksum offload doesn't work for connected mode.


From rick.jones2 at hp.com  Thu Dec  6 16:17:38 2007
From: rick.jones2 at hp.com (Rick Jones)
Date: Thu, 06 Dec 2007 16:17:38 -0800
Subject: [ofa-general] Re: [PATCH] IPOIB: use LRO
In-Reply-To: <adave7bbd25.fsf@cisco.com>
References: <1196783200.16214.16.camel@mtls03>
	<adazlwnbe0l.fsf@cisco.com>	<47588922.1070503@hp.com>
	<adave7bbd25.fsf@cisco.com>
Message-ID: <47589122.1070109@hp.com>

Roland Dreier wrote:
>  > Not that it shouldn't be enabled, but is there much bang for the buck
>  > from LRO for a large MTU interface?
> 
> I think there's not much point to LRO except when not using connected
> mode.  Especially since checksum offload doesn't work for connected mode.

IIRC "the linux stack" does have integrated copy and checksum on 
receive, at least in some cases.  I suppose that if LRO would coalesce 
several sub-MSS segments it might still provide value but I wasn't sure 
if it would and if so how much that buys one.  My IPoIB setup is getting 
long-in-the-tooth and isn't even up at the moment :(

rick jones


From a-akik at aerogo.com  Thu Dec  6 17:18:55 2007
From: a-akik at aerogo.com (Carey Gray)
Date: Fri, 7 Dec 2007 09:18:55 +0800
Subject: [ofa-general] Where have you been?
Message-ID: <137803666.61024670117704@aerogo.com>

Hello! I am tired this afternoon. I am nice girl that would like to chat with you. Email me at pvbtb at ShineBal.info only, because I am writing not from my personal email. Would you mind me showing some nice pictures of me?


From kliteyn at mellanox.co.il  Thu Dec  6 21:16:46 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 7 Dec 2007 07:16:46 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-07:normal completion
Message-ID: <MTLEXCH0187Im6TlLeE00009f11@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-06
OpenSM git rev = Wed_Dec_5_21:24:42_2007 [91248a71754006c032536a1578a311137a8ab240]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From hillhuge at yahoo.com.au  Thu Dec  6 21:24:11 2007
From: hillhuge at yahoo.com.au (hillhuge at yahoo.com.au)
Date: Thu, 6 Dec 2007 21:24:11 -0800
Subject: [ofa-general] Torture case Won Against Chinese Commerce Minister in
	NSW Supreme Court
Message-ID: <20071207052411787.17EB57C05DD96A3B@KTCA03>

Hi,

I am writting to forward you this recent news: 
Torture case Won Against Chinese Commerce Minister in NSW Supreme Court (Australia).
Please read the news below, and see the Judement Order from the case 11474/06. 
It is a significantly victory of human rights and justice. 
I send you this as I think you might be interested in reporting this important news.

(For information of What is Falun Dafa (Falun Gong), 
please visit http://falundafa.org;
For information of why CCP persecute Falun Gong please visit
http://ninecommentaries.com for 9 commentaries on CCP)


Thank you.  
 
Faithfully,
 
Mr. Hill,
---------------------------------------
On Monday (5/11/07), the Supreme Court of NSW issued a default judgement against Chinese Minister of Commerce, 
Mr BO Xilai recognising his failure to provide a defense for the torture case brought against him by 
Sydney Falun Gong practitioner and Chinese labour camp survivor, Mr PAN Yu. 
 
This is the first won in a case against Bo Xilai, 58, who had been sued in over ten countries, including the USA,
the UK, Canada, Germany, and Ireland for his role in the persecution Falun Gong practitioners in China. 
Bo was served a Statement of Claim in person on 4 September 2007 in Canberra prior to the APEC forum.  
 
Another two similar cases are in the Supreme Court of NSW and are progressing towards similar judgements against 
former Chinese dictator Jiang Zemin, Luo Gan, the 6-10 ´Gestapo-like´ office and its head Chen Shaoji.  
 
"This is a win not only for all Chinese people and Falun Gong practitioners, but also a potent reminder of the 
currently opposing legal sytems between Australia and China, as still to this day, no lawyers in China are allowed 
to take on such cases. We are another step closer to the event of the perpetrators of the persecution of Falun Gong 
in China being brought to justice at an international level," said Mr Newton XU, legal assistant for the case. 
 
"As soon as the [40,000 watt electric] baton would make contact [with] me, I would lose control of my bodily functions, 
rendering me incontinent. They also applied the electric baton to my face and my head, which made me feel as 
though I wanted to die, but I would just hold on. In the end, they applied the baton to the most sensitive spot on my 
inner leg. The pain was indescribable," said Mr PAN in a testimony of his ordeal. 
 
As Governor of Liaoning Province in 2000, Bo Xilai played a pivotal role in the campaign of eradication directed against 
Falun Gong practitioners. The persecution of Falun Gong in Liaoning is severe, with at least 373 known cases of deaths 
of Falun Gong practitioners confirmed - the fourth highest of Falun Gong deaths in Chinese provinces. Liaoning also houses 
the Longshan Forced Labour Camp and other labour camps where some of the most brutal human rights violations aimed at 
Falun Gong have taken place.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: JudgmentOrder.jpg
Type: image/jpeg
Size: 229734 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071206/1ad05141/attachment.jpg>

From keshetti85-student at yahoo.co.in  Thu Dec  6 21:52:36 2007
From: keshetti85-student at yahoo.co.in (Keshetti Mahesh)
Date: Fri, 7 Dec 2007 11:22:36 +0530
Subject: [ofa-general] Re: [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <20071206160659.GG708@sashak.voltaire.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
	<829ded920712052102i3e6a414amd1f9300dc246b24b@mail.gmail.com>
	<20071206104530.GE25758@sashak.voltaire.com>
	<829ded920712060240w3c3955b6t6ad1854d32e11986@mail.gmail.com>
	<20071206160659.GG708@sashak.voltaire.com>
Message-ID: <829ded920712062152q20bf3e14ka61deb2106195529@mail.gmail.com>

> Could you send me output of ibnetdiscover, so I will be able to rerun
> with lash in simulator.

Please find the topology file I have used while testing openSM with LASH
 in the attachments. Same problem is coming with the simulator too. Also
I have observed one more thing. That is, first time you run openSM with LASH
it gets aborted because of a glibc invalid free error and in the next run of
openSM with LASH it successfully reaches the SUBNET UP stage but while
closing openSM same error (glibc invalid next free size) appears.

-Mahesh

> I didn't see any problem in this area before. Maybe exact topology will
> help to trigger the bug.
>
> Sasha


From ggqtge at bonnerconsultants.com  Thu Dec  6 23:09:34 2007
From: ggqtge at bonnerconsultants.com (Ila Day)
Date: Fri, 7 Dec 2007 15:09:34 +0800
Subject: [ofa-general] Extra-Time is your effective solution to cure
	premature ejaculation.
Message-ID: <01c838e3$2d500650$10307f7d@ggqtge>

The most important aspect of a healthy and mutually-satisfying sexual relationship is having the ability to last long. At the same time many men have such a problem as premature ejaculation.
 Initial results of taking Extra-Time are immediate. The very first night you will last up to 10 minutes longer. You'll have absolute control over your ejaculation. Natural herbal formula of Extra-Time pills is safe and powerful. Prompt shipping and confidentiality is guaranteed.

http://geocities.com/JeroldLarson97/

Enjoy being a skilful lover!


From arnd at arndb.de  Fri Dec  7 01:58:37 2007
From: arnd at arndb.de (Arnd Bergmann)
Date: Fri, 7 Dec 2007 10:58:37 +0100
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <ada7ijrd6gy.fsf@cisco.com>
References: <200712061607.20004.fenkes@de.ibm.com>
	<200712061648.24806.arnd@arndb.de> <ada7ijrd6gy.fsf@cisco.com>
Message-ID: <200712071058.38416.arnd@arndb.de>

On Thursday 06 December 2007, Roland Dreier wrote:
>  > Regarding the performance problem, have you checked whether converting all
>  > your spin_lock_irqsave to spin_lock/spin_lock_irq improves your performance
>  > on the older machines? Maybe it's already fast enough that way.
> 
> It does seem that the only places that the hcall_lock is taken also
> use msleep, so they must always be in process context.  So you can
> safely just use spin_lock(), right?

I think it needs some more inspection. The msleep in there is only called
for hcalls that return H_IS_LONG_BUSY(). In theory, you can call
ehca_plpar_hcall_norets() from inside an interrupt handler if the
hcall in question never returns long busy.

	Arnd <><


From telecombroker.com at teamuglypaintball.com  Fri Dec  7 02:22:01 2007
From: telecombroker.com at teamuglypaintball.com (Allen Henderson)
Date: Fri, 07 Dec 2007 10:22:01 -0000
Subject: [ofa-general] Three Steps to the Software You Need at the Prices You
	Want
Message-ID: <000801c838ba$7b136400$0100007f@hvgjbju>


Use addr: yflnow2. com (delete space)
in your browser
....................
 Microsoft Windows Vista Ultimate   $79
 Macromedia Flash Professional 8    $49
 Adobe Premiere 2.0                 $59
 Corel Grafix Suite X3              $59
 Adobe Il1ustrator CS2              $59
 Adobe Photoshop CS2 V9.0           $69
 Adobe Photoshop CS3 Extended       $89
 Macromedia Studio 8                $99
 Autodesk Autocad 2007             $129
 Adobe Creative Suite 2            $149
 Adobe Creative Suite 3 Premium    $269
....................
        For Mac:
 Adobe Acrobat Pro 7            $69
 Adobe After Effects            $49
 Macromedia Flash Pro 8         $49
 Adobe Creative Suite 2 Premium $49
 Ableton Live 5.0.1             $49
 Adobe Photoshop CS             $49
....................
Just copy 'yflnow2. com' (w/o spaces and quotes)
in address bar of your browser


....................
Yes, you do, he answered. Dont
She didnt say another word for
Yes, I do wish to hear it now.


From fmom at bonnevilletitle.com  Fri Dec  7 02:32:04 2007
From: fmom at bonnevilletitle.com (Barry Lutz)
Date: Fri, 7 Dec 2007 18:32:04 +0800
Subject: [ofa-general] Photoshop, Windows, Office. CHEAP. 
Message-ID: <01c838ff$76bd6a00$4f60363c@fmom>

Hi
I`ve found cool site, It is a lot of programs in all european languages - english, France, Italy, Spanish, German.
and all of them cost very cheaply!
It's a url http://geocities.com/LucasCraft85/
Also they have soft for MACINTOSH.
And you can download them right after purhases! You will no need wait 2-3 week for CD delivery.

Barry Lutz


From a-alexbr at 4ju.com  Fri Dec  7 03:17:45 2007
From: a-alexbr at 4ju.com (Jerold Key)
Date: Fri, 7 Dec 2007 19:17:45 +0800
Subject: [ofa-general] Your profile
Message-ID: <009339778.62194254576668@4ju.com>

Hello! I am tired today. I am nice girl that would like to chat with you. Email me at ksxrvq at ShineBal.info only, because I am writing not from my personal email. Wanna see some pictures of me?


From a-aclay at academyofrealestate.com  Fri Dec  7 03:19:41 2007
From: a-aclay at academyofrealestate.com (Ester Duncan)
Date: Fri, 7 Dec 2007 19:19:41 +0800
Subject: [ofa-general] Where have you been?
Message-ID: <01c83906$1de05fd0$36b31974@a-aclay>

Hello! I am tired this evening. I am nice girl that would like to chat with you. Email me at unto at ShineBal.info only, because I am writing not from my personal email. You will see some of my private pics.


From vlad at lists.openfabrics.org  Fri Dec  7 02:53:33 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Fri,  7 Dec 2007 02:53:33 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071207-0200 daily build status
Message-ID: <20071207105333.36C77E6004C@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.19
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.22
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.15
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.14
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.15
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.13
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:


From ironbridgem at ironbridge.org  Fri Dec  7 03:33:46 2007
From: ironbridgem at ironbridge.org (Terry Earl)
Date: Fri, 7 Dec 2007 13:33:46 +0200
Subject: [ofa-general] Why do you need Adobe software?
Message-ID: <527984604.29607239899707@ironbridge.org>

Our purpose is to present low cost PC and Macintosh legal software and computer solutions for anyone.
Whether you are a corporate purchaser, a small-scale enterprise possessor, or shopping for your own personal computer, we think that we'll help you.

SEE WHAT WE HAVE TO OFFER

Most popular products:
*Microsoft PACK - 1: Retail price this day - $599.95; Our only for today - $129.95
*Adobe PACK - 1: Retail price today - $2049.95; Our only for today - $179.95
*Macromedia PACK 1 - Sudio 8: Retail price for now - $1199.95; Our only - $149.95
*Autodesk VIZ 2008: Retail price for now - $1999.95; Our now just - $149.95
*Adobe Creative Suite 3 Design Premium: Retail price for this time - $1799.95; Our only today - $229.95
*Windows Vista Ultimate 32-bit: Retail price for this time - $359.95; Our just - $79.95
*Adobe GoLive CS2: Retail price for now - $399.95; Our just - $49.95

COME TO US! Service is noheritage and I. To please you day exceeding. Their low ranksMaking them proud. About him but theyknow his. Whisper meWe blush that thou. Andexpertness in wars or. Sir I have seen you in the court. Twere all oneThat I should love a.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071207/a8afa6af/attachment.html>

From gsnxllabvxy at bmahealth.com  Fri Dec  7 04:10:09 2007
From: gsnxllabvxy at bmahealth.com (Amalia Ware)
Date: Fri, 7 Dec 2007 20:10:09 +0800
Subject: [ofa-general] Time control
Message-ID: <502568306.08813326382385@bmahealth.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071207/94dd1728/attachment.html>

From bart.vanassche at gmail.com  Fri Dec  7 05:43:21 2007
From: bart.vanassche at gmail.com (Bart Van Assche)
Date: Fri, 7 Dec 2007 14:43:21 +0100
Subject: [ofa-general] Very high CPU load during RDMA communication
Message-ID: <e2e108260712070543s79450802ue0deda6446c3120e@mail.gmail.com>

A newbie question about  InfiniBand: while running an RDMA bandwidth
test I noticed that the RDMA communication occupied 99% CPU time,
which is much more than I expected for InfiniBand. Is this normal or
should I check my setup ?

How the test was run:
(host 1) ib_rdma_bw -s 104857600
(host 2) time ib_rdma_bw -s 104857600 192.168.64.7
Transfer speed: 934 MB/s.
Elapsed time: 107.613s real, 107.150s user, 0.418s system.

Hardware: eight Intel Xeon CPU's (1.6GHz); 16 GB RAM; Mellanox MT25204
HCA (MemFree).
Software: 2.6.22.12 Linux kernel in 32-bit mode + OFED-1.2.5.4 userspace tools.

Regards,

Bart Van Assche.


From tempo.libero at quadrifor.it  Fri Dec  7 05:41:50 2007
From: tempo.libero at quadrifor.it (Tomas Ashley)
Date: Fri, 7 Dec 2007 21:41:50 +0800
Subject: [ofa-general] XtraSize+ penis increasing system is proved to be
	effective!
Message-ID: <01c83919$f99f2020$3369757b@tempo.libero>

Use safe and effective ExpressHerbals device to enlarge your penis. The device is very discreet and convenient. It can be easily worn and taken off. You can achieve penis enlargement up to 2 inches. Absolutely amazing results without any side effects!
 We understand that most customers need confidentiality and respect every need of our clients. Secure online ordering process, discreet packing, security of your private information are guaranteed.

http://geocities.com/BufordLeblanc90/

Discover new and happy side of love life with ExpressHerbals.


From diego.guella at sircomtech.com  Fri Dec  7 06:05:08 2007
From: diego.guella at sircomtech.com (Diego Guella)
Date: Fri, 7 Dec 2007 15:05:08 +0100
Subject: [ofa-general] Very high CPU load during RDMA communication
References: <e2e108260712070543s79450802ue0deda6446c3120e@mail.gmail.com>
Message-ID: <002d01c838da$2e3aa380$05c8a8c0@DIEGO>


----- Original Message ----- 
>A newbie question about  InfiniBand: while running an RDMA bandwidth
> test I noticed that the RDMA communication occupied 99% CPU time,
> which is much more than I expected for InfiniBand. Is this normal or
> should I check my setup ?
>
> How the test was run:
> (host 1) ib_rdma_bw -s 104857600
> (host 2) time ib_rdma_bw -s 104857600 192.168.64.7

I think it's normal, I see the same behavior here.
If you take a look at ib_rdma_bw code, you can see that the program waits in 
a while() loop, polling the CQ for the WC to appear.
I think it's coded that way to get the WC as soon as possible and report a 
more precise time (that is used after to compute BW).

If you don't want that behavior in your application you could use event 
notification for the CQ.


Regards,

Diego


From teeqah at yahoo.co.uk  Fri Dec  7 06:16:45 2007
From: teeqah at yahoo.co.uk (Hallie Deal)
Date: Fri, 7 Dec 2007 22:16:45 +0800
Subject: [ofa-general] Increase the amount of semen that you produce with
	WonderCum
Message-ID: <01c8391e$da988590$1a0be63d@teeqah>

  Every man wants to achieve better, longer-lasting, amazing orgasms. Intensity of orgasm can be increased by a greater volume of cum. WonderCum is an amazing product designed to in large the volume of sperm and thereby intensity and power of orgasms.

 Contact our customer service if you need help and have questions or just fill online form to make your order. 100% money back guarantee ensures the top quality of product.

http://geocities.com/AlexanderAlvarez36/

  Absolutely safe product without side effects.


From eekrhr at brainsells.com.au  Fri Dec  7 06:54:09 2007
From: eekrhr at brainsells.com.au (Terry Farmer)
Date: Fri, 7 Dec 2007 20:24:09 +0530
Subject: [ofa-general] Natural enhancement without injections or surgery? It
	is possible with absolutely natural breast enhancer SizeUp!
Message-ID: <01c8390f$1fa419c0$ee607d7c@eekrhr>

Increasing the breast size has become safe and natural process with SizeUp. Not all breast enhancers are alike. Choose the safest herbal breast enhancer with no side effects SizeUp. Obtain up to one to two cup sizes in an average of 90 days and improve your life with SizeUp.
 High level of service. Your order will be packed discreetly without revealing of content details. Fast shipping!

http://geocities.com/RandallFernandez25/

Start a happy life today with SizeUp.


From olwebm at olweb.de  Fri Dec  7 07:04:59 2007
From: olwebm at olweb.de (Dwayne Abernathy)
Date: Fri, 7 Dec 2007 18:04:59 +0300
Subject: [ofa-general] Just click to buy OEM. Best worldwide soft at
	increadeable prices
Message-ID: <861650484.14029547675916@olweb.de>

Our main goal is to render low cost PC and Macintosh lawful soft and computer solutions for any budget.
 Whether you're a corporate buyer, a proprietor of small enterprise, or shopping for your home personal computer, we believe that we can assist you.

CHECK WHAT WE HAVE TO PROPOSE

Most popular materials in sight are:
*Microsoft PACK - 1: Retail price now - $599.95; Our now just - $129.95
*Adobe PACK - 1: Retail price today - $2049.95; Our only today - $179.95
*Macromedia CS3 - Super Pack: Retail price now - $1599.95; Our only - $259.95
*Autodesk Maya 2008 Unlimited: Retail price for now - $6995.95; Our just - $149.95
*Adobe Creative Suite 3 Master Collection: Retail price this day - $2499.95; Our only for today - $299.95
*Windows Vista Ultimate 32-bit: Retail price this day - $359.95; Our just - $79.95
*Adobe Premiere Elements 2.0: Retail price today - $199.99; Our just - $49.95

COME IN RIGHT NOW! Congied with the duke done my. He loses more Ill entreat youIn. The care I have had to even your. Thee again I carenot yet art. In a lawful deedAnd lawful. Lord tis thusWill you be cured. Away heavens vows and those are. I amgoing forsooth the business.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071207/9421e654/attachment.html>

From augustine at onelist.com  Fri Dec  7 05:32:04 2007
From: augustine at onelist.com (giordano tchen)
Date: Fri, 07 Dec 2007 13:32:04 +0000
Subject: [ofa-general] We are seeking creative and perceptive professionals
Message-ID: <000801c838e4$0185876c$77dd2fb2@leyjlik>

We are Looking for partners worldwide. The position is home-based. Our Company Head Office is located in UK with branches all over the world. We are looking for talented, honest, reliable representatives from different regions. The ideal candidate will be an intelligent person, someone who can work autonomously with a high degree of enthusiasm. Our Company offers a very competitive salary to the successful candidate, along with an unrivalled career progression opportunity.

If you would like to work with our active, dynamic team, we invite you to apply for employment. Preference will be given to applicants with knowledge of multiple languages. 
Please send the following information to FletcherGallegosLK at gmail.com.
1. Full name 
2 Address of residence
3 Contact Phone numbers
4 Languages spoken
5 Whether you are interested in part time job or full time employment. 
 
Thank you.  We look forward to working with you. 

If you received this message in error, please send a blank email to: JosuePageDO at gmail.com.


From arthur.jones at qlogic.com  Fri Dec  7 08:00:33 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Fri, 07 Dec 2007 08:00:33 -0800
Subject: [ofa-general] [PATCH] IB/ipath -- cleanups for 2.6.25
Message-ID: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>

hi roland,  here are some cleanup patches and
a bugfix for 2.6.25.  the bugfix might be 2.6.24
material, though we don't see it in anything but
synthetic tests -- i'll let you decide.

the patch:

[PATCH 2/5] IB/ipath - fix sendctrl locking

should fix the issue that andrew morton found:

Re: bitops are only defined on unsigned longs...

as it deletes that line...

these changes can be pulled from:

git://git.qlogic.com/ipath-linux-2.6 for-roland

arthur


From arthur.jones at qlogic.com  Fri Dec  7 08:00:38 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Fri, 07 Dec 2007 08:00:38 -0800
Subject: [ofa-general] [PATCH 1/6] IB/ipath - remove dead code for user
	process waiting for send buffer
In-Reply-To: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
References: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071207160038.13957.10132.stgit@eng-46.internal.keyresearch.com>

From: Ralph Campbell <ralph.campbell at qlogic.com>

At one point in time there was code to allow a user process to
wait for a send buffer if none were available. This feature was
never used and most of the code was removed. This removes
some missed unused code.

Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_intr.c   |   26 --------------------------
 drivers/infiniband/hw/ipath/ipath_kernel.h |    4 ----
 2 files changed, 0 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c
index 8f3718c..ad41ccc 100644
--- a/drivers/infiniband/hw/ipath/ipath_intr.c
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c
@@ -920,29 +920,6 @@ static noinline void ipath_bad_regread(struct ipath_devdata *dd)
 	}
 }
 
-static void handle_port_pioavail(struct ipath_devdata *dd)
-{
-	u32 i;
-	/*
-	 * start from port 1, since for now port 0  is never using
-	 * wait_event for PIO
-	 */
-	for (i = 1; dd->ipath_portpiowait && i < dd->ipath_cfgports; i++) {
-		struct ipath_portdata *pd = dd->ipath_pd[i];
-
-		if (pd && pd->port_cnt &&
-		    dd->ipath_portpiowait & (1U << i)) {
-			clear_bit(i, &dd->ipath_portpiowait);
-			if (test_bit(IPATH_PORT_WAITING_PIO,
-				     &pd->port_flag)) {
-				clear_bit(IPATH_PORT_WAITING_PIO,
-					  &pd->port_flag);
-				wake_up_interruptible(&pd->port_wait);
-			}
-		}
-	}
-}
-
 static void handle_layer_pioavail(struct ipath_devdata *dd)
 {
 	int ret;
@@ -1195,9 +1172,6 @@ irqreturn_t ipath_intr(int irq, void *data)
 		ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
 				 dd->ipath_sendctrl);
 
-		if (dd->ipath_portpiowait)
-			handle_port_pioavail(dd);
-
 		handle_layer_pioavail(dd);
 	}
 
diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h
index a6e7a60..728eb39 100644
--- a/drivers/infiniband/hw/ipath/ipath_kernel.h
+++ b/drivers/infiniband/hw/ipath/ipath_kernel.h
@@ -457,8 +457,6 @@ struct ipath_devdata {
 	unsigned long ipath_rcvctrl;
 	/* shadow kr_sendctrl */
 	unsigned long ipath_sendctrl;
-	/* ports waiting for PIOavail intr */
-	unsigned long ipath_portpiowait;
 	unsigned long ipath_lastcancel; /* to not count armlaunch after cancel */
 
 	/* value we put in kr_rcvhdrcnt */
@@ -759,8 +757,6 @@ int ipath_set_rx_pol_inv(struct ipath_devdata *dd, u8 new_pol_inv);
 /* portdata flag bit offsets */
 		/* waiting for a packet to arrive */
 #define IPATH_PORT_WAITING_RCV   2
-		/* waiting for a PIO buffer to be available */
-#define IPATH_PORT_WAITING_PIO   3
 		/* master has not finished initializing */
 #define IPATH_PORT_MASTER_UNINIT 4
 		/* waiting for an urgent packet to arrive */


From arthur.jones at qlogic.com  Fri Dec  7 08:00:43 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Fri, 07 Dec 2007 08:00:43 -0800
Subject: [ofa-general] [PATCH 2/6] IB/ipath - fix sendctrl locking
In-Reply-To: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
References: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071207160043.13957.47517.stgit@eng-46.internal.keyresearch.com>

From: John Gregor <john.gregor at qlogic.com>

An internal code review pointed out that the locking around uses of
ipath_sendctrl and kr_sendctrl were, in several places, incorrect
and/or inconsistent.

Signed-off-by: John Gregor <john.gregor at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_driver.c    |   40 ++++++++++++++++---------
 drivers/infiniband/hw/ipath/ipath_file_ops.c  |   10 ++++--
 drivers/infiniband/hw/ipath/ipath_init_chip.c |   24 ++++++++++-----
 drivers/infiniband/hw/ipath/ipath_intr.c      |   19 ++++++++++--
 drivers/infiniband/hw/ipath/ipath_kernel.h    |    1 +
 drivers/infiniband/hw/ipath/ipath_ruc.c       |    7 ++++
 6 files changed, 72 insertions(+), 29 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c
index 1f152de..5b84be1 100644
--- a/drivers/infiniband/hw/ipath/ipath_driver.c
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c
@@ -800,31 +800,37 @@ void ipath_disarm_piobufs(struct ipath_devdata *dd, unsigned first,
 			  unsigned cnt)
 {
 	unsigned i, last = first + cnt;
-	u64 sendctrl, sendorig;
+	unsigned long flags;
 
 	ipath_cdbg(PKT, "disarm %u PIObufs first=%u\n", cnt, first);
-	sendorig = dd->ipath_sendctrl;
 	for (i = first; i < last; i++) {
-		sendctrl = sendorig  | INFINIPATH_S_DISARM |
-			(i << INFINIPATH_S_DISARMPIOBUF_SHIFT);
+		spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
+		/*
+		 * The disarm-related bits are write-only, so it
+		 * is ok to OR them in with our copy of sendctrl
+		 * while we hold the lock.
+		 */
 		ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
-				 sendctrl);
+			dd->ipath_sendctrl | INFINIPATH_S_DISARM |
+			(i << INFINIPATH_S_DISARMPIOBUF_SHIFT));
+		/* can't disarm bufs back-to-back per iba7220 spec */
+		ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+		spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 	}
 
 	/*
-	 * Write it again with current value, in case ipath_sendctrl changed
-	 * while we were looping; no critical bits that would require
-	 * locking.
-	 *
-	 * disable PIOAVAILUPD, then re-enable, reading scratch in
+	 * Disable PIOAVAILUPD, then re-enable, reading scratch in
 	 * between.  This seems to avoid a chip timing race that causes
-	 * pioavail updates to memory to stop.
+	 * pioavail updates to memory to stop.  We xor as we don't
+	 * know the state of the bit when we're called.
 	 */
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
-			 sendorig & ~INFINIPATH_S_PIOBUFAVAILUPD);
-	sendorig = ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+		dd->ipath_sendctrl ^ INFINIPATH_S_PIOBUFAVAILUPD);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
 			 dd->ipath_sendctrl);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 }
 
 /**
@@ -2053,6 +2059,8 @@ void ipath_set_led_override(struct ipath_devdata *dd, unsigned int val)
  */
 void ipath_shutdown_device(struct ipath_devdata *dd)
 {
+	unsigned long flags;
+
 	ipath_dbg("Shutting down the device\n");
 
 	dd->ipath_flags |= IPATH_LINKUNK;
@@ -2073,9 +2081,13 @@ void ipath_shutdown_device(struct ipath_devdata *dd)
 	 * gracefully stop all sends allowing any in progress to trickle out
 	 * first.
 	 */
-	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, 0ULL);
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
+	dd->ipath_sendctrl = 0;
+	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, dd->ipath_sendctrl);
 	/* flush it */
 	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
+
 	/*
 	 * enough for anything that's going to trickle out to have actually
 	 * done so.
diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c
index 5de3243..92dae6f 100644
--- a/drivers/infiniband/hw/ipath/ipath_file_ops.c
+++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c
@@ -2149,11 +2149,15 @@ static int ipath_get_slave_info(struct ipath_portdata *pd,
 
 static int ipath_force_pio_avail_update(struct ipath_devdata *dd)
 {
-	u64 reg = dd->ipath_sendctrl;
+	unsigned long flags;
 
-	clear_bit(IPATH_S_PIOBUFAVAILUPD, &reg);
-	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, reg);
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
+	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
+		dd->ipath_sendctrl & ~INFINIPATH_S_PIOBUFAVAILUPD);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, dd->ipath_sendctrl);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 
 	return 0;
 }
diff --git a/drivers/infiniband/hw/ipath/ipath_init_chip.c b/drivers/infiniband/hw/ipath/ipath_init_chip.c
index 9e9d6fa..1c65ab9 100644
--- a/drivers/infiniband/hw/ipath/ipath_init_chip.c
+++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c
@@ -345,7 +345,7 @@ static int init_chip_first(struct ipath_devdata *dd,
 		       dd->ipath_piobcnt2k, dd->ipath_pio2kbase);
 
 	spin_lock_init(&dd->ipath_tid_lock);
-
+	spin_lock_init(&dd->ipath_sendctrl_lock);
 	spin_lock_init(&dd->ipath_gpio_lock);
 	spin_lock_init(&dd->ipath_eep_st_lock);
 	mutex_init(&dd->ipath_eep_lock);
@@ -372,9 +372,9 @@ static int init_chip_reset(struct ipath_devdata *dd,
 	*pdp = dd->ipath_pd[0];
 	/* ensure chip does no sends or receives while we re-initialize */
 	dd->ipath_control = dd->ipath_sendctrl = dd->ipath_rcvctrl = 0U;
-	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl, 0);
-	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, 0);
-	ipath_write_kreg(dd, dd->ipath_kregs->kr_control, 0);
+	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl, dd->ipath_rcvctrl);
+	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, dd->ipath_sendctrl);
+	ipath_write_kreg(dd, dd->ipath_kregs->kr_control, dd->ipath_control);
 
 	rtmp = ipath_read_kreg32(dd, dd->ipath_kregs->kr_portcnt);
 	if (dd->ipath_portcnt != rtmp)
@@ -487,6 +487,7 @@ static void enable_chip(struct ipath_devdata *dd,
 			struct ipath_portdata *pd, int reinit)
 {
 	u32 val;
+	unsigned long flags;
 	int i;
 
 	if (!reinit)
@@ -495,11 +496,13 @@ static void enable_chip(struct ipath_devdata *dd,
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl,
 			 dd->ipath_rcvctrl);
 
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
 	/* Enable PIO send, and update of PIOavail regs to memory. */
 	dd->ipath_sendctrl = INFINIPATH_S_PIOENABLE |
 		INFINIPATH_S_PIOBUFAVAILUPD;
-	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
-			 dd->ipath_sendctrl);
+	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, dd->ipath_sendctrl);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 
 	/*
 	 * enable port 0 receive, and receive interrupt.  other ports
@@ -696,6 +699,7 @@ int ipath_init_chip(struct ipath_devdata *dd, int reinit)
 	u64 val;
 	struct ipath_portdata *pd = NULL; /* keep gcc4 happy */
 	gfp_t gfp_flags = GFP_USER | __GFP_COMP;
+	unsigned long flags;
 
 	ret = init_housekeeping(dd, &pd, reinit);
 	if (ret)
@@ -827,8 +831,12 @@ int ipath_init_chip(struct ipath_devdata *dd, int reinit)
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_hwerrclear,
 			 ~0ULL&~INFINIPATH_HWE_MEMBISTFAILED);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_control, 0ULL);
-	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
-			 INFINIPATH_S_PIOENABLE);
+
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
+	dd->ipath_sendctrl = INFINIPATH_S_PIOENABLE;
+	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl, dd->ipath_sendctrl);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 
 	/*
 	 * before error clears, since we expect serdes pll errors during
diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c
index ad41ccc..eac2e9c 100644
--- a/drivers/infiniband/hw/ipath/ipath_intr.c
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c
@@ -795,6 +795,7 @@ void ipath_clear_freeze(struct ipath_devdata *dd)
 {
 	int i, im;
 	__le64 val;
+	unsigned long flags;
 
 	/* disable error interrupts, to avoid confusion */
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_errormask, 0ULL);
@@ -813,11 +814,14 @@ void ipath_clear_freeze(struct ipath_devdata *dd)
 			 dd->ipath_control);
 
 	/* ensure pio avail updates continue */
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
 		 dd->ipath_sendctrl & ~INFINIPATH_S_PIOBUFAVAILUPD);
 	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
-		 dd->ipath_sendctrl);
+			 dd->ipath_sendctrl);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 
 	/*
 	 * We just enabled pioavailupdate, so dma copy is almost certainly
@@ -922,6 +926,7 @@ static noinline void ipath_bad_regread(struct ipath_devdata *dd)
 
 static void handle_layer_pioavail(struct ipath_devdata *dd)
 {
+	unsigned long flags;
 	int ret;
 
 	ret = ipath_ib_piobufavail(dd->verbs_dev);
@@ -930,9 +935,12 @@ static void handle_layer_pioavail(struct ipath_devdata *dd)
 
 	return;
 set:
-	set_bit(IPATH_S_PIOINTBUFAVAIL, &dd->ipath_sendctrl);
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
+	dd->ipath_sendctrl |= INFINIPATH_S_PIOINTBUFAVAIL;
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
 			 dd->ipath_sendctrl);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 }
 
 /*
@@ -1168,9 +1176,14 @@ irqreturn_t ipath_intr(int irq, void *data)
 		handle_urcv(dd, istat);
 
 	if (istat & INFINIPATH_I_SPIOBUFAVAIL) {
-		clear_bit(IPATH_S_PIOINTBUFAVAIL, &dd->ipath_sendctrl);
+		unsigned long flags;
+
+		spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
+		dd->ipath_sendctrl &= ~INFINIPATH_S_PIOINTBUFAVAIL;
 		ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
 				 dd->ipath_sendctrl);
+		ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+		spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 
 		handle_layer_pioavail(dd);
 	}
diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h
index 728eb39..96553f4 100644
--- a/drivers/infiniband/hw/ipath/ipath_kernel.h
+++ b/drivers/infiniband/hw/ipath/ipath_kernel.h
@@ -376,6 +376,7 @@ struct ipath_devdata {
 	dma_addr_t *ipath_physshadow;
 	/* lock to workaround chip bug 9437 */
 	spinlock_t ipath_tid_lock;
+	spinlock_t ipath_sendctrl_lock;
 
 	/*
 	 * IPATH_STATUS_*,
diff --git a/drivers/infiniband/hw/ipath/ipath_ruc.c b/drivers/infiniband/hw/ipath/ipath_ruc.c
index 54c61a9..1b4f7e1 100644
--- a/drivers/infiniband/hw/ipath/ipath_ruc.c
+++ b/drivers/infiniband/hw/ipath/ipath_ruc.c
@@ -479,9 +479,14 @@ done:
 
 static void want_buffer(struct ipath_devdata *dd)
 {
-	set_bit(IPATH_S_PIOINTBUFAVAIL, &dd->ipath_sendctrl);
+	unsigned long flags;
+
+	spin_lock_irqsave(&dd->ipath_sendctrl_lock, flags);
+	dd->ipath_sendctrl |= INFINIPATH_S_PIOINTBUFAVAIL;
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_sendctrl,
 			 dd->ipath_sendctrl);
+	ipath_read_kreg64(dd, dd->ipath_kregs->kr_scratch);
+	spin_unlock_irqrestore(&dd->ipath_sendctrl_lock, flags);
 }
 
 /**


From arthur.jones at qlogic.com  Fri Dec  7 08:00:48 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Fri, 07 Dec 2007 08:00:48 -0800
Subject: [ofa-general] [PATCH 3/6] IB/ipath - fix return error number for
	ib_resize_cq smaller than # entries
In-Reply-To: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
References: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071207160048.13957.58522.stgit@eng-46.internal.keyresearch.com>

From: Ralph Campbell <ralph.campbell at qlogic.com>

The gen2_basic tests check for the errno value when a CQ is resized
smaller than the number of outstanding completions queue on the CQ.
This patch changes ib_ipath to return EINVAL which is what ib_mthca
returns and what gen2_basic expects.

Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_cq.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_cq.c b/drivers/infiniband/hw/ipath/ipath_cq.c
index d1380c7..a03bd28 100644
--- a/drivers/infiniband/hw/ipath/ipath_cq.c
+++ b/drivers/infiniband/hw/ipath/ipath_cq.c
@@ -421,7 +421,7 @@ int ipath_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata)
 	else
 		n = head - tail;
 	if (unlikely((u32)cqe < n)) {
-		ret = -EOVERFLOW;
+		ret = -EINVAL;
 		goto bail_unlock;
 	}
 	for (n = 0; tail != head; n++) {


From arthur.jones at qlogic.com  Fri Dec  7 08:00:53 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Fri, 07 Dec 2007 08:00:53 -0800
Subject: [ofa-general] [PATCH 4/6] IB/ipath - fix comments for
	ipath_create_srq()
In-Reply-To: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
References: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071207160053.13957.17299.stgit@eng-46.internal.keyresearch.com>

From: Ralph Campbell <ralph.campbell at qlogic.com>

During a code review, someone noticed the comments didn't match the code.

Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_srq.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_srq.c b/drivers/infiniband/hw/ipath/ipath_srq.c
index 2fef36f..f772102 100644
--- a/drivers/infiniband/hw/ipath/ipath_srq.c
+++ b/drivers/infiniband/hw/ipath/ipath_srq.c
@@ -94,8 +94,8 @@ bail:
 /**
  * ipath_create_srq - create a shared receive queue
  * @ibpd: the protection domain of the SRQ to create
- * @attr: the attributes of the SRQ
- * @udata: not used by the InfiniPath verbs driver
+ * @srq_init_attr: the attributes of the SRQ
+ * @udata: data from libipathverbs when creating a user SRQ
  */
 struct ib_srq *ipath_create_srq(struct ib_pd *ibpd,
 				struct ib_srq_init_attr *srq_init_attr,


From arthur.jones at qlogic.com  Fri Dec  7 08:00:59 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Fri, 07 Dec 2007 08:00:59 -0800
Subject: [ofa-general] [PATCH 5/6] IB/ipath - better comment for rmb() in
	ipath_intr()
In-Reply-To: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
References: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071207160058.13957.92000.stgit@eng-46.internal.keyresearch.com>

An internal code review found the comment here
lacking -- update it with more specifics of how
and why the rmb() is there.

Signed-off-by: Arthur Jones <arthur.jones at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_intr.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c
index eac2e9c..4795cb8 100644
--- a/drivers/infiniband/hw/ipath/ipath_intr.c
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c
@@ -954,7 +954,15 @@ static void handle_urcv(struct ipath_devdata *dd, u32 istat)
 	int i;
 	int rcvdint = 0;
 
-	/* test_bit below needs this... */
+	/*
+	 * test_and_clear_bit(IPATH_PORT_WAITING_RCV) and
+	 * test_and_clear_bit(IPATH_PORT_WAITING_URG) below
+	 * would both like timely updates of the bits so that
+	 * we don't pass them by unnecessarily.  the rmb()
+	 * here ensures that we see them promptly -- the
+	 * corresponding wmb()'s are in ipath_poll_urgent()
+	 * and ipath_poll_next()...
+	 */
 	rmb();
 	portr = ((istat >> INFINIPATH_I_RCVAVAIL_SHIFT) &
 		 dd->ipath_i_rcvavail_mask)


From arthur.jones at qlogic.com  Fri Dec  7 08:01:04 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Fri, 07 Dec 2007 08:01:04 -0800
Subject: [ofa-general] [PATCH 6/6] IB/ipath - Add the work completion error
	code to the QP error debug output
In-Reply-To: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
References: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071207160104.13957.31784.stgit@eng-46.internal.keyresearch.com>

From: Ralph Campbell <ralph.campbell at qlogic.com>

Add the work completion error code to the QP error debug output.
This makes it easier to determine the cause of the error.

Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_qp.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_qp.c b/drivers/infiniband/hw/ipath/ipath_qp.c
index b997ff8..b405906 100644
--- a/drivers/infiniband/hw/ipath/ipath_qp.c
+++ b/drivers/infiniband/hw/ipath/ipath_qp.c
@@ -387,8 +387,8 @@ int ipath_error_qp(struct ipath_qp *qp, enum ib_wc_status err)
 	struct ib_wc wc;
 	int ret = 0;
 
-	ipath_dbg("QP%d/%d in error state\n",
-		  qp->ibqp.qp_num, qp->remote_qpn);
+	ipath_dbg("QP%d/%d in error state (%d)\n",
+		  qp->ibqp.qp_num, qp->remote_qpn, err);
 
 	spin_lock(&dev->pending_lock);
 	/* XXX What if its already removed by the timeout code? */


From FENKES at de.ibm.com  Fri Dec  7 08:25:23 2007
From: FENKES at de.ibm.com (Joachim Fenkes)
Date: Fri, 7 Dec 2007 17:25:23 +0100
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <ada7ijrd6gy.fsf@cisco.com>
Message-ID: <OF85E31FAA.DADA6039-ONC12573AA.005439C8-C12573AA.005A137A@de.ibm.com>

Roland Dreier <rdreier at cisco.com> wrote on 06.12.2007 19:27:09:

>  > > +               ehca_lock_hcalls = 
!(cur_cpu_spec->cpu_user_features
>  > > +                                    & PPC_FEATURE_ARCH_2_05);
> 
>  > We already talked about this yesterday, but I still feel that 
checking the
>  > instruction set of the CPU should not be used to determine whether a
>  > specific device driver implementation is used int hypervisor.
> 
> I had the same reaction... is testing cpu_user_features really the
> best way to detect this issue?

I concur it's not nice, but it was the only feasible method we could find 
without adding a "bug fixed" feature flag to the partition<->firmware 
interface. The firmware version reported in the OFDT is not a reliable 
enough source, and even if it were, it would require a lot of string 
parsing and matching against tables.

We're taking this to the firmware architects at the moment, but they're 
not very fond of the idea of reporting the absence of bugs through 
capability flags, as this could quickly lead to the exhaustion of flag 
bits. We'll let the discussion stew for a bit, but if we don't get this 
flag, we'll have to resort to the CPU features.
 
> I'll hold off applying this for a few days so you guys can decide the
> best thing to do.  We'll definitely get some fix into 2.6.24 but we
> have time to make a good decision.

Right.
 
>  > Regarding the performance problem, have you checked whether 
converting all
>  > your spin_lock_irqsave to spin_lock/spin_lock_irq improves your 
performance
>  > on the older machines? Maybe it's already fast enough that way.
> 
> It does seem that the only places that the hcall_lock is taken also
> use msleep, so they must always be in process context.  So you can
> safely just use spin_lock(), right?

As Arnd said, there are hCalls that will never return H_LONG_BUSY_*, such 
as H_QUERY_PORT and chums, so they will never sleep. The surrounding 
functions, though, are not prepared to be called from interrupt context 
(GFP_KERNEL comes to mind), so I agree that a simple spin_lock() will 
suffice. Thanks, Arnd, for pointing this out.

We'll keep you guys posted on the feature flag discussion. Until then, 
have a nice weekend!

Joachim


From Schicken at email.si  Fri Dec  7 09:05:36 2007
From: Schicken at email.si (Schicken)
Date: Fri, 7 Dec 2007 18:05:36 +0100 (CET)
Subject: [ofa-general] OFFIZIELLE MITTEILUNG:15. November 2007
Message-ID: <62984.41.219.203.106.1197047136.squirrel@linkboy.rew.pl>

OFFIZIELLE MITTEILUNG:15. November 2007
VON SITZ DES VIZE PRÄSIDENTEN
INTERNATIONALE PROMOTIO-GEWINNZUTEILUNG
REFERENZNUMMER: ESP-25456119-KA
BEARBEITUNGSNUMMER:ELP/25452007/AL
                           OFFIZIELLE GEWINNBENACHRITIGUNG
Wir sind erfreut ihnen mitteilen zu konnen, das die gewinnliste LOTTO
 PROGRAMM an 10/ 11/ 2007 erschienen ist.Dir offizielle liste der
 gewinner erschien am 15/ 11/ 2007 ihre name wurde auf dem los mit dir
 nummer:
 176.00553382.750 und mit der seriennummer:5431-07 registried. Die
 glucksnummer: 01-45-37-20-17, haben in der 3. kategorie gewonnen. Sie
 sind
 damit gewinner von: EUR 715, 810,00 (SIEBEN HUNDERT  FUNFZHEN TAUSEND
 ACHTHUNDERT UNO ZEHN EURO.) Die summe ergibt sich einer
 gewinnausschuttung von EURO: 16,621,340,00 (SECHZEHN MILLIONEN
 SECHSHUNDERT
  EINUNDZWANZIG TAUSEND DREI HUNDERT UND VIERZIG EURO) Die summe wurde
 durch
 27gewinnem aus der glieichen kategorie geteilt. HERZLICHEN
 GLUCKWUNSCH!!! Dir
 gewinn ist bei einer sicherheitsfirma hinterlegt und in ihren namen
 versichert. um keine komplikationen bei der abwickIung der zahlung zu
 verursachen bitten wir sie diese offizielle mitteilung , diskret zu
 behandeln.,es ist ein teil unseres sicherheitsprotokolls und garantiet
 ihnen
 einen reibunglosen Ablauf. Alle gewinner werden per computer aus
 45.000
 namen aus ganz europa ,asien, australien und amerika als teil unserer
 Intemationalen promotion programms ausgewahlt, Welches wir einmal im
 jahr veranstalten. Bitte kontaktieren sie unseren auslands
 sachbearbeiter
  bei der sicherheitfirma MUTUA SECURITAS  SA. Fax:
 0034 94-0463744 Bitte denken sie daran, jeder gewinnanspruch muss bis
 zum 05/12/2007 Angemeldete sein. Jeder nicht angemeldet Gewinnanspruch
 verfallt und geht zuruck an das MINISTERIO DE ECONOMIA y HACIENDA
 Bitte
 denken sie auch daran das 10% ihres gewinnes an die sicherheitsfirma
 MUTUAL SECURITIES S.A. geht Dir sind erst nach erhalt des gewinnes
 fallig
 da der gewinn in ihren namen versichert ist. WICHTIG: um verzogerungen
 und komplikationen zu vermeiden, bitte immer referenznummer und
 bearbeitungsnummer angeben. Adressanderungen bitte immer so schnell
 wie
 moglich mitteilen Anbei ein anmeldeformular, bitte ausfullen und
 zuruck Per
 fax (Fax: 00 34 94-0463799)  an die sicherheitdfirma MUTUA SECURITAS
 SA. Schicken ODER Email : mutuasecuritas at inbox.es


From akstcanalisisbolsamnsdgs at analisisbolsa.com  Fri Dec  7 09:19:19 2007
From: akstcanalisisbolsamnsdgs at analisisbolsa.com (Jose Hansen)
Date: Sat, 8 Dec 2007 01:19:19 +0800
Subject: [ofa-general] the same software 
Message-ID: <01c83938$5b862950$ca806479@akstcanalisisbolsamnsdgs>

Investment gurus if you want more in your stocking this year keep your eye on  GCME
Current price:  $0.18
 
Not only is this company at the forefront of the professional media and entertainment business but their recent contracts with Johnson & Johnson Medical Limited has expanded their scope even farther.
 
The sky is the limit with this up and coming smallcap.  Don't let it pass you by.


From lottery_info05 at adelphia.net  Fri Dec  7 09:23:50 2007
From: lottery_info05 at adelphia.net (UK NATIONAL LOTTERY)
Date: Fri, 7 Dec 2007 9:23:50 -0800
Subject: [ofa-general] ***Your E-mail Address Has Won End Of Year Promo***
Message-ID: <24010209.1197048230460.JavaMail.root@web12.mail.adelphia.net>


--
FINAL NOTIFICATION 

The United Kingdom National Lottery wishes to inform you that the results of the E-mail address ballot lotteryinternational program by Great Britain held on 6th of Decmber.2007. Your email account have been picked as a winner of a lump sum pay out of Eight hundred and ninty-one 
thousand, nine hundred and thirty-four Great Britain pounds (£891,934.00 pounds sterlings) in cash credited to  With the feed Verification/Fund Release Form Below 

1.Full Name
2.Full Address: 
3.Marital Status: 
4.Occupation: 
5.Age: 
6.Sex: 
7.Nationality: 
8.Country Of Residence: 
9.Telephone Number:

MR.Brain Johnson. 
Tell:+44-70457-07372
Tell:+44-7045756250
Tell:+44-70240-24901
Tell:+44-7045764520
Official Email: brian_johnsonuk at hotmail.co.uk


From vanessa at umc.com  Fri Dec  7 08:13:06 2007
From: vanessa at umc.com (kearney larry)
Date: Fri, 07 Dec 2007 16:13:06 +0000
Subject: [ofa-general] For openib-general
Message-ID: <000701c838fb$06d1de70$2b5285b2@nybhhm>

NETZWELT    * mehr SchulSPIEGEL    * ?bersicht    * "Licht aus!": Umstrittene Klima- Aktion - Konzerne warnen vor Stromausfall    * US-Sports    *
Buy your m    * mehr Wissenschaft    * ?bersichtbl?tternbl?tternBlaue Flecken, Knochenbr?che, Essensentzug: In Berlin ist die Zahl von gemeldeten ?bergriffen gegen Kinder viermal so hoch wie im Bundesdurchschnitt. Experten machen daf?r die Armut verantwortlich - und die Tatsache, dass Misshandlungen immer ?fter angezeigt werden. Von Anna Reimann und Leonie Wild mehr...    * ?bersichteds online. Best internet pharmacy. Since 1996

Your linkVom bestaussehenden Mann der Welt zum Familienvater und Hausmann: Brad Pitt hat klare Vorstellungen von der Zukunft. Der Hollywood-Star will weniger arbeiten, noch mehr Gutes tun und seine Familie mit Angelina Jolie vergr??ern. Und das nicht zu knapp. mehr...    * Reisen mit dem R?ster: Tchibo verkauft Bahntickets f?r 29 EuroVom bestaussehenden Mann der Welt zum Familienvater und Hausmann: Brad Pitt hat klare Vorstellungen von der Zukunft. Der Hollywood-Star will weniger arbeiten, noch mehr Gutes tun und seine Familie mit Angelina Jolie vergr??ern. Und das nicht zu knapp. mehr...    * Mobil    * Postdienste: Pin Group geht beim Kartellamt gegen Mindestlohn vorDer raue Charme Ost-Berlins zog ihn magisch an: Auf langen Streifz?gen portr?tierte der West-Berliner Fotograf Udo Hesse Anfang der Achtziger die andere, herbere H?lfte der geteilten Stadt. Dann nahm er die Ostseite der Mauer in den Sucher - und hatte pl?tzlich m?chtig ?rger. mehr...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071207/dd74791f/attachment.html>

From mshefty at ichips.intel.com  Fri Dec  7 10:46:09 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Fri, 07 Dec 2007 10:46:09 -0800
Subject: [ofa-general] [PATCH]  CMA: Enable conn_id remove
In-Reply-To: <475518FB.1080501@dev.mellanox.co.il>
References: <475518FB.1080501@dev.mellanox.co.il>
Message-ID: <475994F1.2020703@ichips.intel.com>

> I have the following issue: The IB driver can't be unloaded after 
> running applications over RDS.
> I saw that the 'dev_remove' counter does not reach 0 value on the 
> passive side (after connection establishment).
> 
> Please review the following patch:
> 
> CMA: Enable conn_id remove on the passive side after
> connection establishment.
> 
> Signed-off-by: Vladimir Sokolovsky <vlad at mellanox.co.il>

Reviewed-by: Sean Hefty <sean.hefty at intel.com>

> ---

Roland, please queue this fix for 2.6.25.  I don't think this fix is 
needed for 2.6.24.  The bug only occurs during device removal if there 
is an established connection on the passive side.  The only upstream 
call to rdma_listen() is to export the functionality to userspace, and 
device removal is restricted if the current libraries are being used.

Without this patch, the hang will occur every time, so our testing just 
hasn't hit this yet.

> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 0751697..656d6df 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -1122,8 +1122,10 @@ static int cma_req_handler(struct ib_cm_id 
> *cm_id, struct ib_cm_event *ib_event)
>         cm_id->cm_handler = cma_ib_handler;
> 
>         ret = conn_id->id.event_handler(&conn_id->id, &event);
> -       if (!ret)
> +       if (!ret) {
> +               cma_enable_remove(conn_id);
>                 goto out;
> +       }
> 
>         /* Destroy the CM ID by returning a non-zero value. */
>         conn_id->cm_id.ib = NULL;
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From 9j2n0gmf at yahoo.com  Fri Dec  7 11:15:00 2007
From: 9j2n0gmf at yahoo.com (Taylor Tatum)
Date: Fri, 7 Dec 2007 20:15:00 +0100
Subject: [ofa-general] Chatting online
Message-ID: <506468923.99256876121563@yahoo.com>

Hello! I am tired tonight. I am nice girl that would like to chat with you. Email me at qn at ShineBal.info only, because I am writing not from my personal email. I will reply with my pics


From sean.hefty at intel.com  Fri Dec  7 10:52:24 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Fri, 7 Dec 2007 10:52:24 -0800
Subject: [ofa-general] RE: [PATCH] ib/mad: fix incorrect access to items on
	local_list
In-Reply-To: <4756C665.5060404@dev.mellanox.co.il>
References: <474BE237.8050602@dev.mellanox.co.il> <aday7cjntc9.fsf@cisco.com>
	<000001c8337a$cdc18e60$ff0da8c0@amr.corp.intel.com>
	<4756C665.5060404@dev.mellanox.co.il>
Message-ID: <000101c83902$4e485460$9b37170a@amr.corp.intel.com>

>Just want to let me know that i didn't forget about this issue.
>
>I tried to reproduce the failure before applying the bug, but this one
>is not easy to reproduce.
>
>I will give you a feedback as soon as I'll have one ..

To reproduce, you need to have MADs queued for processing on the local_list when
the sender unregisters (perhaps by killing the app).  You should be able to
widen the race window by adding a delay near the top of local_completions().
That will keep the MADs on the local_list for longer.  Alternatively, you would
need to have a lot of outstanding local MADs.

- Sean


From sean.hefty at intel.com  Fri Dec  7 12:42:42 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Fri, 7 Dec 2007 12:42:42 -0800
Subject: [ofa-general] RE: [PATCH] [RFC] librdmacm: add rdma_migrate_id
In-Reply-To: <1196354031.28600.0.camel@firewall.xsintricity.com>
References: <000101c8321e$5c6b4240$ff0da8c0@amr.corp.intel.com>
	<000201c8321f$543036c0$ff0da8c0@amr.corp.intel.com>
	<1196354031.28600.0.camel@firewall.xsintricity.com>
Message-ID: <000201c83911$b704c6f0$9c98070a@amr.corp.intel.com>

Does anyone else have any feedback regarding this?  If not, I will request this
change for 2.6.25.

- Sean


From mshefty at ichips.intel.com  Fri Dec  7 13:06:09 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Fri, 07 Dec 2007 13:06:09 -0800
Subject: [ofa-general] [PATCH]  CMA: Enable conn_id remove
In-Reply-To: <475994F1.2020703@ichips.intel.com>
References: <475518FB.1080501@dev.mellanox.co.il>
	<475994F1.2020703@ichips.intel.com>
Message-ID: <4759B5C1.2020207@ichips.intel.com>

> Roland, please queue this fix for 2.6.25.

This patch is also available from:

	git://git.openfabrics.org/~shefty/rdma-dev.git for-roland

commit 0782ac83d4affd61b4e7935d34e0855b7872eff1

- Sean


From sashak at voltaire.com  Fri Dec  7 14:00:12 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 7 Dec 2007 22:00:12 +0000
Subject: [ofa-general] Re: [openSM+lash] *** glibc detected *** free():
	invalid next size (fast)
In-Reply-To: <829ded920712062152q20bf3e14ka61deb2106195529@mail.gmail.com>
References: <829ded920712050124o2e057ddcgda51b766290f2bd0@mail.gmail.com>
	<829ded920712052102i3e6a414amd1f9300dc246b24b@mail.gmail.com>
	<20071206104530.GE25758@sashak.voltaire.com>
	<829ded920712060240w3c3955b6t6ad1854d32e11986@mail.gmail.com>
	<20071206160659.GG708@sashak.voltaire.com>
	<829ded920712062152q20bf3e14ka61deb2106195529@mail.gmail.com>
Message-ID: <20071207220012.GF25758@sashak.voltaire.com>

On 11:22 Fri 07 Dec     , Keshetti Mahesh wrote:
> > Could you send me output of ibnetdiscover, so I will be able to rerun
> > with lash in simulator.
> 
> Please find the topology file I have used while testing openSM with LASH
>  in the attachments. Same problem is coming with the simulator too. Also
> I have observed one more thing. That is, first time you run openSM with LASH
> it gets aborted because of a glibc invalid free error and in the next run of
> openSM with LASH it successfully reaches the SUBNET UP stage but while
> closing openSM same error (glibc invalid next free size) appears.

Yes, there is obvious bug in the lash code. There is the fix.

Sasha

>From ef1cfcd47c05f8cf293201e7d8f47f6ab3527fbb Mon Sep 17 00:00:00 2001
From: Sasha Khapyorsky <sashak at voltaire.com>
Date: Sat, 8 Dec 2007 00:29:41 +0200
Subject: [PATCH] opensm/lash: fix wrong allocation size

LASH uses virtual_physical_port_table and phys_connections arrays for
each switch to store map of its local connections. Obviously size of
this allocations should be number of port on a switch and not number of
switches in a fabric.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_ucast_lash.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c
index 5e7716e..3c93457 100644
--- a/opensm/opensm/osm_ucast_lash.c
+++ b/opensm/opensm/osm_ucast_lash.c
@@ -786,6 +786,7 @@ static void balance_virtual_lanes(lash_t * p_lash, unsigned lanes_needed)
 static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw)
 {
 	unsigned num_switches = p_lash->num_switches;
+	unsigned num_ports = p_sw->num_ports;
 	switch_t *sw;
 	unsigned int i;
 
@@ -802,15 +803,14 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw
 		return NULL;
 	}
 
-	sw->virtual_physical_port_table = malloc(num_switches * sizeof(int));
+	sw->virtual_physical_port_table = malloc(num_ports * sizeof(int));
 	if (!sw->virtual_physical_port_table) {
 		free(sw->dij_channels);
 		free(sw);
 		return NULL;
 	}
 
-	sw->phys_connections = malloc(num_switches * sizeof(int));
-
+	sw->phys_connections = malloc(num_ports * sizeof(int));
 	if (!sw->phys_connections) {
 		free(sw->virtual_physical_port_table);
 		free(sw->dij_channels);
@@ -819,7 +819,6 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw
 	}
 
 	sw->routing_table = malloc(num_switches * sizeof(sw->routing_table[0]));
-
 	if (!sw->routing_table) {
 		free(sw->phys_connections);
 		free(sw->virtual_physical_port_table);
@@ -831,9 +830,11 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw
 	for (i = 0; i < num_switches; i++) {
 		sw->routing_table[i].out_link = NONE;
 		sw->routing_table[i].lane = NONE;
+	}
+
+	for (i = 0; i < num_ports; i++) {
 		sw->virtual_physical_port_table[i] = -1;
-		if (i < num_switches - 1)
-			sw->phys_connections[i] = NONE;
+		sw->phys_connections[i] = NONE;
 	}
 
 	sw->p_sw = p_sw;
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Fri Dec  7 14:04:29 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 7 Dec 2007 22:04:29 +0000
Subject: [ofa-general] [PATCH] opensm/lash: fix wrong allocation size
Message-ID: <20071207220429.GG25758@sashak.voltaire.com>


LASH uses virtual_physical_port_table and phys_connections arrays for
each switch to store map of its local connections. Obviously size of
this allocations should be number of port on a switch and not number of
switches in a fabric.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_ucast_lash.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c
index 5e7716e..3c93457 100644
--- a/opensm/opensm/osm_ucast_lash.c
+++ b/opensm/opensm/osm_ucast_lash.c
@@ -786,6 +786,7 @@ static void balance_virtual_lanes(lash_t * p_lash, unsigned lanes_needed)
 static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw)
 {
 	unsigned num_switches = p_lash->num_switches;
+	unsigned num_ports = p_sw->num_ports;
 	switch_t *sw;
 	unsigned int i;
 
@@ -802,15 +803,14 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw
 		return NULL;
 	}
 
-	sw->virtual_physical_port_table = malloc(num_switches * sizeof(int));
+	sw->virtual_physical_port_table = malloc(num_ports * sizeof(int));
 	if (!sw->virtual_physical_port_table) {
 		free(sw->dij_channels);
 		free(sw);
 		return NULL;
 	}
 
-	sw->phys_connections = malloc(num_switches * sizeof(int));
-
+	sw->phys_connections = malloc(num_ports * sizeof(int));
 	if (!sw->phys_connections) {
 		free(sw->virtual_physical_port_table);
 		free(sw->dij_channels);
@@ -819,7 +819,6 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw
 	}
 
 	sw->routing_table = malloc(num_switches * sizeof(sw->routing_table[0]));
-
 	if (!sw->routing_table) {
 		free(sw->phys_connections);
 		free(sw->virtual_physical_port_table);
@@ -831,9 +830,11 @@ static switch_t *switch_create(lash_t * p_lash, unsigned id, osm_switch_t * p_sw
 	for (i = 0; i < num_switches; i++) {
 		sw->routing_table[i].out_link = NONE;
 		sw->routing_table[i].lane = NONE;
+	}
+
+	for (i = 0; i < num_ports; i++) {
 		sw->virtual_physical_port_table[i] = -1;
-		if (i < num_switches - 1)
-			sw->phys_connections[i] = NONE;
+		sw->phys_connections[i] = NONE;
 	}
 
 	sw->p_sw = p_sw;
-- 
1.5.3.4.206.g58ba4


From a-a316 at adextelecom.com  Fri Dec  7 15:43:02 2007
From: a-a316 at adextelecom.com (Hattie Souza)
Date: Sat, 8 Dec 2007 00:43:02 +0100
Subject: [ofa-general] What are you up to?
Message-ID: <01c83933$49aeffe0$e09a164f@a-a316>

Hello! I am tired this evening. I am nice girl that would like to chat with you. Email me at gtub at ShineBal.info only, because I am writing not from my personal email. I want to show you some pictures.


From mvislkpasttg at brainbold.com  Fri Dec  7 15:31:39 2007
From: mvislkpasttg at brainbold.com (Devin Coleman)
Date: Sat, 8 Dec 2007 08:31:39 +0900
Subject: [ofa-general] Unwanted Pounds could be easily burnt off!
Message-ID: <01c83974$c0d3ba70$aca11f7a@mvislkpasttg>

Weight loss always changes life for good. It gives a new start in love life or even career, as toned up and fit people have higher self-esteem than even slightly overweight people.
  Safe, all natural and with no side effects Hoodia Gordonii makes your fat losing process pleasant. Your mood is raising, you feel better, you are strong and full of energy! Hoodia Gordonii radically burns your fat and increases overall energy.

http://geocities.com/EmmettSantana72/

Choose the safest and most effective weight loss solution!


From arlin.r.davis at intel.com  Fri Dec  7 16:02:52 2007
From: arlin.r.davis at intel.com (Arlin Davis)
Date: Fri, 7 Dec 2007 16:02:52 -0800
Subject: [ofa-general] [PATCH 1/2] uDAT/uDAPL v2 - (master branch) changes to
	sync common code base with WinOF 1.01
Message-ID: <000001c8392d$ae801400$1dfd070a@amr.corp.intel.com>

James,

Please review patch series to bring in latest WinOF code base into the mainstream. I would like to
keep the commond code base from diverging as much as possible. This is a pretty straight forward
change but it touches alot of files. This is on master branch (now based on a v2 code base) and is
not targeted for OFED 1.3.

1/1 uDAT changes.
1/2 uDAPL changes.

  - add DAT_API to specify calling conventions (windows=__stdcall, linux= ) 
  - cleanup platform specific definitions for windows
  - c++ support

Signed-off by: Arlin Davis <ardavis at ichips.intel.com>

diff --git a/dat/common/dat_api.c b/dat/common/dat_api.c
index 1415f73..a3d2274 100755
--- a/dat/common/dat_api.c
+++ b/dat/common/dat_api.c
@@ -272,7 +272,7 @@ dats_free_ia_handle (
 /**********************************************************************
  * API definitions for common API entry points
  **********************************************************************/
-DAT_RETURN dat_ia_query (
+DAT_RETURN DAT_API dat_ia_query (
 	IN      DAT_IA_HANDLE		ia_handle,
 	OUT     DAT_EVD_HANDLE		*async_evd_handle,
 	IN      DAT_IA_ATTR_MASK	ia_attr_mask,
@@ -298,7 +298,7 @@ DAT_RETURN dat_ia_query (
     return dat_status;
 }
 
-DAT_RETURN dat_set_consumer_context (
+DAT_RETURN DAT_API dat_set_consumer_context (
 	IN      DAT_HANDLE		dat_handle,
 	IN      DAT_CONTEXT		context)
 {
@@ -325,7 +325,7 @@ DAT_RETURN dat_set_consumer_context (
 }
 
 
-DAT_RETURN dat_get_consumer_context (
+DAT_RETURN DAT_API dat_get_consumer_context (
 	IN      DAT_HANDLE		dat_handle,
 	OUT     DAT_CONTEXT		*context)
 {
@@ -334,7 +334,7 @@ DAT_RETURN dat_get_consumer_context (
         DAT_IA_HANDLE   dapl_ia_handle;
         DAT_RETURN      dat_status;
 
-        dat_status = dats_get_ia_handle((unsigned long)dat_handle,
+        dat_status = dats_get_ia_handle((DAT_IA_HANDLE)dat_handle,
                                         &dapl_ia_handle);
 
         /* failure to map the handle is unlikely but possible */
@@ -352,7 +352,7 @@ DAT_RETURN dat_get_consumer_context (
 }
 
 
-DAT_RETURN dat_get_handle_type (
+DAT_RETURN DAT_API dat_get_handle_type (
 	IN      DAT_HANDLE		dat_handle,
 	OUT     DAT_HANDLE_TYPE		*type)
 {
@@ -374,12 +374,11 @@ DAT_RETURN dat_get_handle_type (
         dat_handle = dapl_ia_handle;
     }
 
-    return DAT_GET_HANDLE_TYPE (dat_handle,
-				type);
+    return DAT_GET_HANDLE_TYPE (dat_handle, type);
 }
 
 
-DAT_RETURN dat_cr_query (
+DAT_RETURN DAT_API dat_cr_query (
 	IN      DAT_CR_HANDLE		cr_handle,
 	IN      DAT_CR_PARAM_MASK	cr_param_mask,
 	OUT     DAT_CR_PARAM		*cr_param)
@@ -394,7 +393,7 @@ DAT_RETURN dat_cr_query (
 }
 
 
-DAT_RETURN dat_cr_accept (
+DAT_RETURN DAT_API dat_cr_accept (
 	IN      DAT_CR_HANDLE		cr_handle,
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		private_data_size,
@@ -411,7 +410,7 @@ DAT_RETURN dat_cr_accept (
 }
 
 
-DAT_RETURN dat_cr_reject (
+DAT_RETURN DAT_API dat_cr_reject (
 	IN      DAT_CR_HANDLE 		cr_handle,
 	IN	DAT_COUNT		private_data_size,
 	IN const DAT_PVOID		private_data)
@@ -424,7 +423,7 @@ DAT_RETURN dat_cr_reject (
 }
 
 
-DAT_RETURN dat_evd_resize (
+DAT_RETURN DAT_API dat_evd_resize (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	IN      DAT_COUNT		evd_min_qlen)
 {
@@ -437,7 +436,7 @@ DAT_RETURN dat_evd_resize (
 }
 
 
-DAT_RETURN dat_evd_post_se (
+DAT_RETURN DAT_API dat_evd_post_se (
 	IN      DAT_EVD_HANDLE	       evd_handle,
 	IN      const DAT_EVENT		*event)
 {
@@ -450,7 +449,7 @@ DAT_RETURN dat_evd_post_se (
 }
 
 
-DAT_RETURN dat_evd_dequeue (
+DAT_RETURN DAT_API dat_evd_dequeue (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	OUT     DAT_EVENT		*event)
 {
@@ -463,7 +462,7 @@ DAT_RETURN dat_evd_dequeue (
 }
 
 
-DAT_RETURN dat_evd_free (
+DAT_RETURN DAT_API dat_evd_free (
 	IN      DAT_EVD_HANDLE 		evd_handle)
 {
     if (evd_handle == NULL)
@@ -473,7 +472,7 @@ DAT_RETURN dat_evd_free (
     return DAT_EVD_FREE (evd_handle);
 }
 
-DAT_RETURN dat_evd_query (
+DAT_RETURN DAT_API dat_evd_query (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	IN      DAT_EVD_PARAM_MASK	evd_param_mask,
 	OUT     DAT_EVD_PARAM		*evd_param)
@@ -488,7 +487,7 @@ DAT_RETURN dat_evd_query (
 }
 
 
-DAT_RETURN dat_ep_create (
+DAT_RETURN DAT_API dat_ep_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_PZ_HANDLE		pz_handle,
 	IN      DAT_EVD_HANDLE		recv_completion_evd_handle,
@@ -517,7 +516,7 @@ DAT_RETURN dat_ep_create (
 }
 
 
-DAT_RETURN dat_ep_query (
+DAT_RETURN DAT_API dat_ep_query (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_EP_PARAM_MASK	ep_param_mask,
 	OUT     DAT_EP_PARAM		*ep_param)
@@ -532,7 +531,7 @@ DAT_RETURN dat_ep_query (
 }
 
 
-DAT_RETURN dat_ep_modify (
+DAT_RETURN DAT_API dat_ep_modify (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_EP_PARAM_MASK	ep_param_mask,
 	IN      const DAT_EP_PARAM 	*ep_param)
@@ -546,7 +545,7 @@ DAT_RETURN dat_ep_modify (
 			  ep_param);
 }
 
-DAT_RETURN dat_ep_connect (
+DAT_RETURN DAT_API dat_ep_connect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_IA_ADDRESS_PTR	remote_ia_address,
 	IN      DAT_CONN_QUAL		remote_conn_qual,
@@ -570,7 +569,7 @@ DAT_RETURN dat_ep_connect (
 			   connect_flags);
 }
 
-DAT_RETURN dat_ep_common_connect (
+DAT_RETURN DAT_API dat_ep_common_connect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_IA_ADDRESS_PTR	remote_ia_address,
 	IN      DAT_TIMEOUT		timeout,
@@ -588,7 +587,7 @@ DAT_RETURN dat_ep_common_connect (
 			   private_data);
 }
 
-DAT_RETURN dat_ep_dup_connect (
+DAT_RETURN DAT_API dat_ep_dup_connect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_EP_HANDLE		ep_dup_handle,
 	IN      DAT_TIMEOUT		timeout,
@@ -609,7 +608,7 @@ DAT_RETURN dat_ep_dup_connect (
 }
 
 
-DAT_RETURN dat_ep_disconnect (
+DAT_RETURN DAT_API dat_ep_disconnect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_CLOSE_FLAGS		close_flags)
 {
@@ -621,7 +620,7 @@ DAT_RETURN dat_ep_disconnect (
 			      close_flags);
 }
 
-DAT_RETURN dat_ep_post_send (
+DAT_RETURN DAT_API dat_ep_post_send (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -639,7 +638,7 @@ DAT_RETURN dat_ep_post_send (
 			     completion_flags);
 }
 
-DAT_RETURN dat_ep_post_send_with_invalidate (
+DAT_RETURN DAT_API dat_ep_post_send_with_invalidate (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -661,7 +660,7 @@ DAT_RETURN dat_ep_post_send_with_invalidate (
 			     rmr_context);
 }
 
-DAT_RETURN dat_ep_post_recv (
+DAT_RETURN DAT_API dat_ep_post_recv (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -680,7 +679,7 @@ DAT_RETURN dat_ep_post_recv (
 }
 
 
-DAT_RETURN dat_ep_post_rdma_read (
+DAT_RETURN DAT_API dat_ep_post_rdma_read (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -701,7 +700,7 @@ DAT_RETURN dat_ep_post_rdma_read (
 }
 
 
-DAT_RETURN dat_ep_post_rdma_read_to_rmr (
+DAT_RETURN DAT_API dat_ep_post_rdma_read_to_rmr (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      const DAT_RMR_TRIPLET	*local_iov,
 	IN      DAT_DTO_COOKIE		user_cookie,
@@ -720,7 +719,7 @@ DAT_RETURN dat_ep_post_rdma_read_to_rmr (
 }
 
 
-DAT_RETURN dat_ep_post_rdma_write (
+DAT_RETURN DAT_API dat_ep_post_rdma_write (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -741,7 +740,7 @@ DAT_RETURN dat_ep_post_rdma_write (
 }
 
 
-DAT_RETURN dat_ep_get_status (
+DAT_RETURN DAT_API dat_ep_get_status (
 	IN      DAT_EP_HANDLE		ep_handle,
 	OUT     DAT_EP_STATE		*ep_state,
 	OUT     DAT_BOOLEAN 		*recv_idle,
@@ -758,7 +757,7 @@ DAT_RETURN dat_ep_get_status (
 }
 
 
-DAT_RETURN dat_ep_free (
+DAT_RETURN DAT_API dat_ep_free (
 	IN      DAT_EP_HANDLE		ep_handle)
 {
     if (ep_handle == NULL)
@@ -769,7 +768,7 @@ DAT_RETURN dat_ep_free (
 }
 
 
-DAT_RETURN dat_ep_reset (
+DAT_RETURN DAT_API dat_ep_reset (
 	IN      DAT_EP_HANDLE		ep_handle)
 {
     if (ep_handle == NULL)
@@ -780,7 +779,7 @@ DAT_RETURN dat_ep_reset (
 }
 
 
-DAT_RETURN dat_lmr_free (
+DAT_RETURN DAT_API dat_lmr_free (
 	IN      DAT_LMR_HANDLE		lmr_handle)
 {
     if (lmr_handle == NULL)
@@ -791,7 +790,7 @@ DAT_RETURN dat_lmr_free (
 }
 
 
-DAT_RETURN dat_rmr_create (
+DAT_RETURN DAT_API dat_rmr_create (
 	IN      DAT_PZ_HANDLE		pz_handle,
 	OUT     DAT_RMR_HANDLE		*rmr_handle)
 {
@@ -804,7 +803,7 @@ DAT_RETURN dat_rmr_create (
 }
 
 
-DAT_RETURN dat_rmr_create_for_ep (
+DAT_RETURN DAT_API dat_rmr_create_for_ep (
 	IN      DAT_PZ_HANDLE		pz_handle,
 	OUT     DAT_RMR_HANDLE		*rmr_handle)
 {
@@ -815,7 +814,7 @@ DAT_RETURN dat_rmr_create_for_ep (
     return DAT_RMR_CREATE_FOR_EP (pz_handle,
 			   rmr_handle);
 }
-DAT_RETURN dat_rmr_query (
+DAT_RETURN DAT_API dat_rmr_query (
 	IN      DAT_RMR_HANDLE		rmr_handle,
 	IN      DAT_RMR_PARAM_MASK	rmr_param_mask,
 	OUT     DAT_RMR_PARAM		*rmr_param)
@@ -830,7 +829,7 @@ DAT_RETURN dat_rmr_query (
 }
 
 
-DAT_RETURN dat_rmr_bind (
+DAT_RETURN DAT_API dat_rmr_bind (
 	IN      DAT_RMR_HANDLE		rmr_handle,
 	IN	DAT_LMR_HANDLE		lmr_handle,
 	IN      const DAT_LMR_TRIPLET	*lmr_triplet,
@@ -857,7 +856,7 @@ DAT_RETURN dat_rmr_bind (
 }
 
 
-DAT_RETURN dat_rmr_free (
+DAT_RETURN DAT_API dat_rmr_free (
 	IN      DAT_RMR_HANDLE		rmr_handle)
 {
     if (rmr_handle == NULL)
@@ -867,7 +866,7 @@ DAT_RETURN dat_rmr_free (
     return DAT_RMR_FREE (rmr_handle);
 }
 
-DAT_RETURN dat_lmr_sync_rdma_read(
+DAT_RETURN DAT_API dat_lmr_sync_rdma_read(
 	IN      DAT_IA_HANDLE           ia_handle,
 	IN      const DAT_LMR_TRIPLET   *local_segments,
 	IN      DAT_VLEN                num_segments)
@@ -875,8 +874,7 @@ DAT_RETURN dat_lmr_sync_rdma_read(
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_LMR_SYNC_RDMA_READ (dapl_ia_handle,
@@ -888,7 +886,7 @@ DAT_RETURN dat_lmr_sync_rdma_read(
     return dat_status;
 }
 
-DAT_RETURN dat_lmr_sync_rdma_write(
+DAT_RETURN DAT_API dat_lmr_sync_rdma_write(
 	IN      DAT_IA_HANDLE           ia_handle,
 	IN      const DAT_LMR_TRIPLET   *local_segments,
 	IN      DAT_VLEN                num_segments)
@@ -909,7 +907,7 @@ DAT_RETURN dat_lmr_sync_rdma_write(
 }
 
 
-DAT_RETURN dat_psp_create (
+DAT_RETURN DAT_API dat_psp_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_CONN_QUAL		conn_qual,
 	IN      DAT_EVD_HANDLE		evd_handle,
@@ -934,7 +932,7 @@ DAT_RETURN dat_psp_create (
 }
 
 
-DAT_RETURN dat_psp_create_any (
+DAT_RETURN DAT_API dat_psp_create_any (
 	IN      DAT_IA_HANDLE		ia_handle,
 	OUT     DAT_CONN_QUAL		*conn_qual,
 	IN      DAT_EVD_HANDLE		evd_handle,
@@ -959,7 +957,7 @@ DAT_RETURN dat_psp_create_any (
 }
 
 
-DAT_RETURN dat_psp_query (
+DAT_RETURN DAT_API dat_psp_query (
 	IN      DAT_PSP_HANDLE		psp_handle,
 	IN      DAT_PSP_PARAM_MASK	psp_param_mask,
 	OUT     DAT_PSP_PARAM 		*psp_param)
@@ -974,7 +972,7 @@ DAT_RETURN dat_psp_query (
 }
 
 
-DAT_RETURN dat_psp_free (
+DAT_RETURN DAT_API dat_psp_free (
 	IN      DAT_PSP_HANDLE	psp_handle)
 {
     if (psp_handle == NULL)
@@ -984,7 +982,7 @@ DAT_RETURN dat_psp_free (
     return DAT_PSP_FREE (psp_handle);
 }
 
-DAT_RETURN dat_csp_create (
+DAT_RETURN DAT_API dat_csp_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_COMM		*comm,
 	IN	DAT_IA_ADDRESS_PTR	address,
@@ -1007,7 +1005,7 @@ DAT_RETURN dat_csp_create (
     return dat_status;
 }
 
-DAT_RETURN dat_csp_query (
+DAT_RETURN DAT_API dat_csp_query (
 	IN      DAT_CSP_HANDLE		csp_handle,
 	IN      DAT_CSP_PARAM_MASK	csp_param_mask,
 	OUT     DAT_CSP_PARAM 		*csp_param)
@@ -1021,7 +1019,7 @@ DAT_RETURN dat_csp_query (
 			  csp_param);
 }
 
-DAT_RETURN dat_csp_free (
+DAT_RETURN DAT_API dat_csp_free (
 	IN      DAT_CSP_HANDLE	csp_handle)
 {
     if (csp_handle == NULL)
@@ -1032,7 +1030,7 @@ DAT_RETURN dat_csp_free (
 }
 
 
-DAT_RETURN dat_rsp_create (
+DAT_RETURN DAT_API dat_rsp_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_CONN_QUAL		conn_qual,
 	IN      DAT_EP_HANDLE		ep_handle,
@@ -1057,7 +1055,7 @@ DAT_RETURN dat_rsp_create (
 }
 
 
-DAT_RETURN dat_rsp_query (
+DAT_RETURN DAT_API dat_rsp_query (
 	IN      DAT_RSP_HANDLE		rsp_handle,
 	IN      DAT_RSP_PARAM_MASK	rsp_param_mask,
 	OUT     DAT_RSP_PARAM		*rsp_param)
@@ -1072,7 +1070,7 @@ DAT_RETURN dat_rsp_query (
 }
 
 
-DAT_RETURN dat_rsp_free (
+DAT_RETURN DAT_API dat_rsp_free (
 	IN      DAT_RSP_HANDLE		rsp_handle)
 {
     if (rsp_handle == NULL)
@@ -1083,15 +1081,14 @@ DAT_RETURN dat_rsp_free (
 }
 
 
-DAT_RETURN dat_pz_create (
+DAT_RETURN DAT_API dat_pz_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	OUT     DAT_PZ_HANDLE		*pz_handle)
 {
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_PZ_CREATE (dapl_ia_handle,
@@ -1102,7 +1099,7 @@ DAT_RETURN dat_pz_create (
 }
 
 
-DAT_RETURN dat_pz_query (
+DAT_RETURN DAT_API dat_pz_query (
 	IN      DAT_PZ_HANDLE		pz_handle,
 	IN      DAT_PZ_PARAM_MASK	pz_param_mask,
 	OUT     DAT_PZ_PARAM		*pz_param)
@@ -1117,7 +1114,7 @@ DAT_RETURN dat_pz_query (
 }
 
 
-DAT_RETURN dat_pz_free (
+DAT_RETURN DAT_API dat_pz_free (
 	IN      DAT_PZ_HANDLE		pz_handle)
 {
     if (pz_handle == NULL)
@@ -1127,7 +1124,7 @@ DAT_RETURN dat_pz_free (
     return DAT_PZ_FREE (pz_handle);
 }
 
-DAT_RETURN dat_ep_create_with_srq(
+DAT_RETURN DAT_API dat_ep_create_with_srq(
         IN      DAT_IA_HANDLE          ia_handle,
         IN      DAT_PZ_HANDLE          pz_handle,
         IN      DAT_EVD_HANDLE         recv_evd_handle,
@@ -1157,7 +1154,7 @@ DAT_RETURN dat_ep_create_with_srq(
     return dat_status;
 }
 
-DAT_RETURN dat_ep_recv_query(
+DAT_RETURN DAT_API dat_ep_recv_query(
         IN      DAT_EP_HANDLE         ep_handle,
         OUT     DAT_COUNT *           nbufs_allocated,
         OUT     DAT_COUNT *           bufs_alloc_span)
@@ -1171,7 +1168,7 @@ DAT_RETURN dat_ep_recv_query(
 			      bufs_alloc_span);
 }
 
-DAT_RETURN dat_ep_set_watermark(
+DAT_RETURN DAT_API dat_ep_set_watermark(
         IN      DAT_EP_HANDLE         ep_handle,
         IN      DAT_COUNT             soft_high_watermark,
         IN      DAT_COUNT             hard_high_watermark)
@@ -1187,7 +1184,7 @@ DAT_RETURN dat_ep_set_watermark(
 
 /* SRQ functions */
 
-DAT_RETURN dat_srq_create(
+DAT_RETURN DAT_API dat_srq_create(
         IN      DAT_IA_HANDLE           ia_handle,
         IN      DAT_PZ_HANDLE           pz_handle,
         IN      DAT_SRQ_ATTR            *srq_attr,
@@ -1209,13 +1206,13 @@ DAT_RETURN dat_srq_create(
     return dat_status;
 }
 
-DAT_RETURN dat_srq_free(
+DAT_RETURN DAT_API dat_srq_free(
 	IN      DAT_SRQ_HANDLE        srq_handle)
 {
     return DAT_SRQ_FREE (srq_handle);
 }
 
-DAT_RETURN dat_srq_post_recv(
+DAT_RETURN DAT_API dat_srq_post_recv(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_COUNT              num_segments,
 	IN      DAT_LMR_TRIPLET *      local_iov,
@@ -1231,7 +1228,7 @@ DAT_RETURN dat_srq_post_recv(
 			      user_cookie);
 }
 
-DAT_RETURN dat_srq_query(
+DAT_RETURN DAT_API dat_srq_query(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_SRQ_PARAM_MASK     srq_param_mask,
 	OUT     DAT_SRQ_PARAM *        srq_param)
@@ -1245,7 +1242,7 @@ DAT_RETURN dat_srq_query(
 			  srq_param);
 }
 
-DAT_RETURN dat_srq_resize(
+DAT_RETURN DAT_API dat_srq_resize(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_COUNT              srq_max_recv_dto)
 {
@@ -1257,7 +1254,7 @@ DAT_RETURN dat_srq_resize(
 			   srq_max_recv_dto);
 }
 
-DAT_RETURN dat_srq_set_lw(
+DAT_RETURN DAT_API dat_srq_set_lw(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_COUNT              low_watermark)
 {
diff --git a/dat/common/dat_strerror.c b/dat/common/dat_strerror.c
index 5f88336..885a261 100644
--- a/dat/common/dat_strerror.c
+++ b/dat/common/dat_strerror.c
@@ -49,12 +49,12 @@
  *                                                                   *
  *********************************************************************/
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_major (
     IN  DAT_RETURN 		value,
     OUT const char 		**message );
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_minor (
     IN  DAT_RETURN 		value,
     OUT const char 		**message );
@@ -66,7 +66,7 @@ dat_strerror_minor (
  *                                                                   *
  *********************************************************************/
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_major (
     IN  DAT_RETURN 		value,
     OUT const char 		**message )
@@ -187,7 +187,7 @@ dat_strerror_major (
 }
 
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_minor (
     IN  DAT_RETURN 		value,
     OUT const char 		**message )
@@ -600,7 +600,7 @@ dat_strerror_minor (
  *                                                                   *
  *********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_strerror (
     IN  DAT_RETURN 		value,
     OUT const char 		**major_message,
diff --git a/dat/include/dat/dat.h b/dat/include/dat/dat.h
index 7fa543b..9c1632c 100755
--- a/dat/include/dat/dat.h
+++ b/dat/include/dat/dat.h
@@ -58,6 +58,11 @@
 
 #include <dat/dat_error.h>
 
+#ifdef __cplusplus
+extern "C"
+{
+#endif
+
 /* Generic DAT types */
 
 typedef char *  DAT_NAME_PTR;	/* Format for ia_name and attributes */
@@ -972,7 +977,7 @@ typedef struct dat_provider_info
  * unload the library after the last close.
  */
 
-extern DAT_RETURN dat_ia_openv (
+extern DAT_RETURN DAT_API dat_ia_openv (
 	IN      const DAT_NAME_PTR,	/* provider             */
 	IN      DAT_COUNT,		/* asynch_evd_min_qlen  */
 	INOUT   DAT_EVD_HANDLE *,	/* asynch_evd_handle    */
@@ -986,7 +991,7 @@ extern DAT_RETURN dat_ia_openv (
 		DAT_VERSION_MAJOR, DAT_VERSION_MINOR, \
 		DAT_THREADSAFE)
 
-extern DAT_RETURN dat_ia_query (
+extern  DAT_RETURN DAT_API dat_ia_query (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	OUT     DAT_EVD_HANDLE *,	/* async_evd_handle     */
 	IN      DAT_IA_ATTR_MASK,	/* ia_attr_mask         */
@@ -994,38 +999,38 @@ extern DAT_RETURN dat_ia_query (
 	IN      DAT_PROVIDER_ATTR_MASK,	/* provider_attr_mask   */
 	OUT     DAT_PROVIDER_ATTR * );	/* provider_attr        */
 
-extern DAT_RETURN dat_ia_close (
+extern  DAT_RETURN DAT_API dat_ia_close (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_CLOSE_FLAGS );	/* close_flags          */
 
 /* helper functions */
 
-extern DAT_RETURN dat_set_consumer_context (
+extern DAT_RETURN DAT_API dat_set_consumer_context (
 	IN      DAT_HANDLE,		/* dat_handle           */
 	IN      DAT_CONTEXT);		/* context              */
 
-extern DAT_RETURN dat_get_consumer_context (
+extern DAT_RETURN DAT_API dat_get_consumer_context (
 	IN      DAT_HANDLE,		/* dat_handle           */
 	OUT     DAT_CONTEXT * );	/* context              */
 
-extern DAT_RETURN dat_get_handle_type (
+extern DAT_RETURN DAT_API dat_get_handle_type (
 	IN      DAT_HANDLE,		/* dat_handle           */
 	OUT     DAT_HANDLE_TYPE * );	/* handle_type          */
 
 /* CR functions */
 
-extern DAT_RETURN dat_cr_query (
+extern DAT_RETURN DAT_API dat_cr_query (
 	IN      DAT_CR_HANDLE,		/* cr_handle            */
 	IN      DAT_CR_PARAM_MASK,	/* cr_param_mask        */
 	OUT     DAT_CR_PARAM * );	/* cr_param             */
 
-extern DAT_RETURN dat_cr_accept (
+extern DAT_RETURN DAT_API dat_cr_accept (
 	IN      DAT_CR_HANDLE,		/* cr_handle            */
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* private_data_size    */
 	IN      const DAT_PVOID );	/* private_data         */
 
-extern DAT_RETURN dat_cr_reject (
+extern DAT_RETURN DAT_API dat_cr_reject (
 	IN      DAT_CR_HANDLE, 		/* cr_handle            */
 	IN	DAT_COUNT,		/* private_data_size	*/
 	IN const DAT_PVOID );		/* private_data		*/
@@ -1033,35 +1038,35 @@ extern DAT_RETURN dat_cr_reject (
 /* For DAT-1.1 and above, this function is defined for both uDAPL and
  * kDAPL. For DAT-1.0, it is only defined for uDAPL.
  */
-extern DAT_RETURN dat_cr_handoff (
+extern DAT_RETURN DAT_API dat_cr_handoff (
 	IN      DAT_CR_HANDLE,		/* cr_handle            */
 	IN      DAT_CONN_QUAL);		/* handoff              */
 
 /* EVD functions */
 
-extern DAT_RETURN dat_evd_resize (
+extern DAT_RETURN DAT_API dat_evd_resize (
 	IN      DAT_EVD_HANDLE,	        /* evd_handle           */
 	IN      DAT_COUNT );	        /* evd_min_qlen         */
 
-extern DAT_RETURN dat_evd_post_se (
+extern DAT_RETURN DAT_API dat_evd_post_se (
 	IN      DAT_EVD_HANDLE,	        /* evd_handle           */
 	IN      const DAT_EVENT * );    /* event                */
 
-extern DAT_RETURN dat_evd_dequeue (
+extern DAT_RETURN DAT_API dat_evd_dequeue (
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	OUT     DAT_EVENT * );		/* event                */
 
-extern DAT_RETURN dat_evd_query (
+extern DAT_RETURN DAT_API dat_evd_query (
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	IN      DAT_EVD_PARAM_MASK,	/* evd_param_mask       */
 	OUT     DAT_EVD_PARAM * );	/* evd_param            */
 
-extern DAT_RETURN dat_evd_free (
+extern DAT_RETURN DAT_API dat_evd_free (
 	IN      DAT_EVD_HANDLE );	/* evd_handle           */
 
 /* EP functions */
 
-extern DAT_RETURN dat_ep_create (
+extern DAT_RETURN DAT_API dat_ep_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_PZ_HANDLE,		/* pz_handle            */
 	IN      DAT_EVD_HANDLE,		/* recv_completion_evd_handle */
@@ -1070,17 +1075,17 @@ extern DAT_RETURN dat_ep_create (
 	IN      const DAT_EP_ATTR *,	/* ep_attributes        */
 	OUT     DAT_EP_HANDLE * );	/* ep_handle            */
 
-extern DAT_RETURN dat_ep_query (
+extern DAT_RETURN DAT_API dat_ep_query (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EP_PARAM_MASK,	/* ep_param_mask        */
 	OUT     DAT_EP_PARAM * );	/* ep_param             */
 
-extern DAT_RETURN dat_ep_modify (
+extern DAT_RETURN DAT_API dat_ep_modify (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EP_PARAM_MASK,	/* ep_param_mask        */
 	IN      const DAT_EP_PARAM * ); /* ep_param             */
 
-extern DAT_RETURN dat_ep_connect (
+extern DAT_RETURN DAT_API dat_ep_connect (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_IA_ADDRESS_PTR,	/* remote_ia_address    */
 	IN      DAT_CONN_QUAL,		/* remote_conn_qual     */
@@ -1090,7 +1095,7 @@ extern DAT_RETURN dat_ep_connect (
 	IN      DAT_QOS,		/* quality_of_service   */
 	IN      DAT_CONNECT_FLAGS );	/* connect_flags        */
 
-extern DAT_RETURN dat_ep_dup_connect (
+extern DAT_RETURN DAT_API dat_ep_dup_connect (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EP_HANDLE,		/* ep_dup_handle        */
 	IN      DAT_TIMEOUT,		/* timeout              */
@@ -1098,25 +1103,25 @@ extern DAT_RETURN dat_ep_dup_connect (
 	IN      const DAT_PVOID,	/* private_data         */
 	IN      DAT_QOS);		/* quality_of_service   */
 
-extern DAT_RETURN dat_ep_common_connect (
+extern DAT_RETURN DAT_API dat_ep_common_connect (
 	IN      DAT_EP_HANDLE,          /* ep_handle            */
 	IN      DAT_IA_ADDRESS_PTR,     /* remote_ia_address    */
 	IN      DAT_TIMEOUT,            /* timeout              */
 	IN      DAT_COUNT,              /* private_data_size    */
 	IN      const DAT_PVOID );      /* private_data         */
 
-extern DAT_RETURN dat_ep_disconnect (
+extern DAT_RETURN DAT_API dat_ep_disconnect (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_CLOSE_FLAGS );	/* close_flags          */
 
-extern DAT_RETURN dat_ep_post_send (
+extern DAT_RETURN DAT_API dat_ep_post_send (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
 	IN      DAT_DTO_COOKIE,		/* user_cookie          */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_send_with_invalidate (
+extern DAT_RETURN DAT_API dat_ep_post_send_with_invalidate (
 	IN      DAT_EP_HANDLE,          /* ep_handle            */
 	IN      DAT_COUNT,              /* num_segments         */
 	IN      DAT_LMR_TRIPLET *,      /* local_iov            */
@@ -1125,14 +1130,14 @@ extern DAT_RETURN dat_ep_post_send_with_invalidate (
 	IN	DAT_BOOLEAN,		/* invalidate_flag 	*/
 	IN	DAT_RMR_CONTEXT );	/* RMR to invalidate	*/
 
-extern DAT_RETURN dat_ep_post_recv (
+extern DAT_RETURN DAT_API dat_ep_post_recv (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
 	IN      DAT_DTO_COOKIE,		/* user_cookie          */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_rdma_read (
+extern DAT_RETURN DAT_API dat_ep_post_rdma_read (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
@@ -1140,14 +1145,14 @@ extern DAT_RETURN dat_ep_post_rdma_read (
 	IN      const DAT_RMR_TRIPLET *,/* remote_iov           */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_rdma_read_to_rmr (
+extern DAT_RETURN DAT_API dat_ep_post_rdma_read_to_rmr (
 	IN      DAT_EP_HANDLE,          /* ep_handle            */
 	IN const DAT_RMR_TRIPLET *,      /* local_iov            */
 	IN      DAT_DTO_COOKIE,         /* user_cookie          */
 	IN      const DAT_RMR_TRIPLET *,/* remote_iov           */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_rdma_write (
+extern DAT_RETURN DAT_API dat_ep_post_rdma_write (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
@@ -1155,19 +1160,19 @@ extern DAT_RETURN dat_ep_post_rdma_write (
 	IN      const DAT_RMR_TRIPLET *,/* remote_iov           */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_get_status (
+extern DAT_RETURN DAT_API dat_ep_get_status (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	OUT     DAT_EP_STATE *,		/* ep_state             */
 	OUT     DAT_BOOLEAN *,		/* recv_idle            */
 	OUT     DAT_BOOLEAN * );	/* request_idle         */
 
-extern DAT_RETURN dat_ep_free (
+extern DAT_RETURN DAT_API dat_ep_free (
 	IN      DAT_EP_HANDLE);		/* ep_handle            */
 
-extern DAT_RETURN dat_ep_reset (
+extern DAT_RETURN DAT_API dat_ep_reset (
 	IN      DAT_EP_HANDLE);		/* ep_handle            */
 
-extern DAT_RETURN dat_ep_create_with_srq (
+extern DAT_RETURN DAT_API dat_ep_create_with_srq (
         IN      DAT_IA_HANDLE,          /* ia_handle            */
         IN      DAT_PZ_HANDLE,          /* pz_handle            */
         IN      DAT_EVD_HANDLE,         /* recv_evd_handle      */
@@ -1177,49 +1182,49 @@ extern DAT_RETURN dat_ep_create_with_srq (
         IN      const DAT_EP_ATTR *,    /* ep_attributes        */
         OUT     DAT_EP_HANDLE *);       /* ep_handle            */
 
-extern DAT_RETURN dat_ep_recv_query (
+extern DAT_RETURN DAT_API dat_ep_recv_query (
         IN      DAT_EP_HANDLE,          /* ep_handle            */
         OUT     DAT_COUNT *,            /* nbufs_allocated      */
         OUT     DAT_COUNT *);           /* bufs_alloc_span      */
 
-extern DAT_RETURN dat_ep_set_watermark (
+extern DAT_RETURN DAT_API dat_ep_set_watermark (
         IN      DAT_EP_HANDLE,          /* ep_handle            */
         IN      DAT_COUNT,              /* soft_high_watermark  */
         IN      DAT_COUNT);             /* hard_high_watermark  */
 
 /* LMR functions */
 
-extern DAT_RETURN dat_lmr_free (
+extern DAT_RETURN DAT_API dat_lmr_free (
 	IN      DAT_LMR_HANDLE);	/* lmr_handle           */
 
 /* Non-coherent memory functions */
 
-extern DAT_RETURN dat_lmr_sync_rdma_read (
+extern DAT_RETURN DAT_API dat_lmr_sync_rdma_read (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      const DAT_LMR_TRIPLET *, /* local_segments      */
 	IN      DAT_VLEN);              /* num_segments         */
 
-extern DAT_RETURN dat_lmr_sync_rdma_write (
+extern DAT_RETURN DAT_API dat_lmr_sync_rdma_write (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      const DAT_LMR_TRIPLET *, /* local_segments      */
 	IN      DAT_VLEN);              /* num_segments         */
 
 /* RMR functions */
 
-extern DAT_RETURN dat_rmr_create (
+extern DAT_RETURN DAT_API dat_rmr_create (
 	IN      DAT_PZ_HANDLE,		/* pz_handle            */
 	OUT     DAT_RMR_HANDLE *);	/* rmr_handle           */
 
-extern DAT_RETURN dat_rmr_create_for_ep (
+extern DAT_RETURN DAT_API dat_rmr_create_for_ep (
 	IN      DAT_PZ_HANDLE,          /* pz_handle            */
 	OUT     DAT_RMR_HANDLE *);      /* rmr_handle           */
 
-extern DAT_RETURN dat_rmr_query (
+extern DAT_RETURN DAT_API dat_rmr_query (
 	IN      DAT_RMR_HANDLE,		/* rmr_handle           */
 	IN      DAT_RMR_PARAM_MASK,	/* rmr_param_mask       */
 	OUT     DAT_RMR_PARAM *);	/* rmr_param            */
 
-extern DAT_RETURN dat_rmr_bind (
+extern DAT_RETURN DAT_API dat_rmr_bind (
 	IN      DAT_RMR_HANDLE,		/* rmr_handle           */
 	IN	DAT_LMR_HANDLE,		/* lmr_handle		*/
 	IN      const DAT_LMR_TRIPLET *,/* lmr_triplet          */
@@ -1230,114 +1235,114 @@ extern DAT_RETURN dat_rmr_bind (
 	IN      DAT_COMPLETION_FLAGS,	/* completion_flags     */
 	OUT     DAT_RMR_CONTEXT * );	/* context              */
 
-extern DAT_RETURN dat_rmr_free (
+extern DAT_RETURN DAT_API dat_rmr_free (
 	IN      DAT_RMR_HANDLE);	/* rmr_handle           */
 
 /* PSP functions */
 
-extern DAT_RETURN dat_psp_create (
+extern DAT_RETURN DAT_API dat_psp_create (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      DAT_CONN_QUAL,          /* conn_qual            */
 	IN      DAT_EVD_HANDLE,         /* evd_handle           */
 	IN      DAT_PSP_FLAGS,          /* psp_flags            */
 	OUT     DAT_PSP_HANDLE * );     /* psp_handle           */
 
-extern DAT_RETURN dat_psp_create_any (
+extern DAT_RETURN DAT_API dat_psp_create_any (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	OUT     DAT_CONN_QUAL *,	/* conn_qual            */
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	IN      DAT_PSP_FLAGS,		/* psp_flags            */
 	OUT     DAT_PSP_HANDLE * );	/* psp_handle           */
 
-extern DAT_RETURN dat_psp_query (
+extern DAT_RETURN DAT_API dat_psp_query (
 	IN      DAT_PSP_HANDLE,		/* psp_handle           */
 	IN      DAT_PSP_PARAM_MASK,	/* psp_param_mask       */
 	OUT     DAT_PSP_PARAM * );	/* psp_param            */
 
-extern DAT_RETURN dat_psp_free (
+extern DAT_RETURN DAT_API dat_psp_free (
 	IN      DAT_PSP_HANDLE );	/* psp_handle           */
 
 /* RSP functions */
 
-extern DAT_RETURN dat_rsp_create (
+extern DAT_RETURN DAT_API dat_rsp_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_CONN_QUAL,		/* conn_qual            */
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	OUT     DAT_RSP_HANDLE * );	/* rsp_handle           */
 
-extern DAT_RETURN dat_rsp_query (
+extern DAT_RETURN DAT_API dat_rsp_query (
 	IN      DAT_RSP_HANDLE,		/* rsp_handle           */
 	IN      DAT_RSP_PARAM_MASK,	/* rsp_param_mask       */
 	OUT     DAT_RSP_PARAM * );	/* rsp_param            */
 
-extern DAT_RETURN dat_rsp_free (
+extern DAT_RETURN DAT_API dat_rsp_free (
 	IN      DAT_RSP_HANDLE );	/* rsp_handle           */
 
 /* CSP functions */
 
-extern DAT_RETURN dat_csp_create (
+extern DAT_RETURN DAT_API dat_csp_create (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      DAT_COMM *,          	/* communicator		*/
 	IN      DAT_IA_ADDRESS_PTR,     /* address		*/
 	IN      DAT_EVD_HANDLE,         /* evd_handle           */
 	OUT     DAT_CSP_HANDLE * );     /* csp_handle           */
 
-extern DAT_RETURN dat_csp_query (
+extern DAT_RETURN DAT_API dat_csp_query (
 	IN      DAT_CSP_HANDLE,         /* csp_handle           */
 	IN      DAT_CSP_PARAM_MASK,     /* csp_param_mask       */
 	OUT     DAT_CSP_PARAM * );      /* csp_param            */
 
-extern DAT_RETURN dat_csp_free (
+extern DAT_RETURN DAT_API dat_csp_free (
 	IN      DAT_CSP_HANDLE );       /* csp_handle           */
 
 /* PZ functions */
 
-extern DAT_RETURN dat_pz_create (
+extern DAT_RETURN DAT_API dat_pz_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	OUT     DAT_PZ_HANDLE * );	/* pz_handle            */
 
-extern DAT_RETURN dat_pz_query (
+extern DAT_RETURN DAT_API dat_pz_query (
 	IN      DAT_PZ_HANDLE,		/* pz_handle            */
 	IN      DAT_PZ_PARAM_MASK,	/* pz_param_mask        */
 	OUT     DAT_PZ_PARAM *);	/* pz_param             */
 
-extern DAT_RETURN dat_pz_free (
+extern DAT_RETURN DAT_API dat_pz_free (
 	IN      DAT_PZ_HANDLE );	/* pz_handle            */
 
 /* SRQ functions */
 
-extern DAT_RETURN dat_srq_create (
+extern DAT_RETURN DAT_API dat_srq_create (
         IN      DAT_IA_HANDLE,          /* ia_handle            */
         IN      DAT_PZ_HANDLE,          /* pz_handle            */
         IN      DAT_SRQ_ATTR *,         /* srq_attr             */
         OUT     DAT_SRQ_HANDLE *);      /* srq_handle           */
 
-extern DAT_RETURN dat_srq_free (
+extern DAT_RETURN DAT_API dat_srq_free (
 	IN      DAT_SRQ_HANDLE);        /* srq_handle           */
 
-extern DAT_RETURN dat_srq_post_recv (
+extern DAT_RETURN DAT_API dat_srq_post_recv (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT,              /* num_segments         */
 	IN      DAT_LMR_TRIPLET *,      /* local_iov            */
 	IN      DAT_DTO_COOKIE);        /* user_cookie          */
 
-extern DAT_RETURN dat_srq_query (
+extern DAT_RETURN DAT_API dat_srq_query (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_SRQ_PARAM_MASK,     /* srq_param_mask       */
 	OUT     DAT_SRQ_PARAM *);       /* srq_param            */
 
-extern DAT_RETURN dat_srq_resize (
+extern DAT_RETURN DAT_API dat_srq_resize (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT);             /* srq_max_recv_dto     */
 
-extern DAT_RETURN dat_srq_set_lw (
+extern DAT_RETURN DAT_API dat_srq_set_lw (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT);             /* low_watermark        */
 
 #ifdef DAT_EXTENSIONS
 typedef int	DAT_EXTENDED_OP;
-extern DAT_RETURN dat_extension_op(
+extern DAT_RETURN DAT_API dat_extension_op(
 	IN	DAT_HANDLE,		/* handle */
 	IN      DAT_EXTENDED_OP,	/* operation */
 	IN	... );			/* args */
@@ -1349,7 +1354,7 @@ extern DAT_RETURN dat_extension_op(
  * Note the dat_ia_open and dat_ia_close functions are linked to
  * registration code which "redirects" to the appropriate provider.
  */
-extern DAT_RETURN dat_registry_list_providers (
+extern DAT_RETURN DAT_API dat_registry_list_providers (
 	IN      DAT_COUNT,		/* max_to_return        */
 	OUT     DAT_COUNT *,		/* entries_returned     */
 	OUT     DAT_PROVIDER_INFO *(dat_provider_list[]) ); /* dat_provider_list */
@@ -1357,10 +1362,14 @@ extern DAT_RETURN dat_registry_list_providers (
 /*
  * DAT error functions.
  */
-extern DAT_RETURN dat_strerror (
+extern DAT_RETURN DAT_API dat_strerror (
 	IN      DAT_RETURN,		/* dat function return */
 	OUT     const char ** ,		/* major message string */
 	OUT     const char ** );	/* minor message string */
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _DAT_H_ */
 
diff --git a/dat/include/dat/dat_platform_specific.h b/dat/include/dat/dat_platform_specific.h
index 0314035..8a839d1 100644
--- a/dat/include/dat/dat_platform_specific.h
+++ b/dat/include/dat/dat_platform_specific.h
@@ -133,6 +133,9 @@ typedef struct sockaddr_in6    DAT_SOCK_ADDR6; /* Socket address header native t
 
 typedef DAT_UINT64		DAT_PADDR;
 
+#define DAT_API
+#define DAT_EXPORT		extern
+
 /* Solaris ends */
 
 
@@ -174,37 +177,74 @@ typedef int DAT_FD;		/* DAT File Descriptor */
 
 typedef struct sockaddr         DAT_SOCK_ADDR; /* Socket address header native to OS */
 typedef struct sockaddr_in6     DAT_SOCK_ADDR6; /* Socket address header native to OS */
-#define DAT_AF_INET AF_INET
-#define DAT_AF_INET6 AF_INET6
-/* Linux ends */
+#define DAT_AF_INET		AF_INET
+#define DAT_AF_INET6		AF_INET6
+
+#define DAT_API
+#define DAT_EXPORT		extern
 
+/* Linux ends */
 
-/* Win32 begins */
-#elif defined(_MSC_VER) || defined(_WIN32) /* NT. MSC compiler, Win32 platform */
+/* Win32/64 begins */
+#elif defined(_MSC_VER) || defined(_WIN32) || defined(_WIN64)
+/* NT. MSC compiler, Win32/64 platform */
 
 typedef unsigned __int32        DAT_UINT32;	/* Unsigned host order, 32 bits */
 typedef unsigned __int64        DAT_UINT64;	/* unsigned host order, 64 bits */
-typedef unsigned  long		DAT_UVERYLONG;	/* unsigned longest native to compiler */
+typedef unsigned  long	        DAT_UVERYLONG;	/* unsigned longest native to compiler */
+
+#if defined(_WIN64)
+#define DAT_IA_HANDLE_TO_UL(a) (unsigned long)((DAT_UINT64)(a))
+#define UL_TO_DAT_IA_HANDLE(a) (DAT_IA_HANDLE)((DAT_UINT64)(a))
+#else // _WIN32
+#define DAT_IA_HANDLE_TO_UL(a) (unsigned long)(a)
+#define UL_TO_DAT_IA_HANDLE(a) (DAT_IA_HANDLE)(a)
+#endif
 
 typedef void *                  DAT_PVOID;
 typedef long                    DAT_COUNT;
+typedef DAT_UINT64              DAT_PADDR;
 
-typedef struct sockaddr         DAT_SOCK_ADDR;	/* Socket address header native to OS */
-typedef struct sockaddr_in6     DAT_SOCK_ADDR6; /* Socket address header native to OS */
+typedef struct dat_comm {
+	int	domain;
+	int	type;
+	int	protocol;
+} DAT_COMM;
+
+typedef int DAT_FD;		/* DAT File Descriptor */
+
+typedef struct sockaddr     DAT_SOCK_ADDR; /* Sock addr header native to OS */
+typedef struct sockaddr_in6 DAT_SOCK_ADDR6;/* Sock addr header native to OS */
 
 #ifndef UINT64_C
 #define UINT64_C(c) c ## i64
 #endif /* UINT64_C */
 
-#define DAT_AF_INET             AF_INET
-#define DAT_AF_INET6            AF_INET6
+#define DAT_AF_INET        AF_INET
+#define DAT_AF_INET6       AF_INET6
+
+#if defined(EXPORT_DAT_SYMBOLS)
+#define DAT_EXPORT	__declspec(dllexport)
+#else
+#define DAT_EXPORT	__declspec(dllimport)
+#endif
+
+#define DAT_API		__stdcall
+
+#ifndef __inline__
+#define __inline__	__inline
+#endif
+
+#ifndef INLINE
+#define INLINE		__inline
+#endif
 
 #if defined(__KDAPL__)
 /* must have the DDK for this definition */
 typedef PHYSICAL_ADDRESS	DAT_PADDR;
 #endif /* __KDAPL__ */
 
-/* Win32 ends */
+/* Win32/64 ends */
 
 
 #else
diff --git a/dat/include/dat/dat_registry.h b/dat/include/dat/dat_registry.h
index fe9db4b..80c3801 100644
--- a/dat/include/dat/dat_registry.h
+++ b/dat/include/dat/dat_registry.h
@@ -59,6 +59,11 @@
 #ifndef _DAT_REGISTRY_H_
 #define _DAT_REGISTRY_H_
 
+#ifdef __cplusplus
+extern "C"
+{
+#endif
+
 #if defined(_UDAT_H_)
 #include <dat/udat_redirection.h>
 #elif defined(_KDAT_H_)
@@ -78,11 +83,11 @@
  *
  */
 
-extern DAT_RETURN dat_registry_add_provider (
+extern DAT_RETURN DAT_API dat_registry_add_provider (
 	IN  const DAT_PROVIDER *,               /* provider          */
 	IN  const DAT_PROVIDER_INFO* );         /* provider info     */
 
-extern DAT_RETURN dat_registry_remove_provider (
+extern DAT_RETURN DAT_API dat_registry_remove_provider (
 	IN  const DAT_PROVIDER *,               /* provider          */
 	IN  const DAT_PROVIDER_INFO* );         /* provider info     */
 
@@ -99,11 +104,11 @@ extern DAT_RETURN dat_registry_remove_provider (
 #define DAT_PROVIDER_INIT_FUNC_STR   "dat_provider_init"
 #define DAT_PROVIDER_FINI_FUNC_STR   "dat_provider_fini"
 
-typedef void ( *DAT_PROVIDER_INIT_FUNC) (
+typedef void ( DAT_API *DAT_PROVIDER_INIT_FUNC) (
 	IN const DAT_PROVIDER_INFO *,           /* provider info     */
 	IN const char *);                       /* instance data     */
 
-typedef void ( *DAT_PROVIDER_FINI_FUNC) (
+typedef void ( DAT_API *DAT_PROVIDER_FINI_FUNC) (
 	IN const DAT_PROVIDER_INFO *);          /* provider info     */
 
 typedef enum dat_ha_relationship
@@ -115,9 +120,13 @@ typedef enum dat_ha_relationship
 	DAT_HA_EXTENSION_BASE
 } DAT_HA_RELATIONSHIP;
 
-extern DAT_RETURN dat_registry_providers_related (
+extern DAT_RETURN DAT_API dat_registry_providers_related (
 	IN      const DAT_NAME_PTR,
 	IN      const DAT_NAME_PTR,
 	OUT     DAT_HA_RELATIONSHIP * );
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _DAT_REGISTRY_H_ */
diff --git a/dat/include/dat/udat.h b/dat/include/dat/udat.h
index 7a89241..a9bb2ac 100755
--- a/dat/include/dat/udat.h
+++ b/dat/include/dat/udat.h
@@ -60,6 +60,11 @@
 
 #include <dat/dat_platform_specific.h>
 
+#ifdef __cplusplus
+extern "C"
+{
+#endif
+
 typedef enum dat_mem_type
 {
         /* Shared between udat and kdat */
@@ -152,7 +157,7 @@ typedef char (* DAT_LMR_COOKIE)[DAT_LMR_COOKIE_SIZE];
 
 /* Format for OS wait proxy agent function */
 
-typedef void (*DAT_AGENT_FUNC)
+typedef void (DAT_API *DAT_AGENT_FUNC)
 (
     DAT_PVOID,                 /* instance data */
     DAT_EVD_HANDLE             /* Event Dispatcher*/
@@ -410,7 +415,7 @@ struct dat_provider_attr
  * User DAT function call definitions,
  */
 
-extern DAT_RETURN dat_lmr_create (
+extern DAT_RETURN DAT_API dat_lmr_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_MEM_TYPE,		/* mem_type             */
 	IN      DAT_REGION_DESCRIPTION, /* region_description   */
@@ -424,72 +429,76 @@ extern DAT_RETURN dat_lmr_create (
 	OUT     DAT_VLEN *,		/* registered_length    */
 	OUT     DAT_VADDR * );		/* registered_address   */
 
-extern DAT_RETURN dat_lmr_query (
+extern DAT_RETURN DAT_API dat_lmr_query (
 	IN      DAT_LMR_HANDLE,		/* lmr_handle           */
 	IN      DAT_LMR_PARAM_MASK,	/* lmr_param_mask       */
 	OUT     DAT_LMR_PARAM * );	/* lmr_param            */
 
 /* Event Functions */
 
-extern DAT_RETURN dat_evd_create (
+extern DAT_RETURN DAT_API dat_evd_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_COUNT,		/* evd_min_qlen         */
 	IN      DAT_CNO_HANDLE,		/* cno_handle           */
 	IN      DAT_EVD_FLAGS,		/* evd_flags            */
 	OUT     DAT_EVD_HANDLE * );	/* evd_handle           */
 
-extern DAT_RETURN dat_evd_modify_cno (
+extern DAT_RETURN DAT_API dat_evd_modify_cno (
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	IN      DAT_CNO_HANDLE);	/* cno_handle           */
 
-extern DAT_RETURN dat_cno_create (
+extern DAT_RETURN DAT_API dat_cno_create (
 	IN 	DAT_IA_HANDLE,		/* ia_handle            */
 	IN 	DAT_OS_WAIT_PROXY_AGENT,/* agent                */
 	OUT 	DAT_CNO_HANDLE *);	/* cno_handle           */
 
-extern DAT_RETURN dat_cno_modify_agent (
+extern DAT_RETURN DAT_API dat_cno_modify_agent (
 	IN 	DAT_CNO_HANDLE,		 /* cno_handle           */
 	IN 	DAT_OS_WAIT_PROXY_AGENT);/* agent                */
 
-extern DAT_RETURN dat_cno_query (
+extern DAT_RETURN DAT_API dat_cno_query (
 	IN      DAT_CNO_HANDLE,		/* cno_handle            */
 	IN      DAT_CNO_PARAM_MASK,	/* cno_param_mask        */
 	OUT     DAT_CNO_PARAM * );	/* cno_param             */
 
-extern DAT_RETURN dat_cno_free (
+extern DAT_RETURN DAT_API dat_cno_free (
 	IN DAT_CNO_HANDLE);		/* cno_handle            */
 
-extern DAT_RETURN dat_cno_wait (
+extern DAT_RETURN DAT_API dat_cno_wait (
 	IN  	DAT_CNO_HANDLE,		/* cno_handle            */
 	IN  	DAT_TIMEOUT,		/* timeout               */
 	OUT 	DAT_EVD_HANDLE *);	/* evd_handle            */
 
-extern DAT_RETURN dat_evd_enable (
+extern DAT_RETURN DAT_API dat_evd_enable (
 	IN      DAT_EVD_HANDLE);	/* evd_handle            */
 
-extern DAT_RETURN dat_evd_wait (
+extern DAT_RETURN DAT_API dat_evd_wait (
 	IN  	DAT_EVD_HANDLE,		/* evd_handle            */
 	IN  	DAT_TIMEOUT,		/* timeout               */
 	IN  	DAT_COUNT,		/* threshold             */
 	OUT 	DAT_EVENT *,		/* event                 */
 	OUT 	DAT_COUNT * );		/* n_more_events         */
 
-extern DAT_RETURN dat_evd_disable (
+extern DAT_RETURN DAT_API dat_evd_disable (
 	IN      DAT_EVD_HANDLE);	/* evd_handle            */
 
-extern DAT_RETURN dat_evd_set_unwaitable (
+extern DAT_RETURN DAT_API dat_evd_set_unwaitable (
 	IN DAT_EVD_HANDLE);		/* evd_handle            */
 
-extern DAT_RETURN dat_evd_clear_unwaitable (
+extern DAT_RETURN DAT_API dat_evd_clear_unwaitable (
 	IN DAT_EVD_HANDLE);		/* evd_handle            */
 
-extern DAT_RETURN dat_cno_fd_create (
+extern DAT_RETURN DAT_API dat_cno_fd_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle		*/
 	OUT	DAT_FD *,		/* file descriptor	*/
 	OUT	DAT_CNO_HANDLE * );	/* cno_handle		*/
 
-extern DAT_RETURN dat_cno_trigger (
+extern DAT_RETURN DAT_API dat_cno_trigger (
 	IN	DAT_CNO_HANDLE,		/* cno_handle		*/
 	OUT	DAT_EVD_HANDLE * );	/* evd_handle		*/
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _UDAT_H_ */
diff --git a/dat/udat/udat.c b/dat/udat/udat.c
index 8356482..9137e65 100755
--- a/dat/udat/udat.c
+++ b/dat/udat/udat.c
@@ -82,15 +82,15 @@ int g_dat_extensions = 0;
  * Function: dat_registry_add_provider
  ***********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_registry_add_provider (
 	IN  const DAT_PROVIDER	 	*provider,
 	IN  const DAT_PROVIDER_INFO	*provider_info )
 {
-    DAT_DR_ENTRY 		entry;
+    DAT_DR_ENTRY 	entry;
 
     dat_os_dbg_print (DAT_OS_DBG_TYPE_PROVIDER_API,
-		      "DAT Registry: dat_registry_add_provider (%s,%x:%x,%x)\n",
+	      "DAT Registry: %s (%s,%x:%x,%x)\n", __FUNCTION__,
 		      provider_info->ia_name,
 		      provider_info->dapl_version_major,
 		      provider_info->dapl_version_minor,
@@ -123,7 +123,7 @@ dat_registry_add_provider (
 // Function: dat_registry_remove_provider
 //***********************************************************************
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_registry_remove_provider (
 	IN  const DAT_PROVIDER 		*provider,
 	IN  const DAT_PROVIDER_INFO	*provider_info )
@@ -155,7 +155,7 @@ dat_registry_remove_provider (
  * Function: dat_ia_open
  ***********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_ia_openv (
     IN	   const DAT_NAME_PTR	name,
     IN	   DAT_COUNT		async_event_qlen,
@@ -271,7 +271,7 @@ dat_ia_openv (
  * Function: dat_ia_close
  ***********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_ia_close (
     IN DAT_IA_HANDLE	ia_handle,
     IN DAT_CLOSE_FLAGS	ia_flags)
@@ -366,7 +366,7 @@ dat_ia_close (
 // Function: dat_registry_list_providers
 //***********************************************************************
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_registry_list_providers (
     IN  DAT_COUNT   		max_to_return,
     OUT DAT_COUNT   		*entries_returned,
diff --git a/dat/udat/udat_api.c b/dat/udat/udat_api.c
index 58813fe..8dea05c 100644
--- a/dat/udat/udat_api.c
+++ b/dat/udat/udat_api.c
@@ -52,7 +52,7 @@
 
 #define UDAT_IS_BAD_HANDLE(h) ( NULL == (p) )
 
-DAT_RETURN dat_lmr_create (
+DAT_RETURN DAT_API dat_lmr_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_MEM_TYPE		mem_type,
 	IN      DAT_REGION_DESCRIPTION	region_description,
@@ -91,7 +91,7 @@ DAT_RETURN dat_lmr_create (
 }
 
 
-DAT_RETURN dat_evd_create (
+DAT_RETURN DAT_API dat_evd_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_COUNT		evd_min_qlen,
 	IN      DAT_CNO_HANDLE		cno_handle,
@@ -116,7 +116,7 @@ DAT_RETURN dat_evd_create (
 }
 
 
-DAT_RETURN dat_evd_modify_cno (
+DAT_RETURN DAT_API dat_evd_modify_cno (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	IN      DAT_CNO_HANDLE		cno_handle)
 {
@@ -129,7 +129,7 @@ DAT_RETURN dat_evd_modify_cno (
 }
 
 
-DAT_RETURN dat_cno_create (
+DAT_RETURN DAT_API dat_cno_create (
 	IN 	DAT_IA_HANDLE		ia_handle,
 	IN 	DAT_OS_WAIT_PROXY_AGENT agent,
 	OUT 	DAT_CNO_HANDLE		*cno_handle)
@@ -149,7 +149,7 @@ DAT_RETURN dat_cno_create (
     return dat_status;
 }
 
-DAT_RETURN dat_cno_fd_create (
+DAT_RETURN DAT_API dat_cno_fd_create (
 	IN 	DAT_IA_HANDLE		ia_handle,
 	OUT	DAT_FD			*fd,
 	OUT 	DAT_CNO_HANDLE		*cno_handle)
@@ -169,7 +169,7 @@ DAT_RETURN dat_cno_fd_create (
     return dat_status;
 }
 
-DAT_RETURN dat_cno_modify_agent (
+DAT_RETURN DAT_API dat_cno_modify_agent (
 	IN 	DAT_CNO_HANDLE		 cno_handle,
 	IN 	DAT_OS_WAIT_PROXY_AGENT	 agent)
 {
@@ -182,7 +182,7 @@ DAT_RETURN dat_cno_modify_agent (
 }
 
 
-DAT_RETURN dat_cno_query (
+DAT_RETURN DAT_API dat_cno_query (
 	IN      DAT_CNO_HANDLE		cno_handle,
 	IN      DAT_CNO_PARAM_MASK	cno_param_mask,
 	OUT     DAT_CNO_PARAM		*cno_param)
@@ -193,7 +193,7 @@ DAT_RETURN dat_cno_query (
 }
 
 
-DAT_RETURN dat_cno_free (
+DAT_RETURN DAT_API dat_cno_free (
 	IN DAT_CNO_HANDLE		cno_handle)
 {
     if (cno_handle == NULL)
@@ -204,7 +204,7 @@ DAT_RETURN dat_cno_free (
 }
 
 
-DAT_RETURN dat_cno_wait (
+DAT_RETURN DAT_API dat_cno_wait (
 	IN  	DAT_CNO_HANDLE		cno_handle,
 	IN  	DAT_TIMEOUT		timeout,
 	OUT 	DAT_EVD_HANDLE		*evd_handle)
@@ -219,7 +219,7 @@ DAT_RETURN dat_cno_wait (
 }
 
 
-DAT_RETURN dat_evd_enable (
+DAT_RETURN DAT_API dat_evd_enable (
 	IN      DAT_EVD_HANDLE		evd_handle)
 {
     if (evd_handle == NULL)
@@ -230,7 +230,7 @@ DAT_RETURN dat_evd_enable (
 }
 
 
-DAT_RETURN dat_evd_wait (
+DAT_RETURN DAT_API dat_evd_wait (
 	IN  	DAT_EVD_HANDLE		evd_handle,
 	IN  	DAT_TIMEOUT		Timeout,
 	IN  	DAT_COUNT		Threshold,
@@ -249,7 +249,7 @@ DAT_RETURN dat_evd_wait (
 }
 
 
-DAT_RETURN dat_evd_disable (
+DAT_RETURN DAT_API dat_evd_disable (
 	IN      DAT_EVD_HANDLE		evd_handle)
 {
     if (evd_handle == NULL)
@@ -260,7 +260,7 @@ DAT_RETURN dat_evd_disable (
 }
 
 
-DAT_RETURN dat_evd_set_unwaitable (
+DAT_RETURN DAT_API dat_evd_set_unwaitable (
 	IN 	DAT_EVD_HANDLE		 evd_handle)
 {
     if (evd_handle == NULL)
@@ -271,7 +271,7 @@ DAT_RETURN dat_evd_set_unwaitable (
 }
 
 
-DAT_RETURN dat_evd_clear_unwaitable (
+DAT_RETURN DAT_API dat_evd_clear_unwaitable (
 	IN 	DAT_EVD_HANDLE		 evd_handle)
 {
     if (evd_handle == NULL)
@@ -283,7 +283,7 @@ DAT_RETURN dat_evd_clear_unwaitable (
 
 
-DAT_RETURN dat_cr_handoff (
+DAT_RETURN DAT_API dat_cr_handoff (
     IN          DAT_CR_HANDLE		cr_handle,
     IN          DAT_CONN_QUAL		handoff)
 {
@@ -296,7 +296,7 @@ DAT_RETURN dat_cr_handoff (
 }
 
 
-DAT_RETURN dat_lmr_query (
+DAT_RETURN DAT_API dat_lmr_query (
 	IN      DAT_LMR_HANDLE		lmr_handle,
 	IN      DAT_LMR_PARAM_MASK	lmr_param_mask,
 	OUT     DAT_LMR_PARAM		*lmr_param)


From arlin.r.davis at intel.com  Fri Dec  7 16:08:56 2007
From: arlin.r.davis at intel.com (Arlin Davis)
Date: Fri, 7 Dec 2007 16:08:56 -0800
Subject: [ofa-general] [PATCH 2/2] uDAT/uDAPL v2 - (master branch) changes to
	sync common code base with WinOF 1.01
Message-ID: <000101c8392e$86f2ff50$1dfd070a@amr.corp.intel.com>


2/2 uDAPL changes.

  - add DAT_API to specify calling conventions (windows=__stdcall, linux= ) 
  - cleanup CR+LF's

Signed-off by: Arlin Davis <ardavis at ichips.intel.com>

diff --git a/dapl/common/dapl_cno_util.c b/dapl/common/dapl_cno_util.c
index 936224a..9de7add 100755
--- a/dapl/common/dapl_cno_util.c
+++ b/dapl/common/dapl_cno_util.c
@@ -266,11 +266,11 @@ dapl_internal_cno_trigger (
  *
  * DAPL Requirements Version 2.0, 6.3.2.x
  *
- * creates a CNO instance. Upon creation, there are no
- * Event Dispatchers feeding it. os_fd is a File Descriptor in Unix, 
- * i.e. struct pollfd or an equivalent object in other OSes that is 
- * always associates with the created CNO. Consumer can multiplex event 
- * waiting using UNIX poll or select functions. Upon creation, the CNO 
+ * creates a CNO instance. Upon creation, there are no
+ * Event Dispatchers feeding it. os_fd is a File Descriptor in Unix, 
+ * i.e. struct pollfd or an equivalent object in other OSes that is 
+ * always associates with the created CNO. Consumer can multiplex event 
+ * waiting using UNIX poll or select functions. Upon creation, the CNO 
  * is not associated with any EVDs, has no waiters and has the os_fd 
  * associated with it.
  *
@@ -292,7 +292,7 @@ dapl_internal_cno_trigger (
  *	DAT_PRIVILEGES_VIOLATION
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN 
+DAT_RETURN DAT_API
 dapl_cno_fd_create (
 	IN 	DAT_IA_HANDLE ia_handle,	/* ia_handle            */
 	OUT	DAT_FD *fd,			/* file_descriptor	*/
@@ -319,7 +319,7 @@ dapl_cno_fd_create (
  *	DAT_INVALID_HANDLE
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_cno_trigger(
     IN  DAT_CNO_HANDLE	cno_handle,
     OUT DAT_EVD_HANDLE	*evd_handle)
diff --git a/dapl/common/dapl_cr_accept.c b/dapl/common/dapl_cr_accept.c
index c30c29e..a0ec14b 100644
--- a/dapl/common/dapl_cr_accept.c
+++ b/dapl/common/dapl_cr_accept.c
@@ -62,7 +62,7 @@
  *	DAT_INVALID_PARAMETER
  *	DAT_INVALID_ATTRIBUTE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_cr_accept (
 	IN	DAT_CR_HANDLE		cr_handle,
 	IN	DAT_EP_HANDLE		ep_handle,
diff --git a/dapl/common/dapl_cr_handoff.c b/dapl/common/dapl_cr_handoff.c
index b34508f..1899360 100644
--- a/dapl/common/dapl_cr_handoff.c
+++ b/dapl/common/dapl_cr_handoff.c
@@ -58,7 +58,7 @@
  *	DAT_INVALID_HANDLE
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_cr_handoff (
 	IN	DAT_CR_HANDLE	   cr_handle,
 	IN 	DAT_CONN_QUAL	   cr_handoff )		/* handoff */
diff --git a/dapl/common/dapl_cr_query.c b/dapl/common/dapl_cr_query.c
index 1b1101b..d458782 100644
--- a/dapl/common/dapl_cr_query.c
+++ b/dapl/common/dapl_cr_query.c
@@ -58,7 +58,7 @@
  *	DAT_INVALID_PARAMETER
  *	DAT_INVALID_HANDLE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_cr_query (
 	IN	DAT_CR_HANDLE		cr_handle,
 	IN	DAT_CR_PARAM_MASK	cr_param_mask,
diff --git a/dapl/common/dapl_cr_reject.c b/dapl/common/dapl_cr_reject.c
index 31f7ed0..d6842b3 100755
--- a/dapl/common/dapl_cr_reject.c
+++ b/dapl/common/dapl_cr_reject.c
@@ -59,7 +59,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_cr_reject (
 	IN      DAT_CR_HANDLE cr_handle,/* cr_handle            */
 	IN	DAT_COUNT pdata_size,	/* private_data_size	*/
diff --git a/dapl/common/dapl_csp.c b/dapl/common/dapl_csp.c
index 6c0aaf2..678ca79 100755
--- a/dapl/common/dapl_csp.c
+++ b/dapl/common/dapl_csp.c
@@ -46,12 +46,12 @@
  *
  * uDAPL: User Direct Access Program Library Version 2.0, 6.4.4.2
  *
- * The Common Service Point is transport-independent analog of the Public
- * Service Point. It allows the Consumer to listen on socket-equivalent for
- * requests for connections arriving on a specified IP port instead of
+ * The Common Service Point is transport-independent analog of the Public
+ * Service Point. It allows the Consumer to listen on socket-equivalent for
+ * requests for connections arriving on a specified IP port instead of
  * transport-dependent Connection Qualifier. An IA Address follows the
- * platform conventions and provides among others the IP port to listen on.
- * An IP port of the Common Service Point advertisement is supported by
+ * platform conventions and provides among others the IP port to listen on.
+ * An IP port of the Common Service Point advertisement is supported by
  * existing Ethernet infrastructure or DAT Name Service.
  *
  * Input:
@@ -71,7 +71,7 @@
  * 	DAT_CONN_QUAL_IN_USE
  * 	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN 
+DAT_RETURN DAT_API
 dapl_csp_create (
 	IN      DAT_IA_HANDLE ia_handle,          /* ia_handle      */
 	IN      DAT_COMM *comm,                   /* communicator   */
@@ -82,7 +82,7 @@ dapl_csp_create (
 	return DAT_MODEL_NOT_SUPPORTED;
 }
 
-DAT_RETURN 
+DAT_RETURN DAT_API
 dapl_csp_query (
 	IN      DAT_CSP_HANDLE csp_handle,         /* csp_handle     */
 	IN      DAT_CSP_PARAM_MASK param_mask,     /* csp_param_mask */
@@ -91,7 +91,7 @@ dapl_csp_query (
 	return DAT_MODEL_NOT_SUPPORTED;
 }
 
-DAT_RETURN 
+DAT_RETURN DAT_API
 dapl_csp_free (
 	IN      DAT_CSP_HANDLE csp_handle )        /* csp_handle     */
 {
diff --git a/dapl/common/dapl_ep_connect.c b/dapl/common/dapl_ep_connect.c
index cf5b2c5..0998edc 100755
--- a/dapl/common/dapl_ep_connect.c
+++ b/dapl/common/dapl_ep_connect.c
@@ -66,7 +66,7 @@
  *	DAT_INVALID_PARAMETER
  *	DAT_MODLE_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_connect (
 	IN	DAT_EP_HANDLE		ep_handle,
 	IN	DAT_IA_ADDRESS_PTR	remote_ia_address,
@@ -403,7 +403,7 @@ bail:
  *	DAT_INVALID_STATE
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN 
+DAT_RETURN DAT_API
 dapl_ep_common_connect (
 	IN      DAT_EP_HANDLE ep,		/* ep_handle            */
 	IN      DAT_IA_ADDRESS_PTR remote_addr,	/* remote_ia_address    */
diff --git a/dapl/common/dapl_ep_create_with_srq.c b/dapl/common/dapl_ep_create_with_srq.c
index dd47b51..b62f53b 100644
--- a/dapl/common/dapl_ep_create_with_srq.c
+++ b/dapl/common/dapl_ep_create_with_srq.c
@@ -70,7 +70,7 @@
  *	DAT_INVALID_ATTRIBUTE
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_create_with_srq (
 	IN	DAT_IA_HANDLE	   	ia_handle,
 	IN	DAT_PZ_HANDLE	   	pz_handle,
diff --git a/dapl/common/dapl_ep_disconnect.c b/dapl/common/dapl_ep_disconnect.c
index 37fbb41..4101ac2 100644
--- a/dapl/common/dapl_ep_disconnect.c
+++ b/dapl/common/dapl_ep_disconnect.c
@@ -62,7 +62,7 @@
  *	DAT_INSUFFICIENT_RESOURCES
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_disconnect (
 	IN	DAT_EP_HANDLE	   	ep_handle,
 	IN	DAT_CLOSE_FLAGS		disconnect_flags)
diff --git a/dapl/common/dapl_ep_dup_connect.c b/dapl/common/dapl_ep_dup_connect.c
index 4423c4f..d853ddd 100644
--- a/dapl/common/dapl_ep_dup_connect.c
+++ b/dapl/common/dapl_ep_dup_connect.c
@@ -69,7 +69,7 @@
  *	DAT_INVALID_STATE
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_dup_connect (
 	IN	DAT_EP_HANDLE		ep_handle,
 	IN	DAT_EP_HANDLE		ep_dup_handle,
diff --git a/dapl/common/dapl_ep_free.c b/dapl/common/dapl_ep_free.c
index 3bf41ab..d0b25f1 100644
--- a/dapl/common/dapl_ep_free.c
+++ b/dapl/common/dapl_ep_free.c
@@ -61,7 +61,7 @@
  *	DAT_INVALID_PARAMETER
  *	DAT_INVALID_STATE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_free (
 	IN	DAT_EP_HANDLE	   ep_handle)
 {
diff --git a/dapl/common/dapl_ep_get_status.c b/dapl/common/dapl_ep_get_status.c
index 49b4fef..d55b512 100644
--- a/dapl/common/dapl_ep_get_status.c
+++ b/dapl/common/dapl_ep_get_status.c
@@ -59,7 +59,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_get_status (
 	IN	DAT_EP_HANDLE	   ep_handle,
 	OUT	DAT_EP_STATE	   *ep_state,
diff --git a/dapl/common/dapl_ep_modify.c b/dapl/common/dapl_ep_modify.c
index f2628af..e006b1d 100644
--- a/dapl/common/dapl_ep_modify.c
+++ b/dapl/common/dapl_ep_modify.c
@@ -76,7 +76,7 @@ dapli_ep_modify_validate_parameters (
  *	DAT_INVALID_ATTRIBUTE
  *	DAT_INVALID_STATE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_modify (
 	IN	DAT_EP_HANDLE	   	ep_handle,
 	IN	DAT_EP_PARAM_MASK	ep_param_mask,
diff --git a/dapl/common/dapl_ep_post_rdma_read.c b/dapl/common/dapl_ep_post_rdma_read.c
index 7f2b6df..95a7e05 100644
--- a/dapl/common/dapl_ep_post_rdma_read.c
+++ b/dapl/common/dapl_ep_post_rdma_read.c
@@ -66,7 +66,7 @@
  * 	DAT_PROTECTION_VIOLATION
  * 	DAT_PRIVILEGES_VIOLATION
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_post_rdma_read (
 	IN	DAT_EP_HANDLE		ep_handle,
 	IN	DAT_COUNT		num_segments,
diff --git a/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
b/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
index 929186f..2b1210d 100755
--- a/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
+++ b/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
@@ -46,7 +46,7 @@
  *
  * DAPL Requirements Version xxx, 6.6.24
  *
- * Requests the transfer of all the data specified by the remote_buffer 
+ * Requests the transfer of all the data specified by the remote_buffer 
  * over the connection of the ep_handle Endpoint into the local_iov 
  * specified by the RMR segments.
  *
@@ -72,7 +72,7 @@
  *	DAT_PRIVILEGES_VIOLATION
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_post_rdma_read_to_rmr (
 	IN      DAT_EP_HANDLE ep_handle,	/* ep_handle            */
 	IN      const DAT_RMR_TRIPLET *local,	/* local_iov            */
diff --git a/dapl/common/dapl_ep_post_rdma_write.c b/dapl/common/dapl_ep_post_rdma_write.c
index 95a107a..6b9ae94 100644
--- a/dapl/common/dapl_ep_post_rdma_write.c
+++ b/dapl/common/dapl_ep_post_rdma_write.c
@@ -66,7 +66,7 @@
  *	DAT_PROTECTION_VIOLATION
  *	DAT_PRIVILEGES_VIOLATION
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_post_rdma_write (
 	IN	DAT_EP_HANDLE		ep_handle,
 	IN	DAT_COUNT		num_segments,
diff --git a/dapl/common/dapl_ep_post_recv.c b/dapl/common/dapl_ep_post_recv.c
index c9be9ec..2e52411 100644
--- a/dapl/common/dapl_ep_post_recv.c
+++ b/dapl/common/dapl_ep_post_recv.c
@@ -66,7 +66,7 @@
  * 	DAT_PROTECTION_VIOLATION
  * 	DAT_PROVILEGES_VIOLATION
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_post_recv (
 	IN	DAT_EP_HANDLE	   	ep_handle,
 	IN	DAT_COUNT	   	num_segments,
diff --git a/dapl/common/dapl_ep_post_send.c b/dapl/common/dapl_ep_post_send.c
index 8fcdfa9..c13a095 100644
--- a/dapl/common/dapl_ep_post_send.c
+++ b/dapl/common/dapl_ep_post_send.c
@@ -66,7 +66,7 @@
  *	DAT_PROTECTION_VIOLATION
  *	DAT_PRIVILEGES_VIOLATION
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_post_send (
 	IN	DAT_EP_HANDLE	   	ep_handle,
 	IN	DAT_COUNT	   	num_segments,
diff --git a/dapl/common/dapl_ep_post_send_invalidate.c b/dapl/common/dapl_ep_post_send_invalidate.c
index 5ce0808..68a3a51 100755
--- a/dapl/common/dapl_ep_post_send_invalidate.c
+++ b/dapl/common/dapl_ep_post_send_invalidate.c
@@ -46,7 +46,7 @@
  *
  * DAPL Requirements Version xxx, 6.6.21
  *
- * Requests a transfer of all the data from the local_iov over the connection
+ * Requests a transfer of all the data from the local_iov over the connection
  * of the ep_handle Endpoint to the remote side and invalidates the Remote 
  * Memory Region context.
  *
@@ -71,7 +71,7 @@
  *	DAT_PRIVILEGES_VIOLATION
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN 
+DAT_RETURN DAT_API
 dapl_ep_post_send_with_invalidate (
 	IN      DAT_EP_HANDLE ep_handle,      /* ep_handle            */
 	IN      DAT_COUNT num_segments,       /* num_segments         */
diff --git a/dapl/common/dapl_ep_query.c b/dapl/common/dapl_ep_query.c
index 7162c70..bfdef3c 100644
--- a/dapl/common/dapl_ep_query.c
+++ b/dapl/common/dapl_ep_query.c
@@ -58,7 +58,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_query (
 	IN	DAT_EP_HANDLE		ep_handle,
 	IN	DAT_EP_PARAM_MASK	ep_param_mask,
diff --git a/dapl/common/dapl_ep_recv_query.c b/dapl/common/dapl_ep_recv_query.c
index 9b0d9be..cec7917 100644
--- a/dapl/common/dapl_ep_recv_query.c
+++ b/dapl/common/dapl_ep_recv_query.c
@@ -58,7 +58,7 @@
  *	DAT_INVALID_HANDLE
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_recv_query (
 	IN	DAT_EP_HANDLE	   ep_handle,
 	OUT	DAT_COUNT	   *nbufs_allocate,
diff --git a/dapl/common/dapl_ep_reset.c b/dapl/common/dapl_ep_reset.c
index e98b115..1c307a6 100644
--- a/dapl/common/dapl_ep_reset.c
+++ b/dapl/common/dapl_ep_reset.c
@@ -61,7 +61,7 @@
  *	DAT_INVALID_PARAMETER
  *	DAT_INVALID_STATE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_reset (
 	IN	DAT_EP_HANDLE	   ep_handle)
 {
diff --git a/dapl/common/dapl_ep_set_watermark.c b/dapl/common/dapl_ep_set_watermark.c
index 1ea93d5..c4b75f8 100644
--- a/dapl/common/dapl_ep_set_watermark.c
+++ b/dapl/common/dapl_ep_set_watermark.c
@@ -59,7 +59,7 @@
  *	DAT_INVALID_HANDLE
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ep_set_watermark (
 	IN	DAT_EP_HANDLE	   ep_handle,
 	IN	DAT_COUNT	   soft_high_watermark,
diff --git a/dapl/common/dapl_evd_dequeue.c b/dapl/common/dapl_evd_dequeue.c
index 1b75267..f6faeeb 100644
--- a/dapl/common/dapl_evd_dequeue.c
+++ b/dapl/common/dapl_evd_dequeue.c
@@ -62,7 +62,7 @@
  * 	DAT_QUEUE_EMPTY
  */
 
-DAT_RETURN dapl_evd_dequeue (
+DAT_RETURN DAT_API dapl_evd_dequeue (
     IN    DAT_EVD_HANDLE	evd_handle,
     OUT   DAT_EVENT		*event)
 
diff --git a/dapl/common/dapl_evd_free.c b/dapl/common/dapl_evd_free.c
index d737d49..407dbc8 100755
--- a/dapl/common/dapl_evd_free.c
+++ b/dapl/common/dapl_evd_free.c
@@ -59,7 +59,7 @@
  *     DAT_INVALID_HANDLE
  *     DAT_INVALID_STATE
  */
-DAT_RETURN dapl_evd_free (
+DAT_RETURN DAT_API dapl_evd_free (
     IN    DAT_EVD_HANDLE    evd_handle)
 
 {
diff --git a/dapl/common/dapl_evd_post_se.c b/dapl/common/dapl_evd_post_se.c
index d2449c3..d196a49 100644
--- a/dapl/common/dapl_evd_post_se.c
+++ b/dapl/common/dapl_evd_post_se.c
@@ -60,7 +60,7 @@
  */
 
 
-DAT_RETURN dapl_evd_post_se (
+DAT_RETURN DAT_API dapl_evd_post_se (
 	DAT_EVD_HANDLE		evd_handle,	
 	const DAT_EVENT		*event)
 
diff --git a/dapl/common/dapl_evd_resize.c b/dapl/common/dapl_evd_resize.c
index 1c17fa6..b906244 100644
--- a/dapl/common/dapl_evd_resize.c
+++ b/dapl/common/dapl_evd_resize.c
@@ -63,7 +63,7 @@
  * 	DAT_INVALID_STATE
  */
 
-DAT_RETURN dapl_evd_resize (
+DAT_RETURN DAT_API dapl_evd_resize (
 	IN	DAT_EVD_HANDLE	   evd_handle,
 	IN	DAT_COUNT	   evd_qlen )
 {
diff --git a/dapl/common/dapl_get_consumer_context.c b/dapl/common/dapl_get_consumer_context.c
index e937c27..142b57b 100644
--- a/dapl/common/dapl_get_consumer_context.c
+++ b/dapl/common/dapl_get_consumer_context.c
@@ -55,7 +55,7 @@
  * 	DAT_SUCCESS
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_get_consumer_context (
 	IN	DAT_HANDLE  dat_handle,
 	OUT	DAT_CONTEXT *context )
diff --git a/dapl/common/dapl_get_handle_type.c b/dapl/common/dapl_get_handle_type.c
index 063f081..156d758 100644
--- a/dapl/common/dapl_get_handle_type.c
+++ b/dapl/common/dapl_get_handle_type.c
@@ -56,7 +56,7 @@
  * 	DAT_INVALID_PARAMETER
  */
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_get_handle_type (
 	IN	DAT_HANDLE	   dat_handle,
 	OUT	DAT_HANDLE_TYPE	   *handle_type )
diff --git a/dapl/common/dapl_ia_close.c b/dapl/common/dapl_ia_close.c
index 3e1b44c..533fae8 100644
--- a/dapl/common/dapl_ia_close.c
+++ b/dapl/common/dapl_ia_close.c
@@ -57,7 +57,7 @@
  * 	DAT_INSUFFICIENT_RESOURCES
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ia_close (
 	IN	DAT_IA_HANDLE	ia_handle,
 	IN	DAT_CLOSE_FLAGS ia_flags)
diff --git a/dapl/common/dapl_ia_ha.c b/dapl/common/dapl_ia_ha.c
index b2aa140..05bb6d6 100755
--- a/dapl/common/dapl_ia_ha.c
+++ b/dapl/common/dapl_ia_ha.c
@@ -47,7 +47,7 @@
  *
  * DAPL Requirements Version xxx, 5.9
  *
- * Queries for provider HA support
+ * Queries for provider HA support
  *
  * Input:
  *	ia_handle
@@ -61,7 +61,7 @@
  *	DAT_MODEL_NOT_SUPPORTED
  */
 
-DAT_RETURN 
+DAT_RETURN DAT_API
 dapl_ia_ha (
 	IN	 DAT_IA_HANDLE ia_handle, /* ia_handle */
 	IN const DAT_NAME_PTR  provider,  /* provider  */
diff --git a/dapl/common/dapl_ia_open.c b/dapl/common/dapl_ia_open.c
index d3d0ed0..7ca5dba 100644
--- a/dapl/common/dapl_ia_open.c
+++ b/dapl/common/dapl_ia_open.c
@@ -86,7 +86,7 @@ void dapli_hca_cleanup (
  * 	DAT_INVALID_HANDLE
  * 	DAT_PROVIDER_NOT_FOUND	(returned by dat registry if necessary)
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ia_open (
 	IN	const DAT_NAME_PTR name,
 	IN	DAT_COUNT	   async_evd_qlen,
diff --git a/dapl/common/dapl_ia_query.c b/dapl/common/dapl_ia_query.c
index 484844c..7596daf 100755
--- a/dapl/common/dapl_ia_query.c
+++ b/dapl/common/dapl_ia_query.c
@@ -61,7 +61,7 @@
  * 	DAT_SUCCESS
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_ia_query (
 	IN	DAT_IA_HANDLE	   		ia_handle,
 	OUT	DAT_EVD_HANDLE     		*async_evd_handle,
diff --git a/dapl/common/dapl_init.h b/dapl/common/dapl_init.h
index d97f0f2..0c98541 100644
--- a/dapl/common/dapl_init.h
+++ b/dapl/common/dapl_init.h
@@ -39,12 +39,12 @@
 #ifndef _DAPL_INIT_H_
 #define _DAPL_INIT_H_
 
-extern void 
+extern void DAT_API
 DAT_PROVIDER_INIT_FUNC_NAME (
     IN const DAT_PROVIDER_INFO *,
     IN const char * );                      /* instance data */
 
-extern void 
+extern void DAT_API
 DAT_PROVIDER_FINI_FUNC_NAME (
     IN const DAT_PROVIDER_INFO * );
 
diff --git a/dapl/common/dapl_lmr_free.c b/dapl/common/dapl_lmr_free.c
index 369ad85..f4106b4 100644
--- a/dapl/common/dapl_lmr_free.c
+++ b/dapl/common/dapl_lmr_free.c
@@ -57,7 +57,7 @@
  * 	DAT_INVALID_STATE 
  */
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_lmr_free (
     IN	DAT_LMR_HANDLE  lmr_handle )
 {
diff --git a/dapl/common/dapl_lmr_query.c b/dapl/common/dapl_lmr_query.c
index 1b8a0bd..7fdc75c 100644
--- a/dapl/common/dapl_lmr_query.c
+++ b/dapl/common/dapl_lmr_query.c
@@ -54,7 +54,7 @@
  *      DAT_INVALID_HANDLE
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_lmr_query (
     IN	DAT_LMR_HANDLE		lmr_handle,
     IN	DAT_LMR_PARAM_MASK	lmr_param_mask,
diff --git a/dapl/common/dapl_lmr_sync_rdma_read.c b/dapl/common/dapl_lmr_sync_rdma_read.c
index 29489e1..3815afc 100644
--- a/dapl/common/dapl_lmr_sync_rdma_read.c
+++ b/dapl/common/dapl_lmr_sync_rdma_read.c
@@ -56,7 +56,7 @@
  * 	DAT_INVALID_HANDLE
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_lmr_sync_rdma_read (
     IN  DAT_IA_HANDLE		ia_handle,
     IN  const DAT_LMR_TRIPLET	*local_segments,
diff --git a/dapl/common/dapl_lmr_sync_rdma_write.c b/dapl/common/dapl_lmr_sync_rdma_write.c
index 7c2676b..b13b02a 100644
--- a/dapl/common/dapl_lmr_sync_rdma_write.c
+++ b/dapl/common/dapl_lmr_sync_rdma_write.c
@@ -56,7 +56,7 @@
  * 	DAT_INVALID_HANDLE
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_lmr_sync_rdma_write (
     IN  DAT_IA_HANDLE		ia_handle,
     IN  const DAT_LMR_TRIPLET	*local_segments,
diff --git a/dapl/common/dapl_provider.c b/dapl/common/dapl_provider.c
index 2fbc415..bb78b74 100755
--- a/dapl/common/dapl_provider.c
+++ b/dapl/common/dapl_provider.c
@@ -368,6 +368,7 @@ dapl_provider_list_insert (
     cur_node->name[len] = '\0';
     cur_node->data = g_dapl_provider_template;
     cur_node->data.device_name = cur_node->name;
+
     cur_node->next = next_node;
     cur_node->prev = prev_node;
 
diff --git a/dapl/common/dapl_psp_create.c b/dapl/common/dapl_psp_create.c
index 93f4108..9d2e945 100644
--- a/dapl/common/dapl_psp_create.c
+++ b/dapl/common/dapl_psp_create.c
@@ -67,7 +67,7 @@
  * 	DAT_CONN_QUAL_IN_USE
  * 	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_psp_create (
 	IN	DAT_IA_HANDLE	   ia_handle,
 	IN	DAT_CONN_QUAL	   conn_qual,
diff --git a/dapl/common/dapl_psp_create_any.c b/dapl/common/dapl_psp_create_any.c
index a2057f5..a2768fb 100644
--- a/dapl/common/dapl_psp_create_any.c
+++ b/dapl/common/dapl_psp_create_any.c
@@ -70,7 +70,7 @@
  * 	DAT_CONN_QUAL_IN_USE
  * 	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_psp_create_any (
 	IN	DAT_IA_HANDLE	   ia_handle,
 	OUT	DAT_CONN_QUAL	   *conn_qual,
diff --git a/dapl/common/dapl_psp_free.c b/dapl/common/dapl_psp_free.c
index 3b1042b..94256a8 100644
--- a/dapl/common/dapl_psp_free.c
+++ b/dapl/common/dapl_psp_free.c
@@ -58,7 +58,7 @@
  * 	DAT_SUCCESS
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_psp_free (
 	IN	DAT_PSP_HANDLE	   psp_handle )
 {
diff --git a/dapl/common/dapl_psp_query.c b/dapl/common/dapl_psp_query.c
index 0f4652d..4a55c8b 100644
--- a/dapl/common/dapl_psp_query.c
+++ b/dapl/common/dapl_psp_query.c
@@ -56,7 +56,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_psp_query (
 	IN	DAT_PSP_HANDLE	   	psp_handle,
 	IN	DAT_PSP_PARAM_MASK  	psp_args_mask,
diff --git a/dapl/common/dapl_pz_create.c b/dapl/common/dapl_pz_create.c
index 918f99a..a7105aa 100644
--- a/dapl/common/dapl_pz_create.c
+++ b/dapl/common/dapl_pz_create.c
@@ -55,7 +55,7 @@
  *      DAT_INVALID_PARAMETER
  *      DAT_INVLAID_HANDLE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_pz_create (
     IN	DAT_IA_HANDLE   ia_handle,
     OUT	DAT_PZ_HANDLE   *pz_handle)
diff --git a/dapl/common/dapl_pz_free.c b/dapl/common/dapl_pz_free.c
index 2d50fd3..b2510f7 100644
--- a/dapl/common/dapl_pz_free.c
+++ b/dapl/common/dapl_pz_free.c
@@ -54,7 +54,7 @@
  * 	DAT_INVALID_STATE
  * 	DAT_INVALID_HANDLE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_pz_free (
     IN	DAT_PZ_HANDLE   pz_handle)
 {
diff --git a/dapl/common/dapl_pz_query.c b/dapl/common/dapl_pz_query.c
index f0b8cbb..c00555b 100644
--- a/dapl/common/dapl_pz_query.c
+++ b/dapl/common/dapl_pz_query.c
@@ -53,7 +53,7 @@
  *      DAT_INVALID_HANDLE
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_pz_query (
     IN	DAT_PZ_HANDLE		pz_handle,
     IN	DAT_PZ_PARAM_MASK	pz_param_mask,
diff --git a/dapl/common/dapl_rmr_bind.c b/dapl/common/dapl_rmr_bind.c
index ff8fb5d..01d06c3 100755
--- a/dapl/common/dapl_rmr_bind.c
+++ b/dapl/common/dapl_rmr_bind.c
@@ -301,7 +301,7 @@ bail1:
  * Input:
  * Output:
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rmr_bind (
 	IN	DAT_RMR_HANDLE		rmr_handle,
 	IN	DAT_LMR_HANDLE		lmr_handle,
diff --git a/dapl/common/dapl_rmr_create.c b/dapl/common/dapl_rmr_create.c
index 2c6e179..9a7dbd9 100755
--- a/dapl/common/dapl_rmr_create.c
+++ b/dapl/common/dapl_rmr_create.c
@@ -53,7 +53,7 @@
  * 	DAT_INSUFFICIENT_RESOURCES
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rmr_create (
 	IN	DAT_PZ_HANDLE	   pz_handle,
 	OUT	DAT_RMR_HANDLE	   *rmr_handle)
@@ -126,7 +126,7 @@ dapl_rmr_create (
  *	DAT_INVALID_HANDLE
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rmr_create_for_ep (
 	IN      DAT_PZ_HANDLE pz_handle,	/* pz_handle            */
 	OUT     DAT_RMR_HANDLE *rmr_handle)	/* rmr_handle           */
diff --git a/dapl/common/dapl_rmr_free.c b/dapl/common/dapl_rmr_free.c
index 1277178..a4edb13 100644
--- a/dapl/common/dapl_rmr_free.c
+++ b/dapl/common/dapl_rmr_free.c
@@ -53,7 +53,7 @@
  * 	DAT_SUCCESS
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rmr_free (
     IN	DAT_RMR_HANDLE  rmr_handle )
 {
diff --git a/dapl/common/dapl_rmr_query.c b/dapl/common/dapl_rmr_query.c
index 9652a24..6e6be07 100644
--- a/dapl/common/dapl_rmr_query.c
+++ b/dapl/common/dapl_rmr_query.c
@@ -57,7 +57,7 @@
  *      DAT_INVALID_HANDLE
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rmr_query (
 	IN	DAT_RMR_HANDLE		rmr_handle,
 	IN	DAT_RMR_PARAM_MASK   	rmr_param_mask,
diff --git a/dapl/common/dapl_rsp_create.c b/dapl/common/dapl_rsp_create.c
index 12107f6..a9c9b5f 100644
--- a/dapl/common/dapl_rsp_create.c
+++ b/dapl/common/dapl_rsp_create.c
@@ -68,7 +68,7 @@
  *	DAT_INVALID_STATE
  *	DAT_CONN_QUAL_IN_USE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rsp_create (
 	IN	DAT_IA_HANDLE	   ia_handle,
 	IN	DAT_CONN_QUAL	   conn_qual,
diff --git a/dapl/common/dapl_rsp_free.c b/dapl/common/dapl_rsp_free.c
index 5ce0682..b7bc40c 100644
--- a/dapl/common/dapl_rsp_free.c
+++ b/dapl/common/dapl_rsp_free.c
@@ -58,7 +58,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_HANDLE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rsp_free (
 	IN	DAT_RSP_HANDLE	   rsp_handle )
 {
diff --git a/dapl/common/dapl_rsp_query.c b/dapl/common/dapl_rsp_query.c
index 737bbe5..3514299 100644
--- a/dapl/common/dapl_rsp_query.c
+++ b/dapl/common/dapl_rsp_query.c
@@ -56,7 +56,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_rsp_query (
 	IN	DAT_RSP_HANDLE	   rsp_handle,
 	IN	DAT_RSP_PARAM_MASK  rsp_mask,
diff --git a/dapl/common/dapl_set_consumer_context.c b/dapl/common/dapl_set_consumer_context.c
index 883de1f..e43be33 100644
--- a/dapl/common/dapl_set_consumer_context.c
+++ b/dapl/common/dapl_set_consumer_context.c
@@ -56,7 +56,7 @@
  * 	DAT_SUCCESS
  * 	DAT_INVALID_HANDLE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_set_consumer_context (
 	IN	DAT_HANDLE  dat_handle,
 	IN	DAT_CONTEXT context )
diff --git a/dapl/common/dapl_srq_create.c b/dapl/common/dapl_srq_create.c
index 6ae260e..66e9d0e 100644
--- a/dapl/common/dapl_srq_create.c
+++ b/dapl/common/dapl_srq_create.c
@@ -66,7 +66,7 @@
  *	?DAT_INVALID_ATTRIBUTE??
  *	DAT_MODEL_NOT_SUPPORTED
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_srq_create (
 	IN	DAT_IA_HANDLE	   	ia_handle,
 	IN	DAT_PZ_HANDLE	   	pz_handle,
diff --git a/dapl/common/dapl_srq_free.c b/dapl/common/dapl_srq_free.c
index 8eaff02..55d74fc 100644
--- a/dapl/common/dapl_srq_free.c
+++ b/dapl/common/dapl_srq_free.c
@@ -59,7 +59,7 @@
  *	DAT_INVALID_PARAMETER
  *	DAT_INVALID_STATE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_srq_free (
 	IN	DAT_SRQ_HANDLE	   srq_handle)
 {
diff --git a/dapl/common/dapl_srq_post_recv.c b/dapl/common/dapl_srq_post_recv.c
index 8751620..e37f7ed 100644
--- a/dapl/common/dapl_srq_post_recv.c
+++ b/dapl/common/dapl_srq_post_recv.c
@@ -67,7 +67,7 @@
  * 	DAT_PROTECTION_VIOLATION
  * 	DAT_PROVILEGES_VIOLATION
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_srq_post_recv (
 	IN	DAT_SRQ_HANDLE	   	srq_handle,
 	IN	DAT_COUNT	   	num_segments,
diff --git a/dapl/common/dapl_srq_query.c b/dapl/common/dapl_srq_query.c
index 8265f92..779a196 100644
--- a/dapl/common/dapl_srq_query.c
+++ b/dapl/common/dapl_srq_query.c
@@ -58,7 +58,7 @@
  *      DAT_INVALID_HANDLE
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_srq_query (
 	IN	DAT_SRQ_HANDLE		srq_handle,
 	IN	DAT_SRQ_PARAM_MASK	srq_param_mask,
diff --git a/dapl/common/dapl_srq_resize.c b/dapl/common/dapl_srq_resize.c
index 8f7adf5..54697ce 100644
--- a/dapl/common/dapl_srq_resize.c
+++ b/dapl/common/dapl_srq_resize.c
@@ -63,7 +63,7 @@
  * 	DAT_INVALID_STATE
  */
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_srq_resize (
 	IN	DAT_SRQ_HANDLE	   srq_handle,
 	IN	DAT_COUNT	   srq_max_recv_dto )
diff --git a/dapl/common/dapl_srq_set_lw.c b/dapl/common/dapl_srq_set_lw.c
index d8a3acc..bd52e21 100644
--- a/dapl/common/dapl_srq_set_lw.c
+++ b/dapl/common/dapl_srq_set_lw.c
@@ -63,7 +63,7 @@
  * 	DAT_MODEL_NOT_SUPPORTED
  */
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_srq_set_lw (
 	IN	DAT_SRQ_HANDLE	   srq_handle,
 	IN	DAT_COUNT	   low_watermark )
diff --git a/dapl/include/dapl.h b/dapl/include/dapl.h
index 81cbcf9..ade101b 100755
--- a/dapl/include/dapl.h
+++ b/dapl/include/dapl.h
@@ -688,18 +688,18 @@ void dapls_io_trc_dump (
  * DAT Mandated functions
  */
 
-extern DAT_RETURN dapl_ia_open (
+extern DAT_RETURN DAT_API dapl_ia_open (
 	IN	const DAT_NAME_PTR,	/* name */
 	IN	DAT_COUNT,		/* asynch_evd_qlen */
 	INOUT	DAT_EVD_HANDLE *,	/* asynch_evd_handle */
 	OUT	DAT_IA_HANDLE *);	/* ia_handle */
 
-extern DAT_RETURN dapl_ia_close (
+extern DAT_RETURN DAT_API dapl_ia_close (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_CLOSE_FLAGS );	/* ia_flags */
 
 
-extern DAT_RETURN dapl_ia_query (
+extern DAT_RETURN DAT_API dapl_ia_query (
 	IN	DAT_IA_HANDLE,		/* ia handle */
 	OUT	DAT_EVD_HANDLE *,	/* async_evd_handle */
 	IN	DAT_IA_ATTR_MASK,	/* ia_params_mask */
@@ -710,52 +710,52 @@ extern DAT_RETURN dapl_ia_query (
 
 /* helper functions */
 
-extern DAT_RETURN dapl_set_consumer_context (
+extern DAT_RETURN DAT_API dapl_set_consumer_context (
 	IN	DAT_HANDLE,			/* dat handle */
 	IN	DAT_CONTEXT);			/* context */
 
-extern DAT_RETURN dapl_get_consumer_context (
+extern DAT_RETURN DAT_API dapl_get_consumer_context (
 	IN	DAT_HANDLE,			/* dat handle */
 	OUT	DAT_CONTEXT * );		/* context */
 
-extern DAT_RETURN dapl_get_handle_type (
+extern DAT_RETURN DAT_API dapl_get_handle_type (
 	IN	DAT_HANDLE,
 	OUT	DAT_HANDLE_TYPE * );
 
 /* CNO functions */
 
 #if !defined(__KERNEL__)
-extern DAT_RETURN dapl_cno_create (
+extern DAT_RETURN DAT_API dapl_cno_create (
 	IN	DAT_IA_HANDLE,			/* ia_handle */
 	IN	DAT_OS_WAIT_PROXY_AGENT,	/* agent */
 	OUT	DAT_CNO_HANDLE *);		/* cno_handle */
 
-extern DAT_RETURN dapl_cno_modify_agent (
+extern DAT_RETURN DAT_API dapl_cno_modify_agent (
 	IN	DAT_CNO_HANDLE,			/* cno_handle */
 	IN	DAT_OS_WAIT_PROXY_AGENT);	/* agent */
 
-extern DAT_RETURN dapl_cno_query (
+extern DAT_RETURN DAT_API dapl_cno_query (
 	IN	DAT_CNO_HANDLE,		/* cno_handle */
 	IN	DAT_CNO_PARAM_MASK,	/* cno_param_mask */
 	OUT	DAT_CNO_PARAM * );	/* cno_param */
 
-extern DAT_RETURN dapl_cno_free (
+extern DAT_RETURN DAT_API dapl_cno_free (
 	IN	DAT_CNO_HANDLE);	/* cno_handle */
 
-extern DAT_RETURN dapl_cno_wait (
+extern DAT_RETURN DAT_API dapl_cno_wait (
 	IN	DAT_CNO_HANDLE,		/* cno_handle */
 	IN	DAT_TIMEOUT,		/* timeout */
 	OUT	DAT_EVD_HANDLE *);	/* evd_handle */
 
-extern DAT_RETURN dapl_cno_free (
+extern DAT_RETURN DAT_API dapl_cno_free (
 	IN	DAT_CNO_HANDLE);	/* cno_handle */
 
-extern DAT_RETURN dapl_cno_fd_create (
+extern DAT_RETURN DAT_API dapl_cno_fd_create (
 	IN 	DAT_IA_HANDLE,		/* ia_handle            */
 	OUT	DAT_FD *,		/* file_descriptor	*/
 	OUT 	DAT_CNO_HANDLE *);	/* cno_handle           */
 
-extern DAT_RETURN dapl_cno_trigger(
+extern DAT_RETURN DAT_API dapl_cno_trigger(
 	IN	DAT_CNO_HANDLE,		/* cno_handle */
 	OUT	DAT_EVD_HANDLE *);	/* evd_handle */
 
@@ -763,30 +763,30 @@ extern DAT_RETURN dapl_cno_trigger(
 
 /* CR Functions */
 
-extern DAT_RETURN dapl_cr_query (
+extern DAT_RETURN DAT_API dapl_cr_query (
 	IN	DAT_CR_HANDLE,		/* cr_handle */
 	IN	DAT_CR_PARAM_MASK,	/* cr_args_mask */
 	OUT	DAT_CR_PARAM * );	/* cwr_args */
 
-extern DAT_RETURN dapl_cr_accept (
+extern DAT_RETURN DAT_API dapl_cr_accept (
 	IN	DAT_CR_HANDLE,		/* cr_handle */
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_COUNT,		/* private_data_size */
 	IN	const DAT_PVOID );	/* private_data */
 
-extern DAT_RETURN dapl_cr_reject (
+extern DAT_RETURN DAT_API dapl_cr_reject (
 	IN      DAT_CR_HANDLE, 		/* cr_handle            */
 	IN	DAT_COUNT,		/* private_data_size	*/
 	IN	const DAT_PVOID );      /* private_data         */
 
-extern DAT_RETURN dapl_cr_handoff (
+extern DAT_RETURN DAT_API dapl_cr_handoff (
 	IN DAT_CR_HANDLE,		/* cr_handle */
 	IN DAT_CONN_QUAL);		/* handoff */
 
 /* EVD Functions */
 
 #if defined(__KERNEL__)
-extern DAT_RETURN dapl_ia_memtype_hint (
+extern DAT_RETURN DAT_API dapl_ia_memtype_hint (
 	IN    DAT_IA_HANDLE,		/* ia_handle */
 	IN    DAT_MEM_TYPE,		/* mem_type */
 	IN    DAT_VLEN,			/* length */
@@ -794,7 +794,7 @@ extern DAT_RETURN dapl_ia_memtype_hint (
 	OUT   DAT_VLEN *,		/* suggested_length */
 	OUT   DAT_VADDR	*);		/* suggested_alignment */
 
-extern DAT_RETURN dapl_evd_kcreate (
+extern DAT_RETURN DAT_API dapl_evd_kcreate (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_COUNT,		/* evd_min_qlen */
 	IN	DAT_UPCALL_POLICY,	/* upcall_policy */
@@ -802,46 +802,46 @@ extern DAT_RETURN dapl_evd_kcreate (
 	IN	DAT_EVD_FLAGS,		/* evd_flags */
 	OUT	DAT_EVD_HANDLE * );	/* evd_handle */
 
-extern DAT_RETURN dapl_evd_kquery (
+extern DAT_RETURN DAT_API dapl_evd_kquery (
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_EVD_PARAM_MASK,	/* evd_args_mask */
 	OUT	DAT_EVD_PARAM * );	/* evd_args */
 
 #else
-extern DAT_RETURN dapl_evd_create (
+extern DAT_RETURN DAT_API dapl_evd_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_COUNT,		/* evd_min_qlen */
 	IN	DAT_CNO_HANDLE,		/* cno_handle */
 	IN	DAT_EVD_FLAGS,		/* evd_flags */
 	OUT	DAT_EVD_HANDLE * );	/* evd_handle */
 
-extern DAT_RETURN dapl_evd_query (
+extern DAT_RETURN DAT_API dapl_evd_query (
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_EVD_PARAM_MASK,	/* evd_args_mask */
 	OUT	DAT_EVD_PARAM * );	/* evd_args */
 #endif	/* defined(__KERNEL__) */
 
 #if defined(__KERNEL__)
-extern DAT_RETURN dapl_evd_modify_upcall (
+extern DAT_RETURN DAT_API dapl_evd_modify_upcall (
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_UPCALL_POLICY,	/* upcall_policy */
 	IN	const DAT_UPCALL_OBJECT * ); /* upcall */
 
 #else
 
-extern DAT_RETURN dapl_evd_modify_cno (
+extern DAT_RETURN DAT_API dapl_evd_modify_cno (
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_CNO_HANDLE);	/* cno_handle */
 #endif
 
-extern DAT_RETURN dapl_evd_enable (
+extern DAT_RETURN DAT_API dapl_evd_enable (
 	IN	DAT_EVD_HANDLE);	/* evd_handle */
 
-extern DAT_RETURN dapl_evd_disable (
+extern DAT_RETURN DAT_API dapl_evd_disable (
 	IN	DAT_EVD_HANDLE);	/* evd_handle */
 
 #if !defined(__KERNEL__)
-extern DAT_RETURN dapl_evd_wait (
+extern DAT_RETURN DAT_API dapl_evd_wait (
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_TIMEOUT,		/* timeout */
 	IN	DAT_COUNT,		/* threshold */
@@ -849,32 +849,32 @@ extern DAT_RETURN dapl_evd_wait (
 	OUT	DAT_COUNT *);		/* nmore */
 #endif	/* !defined(__KERNEL__) */
 
-extern DAT_RETURN dapl_evd_resize (
+extern DAT_RETURN DAT_API dapl_evd_resize (
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_COUNT );		/* evd_qlen */
 
-extern DAT_RETURN dapl_evd_post_se (
+extern DAT_RETURN DAT_API dapl_evd_post_se (
 	DAT_EVD_HANDLE,			/* evd_handle */
 	const DAT_EVENT * );		/* event */
 
-extern DAT_RETURN dapl_evd_dequeue (
+extern DAT_RETURN DAT_API dapl_evd_dequeue (
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	OUT	DAT_EVENT * );		/* event */
 
-extern DAT_RETURN dapl_evd_free (
+extern DAT_RETURN DAT_API dapl_evd_free (
 	IN	DAT_EVD_HANDLE );
 
-extern DAT_RETURN
+extern DAT_RETURN DAT_API
 dapl_evd_set_unwaitable (
 	IN	DAT_EVD_HANDLE	evd_handle );
 
-extern DAT_RETURN
+extern DAT_RETURN DAT_API
 dapl_evd_clear_unwaitable (
 	IN	DAT_EVD_HANDLE	evd_handle );
 
 /* EP functions */
 
-extern DAT_RETURN dapl_ep_create (
+extern DAT_RETURN DAT_API dapl_ep_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_PZ_HANDLE,		/* pz_handle */
 	IN	DAT_EVD_HANDLE,		/* in_dto_completion_evd_handle */
@@ -883,17 +883,17 @@ extern DAT_RETURN dapl_ep_create (
 	IN	const DAT_EP_ATTR *,	/* ep_parameters */
 	OUT	DAT_EP_HANDLE * );	/* ep_handle */
 
-extern DAT_RETURN dapl_ep_query (
+extern DAT_RETURN DAT_API dapl_ep_query (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_EP_PARAM_MASK,	/* ep_args_mask */
 	OUT	DAT_EP_PARAM * );	/* ep_args */
 
-extern DAT_RETURN dapl_ep_modify (
+extern DAT_RETURN DAT_API dapl_ep_modify (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_EP_PARAM_MASK,	/* ep_args_mask */
 	IN	const DAT_EP_PARAM * ); /* ep_args */
 
-extern DAT_RETURN dapl_ep_connect (
+extern DAT_RETURN DAT_API dapl_ep_connect (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_IA_ADDRESS_PTR,	/* remote_ia_address */
 	IN	DAT_CONN_QUAL,		/* remote_conn_qual */
@@ -903,14 +903,14 @@ extern DAT_RETURN dapl_ep_connect (
 	IN	DAT_QOS,		/* quality_of_service */
 	IN	DAT_CONNECT_FLAGS );	/* connect_flags */
 
-extern DAT_RETURN dapl_ep_common_connect (
+extern DAT_RETURN DAT_API dapl_ep_common_connect (
 	IN      DAT_EP_HANDLE ep,		/* ep_handle            */
 	IN      DAT_IA_ADDRESS_PTR remote_addr,	/* remote_ia_address    */
 	IN      DAT_TIMEOUT timeout,		/* timeout              */
 	IN      DAT_COUNT pdata_size,		/* private_data_size    */
 	IN      const DAT_PVOID pdata	);	/* private_data         */
 
-extern DAT_RETURN dapl_ep_dup_connect (
+extern DAT_RETURN DAT_API dapl_ep_dup_connect (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_EP_HANDLE,		/* ep_dup_handle */
 	IN	DAT_TIMEOUT,		/* timeout*/
@@ -918,25 +918,25 @@ extern DAT_RETURN dapl_ep_dup_connect (
 	IN	const DAT_PVOID,	/* private_data */
 	IN	DAT_QOS);		/* quality_of_service */
 
-extern DAT_RETURN dapl_ep_disconnect (
+extern DAT_RETURN DAT_API dapl_ep_disconnect (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_CLOSE_FLAGS );	/* close_flags */
 
-extern DAT_RETURN dapl_ep_post_send (
+extern DAT_RETURN DAT_API dapl_ep_post_send (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_COUNT,		/* num_segments */
 	IN	DAT_LMR_TRIPLET *,	/* local_iov */
 	IN	DAT_DTO_COOKIE,		/* user_cookie */
 	IN	DAT_COMPLETION_FLAGS ); /* completion_flags */
 
-extern DAT_RETURN dapl_ep_post_recv (
+extern DAT_RETURN DAT_API dapl_ep_post_recv (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_COUNT,		/* num_segments */
 	IN	DAT_LMR_TRIPLET *,	/* local_iov */
 	IN	DAT_DTO_COOKIE,		/* user_cookie */
 	IN	DAT_COMPLETION_FLAGS ); /* completion_flags */
 
-extern DAT_RETURN dapl_ep_post_rdma_read (
+extern DAT_RETURN DAT_API dapl_ep_post_rdma_read (
 	IN	DAT_EP_HANDLE,		 /* ep_handle */
 	IN	DAT_COUNT,		 /* num_segments */
 	IN	DAT_LMR_TRIPLET *,	 /* local_iov */
@@ -944,14 +944,14 @@ extern DAT_RETURN dapl_ep_post_rdma_read (
 	IN	const DAT_RMR_TRIPLET *, /* remote_iov */
 	IN	DAT_COMPLETION_FLAGS );	 /* completion_flags */
 
-extern DAT_RETURN dapl_ep_post_rdma_read_to_rmr (
+extern DAT_RETURN DAT_API dapl_ep_post_rdma_read_to_rmr (
 	IN      DAT_EP_HANDLE,	        /* ep_handle            */
 	IN      const DAT_RMR_TRIPLET *,/* local_iov            */
 	IN      DAT_DTO_COOKIE,		/* user_cookie          */
 	IN      const DAT_RMR_TRIPLET *,/* remote_iov           */
 	IN      DAT_COMPLETION_FLAGS);	/* completion_flags     */
 
-extern DAT_RETURN dapl_ep_post_rdma_write (
+extern DAT_RETURN DAT_API dapl_ep_post_rdma_write (
 	IN	DAT_EP_HANDLE,		 /* ep_handle */
 	IN	DAT_COUNT,		 /* num_segments */
 	IN	DAT_LMR_TRIPLET *,	 /* local_iov */
@@ -959,7 +959,7 @@ extern DAT_RETURN dapl_ep_post_rdma_write (
 	IN	const DAT_RMR_TRIPLET *, /* remote_iov */
 	IN	DAT_COMPLETION_FLAGS );	 /* completion_flags */
 
-extern DAT_RETURN dapl_ep_post_send_with_invalidate (
+extern DAT_RETURN DAT_API dapl_ep_post_send_with_invalidate (
 	IN      DAT_EP_HANDLE,          /* ep_handle            */
 	IN      DAT_COUNT,              /* num_segments         */
 	IN      DAT_LMR_TRIPLET *,      /* local_iov            */
@@ -968,19 +968,19 @@ extern DAT_RETURN dapl_ep_post_send_with_invalidate (
 	IN      DAT_BOOLEAN,            /* invalidate_flag      */
 	IN      DAT_RMR_CONTEXT);      /* RMR context          */
 
-extern DAT_RETURN dapl_ep_get_status (
+extern DAT_RETURN DAT_API dapl_ep_get_status (
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	OUT	DAT_EP_STATE *,		/* ep_state */
 	OUT	DAT_BOOLEAN *,		/* in_dto_idle */
 	OUT	DAT_BOOLEAN * );	/* out_dto_idle */
 
-extern DAT_RETURN dapl_ep_free (
+extern DAT_RETURN DAT_API dapl_ep_free (
 	IN	DAT_EP_HANDLE);		/* ep_handle */
 
-extern DAT_RETURN dapl_ep_reset (
+extern DAT_RETURN DAT_API dapl_ep_reset (
 	IN	DAT_EP_HANDLE);		/* ep_handle */
 
-extern DAT_RETURN dapl_ep_create_with_srq (
+extern DAT_RETURN DAT_API dapl_ep_create_with_srq (
         IN      DAT_IA_HANDLE,          /* ia_handle            */
         IN      DAT_PZ_HANDLE,          /* pz_handle            */
         IN      DAT_EVD_HANDLE,         /* recv_evd_handle      */
@@ -990,12 +990,12 @@ extern DAT_RETURN dapl_ep_create_with_srq (
         IN      const DAT_EP_ATTR *,    /* ep_attributes        */
         OUT     DAT_EP_HANDLE *);       /* ep_handle            */
 
-extern DAT_RETURN dapl_ep_recv_query (
+extern DAT_RETURN DAT_API dapl_ep_recv_query (
         IN      DAT_EP_HANDLE,          /* ep_handle            */
         OUT     DAT_COUNT *,            /* nbufs_allocated      */
         OUT     DAT_COUNT *);           /* bufs_alloc_span      */
 
-extern DAT_RETURN dapl_ep_set_watermark (
+extern DAT_RETURN DAT_API dapl_ep_set_watermark (
         IN      DAT_EP_HANDLE,          /* ep_handle            */
         IN      DAT_COUNT,              /* soft_high_watermark  */
         IN      DAT_COUNT);             /* hard_high_watermark  */
@@ -1003,7 +1003,7 @@ extern DAT_RETURN dapl_ep_set_watermark (
 /* LMR functions */
 
 #if defined(__KERNEL__)
-extern DAT_RETURN dapl_lmr_kcreate (
+extern DAT_RETURN DAT_API dapl_lmr_kcreate (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_MEM_TYPE,		/* mem_type */
 	IN	DAT_REGION_DESCRIPTION, /* region_description */
@@ -1018,7 +1018,7 @@ extern DAT_RETURN dapl_lmr_kcreate (
 	OUT	DAT_VLEN *,		/* registered_length */
 	OUT	DAT_VADDR * );		/* registered_address */
 #else
-extern DAT_RETURN dapl_lmr_create (
+extern DAT_RETURN DAT_API dapl_lmr_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_MEM_TYPE,		/* mem_type */
 	IN	DAT_REGION_DESCRIPTION, /* region_description */
@@ -1033,40 +1033,40 @@ extern DAT_RETURN dapl_lmr_create (
 	OUT	DAT_VADDR * );		/* registered_address */
 #endif	/* defined(__KERNEL__) */
 
-extern DAT_RETURN dapl_lmr_query (
+extern DAT_RETURN DAT_API dapl_lmr_query (
 	IN	DAT_LMR_HANDLE,
 	IN	DAT_LMR_PARAM_MASK,
 	OUT	DAT_LMR_PARAM *);
 
-extern DAT_RETURN dapl_lmr_free (
+extern DAT_RETURN DAT_API dapl_lmr_free (
 	IN	DAT_LMR_HANDLE);
 
-extern DAT_RETURN dapl_lmr_sync_rdma_read(
+extern DAT_RETURN DAT_API dapl_lmr_sync_rdma_read(
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      const DAT_LMR_TRIPLET *, /* local_segments      */
 	IN      DAT_VLEN);              /* num_segments         */
 
-extern DAT_RETURN dapl_lmr_sync_rdma_write(
+extern DAT_RETURN DAT_API dapl_lmr_sync_rdma_write(
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      const DAT_LMR_TRIPLET *, /* local_segments      */
 	IN      DAT_VLEN);              /* num_segments         */
 
 /* RMR Functions */
 
-extern DAT_RETURN dapl_rmr_create (
+extern DAT_RETURN DAT_API dapl_rmr_create (
 	IN	DAT_PZ_HANDLE,		/* pz_handle */
 	OUT	DAT_RMR_HANDLE *);	/* rmr_handle */
 
-extern DAT_RETURN dapl_rmr_create_for_ep (
+extern DAT_RETURN DAT_API dapl_rmr_create_for_ep (
 	IN      DAT_PZ_HANDLE pz_handle,	/* pz_handle    */
 	OUT     DAT_RMR_HANDLE *rmr_handle);	/* rmr_handle   */
 
-extern DAT_RETURN dapl_rmr_query (
+extern DAT_RETURN DAT_API dapl_rmr_query (
 	IN	DAT_RMR_HANDLE,		/* rmr_handle */
 	IN	DAT_RMR_PARAM_MASK,	/* rmr_args_mask */
 	OUT	DAT_RMR_PARAM *);	/* rmr_args */
 
-extern DAT_RETURN dapl_rmr_bind (
+extern DAT_RETURN DAT_API dapl_rmr_bind (
 	IN	DAT_RMR_HANDLE,		 /* rmr_handle */
 	IN	DAT_LMR_HANDLE,		 /* lmr_handle */
 	IN	const DAT_LMR_TRIPLET *, /* lmr_triplet */
@@ -1077,119 +1077,119 @@ extern DAT_RETURN dapl_rmr_bind (
 	IN	DAT_COMPLETION_FLAGS,	 /* completion_flags */
 	INOUT	DAT_RMR_CONTEXT * );	 /* context */
 
-extern DAT_RETURN dapl_rmr_free (
+extern DAT_RETURN DAT_API dapl_rmr_free (
 	IN	DAT_RMR_HANDLE);
 
 /* PSP Functions */
 
-extern DAT_RETURN dapl_psp_create (
+extern DAT_RETURN DAT_API dapl_psp_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_CONN_QUAL,		/* conn_qual */
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_PSP_FLAGS,		/* psp_flags */
 	OUT	DAT_PSP_HANDLE * );	/* psp_handle */
 
-extern DAT_RETURN dapl_psp_create_any (
+extern DAT_RETURN DAT_API dapl_psp_create_any (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	OUT	DAT_CONN_QUAL *,	/* conn_qual */
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	IN	DAT_PSP_FLAGS,		/* psp_flags */
 	OUT	DAT_PSP_HANDLE *);	/* psp_handle */
 
-extern DAT_RETURN dapl_psp_query (
+extern DAT_RETURN DAT_API dapl_psp_query (
 	IN	DAT_PSP_HANDLE,
 	IN	DAT_PSP_PARAM_MASK,
 	OUT	DAT_PSP_PARAM * );
 
-extern DAT_RETURN dapl_psp_free (
+extern DAT_RETURN DAT_API dapl_psp_free (
 	IN	DAT_PSP_HANDLE );	/* psp_handle */
 
 /* RSP Functions */
 
-extern DAT_RETURN dapl_rsp_create (
+extern DAT_RETURN DAT_API dapl_rsp_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	IN	DAT_CONN_QUAL,		/* conn_qual */
 	IN	DAT_EP_HANDLE,		/* ep_handle */
 	IN	DAT_EVD_HANDLE,		/* evd_handle */
 	OUT	DAT_RSP_HANDLE * );	/* rsp_handle */
 
-extern DAT_RETURN dapl_rsp_query (
+extern DAT_RETURN DAT_API dapl_rsp_query (
 	IN	DAT_RSP_HANDLE,
 	IN	DAT_RSP_PARAM_MASK,
 	OUT	DAT_RSP_PARAM * );
 
-extern DAT_RETURN dapl_rsp_free (
+extern DAT_RETURN DAT_API dapl_rsp_free (
 	IN	DAT_RSP_HANDLE );	/* rsp_handle */
 
 /* PZ Functions */
 
-extern DAT_RETURN dapl_pz_create (
+extern DAT_RETURN DAT_API dapl_pz_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle */
 	OUT	DAT_PZ_HANDLE * );	/* pz_handle */
 
-extern DAT_RETURN dapl_pz_query (
+extern DAT_RETURN DAT_API dapl_pz_query (
 	IN	DAT_PZ_HANDLE,		/* pz_handle */
 	IN	DAT_PZ_PARAM_MASK,	/* pz_args_mask */
 	OUT	DAT_PZ_PARAM * );	/* pz_args */
 
-extern DAT_RETURN dapl_pz_free (
+extern DAT_RETURN DAT_API dapl_pz_free (
 	IN	DAT_PZ_HANDLE );	/* pz_handle */
 
 /* SRQ functions */
 
-extern DAT_RETURN dapl_srq_create(
+extern DAT_RETURN DAT_API dapl_srq_create(
         IN      DAT_IA_HANDLE,          /* ia_handle            */
         IN      DAT_PZ_HANDLE,          /* pz_handle            */
         IN      DAT_SRQ_ATTR *,         /* srq_attr             */
         OUT     DAT_SRQ_HANDLE *);      /* srq_handle           */
 
-extern DAT_RETURN dapl_srq_free(
+extern DAT_RETURN DAT_API dapl_srq_free(
 	IN      DAT_SRQ_HANDLE);        /* srq_handle           */
 
-extern DAT_RETURN dapl_srq_post_recv(
+extern DAT_RETURN DAT_API dapl_srq_post_recv(
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT,              /* num_segments         */
 	IN      DAT_LMR_TRIPLET *,      /* local_iov            */
 	IN      DAT_DTO_COOKIE);        /* user_cookie          */
 
-extern DAT_RETURN dapl_srq_query(
+extern DAT_RETURN DAT_API dapl_srq_query(
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_SRQ_PARAM_MASK,     /* srq_param_mask       */
 	OUT     DAT_SRQ_PARAM *);       /* srq_param            */
 
-extern DAT_RETURN dapl_srq_resize(
+extern DAT_RETURN DAT_API dapl_srq_resize(
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT);             /* srq_max_recv_dto     */
 
-extern DAT_RETURN dapl_srq_set_lw(
+extern DAT_RETURN DAT_API dapl_srq_set_lw(
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT);             /* low_watermark        */
 
 /* CSP functions */
-extern DAT_RETURN dapl_csp_create(
+extern DAT_RETURN DAT_API dapl_csp_create(
 	IN      DAT_IA_HANDLE,          /* ia_handle      */
 	IN      DAT_COMM *,             /* communicator   */
 	IN      DAT_IA_ADDRESS_PTR,     /* address        */
 	IN      DAT_EVD_HANDLE,         /* evd_handle     */
 	OUT     DAT_CSP_HANDLE *);      /* csp_handle     */
 
-extern DAT_RETURN dapl_csp_query(
+extern DAT_RETURN DAT_API dapl_csp_query(
 	IN      DAT_CSP_HANDLE,         /* csp_handle     */
 	IN      DAT_CSP_PARAM_MASK,     /* csp_param_mask */
 	OUT     DAT_CSP_PARAM *);       /* csp_param      */
 
-extern DAT_RETURN dapl_csp_free( 
+extern DAT_RETURN DAT_API dapl_csp_free( 
 	IN      DAT_CSP_HANDLE);        /* csp_handle     */
 
 /* HA functions */
-DAT_RETURN dapl_ia_ha(
+DAT_RETURN DAT_API dapl_ia_ha(
 	IN	 DAT_IA_HANDLE,         /* ia_handle */
 	IN const DAT_NAME_PTR,          /* provider  */
 	OUT	 DAT_BOOLEAN *);        /* answer    */
 
 #ifdef DAT_EXTENSIONS
 #include <stdarg.h>
-extern DAT_RETURN dapl_extensions(
+extern DAT_RETURN DAT_API dapl_extensions(
 	IN	DAT_HANDLE,		/* handle */
 	IN	DAT_EXTENDED_OP,	/* extended op */
 	IN	va_list);		/* argument list */
diff --git a/dapl/udapl/dapl_cno_create.c b/dapl/udapl/dapl_cno_create.c
index 8ddedc6..088992a 100644
--- a/dapl/udapl/dapl_cno_create.c
+++ b/dapl/udapl/dapl_cno_create.c
@@ -61,7 +61,7 @@
  *	DAT_INVALID_HANDLE
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN dapl_cno_create(
+DAT_RETURN DAT_API dapl_cno_create(
 	IN	DAT_IA_HANDLE			ia_handle,	/* ia_handle */
 	IN	DAT_OS_WAIT_PROXY_AGENT		wait_agent,	/* agent */
 	OUT	DAT_CNO_HANDLE			*cno_handle)	/* cno_handle */
diff --git a/dapl/udapl/dapl_cno_free.c b/dapl/udapl/dapl_cno_free.c
index cf9d7ef..053bde6 100644
--- a/dapl/udapl/dapl_cno_free.c
+++ b/dapl/udapl/dapl_cno_free.c
@@ -58,7 +58,7 @@
  *	DAT_INVALID_HANDLE
  *	DAT_INVALID_STATE
  */
-DAT_RETURN dapl_cno_free(
+DAT_RETURN DAT_API dapl_cno_free(
 	IN	DAT_CNO_HANDLE			cno_handle)	/* cno_handle */
 
 {
diff --git a/dapl/udapl/dapl_cno_modify_agent.c b/dapl/udapl/dapl_cno_modify_agent.c
index 74992d3..c4847ff 100644
--- a/dapl/udapl/dapl_cno_modify_agent.c
+++ b/dapl/udapl/dapl_cno_modify_agent.c
@@ -57,7 +57,7 @@
  *	DAT_INVALID_HANDLE
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN dapl_cno_modify_agent(
+DAT_RETURN DAT_API dapl_cno_modify_agent(
 	IN	DAT_CNO_HANDLE		cno_handle,	/* cno_handle */
 	IN	DAT_OS_WAIT_PROXY_AGENT	prx_agent )	/* agent */
 
diff --git a/dapl/udapl/dapl_cno_query.c b/dapl/udapl/dapl_cno_query.c
index 2901e2e..b7275be 100644
--- a/dapl/udapl/dapl_cno_query.c
+++ b/dapl/udapl/dapl_cno_query.c
@@ -61,7 +61,7 @@
  *	DAT_INVALID_HANDLE
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN dapl_cno_query(
+DAT_RETURN DAT_API dapl_cno_query(
 	IN	DAT_CNO_HANDLE		cno_handle,	/* cno_handle */
 	IN	DAT_CNO_PARAM_MASK	cno_param_mask,	/* cno_param_mask */
 	OUT	DAT_CNO_PARAM 		*cno_param )	/* cno_param */
diff --git a/dapl/udapl/dapl_cno_wait.c b/dapl/udapl/dapl_cno_wait.c
index 3026b01..fce9f88 100644
--- a/dapl/udapl/dapl_cno_wait.c
+++ b/dapl/udapl/dapl_cno_wait.c
@@ -59,7 +59,7 @@
  *	DAT_QUEUE_EMPTY
  *	DAT_INVALID_PARAMETER
  */
-DAT_RETURN dapl_cno_wait(
+DAT_RETURN DAT_API dapl_cno_wait(
 	IN	DAT_CNO_HANDLE		cno_handle,	/* cno_handle */
 	IN	DAT_TIMEOUT		timeout,	/* agent */
 	OUT	DAT_EVD_HANDLE		*evd_handle)	/* ia_handle */
diff --git a/dapl/udapl/dapl_evd_clear_unwaitable.c b/dapl/udapl/dapl_evd_clear_unwaitable.c
index 40614e8..35dd3ae 100644
--- a/dapl/udapl/dapl_evd_clear_unwaitable.c
+++ b/dapl/udapl/dapl_evd_clear_unwaitable.c
@@ -55,7 +55,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_HANDLE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_evd_clear_unwaitable (
 	IN    DAT_EVD_HANDLE	evd_handle )
 {
diff --git a/dapl/udapl/dapl_evd_create.c b/dapl/udapl/dapl_evd_create.c
index f472b39..7687a81 100644
--- a/dapl/udapl/dapl_evd_create.c
+++ b/dapl/udapl/dapl_evd_create.c
@@ -72,7 +72,7 @@
  * even if it is not required. However, it will not be armed.
  */
 
-DAT_RETURN dapl_evd_create (
+DAT_RETURN DAT_API dapl_evd_create (
     IN    DAT_IA_HANDLE		ia_handle,
     IN    DAT_COUNT		evd_min_qlen,
     IN    DAT_CNO_HANDLE	cno_handle,
diff --git a/dapl/udapl/dapl_evd_disable.c b/dapl/udapl/dapl_evd_disable.c
index 64494cf..0187d7f 100644
--- a/dapl/udapl/dapl_evd_disable.c
+++ b/dapl/udapl/dapl_evd_disable.c
@@ -57,7 +57,7 @@
  * 	DAT_INVALID_HANDLE
  */
 
-DAT_RETURN dapl_evd_disable (
+DAT_RETURN DAT_API dapl_evd_disable (
 	IN	DAT_EVD_HANDLE	   evd_handle)
 {
     DAPL_EVD		*evd_ptr;
diff --git a/dapl/udapl/dapl_evd_enable.c b/dapl/udapl/dapl_evd_enable.c
index b9ae03e..6f41eeb 100644
--- a/dapl/udapl/dapl_evd_enable.c
+++ b/dapl/udapl/dapl_evd_enable.c
@@ -58,7 +58,7 @@
  * 	DAT_INVALID_HANDLE
  */
 
-DAT_RETURN dapl_evd_enable (
+DAT_RETURN DAT_API dapl_evd_enable (
 	IN	DAT_EVD_HANDLE	   evd_handle)
 {
     DAPL_EVD		*evd_ptr;
diff --git a/dapl/udapl/dapl_evd_modify_cno.c b/dapl/udapl/dapl_evd_modify_cno.c
index 03cef04..1ee2231 100644
--- a/dapl/udapl/dapl_evd_modify_cno.c
+++ b/dapl/udapl/dapl_evd_modify_cno.c
@@ -59,7 +59,7 @@
  * 	DAT_INVALID_HANDLE
  */
 
-DAT_RETURN dapl_evd_modify_cno (
+DAT_RETURN DAT_API dapl_evd_modify_cno (
 	IN	DAT_EVD_HANDLE		evd_handle,
 	IN	DAT_CNO_HANDLE		cno_handle)
 
diff --git a/dapl/udapl/dapl_evd_query.c b/dapl/udapl/dapl_evd_query.c
index 3aa3b24..6983d23 100644
--- a/dapl/udapl/dapl_evd_query.c
+++ b/dapl/udapl/dapl_evd_query.c
@@ -56,7 +56,7 @@
  * 	DAT_SUCCESS
  * 	DAT_INVALID_PARAMETER
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_evd_query (
 	IN	DAT_EVD_HANDLE		evd_handle,
 	IN	DAT_EVD_PARAM_MASK	evd_param_mask,
diff --git a/dapl/udapl/dapl_evd_set_unwaitable.c b/dapl/udapl/dapl_evd_set_unwaitable.c
index e11b9d8..35250ff 100644
--- a/dapl/udapl/dapl_evd_set_unwaitable.c
+++ b/dapl/udapl/dapl_evd_set_unwaitable.c
@@ -56,7 +56,7 @@
  *	DAT_SUCCESS
  *	DAT_INVALID_HANDLE
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_evd_set_unwaitable (
 	IN    DAT_EVD_HANDLE	evd_handle )
 {
diff --git a/dapl/udapl/dapl_evd_wait.c b/dapl/udapl/dapl_evd_wait.c
index 966cef0..e4e5b37 100644
--- a/dapl/udapl/dapl_evd_wait.c
+++ b/dapl/udapl/dapl_evd_wait.c
@@ -63,7 +63,7 @@
  * 	DAT_INVALID_STATE
  */
 
-DAT_RETURN dapl_evd_wait (
+DAT_RETURN DAT_API dapl_evd_wait (
     IN  DAT_EVD_HANDLE	evd_handle,
     IN  DAT_TIMEOUT	time_out,
     IN  DAT_COUNT       threshold,
diff --git a/dapl/udapl/dapl_init.c b/dapl/udapl/dapl_init.c
index 787261b..cdd90d8 100644
--- a/dapl/udapl/dapl_init.c
+++ b/dapl/udapl/dapl_init.c
@@ -166,7 +166,7 @@ void dapl_fini ( void )
  * <hca name> <port number>
  *
  */
-void 
+void DAT_API
 DAT_PROVIDER_INIT_FUNC_NAME (
     IN const DAT_PROVIDER_INFO 	*provider_info,
     IN const char 		*instance_data )
@@ -261,7 +261,7 @@ DAT_PROVIDER_INIT_FUNC_NAME (
  * This function is called by the registry to de-initialize a provider
  *
  */
-void 
+void DAT_API
 DAT_PROVIDER_FINI_FUNC_NAME (
     IN const DAT_PROVIDER_INFO 	*provider_info )
 {
diff --git a/dapl/udapl/dapl_lmr_create.c b/dapl/udapl/dapl_lmr_create.c
index 2963847..350abe0 100644
--- a/dapl/udapl/dapl_lmr_create.c
+++ b/dapl/udapl/dapl_lmr_create.c
@@ -450,7 +450,7 @@ dapli_lmr_create_shared (
  * 	DAT_MODEL_NOT_SUPPORTED
  *
  */
-DAT_RETURN
+DAT_RETURN DAT_API
 dapl_lmr_create (
 	IN	DAT_IA_HANDLE		ia_handle,
 	IN	DAT_MEM_TYPE		mem_type,


From chu11 at llnl.gov  Fri Dec  7 16:54:25 2007
From: chu11 at llnl.gov (Al Chu)
Date: Fri, 07 Dec 2007 16:54:25 -0800
Subject: [ofa-general] [PATCH 1/3] OpenSM: Add null dereference checks
Message-ID: <1197075265.29314.131.camel@cardanus.llnl.gov>

Hey Sasha,

Nothing fancy.  Just noticed the check is done in the ftree equivalent
destroy function so I figured it should be in the others.

Al

-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-add-null-check-in-context-destroy-functions.patch
Type: text/x-patch
Size: 1340 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071207/41139556/attachment.bin>

From chu11 at llnl.gov  Fri Dec  7 16:55:42 2007
From: chu11 at llnl.gov (Al Chu)
Date: Fri, 07 Dec 2007 16:55:42 -0800
Subject: [ofa-general] [PATCH 2/3] OpenSM: Fix incorrect reporting of routing
	engine/algorithm used
Message-ID: <1197075342.29314.133.camel@cardanus.llnl.gov>

Hey Sasha,

I noticed that when a routing algorithm failed and defaulted back to
'minhop', the logs and the console did not report this change.  This is
because most of that code outputs the routing algorithm name that was
stored during configuration/setup.  The name isn't adjusted depending on
the routing algorithm's success/failure.

There are several ways this could be fixed.  I decided to easiest was to
stick a new routed_name field + lock into struct osm_routing_engine, and
set/use this new field respectively.

Note that within osm_ucast_mgr_process(), there is a slight logic change
from what was there before.  If the routing engine's call to
build_lid_matrices() failed, I've changed the logic to not call the
routing engine's ucast_build_fwd_tables() function.  This felt like the
correct logic and seems to be fine given all the routing algorithms in
OpenSM.  PLMK if there is some behavior subtlety I missed.

Thanks,
Al
-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-fix-incorrect-reporting-of-routing-engine.patch
Type: text/x-patch
Size: 5519 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071207/e1547bbc/attachment.bin>

From chu11 at llnl.gov  Fri Dec  7 16:56:19 2007
From: chu11 at llnl.gov (Al Chu)
Date: Fri, 07 Dec 2007 16:56:19 -0800
Subject: [ofa-general] [PATCH 3/3] OpenSM: Fix incorrect identification of
	routing algorithm used
Message-ID: <1197075380.29314.135.camel@cardanus.llnl.gov>

Hey Sasha,

As a follow up to patch 2/3, there were several locations in the code
where the determination of what routing algorithm was used to route the
subnet may not be done correctly.  Similar to patch 2/3, its due to the
possibility the routing algorithm fails, defaults back to 'minhop', but
the original routing engine variables are not modified to indicate an
alternate/default algorithm was used.

This patch tries to correct this by looking at the new 'routed_name'
field in struct osm_routing_engine instead of the other variables.

Thanks,
Al

-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-fix-incorrect-identification-of-routing-algorithm-us.patch
Type: text/x-patch
Size: 4247 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071207/6432c72c/attachment.bin>

From xfirjdf at bockpr.com  Fri Dec  7 19:32:06 2007
From: xfirjdf at bockpr.com (Dwight Mayo)
Date: Fri, 7 Dec 2007 19:32:06 -0800
Subject: [ofa-general] Enlarge your penis and satisfy your lover like never
	before!
Message-ID: <01c83907$d9eb2600$4eec137b@xfirjdf>

There are different methods to enlarge your penis. Some of them are just ineffective and others are even dangerous or have many side effects. Use safe and medically proven ExpressHerbals device to enlarge your penis length and girth.
 We work 24 hours a day, 7 days a week to ensure prompt delivery of ExpressHerbals as we know how eager our customers are to receive their orders. We guarantee absolute confidentiality.

http://geocities.com/JacquelynSellers72/

Regardless of age, ExpressHerbals will help to achieve incredible results.


From kliteyn at mellanox.co.il  Fri Dec  7 21:18:44 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 8 Dec 2007 07:18:44 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-08:normal completion
Message-ID: <MTLEXCH01cYTjr8cU6x0000a055@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-07
OpenSM git rev = Wed_Dec_5_21:24:42_2007 [91248a71754006c032536a1578a311137a8ab240]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From mqeciclag at boxescc.com.au  Fri Dec  7 23:13:09 2007
From: mqeciclag at boxescc.com.au (Orlando Kauffman)
Date: Sat, 8 Dec 2007 15:13:09 +0800
Subject: [ofa-general] Increase the amount of semen that you produce with
	WonderCum
Message-ID: <01c839ac$d7da1150$552613da@mqeciclag>

  WonderCum is one solution to many men's health problems such as poor sperm count and depressed libido, functional impotence and sexual weakness.

 Ordering WonderCum you don't run any risk as we offer you money back guarantee. Fast shipping! 100% confidentiality!

http://geocities.com/ForestMerrill28/

  Choose safe products when it goes about the most intimate aspects of your life.


From dwsouthlanticm at southlantic.net  Sat Dec  8 01:38:46 2007
From: dwsouthlanticm at southlantic.net (Nickolas Dodge)
Date: Sat, 8 Dec 2007 11:38:46 +0200
Subject: [ofa-general] Hot sex with Viagra pills
Message-ID: <01c8398e$e464a700$f44aa64e@dwsouthlanticm>

Do you love sex but have ed problems? 
Forget about them with Viagra or Cialis meds!
Save your money, buy high-quality meds at low price!

http://geocities.com/RusselHernandez19/

Instant shipping and quality are guaranteed! 


From jbqajexxen at bonanzacapital.com  Sat Dec  8 01:42:02 2007
From: jbqajexxen at bonanzacapital.com (Darren Blount)
Date: Sat, 8 Dec 2007 12:42:02 +0300
Subject: [ofa-general] Natural enhancement without injections or surgery? It
	is possible with absolutely natural breast enhancer SizeUp!
Message-ID: <01c83997$bafc4900$e62b7e4f@jbqajexxen>

If you don't want to have potentially dangerous surgery while creams and pumps don't work, take the benefits of natural herbal capsules SizeUp to obtain your goal breast size. The results are permanent and really amazing.
 If you order SizeUp with us, it will be delivered in short terms to any country. Quick and friendly customer service team will answer your questions concerning the product, dosages and the ordering process. Full confidentiality!

http://geocities.com/ReyesChan68/

Start a happy life today with SizeUp.


From a-albert at actel.com  Sat Dec  8 02:24:01 2007
From: a-albert at actel.com (Elton Barlow)
Date: Sat, 8 Dec 2007 18:24:01 +0800
Subject: [ofa-general] I was looking for you
Message-ID: <01c839c7$81d1e1f0$c6fed5de@a-albert>

Hello! I am tired this afternoon. I am nice girl that would like to chat with you. Email me at s at ShineBal.info only, because I am writing not from my personal email. You will see some of my private pics.


From vlad at lists.openfabrics.org  Sat Dec  8 02:53:35 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sat,  8 Dec 2007 02:53:35 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071208-0200 daily build status
Message-ID: <20071208105335.EF05AE601B1@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.18
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.15
Passed on ppc64 with linux-2.6.18
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on ppc64 with linux-2.6.18-8.el5

Failed:


From hrosenstock at xsigo.com  Sat Dec  8 08:41:20 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sat, 08 Dec 2007 08:41:20 -0800
Subject: [ofa-general] kernel CM question
Message-ID: <1197132080.8114.333.camel@hrosenstock-ws.xsigo.com>

Hi Sean,

In cm.c, there is:
static inline int cm_convert_to_ms(int iba_time)
{
        /* approximate conversion to ms from 4.096us x 2^iba_time */
        return 1 << max(iba_time - 8, 0);
}

Seems to me that as iba_time gets larger, the approximation is off by
more and forces using the next lower time.

For example, if 22 is used:
ib_cm: req timeout_ms 16896 > 8192, decreasing

Is it too much computation to make this more accurate ?

Thanks.

-- Hal


From etext at attglobal.net  Sat Dec  8 11:56:34 2007
From: etext at attglobal.net (New Bestsellers)
Date: Sat, 08 Dec 2007 14:56:34 -0500
Subject: [ofa-general] Download eBooks for your Digital Device
Message-ID: <6T44y5A6MnS_8dPbStxx4L@attglobal.net>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071208/3217ca9c/attachment.html>

From hrosenstock at xsigo.com  Sat Dec  8 12:52:34 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sat, 08 Dec 2007 12:52:34 -0800
Subject: [ofa-general] kernel CM question
In-Reply-To: <1197132080.8114.333.camel@hrosenstock-ws.xsigo.com>
References: <1197132080.8114.333.camel@hrosenstock-ws.xsigo.com>
Message-ID: <1197147154.8114.349.camel@hrosenstock-ws.xsigo.com>

Hi again Sean,

On Sat, 2007-12-08 at 08:41 -0800, Hal Rosenstock wrote:
> Hi Sean,
> 
> In cm.c, there is:
> static inline int cm_convert_to_ms(int iba_time)
> {
>         /* approximate conversion to ms from 4.096us x 2^iba_time */
>         return 1 << max(iba_time - 8, 0);
> }
> 
> Seems to me that as iba_time gets larger, the approximation is off by
> more and forces using the next lower time.
> 
> For example, if 22 is used:
> ib_cm: req timeout_ms 16896 > 8192, decreasing
> 
> Is it too much computation to make this more accurate ?

My bad; this computation is fine and the question/problem is in the
OFED patch which produces this message.

-- Hal

> Thanks.
> 
> -- Hal
> 
> 
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From or.gerlitz at gmail.com  Sat Dec  8 14:04:24 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Sun, 9 Dec 2007 00:04:24 +0200
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A96F7A@xmb-sjc-216.amer.cisco.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712061226l42cd5ed5ne3b36a3198ba9eaa@mail.gmail.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96F7A@xmb-sjc-216.amer.cisco.com>
Message-ID: <15ddcffd0712081404l110c2e74q3f15a3f01a6200b9@mail.gmail.com>

On 12/6/07, Scott Weitzenkamp (sweitzen) <sweitzen at cisco.com> wrote:
> I would prefer to leave openib.conf alone.  The old way and new way are
> not mutually exclusive, are they?

Indeed the old and the new ways are not mutually exclusive, however,
as I see it, the name of the game here is why add noise? over Ethernet
no one would ever never consider to add a weird  script that does not
use the standard tools to configure networking/bonding/etc. For some
reason some people that run this thing called OFED neglect that simple
everyday life fact so they (we, I should say, as at least both me and
you are signed on this) invent this and that mechanisms.

In other word, the long answer is that OFED should go home,  the
openibd service should go home, etc, along with this thing of
configuring bonding through non standard means and the short answer is
that there no point in configuring bonding through non standard means.

Or.


From or.gerlitz at gmail.com  Sat Dec  8 14:09:21 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Sun, 9 Dec 2007 00:09:21 +0200
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <47586FD5.1040602@mellanox.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<47586FD5.1040602@mellanox.com>
Message-ID: <15ddcffd0712081409v17f239d6sd6b6952b8a269db3@mail.gmail.com>

On 12/6/07, Vu Pham <vuhuong at mellanox.com> wrote:
> > The question on usage case of bonding over separate fabrics have been
> > brought to me several times and I gave this answer, no-one ever tried
> > to educate me why its interesting, maybe you will do so...
> >
>
> I don't have good reason. I used two separated fabrics configuration
> because my lacking understanding on ethernet/ib bonding and the old
> methodology way of redundancy in ethernet  & FC using two separated fabrics.

Yes, that was my guess, but, my hope was that you can provide some
reasoning for thismethodology way of redundancy which I understand you
were using also for SRP HA, so can you say anything in favor of the
way you were working till now? As I said, this problem of failure in
one side enforcing a failure in the other side, and worse, when there
are more than two players, eg one target and N initiators, fail-over
in one initiator forces the target to fail-over --> forces the other
N-1 initiators to fail-over!?

Or.


From swise at opengridcomputing.com  Sat Dec  8 14:45:17 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Sat, 08 Dec 2007 16:45:17 -0600
Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0
	firmware support
In-Reply-To: <47501F09.4060800@opengridcomputing.com>
References: <47501F09.4060800@opengridcomputing.com>
Message-ID: <475B1E7D.5080607@opengridcomputing.com>

Vlad, it looks like you didn't pull in version 1.1.0 of libcxgb3 for 
ofed-1.2.5?

Right now the ofed-1.2.5.4 is broken from chelsio's perspective because 
the kernel drivers require 5.0 firmware, but the library doesn't have 
5.0 firmware support.

Can you please pull in 1.1.0 of libcxgb3 and crank a new ofed-1.2.5.4 
release?

Pull from:

git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5


Thanks,

Steve.


Steve Wise wrote:
> Vlad, please pull cxgb3 fixes for ofed-1.2.5 from:
> 
> git://git.openfabrics.org/~swise/ofed-1.2.5 stevo
> 
> These are cxgb3 bug fixes and PPC64 additions that we need for 
> ofed-1.2.5  (stay tuned for ofed-1.3 patches soon).
> 
> The patches are all accepted upstream and were posted here:
> 
> http://www.spinics.net/lists/netdev/msg47492.html
> 
> and here:
> 
> http://www.spinics.net/lists/netdev/msg48240.html
> 
> 
> Also, please pull version 1.1.0 of libcxgb3 from:
> 
> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
> 
> The library and drivers need to be included together as they are both 
> needed to support the chelsio 5.0 firmware.
> 
> Alsoalso: After you integrate these, can you crank a daily OFED-1.2.5.3 
> build including all this?
> 
> 
> Thanks,
> 
> Steve.
> 


From vlad at dev.mellanox.co.il  Sun Dec  9 00:18:06 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Sun, 09 Dec 2007 10:18:06 +0200
Subject: [ofa-general] Re: [GIT PULL ofed-1.3] - RDMA/cxgb3 - fixes and 5.0
	firmware support
In-Reply-To: <4756DBFF.7030607@opengridcomputing.com>
References: <4756DBFF.7030607@opengridcomputing.com>
Message-ID: <475BA4BE.7050004@dev.mellanox.co.il>

Steve Wise wrote:
> Vlad, please pull cxgb3 fixes for ofed-1.3 from:
> 
> git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel
> 
> These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.3.
> 
> The patches are all accepted upstream and were posted here:
> 
> http://www.spinics.net/lists/netdev/msg47492.html
> 
> and here:
> 
> http://www.spinics.net/lists/netdev/msg48240.html
> 
> Also, please pull version 1.1.0 of libcxgb3 from:
> 
> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3
> 
> The library and drivers need to be included together as they are both 
> needed to support the chelsio 5.0 firmware.
> 
> 
> Thanks,
> 
> Steve.
> 

Done,

Regards,
Vladimir


From eli at mellanox.co.il  Sun Dec  9 00:37:10 2007
From: eli at mellanox.co.il (Eli Cohen)
Date: Sun, 09 Dec 2007 10:37:10 +0200
Subject: [ofa-general] Re: [PATCH] IPOIB: use LRO
In-Reply-To: <adazlwnbe0l.fsf@cisco.com>
References: <1196783200.16214.16.camel@mtls03>  <adazlwnbe0l.fsf@cisco.com>
Message-ID: <1197189430.14149.15.camel@mtls03>


On Thu, 2007-12-06 at 15:27 -0800, Roland Dreier wrote:
> > TODO:
>  > add checksum offload support to the core and hw devices.
> 
> Given this I assume this is just an RFC and you don't expect this to
> be merged as-is, right?
Adding the authors of ehea which what I used as an example.
Yes, please send any comments. I believe we need checksum patches to be
re-generated against latest kernel right?

> 
>  > +static int get_skb_hdr(struct sk_buff *skb, void **iphdr,
>  > +		       void **tcph, u64 *hdr_flags, void *priv)
>  > +{
>  > +	unsigned int ip_len;
>  > +	struct iphdr *iph;
>  > +
>  > +	/* FIXME - verify CQE checksum ??? */
>  > +
>  > +	/* non tcp packet */
> 
> I don't understand this comment here.
I meant here that I think this is where I have to check with the HW
whether the chekcsum of the packet is correct. I looked at IBM ehea and
couldn't see any explicit such check. Do you think such a test is
required?

> 
>  > +	skb_reset_network_header(skb);
>  > +	iph = ip_hdr(skb);
>  > +	if (iph->protocol != IPPROTO_TCP)
>  > +		return -1;
>  > +
>  > +	ip_len = ip_hdrlen(skb);
>  > +	skb_set_transport_header(skb, ip_len);
>  > +	*tcph = tcp_hdr(skb);
>  > +
>  > +	/* check if ip header and tcp header are complete */
>  > +	if (iph->tot_len < ip_len + tcp_hdrlen(skb))
>  > +		return -1;
>  > +
>  > +	*hdr_flags = LRO_IPV4 | LRO_TCP;
> 
> I don't see anywhere that you test the ethertype for IPv4 vs. IPv6.
> So how do you know you have an IPv4 packet here?  I guess you need the
> check before you use ip_hdr() above.
Yes looks like I should add this check. Though I did not see ehea do it.

> 
>  > +	*iphdr = iph;
>  > +
>  > +	return 0;


From dotanb at dev.mellanox.co.il  Sun Dec  9 00:56:16 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Sun, 09 Dec 2007 10:56:16 +0200
Subject: [ofa-general] Re: [PATCH] ib/mad: fix incorrect access to items on
	local_list
In-Reply-To: <000101c83902$4e485460$9b37170a@amr.corp.intel.com>
References: <474BE237.8050602@dev.mellanox.co.il> <aday7cjntc9.fsf@cisco.com>
	<000001c8337a$cdc18e60$ff0da8c0@amr.corp.intel.com>
	<4756C665.5060404@dev.mellanox.co.il>
	<000101c83902$4e485460$9b37170a@amr.corp.intel.com>
Message-ID: <475BADB0.3040407@dev.mellanox.co.il>

Sean Hefty wrote:
>> Just want to let me know that i didn't forget about this issue.
>>
>> I tried to reproduce the failure before applying the bug, but this one
>> is not easy to reproduce.
>>
>> I will give you a feedback as soon as I'll have one ..
>>     
>
> To reproduce, you need to have MADs queued for processing on the local_list when
> the sender unregisters (perhaps by killing the app).  You should be able to
> widen the race window by adding a delay near the top of local_completions().
> That will keep the MADs on the local_list for longer.  Alternatively, you would
> need to have a lot of outstanding local MADs.
>
>   
The problem is that i don't call the mad module directly, i execute SDP 
tests that causes this
scenario to happen ..


thanks
Dotan


From vlad at lists.openfabrics.org  Sun Dec  9 02:54:03 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sun,  9 Dec 2007 02:54:03 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071209-0200 daily build status
Message-ID: <20071209105403.AAD90E601BA@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.18
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.19
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.14
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.23
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-53.el5

Failed:


From iejwkfswi at bps8.ebay.sun.com  Sun Dec  9 03:47:50 2007
From: iejwkfswi at bps8.ebay.sun.com (Garrett Purcell)
Date: , 9 Dec 2007 19:47:50 +0800
Subject: [ofa-general] You must be The Real Man with huge dignity
Message-ID: <01c83a9c$61898210$295b797d@iejwkfswi>

Ladies always giggled at me and even guys did in the public toilets!    Well now I laugh at them because I took megadik    for 6 months and now my dick is much bigger than "average" size.Order MegaDik Now

Don't hesitate, make your order todayDid you know... MegaDik was featured in leading mens magazines such as FHM, MAXIM, plus many others,   and rated No.1 choice for penis enlargement... Also seen on TV

http://geocities.com/CalebGalloway83/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071209/ad3fc302/attachment.html>

From dwsecretsandiegom at secretsandiego.com  Sun Dec  9 03:48:38 2007
From: dwsecretsandiegom at secretsandiego.com (Jerold Guevara)
Date: , 9 Dec 2007 19:48:38 +0800
Subject: [ofa-general] Medications that you need.
Message-ID: <01c83a9c$7dcdaf00$321a89de@dwsecretsandiegom>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/GerryVargas99/

We provide confidential and secure purchase! 


From misa0992 at mercury.livedoor.com  Sun Dec  9 04:41:55 2007
From: misa0992 at mercury.livedoor.com (misa0992 at mercury.livedoor.com)
Date: Sun, 9 Dec 2007 21:41:55 +0900
Subject: [ofa-general] =?iso-2022-jp?b?GyRCMGw9byRLTH4kNyQiJCQkXiQ7GyhC?=
	=?iso-2022-jp?b?GyRCJHMkKyEpGyhC?=
Message-ID: <20071209124220.A2AC0E609C5@openfabrics.org>

ご┃近┃所┃さ┃ん┃を┃探┃し┃ま┃せ┃ん┃か┃？┃
━┛━┛━┛━┛━┛━┛━┛━┛━┛━┛━┛━┛━┛

　　■ 完全無料！素敵な相手をGET！
　　■ 自由恋愛を支援するコミュニティサイトです。

【完全無料♪まずはお試し体験実施中！】

　　　　　　　　↓↓↓　

真奈美　さん

■年齢 　 　 23才 
□スタイル 　普通
□写メ　　 　無し
■メッセージ:友達が欲しいです。
　　　　　　 最近、引っ越してきたんですけど、周りの事全然知らないし
　　　　　　 お友達もいないし寂しい毎日を過ごしています。
　　　　　　 誰かお友達になって下さい。メッセ友達も募集です。

  　         http://www.di-girl.com/?kk

綾子 さん　(∇⌒*)♪

■年齢 　 　 29才 
□スタイル 　グラマー
□写メ　　 　有り(プロフィール)
■メッセージ:一人じゃ淋しすぎます
　　　　　　 最近、念願の車を車を買ったんだけど、夜に
             一人でドライブなんて淋し過ぎるぅ（泣）
　　　　　　 気軽に仲良くしてくれたら嬉しいなぁ。
             週末とか時間があればどうですか？

  　         http://www.di-girl.com/?kk

美奈子　さん

■年齢 　 　 37才 
□スタイル 　スレンダー
□写メ　　 　無し
■メッセージ:情熱的になりたいと思っているのですが…
　　　　　　 情熱的な出会いが欲しい。私を包み込んでくれませんか？
             唐突ですいません…。何かきっかけが欲しいです。
             日々の何も変わらない生活に嫌気がさしてます。
　　　　　　 一応既婚者ですけど、夫とは冷え切ってます。
             もしよければ…秘密厳守でお願いします

  　         http://www.di-girl.com/?kk


☆☆☆☆☆☆―――――――――――――――――――――――――――――
完┃全┃　　　┏━━━■ご近所さん大集合■━━━━━━┓
━┛━┛　　　┃☆エッチな子も恋したい子もいっぱい☆　┃
無┃料┃　　　┃　　　http://www.b-gw.net/?hu       　┃
━┛━┛　　　┗━━━━━━━━━━━━━━━━━━━┛
―――――――――――――――――――――――――――――☆☆☆☆☆☆


From sashak at voltaire.com  Sun Dec  9 06:18:22 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 9 Dec 2007 14:18:22 +0000
Subject: [ofa-general] [RFC] opensm: cl_qlock_pool benchmark
In-Reply-To: <20071127003157.GA26160@sashak.voltaire.com>
References: <20071127003157.GA26160@sashak.voltaire.com>
Message-ID: <20071209141822.GJ6213@sashak.voltaire.com>

Hi,

I looked at possibility to optimize and simplify SA requests processing
in OpenSM and found that very common practice there is to use
cl_qlock_pool* as a records allocator (it must be locked because same
type of requests shares the pool). It is also used as MAD allocator (via
osm_mad_pool).

Looking at implementation of q[lock_]pool I thought that it would be
interesting to compare its performance with standard malloc, which by
itself should be reasonably fast. So I wrote some stupid program
test_pool.c (do_nothing() here is for preventing from smart optimizer to
drop some cycles):


#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <complib/cl_qlockpool.h>
#include <complib/cl_qpool.h>

#define USE_MALLOC 1
#define USE_QPOOL 1

#ifdef USE_MALLOC
#define cl_qlock_pool_get(p) malloc(sizeof(*item))
#define cl_qlock_pool_put(p, mem) free(mem)
#else
#ifdef USE_QPOOL
#define cl_qlock_pool_t cl_qpool_t
#define cl_qlock_pool_construct(p) cl_qpool_construct(p)
#define cl_qlock_pool_init(p, a, b, c, d, e, f, g) \
	cl_qpool_init(p, a, b, c, d, e, f, g)
#define cl_qlock_pool_destroy(p) cl_qpool_destroy(p)
#define cl_qlock_pool_get(p) cl_qpool_get(p)
#define cl_qlock_pool_put(p, mem) cl_qpool_put(p, mem)
#endif
#endif

typedef struct item {
	cl_pool_item_t pool_item;
	char data[64];
} item_t;

#define POOL_MIN_SIZE      32
#define POOL_GROW_SIZE     32

#define N_TESTS 1000000000

static void do_nothing(struct item *items[], unsigned n)
{
	int i;
	for (i = 0 ; i < n ; i++) {
		if (!strcmp(items[i]->data, "12345678"))
			printf("Yes!!!\n");
	}
}

static int pool_get_and_put_items(cl_qlock_pool_t *p, unsigned n)
{
	struct item *items[n];
	struct item *item;
	int i;

	for (i = 0 ; i < n ; i++) {
		item = (struct item *)cl_qlock_pool_get(p);
		if (!item)
			return -1;
		memset(item->data, 0, sizeof(item->data));
		items[i] = item;
	}

	do_nothing(items, n);

	for (i = 0 ; i < n ; i++)
		cl_qlock_pool_put(p, &items[i]->pool_item);

	return 0;
}

static int test_pool()
{
	cl_qlock_pool_t pool;
	int i, j, status;

	cl_qlock_pool_construct(&pool);

	status = cl_qlock_pool_init(&pool, POOL_MIN_SIZE, 0, POOL_GROW_SIZE,
				    sizeof(struct item), NULL, NULL, NULL);
	for (i = 0 ; i < N_TESTS; i++)
		if (!pool_get_and_put_items(&pool, 1000000))
			return -i;

	for (i = 0 ; i < N_TESTS; i++) {
		if (!pool_get_and_put_items(&pool, 1000000))
			return -i;
		for (j = 0; j < N_TESTS; j++)
			if (!pool_get_and_put_items(&pool, 1000000))
				return -i;
	}

	cl_qlock_pool_destroy(&pool);

	return 0;
}

int main()
{
	int ret = test_pool();

	return ret;
}


And got such typical numbers:

* with cl_qlock_pool:

real    0m0.541s
user    0m0.488s
sys     0m0.056s

* with cl_qpool:

real    0m0.350s
user    0m0.288s
sys     0m0.060s

cl_qpool is much faster, it is expected since locking cycle is skipped
there.

* with regular malloc/free:

real    0m0.292s
user    0m0.216s
sys     0m0.072s

And this one is *fastest*.

In this test I used various numbers for subsequent test cycles and
different optimization flags - numbers ratios still be similar.

This shows that regular malloc/free is fastest allocator, then used it
doesn't require locking (all allocations are per individual request) and
it is more than twice faster than current cl_qlock_pool.

Obvious question is why to not convert from cl_qlock_pool? Probably some
holes in the test? Any thoughts?

Sasha


From oqhaoaldx at bobtheitguy.com  Sun Dec  9 06:06:28 2007
From: oqhaoaldx at bobtheitguy.com (Prince Pool)
Date: , 9 Dec 2007 14:06:28 +0000
Subject: [ofa-general] Software in many languages!
Message-ID: <01c83a6c$b0f84a00$437acad9@oqhaoaldx>

  Get original and perfectly functioning software at low prices. All software can be downloaded immediately after purchase. Impressive selection of programs even for Macintosh! Programs in many languages are available.

 Free of charge professional installation consultations could be of great help. Prompt reply on all your requests. Money back guarantee ensures the quality of product.

http://geocities.com/ConnieGarza46/

   Buy, download and install right now!


From sashak at voltaire.com  Sun Dec  9 06:23:26 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 9 Dec 2007 14:23:26 +0000
Subject: [ofa-general] [PATCH RFC] opensm: use malloc instead of
	cl_qlock_pool in SA processors
In-Reply-To: <20071209141822.GJ6213@sashak.voltaire.com>
References: <20071127003157.GA26160@sashak.voltaire.com>
	<20071209141822.GJ6213@sashak.voltaire.com>
Message-ID: <20071209142326.GK6213@sashak.voltaire.com>


Use regular malloc/free instead of cl_qlock_pool for records allocation
in SA processors. Simple benchmark shows that regular malloc/free is
more than twice faster than cl_qlock_pool allocator and this doesn't
require additional locking (actually it still be faster than non-locking
cl_qpool allocator too).

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/include/opensm/osm_sa_class_port_info.h  |    4 +-
 opensm/include/opensm/osm_sa_guidinfo_record.h  |    8 +---
 opensm/include/opensm/osm_sa_informinfo.h       |   10 +---
 opensm/include/opensm/osm_sa_lft_record.h       |    8 +---
 opensm/include/opensm/osm_sa_link_record.h      |    7 +--
 opensm/include/opensm/osm_sa_mcmember_record.h  |    4 +-
 opensm/include/opensm/osm_sa_mft_record.h       |    8 +---
 opensm/include/opensm/osm_sa_multipath_record.h |    8 +---
 opensm/include/opensm/osm_sa_node_record.h      |    8 +---
 opensm/include/opensm/osm_sa_path_record.h      |    8 +---
 opensm/include/opensm/osm_sa_pkey_record.h      |    8 +---
 opensm/include/opensm/osm_sa_portinfo_record.h  |    8 +---
 opensm/include/opensm/osm_sa_service_record.h   |    8 +---
 opensm/include/opensm/osm_sa_slvl_record.h      |    8 +---
 opensm/include/opensm/osm_sa_sminfo_record.h    |    4 +-
 opensm/include/opensm/osm_sa_sw_info_record.h   |    3 +-
 opensm/include/opensm/osm_sa_vlarb_record.h     |    8 +---
 opensm/opensm/osm_sa_guidinfo_record.c          |   38 ++++----------
 opensm/opensm/osm_sa_informinfo.c               |   33 +++---------
 opensm/opensm/osm_sa_lft_record.c               |   39 ++++----------
 opensm/opensm/osm_sa_link_record.c              |   33 +++---------
 opensm/opensm/osm_sa_mcmember_record.c          |   38 +++-----------
 opensm/opensm/osm_sa_mft_record.c               |   36 ++++----------
 opensm/opensm/osm_sa_multipath_record.c         |   62 +++++++----------------
 opensm/opensm/osm_sa_node_record.c              |   40 ++++----------
 opensm/opensm/osm_sa_path_record.c              |   49 +++++-------------
 opensm/opensm/osm_sa_pkey_record.c              |   37 ++++----------
 opensm/opensm/osm_sa_portinfo_record.c          |   36 ++++----------
 opensm/opensm/osm_sa_service_record.c           |   49 ++++--------------
 opensm/opensm/osm_sa_slvl_record.c              |   34 +++---------
 opensm/opensm/osm_sa_sminfo_record.c            |   39 ++++----------
 opensm/opensm/osm_sa_sw_info_record.c           |   35 ++++---------
 opensm/opensm/osm_sa_vlarb_record.c             |   41 ++++-----------
 33 files changed, 193 insertions(+), 566 deletions(-)

diff --git a/opensm/include/opensm/osm_sa_class_port_info.h b/opensm/include/opensm/osm_sa_class_port_info.h
index b477cd5..6e4c069 100644
--- a/opensm/include/opensm/osm_sa_class_port_info.h
+++ b/opensm/include/opensm/osm_sa_class_port_info.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -49,8 +49,6 @@
 #define _OSM_CPI_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
-#include <complib/cl_qlockpool.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
diff --git a/opensm/include/opensm/osm_sa_guidinfo_record.h b/opensm/include/opensm/osm_sa_guidinfo_record.h
index b3035c7..c074b7b 100644
--- a/opensm/include/opensm/osm_sa_guidinfo_record.h
+++ b/opensm/include/opensm/osm_sa_guidinfo_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -48,7 +48,6 @@
 #define _OSM_GIR_RCV_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -100,7 +99,6 @@ typedef struct _osm_gir_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_gir_rcv_t;
 /*
 * FIELDS
@@ -119,10 +117,6 @@ typedef struct _osm_gir_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable GUIDInfo Record objects used to generate
-*		the query response.
-*
 * SEE ALSO
 *
 *********/
diff --git a/opensm/include/opensm/osm_sa_informinfo.h b/opensm/include/opensm/osm_sa_informinfo.h
index 5d00dd6..2a4b4ba 100644
--- a/opensm/include/opensm/osm_sa_informinfo.h
+++ b/opensm/include/opensm/osm_sa_informinfo.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,9 +50,6 @@
 #define _OSM_SA_INFR_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
-#include <complib/cl_timer.h>
-#include <complib/cl_qlockpool.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -104,7 +101,6 @@ typedef struct _osm_infr_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_infr_rcv_t;
 /*
 * FIELDS
@@ -120,10 +116,6 @@ typedef struct _osm_infr_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable InformInfo Record objects used to
-*		generate the query response.
-*
 * SEE ALSO
 *	InformInfo Receiver object
 *********/
diff --git a/opensm/include/opensm/osm_sa_lft_record.h b/opensm/include/opensm/osm_sa_lft_record.h
index 18a43f4..8470490 100644
--- a/opensm/include/opensm/osm_sa_lft_record.h
+++ b/opensm/include/opensm/osm_sa_lft_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,7 +50,6 @@
 #define _OSM_LFTR_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -103,7 +102,6 @@ typedef struct _osm_lft {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_lftr_rcv_t;
 /*
 * FIELDS
@@ -125,10 +123,6 @@ typedef struct _osm_lft {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable Linear Forwarding Table Record objects used to
-*               generate the query response.
-*
 * SEE ALSO
 *	Linear Forwarding Table Receiver object
 *********/
diff --git a/opensm/include/opensm/osm_sa_link_record.h b/opensm/include/opensm/osm_sa_link_record.h
index 3104704..d09eb69 100644
--- a/opensm/include/opensm/osm_sa_link_record.h
+++ b/opensm/include/opensm/osm_sa_link_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -49,7 +49,6 @@
 #ifndef _OSM_LR_RCV_H_
 #define _OSM_LR_RCV_H_
 
-#include <complib/cl_qlockpool.h>
 #include <complib/cl_passivelock.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
@@ -102,7 +101,6 @@ typedef struct _osm_lr_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t lr_pool;
 } osm_lr_rcv_t;
 /*
 * FIELDS
@@ -121,9 +119,6 @@ typedef struct _osm_lr_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	lr_pool
-*		Pool of link record objects used to generate the query response.
-*
 * SEE ALSO
 *********/
 
diff --git a/opensm/include/opensm/osm_sa_mcmember_record.h b/opensm/include/opensm/osm_sa_mcmember_record.h
index f13bc98..8540a89 100644
--- a/opensm/include/opensm/osm_sa_mcmember_record.h
+++ b/opensm/include/opensm/osm_sa_mcmember_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,7 +50,6 @@
 #define _OSM_MCMR_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -105,7 +104,6 @@ typedef struct _osm_mcmr {
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
 	uint16_t mlid_ho;
-	cl_qlock_pool_t pool;
 } osm_mcmr_recv_t;
 
 /*
diff --git a/opensm/include/opensm/osm_sa_mft_record.h b/opensm/include/opensm/osm_sa_mft_record.h
index dd14257..09b922d 100644
--- a/opensm/include/opensm/osm_sa_mft_record.h
+++ b/opensm/include/opensm/osm_sa_mft_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -49,7 +49,6 @@
 #define _OSM_MFTR_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -102,7 +101,6 @@ typedef struct _osm_mft {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_mftr_rcv_t;
 /*
 * FIELDS
@@ -124,10 +122,6 @@ typedef struct _osm_mft {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable Multicast Forwarding Table Record objects used to
-*               generate the query response.
-*
 * SEE ALSO
 *	Multicast Forwarding Table Receiver object
 *********/
diff --git a/opensm/include/opensm/osm_sa_multipath_record.h b/opensm/include/opensm/osm_sa_multipath_record.h
index 8fa1046..afd407d 100644
--- a/opensm/include/opensm/osm_sa_multipath_record.h
+++ b/opensm/include/opensm/osm_sa_multipath_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -49,8 +49,6 @@
 #define _OSM_MPR_RCV_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
-#include <complib/cl_qlockpool.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -103,7 +101,6 @@ typedef struct _osm_mpr_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pr_pool;
 } osm_mpr_rcv_t;
 /*
 * FIELDS
@@ -119,9 +116,6 @@ typedef struct _osm_mpr_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pr_pool
-*		Pool of multipath record objects used to generate query responses.
-*
 * SEE ALSO
 *	MultiPath Record Receiver object
 *********/
diff --git a/opensm/include/opensm/osm_sa_node_record.h b/opensm/include/opensm/osm_sa_node_record.h
index 36eea27..8f385f8 100644
--- a/opensm/include/opensm/osm_sa_node_record.h
+++ b/opensm/include/opensm/osm_sa_node_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -49,7 +49,6 @@
 #ifndef _OSM_NR_H_
 #define _OSM_NR_H_
 
-#include <complib/cl_qlockpool.h>
 #include <complib/cl_passivelock.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
@@ -101,7 +100,6 @@ typedef struct _osm_nr_recv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_nr_rcv_t;
 /*
 * FIELDS
@@ -120,10 +118,6 @@ typedef struct _osm_nr_recv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable node record objects used to generate
-*		the query response.
-*
 * SEE ALSO
 *
 *********/
diff --git a/opensm/include/opensm/osm_sa_path_record.h b/opensm/include/opensm/osm_sa_path_record.h
index 88eb6c3..76d24fc 100644
--- a/opensm/include/opensm/osm_sa_path_record.h
+++ b/opensm/include/opensm/osm_sa_path_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,8 +50,6 @@
 #define _OSM_PR_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
-#include <complib/cl_qlockpool.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -104,7 +102,6 @@ typedef struct _osm_pr_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pr_pool;
 } osm_pr_rcv_t;
 /*
 * FIELDS
@@ -120,9 +117,6 @@ typedef struct _osm_pr_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pr_pool
-*		Pool of path record objects used to generate query responses.
-*
 * SEE ALSO
 *	Path Record Receiver object
 *********/
diff --git a/opensm/include/opensm/osm_sa_pkey_record.h b/opensm/include/opensm/osm_sa_pkey_record.h
index 4242a2f..b2f43f0 100644
--- a/opensm/include/opensm/osm_sa_pkey_record.h
+++ b/opensm/include/opensm/osm_sa_pkey_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -37,7 +37,6 @@
 #define _OSM_PKEY_REC_RCV_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -89,7 +88,6 @@ typedef struct _osm_pkey_rec_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_pkey_rec_rcv_t;
 /*
 * FIELDS
@@ -108,10 +106,6 @@ typedef struct _osm_pkey_rec_rcv {
 *  p_lock
 *     Pointer to the serializing lock.
 *
-*  pool
-*     Pool of linkable P_Key Record objects used to generate
-*     the query response.
-*
 * SEE ALSO
 *
 *********/
diff --git a/opensm/include/opensm/osm_sa_portinfo_record.h b/opensm/include/opensm/osm_sa_portinfo_record.h
index 38eabdb..a818f25 100644
--- a/opensm/include/opensm/osm_sa_portinfo_record.h
+++ b/opensm/include/opensm/osm_sa_portinfo_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,7 +50,6 @@
 #define _OSM_PIR_RCV_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -102,7 +101,6 @@ typedef struct _osm_pir_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_pir_rcv_t;
 /*
 * FIELDS
@@ -121,10 +119,6 @@ typedef struct _osm_pir_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable PortInfo Record objects used to generate
-*		the query response.
-*
 * SEE ALSO
 *
 *********/
diff --git a/opensm/include/opensm/osm_sa_service_record.h b/opensm/include/opensm/osm_sa_service_record.h
index 8884944..43859e0 100644
--- a/opensm/include/opensm/osm_sa_service_record.h
+++ b/opensm/include/opensm/osm_sa_service_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,9 +50,7 @@
 #define _OSM_SR_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <complib/cl_timer.h>
-#include <complib/cl_qlockpool.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -104,7 +102,6 @@ typedef struct _osm_sr_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t sr_pool;
 	cl_timer_t sr_timer;
 } osm_sr_rcv_t;
 /*
@@ -121,9 +118,6 @@ typedef struct _osm_sr_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	sr_pool
-*		Pool of Service Record objects used to generate query responses.
-*
 * SEE ALSO
 *	Service Record Receiver object
 *********/
diff --git a/opensm/include/opensm/osm_sa_slvl_record.h b/opensm/include/opensm/osm_sa_slvl_record.h
index c72d5d4..518a0f1 100644
--- a/opensm/include/opensm/osm_sa_slvl_record.h
+++ b/opensm/include/opensm/osm_sa_slvl_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,7 +50,6 @@
 #define _OSM_SLVL_REC_RCV_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -102,7 +101,6 @@ typedef struct _osm_slvl_rec_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_slvl_rec_rcv_t;
 /*
 * FIELDS
@@ -121,10 +119,6 @@ typedef struct _osm_slvl_rec_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable SLtoVL Mapping Record objects used to generate
-*		the query response.
-*
 * SEE ALSO
 *
 *********/
diff --git a/opensm/include/opensm/osm_sa_sminfo_record.h b/opensm/include/opensm/osm_sa_sminfo_record.h
index ce57925..f4fd1ff 100644
--- a/opensm/include/opensm/osm_sa_sminfo_record.h
+++ b/opensm/include/opensm/osm_sa_sminfo_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,7 +50,6 @@
 #define _OSM_SMIR_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -103,7 +102,6 @@ typedef struct _osm_smir {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_smir_rcv_t;
 /*
 * FIELDS
diff --git a/opensm/include/opensm/osm_sa_sw_info_record.h b/opensm/include/opensm/osm_sa_sw_info_record.h
index ad1f773..df6f842 100644
--- a/opensm/include/opensm/osm_sa_sw_info_record.h
+++ b/opensm/include/opensm/osm_sa_sw_info_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -103,7 +103,6 @@ typedef struct _osm_sir_rcv {
 	osm_req_t *p_req;
 	osm_state_mgr_t *p_state_mgr;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_sir_rcv_t;
 /*
 * FIELDS
diff --git a/opensm/include/opensm/osm_sa_vlarb_record.h b/opensm/include/opensm/osm_sa_vlarb_record.h
index e823880..1ed8554 100644
--- a/opensm/include/opensm/osm_sa_vlarb_record.h
+++ b/opensm/include/opensm/osm_sa_vlarb_record.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,7 +50,6 @@
 #define _OSM_VLARB_REC_RCV_H_
 
 #include <complib/cl_passivelock.h>
-#include <complib/cl_qlist.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_madw.h>
 #include <opensm/osm_sa_response.h>
@@ -102,7 +101,6 @@ typedef struct _osm_vlarb_rec_rcv {
 	osm_mad_pool_t *p_mad_pool;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-	cl_qlock_pool_t pool;
 } osm_vlarb_rec_rcv_t;
 /*
 * FIELDS
@@ -121,10 +119,6 @@ typedef struct _osm_vlarb_rec_rcv {
 *	p_lock
 *		Pointer to the serializing lock.
 *
-*	pool
-*		Pool of linkable VLArbitration Record objects used to generate
-*		the query response.
-*
 * SEE ALSO
 *
 *********/
diff --git a/opensm/opensm/osm_sa_guidinfo_record.c b/opensm/opensm/osm_sa_guidinfo_record.c
index d955e93..a758888 100644
--- a/opensm/opensm/osm_sa_guidinfo_record.c
+++ b/opensm/opensm/osm_sa_guidinfo_record.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2006-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -62,11 +62,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_GIR_RCV_POOL_MIN_SIZE      32
-#define OSM_GIR_RCV_POOL_GROW_SIZE     32
-
 typedef struct _osm_gir_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_guidinfo_record_t rec;
 } osm_gir_item_t;
 
@@ -83,7 +80,6 @@ typedef struct _osm_gir_search_ctxt {
 void osm_gir_rcv_construct(IN osm_gir_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -91,7 +87,6 @@ void osm_gir_rcv_construct(IN osm_gir_rcv_t * const p_rcv)
 void osm_gir_rcv_destroy(IN osm_gir_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_gir_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -104,8 +99,6 @@ osm_gir_rcv_init(IN osm_gir_rcv_t * const p_rcv,
 		 IN osm_subn_t * const p_subn,
 		 IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_gir_rcv_init);
 
 	osm_gir_rcv_construct(p_rcv);
@@ -116,14 +109,8 @@ osm_gir_rcv_init(IN osm_gir_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_GIR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_GIR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_gir_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -142,11 +129,11 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_gir_rcv_new_gir);
 
-	p_rec_item = (osm_gir_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_gir_rcv_new_gir: ERR 5102: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
@@ -158,7 +145,7 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv,
 			cl_ntoh16(match_lid), block_num);
 	}
 
-	memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = match_lid;
 	p_rec_item->rec.block_num = block_num;
@@ -166,8 +153,7 @@ __osm_gir_rcv_new_gir(IN osm_gir_rcv_t * const p_rcv,
 		p_rec_item->rec.guid_info.guid[0] =
 		    osm_physp_get_port_guid(p_req_physp);
 
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -465,8 +451,7 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data)
 			    (osm_gir_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_gir_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_gir_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -514,7 +499,7 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_gir_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -559,10 +544,9 @@ void osm_gir_rcv_process(IN void *ctx, IN void *data)
 	for (i = 0; i < pre_trim_num_rec; i++) {
 		p_rec_item = (osm_gir_item_t *) cl_qlist_remove_head(&rec_list);
 		/* copy only if not trimmed */
-		if (i < num_rec) {
+		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_informinfo.c b/opensm/opensm/osm_sa_informinfo.c
index 71332dc..db58bc0 100644
--- a/opensm/opensm/osm_sa_informinfo.c
+++ b/opensm/opensm/osm_sa_informinfo.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -66,11 +66,8 @@
 #include <opensm/osm_inform.h>
 #include <opensm/osm_pkey.h>
 
-#define OSM_IIR_RCV_POOL_MIN_SIZE      32
-#define OSM_IIR_RCV_POOL_GROW_SIZE     32
-
 typedef struct _osm_iir_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_inform_info_record_t rec;
 } osm_iir_item_t;
 
@@ -89,7 +86,6 @@ typedef struct _osm_iir_search_ctxt {
 void osm_infr_rcv_construct(IN osm_infr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -99,7 +95,6 @@ void osm_infr_rcv_destroy(IN osm_infr_rcv_t * const p_rcv)
 	CL_ASSERT(p_rcv);
 
 	OSM_LOG_ENTER(p_rcv->p_log, osm_infr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -112,8 +107,6 @@ osm_infr_rcv_init(IN osm_infr_rcv_t * const p_rcv,
 		  IN osm_subn_t * const p_subn,
 		  IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status = IB_ERROR;
-
 	OSM_LOG_ENTER(p_log, osm_infr_rcv_init);
 
 	osm_infr_rcv_construct(p_rcv);
@@ -124,14 +117,8 @@ osm_infr_rcv_init(IN osm_infr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_IIR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_IIR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_iir_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_rcv->p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -392,18 +379,17 @@ __osm_sa_inform_info_rec_by_comp_mask(IN osm_infr_rcv_t * const p_rcv,
 		goto Exit;
 	}
 
-	p_rec_item = (osm_iir_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_sa_inform_info_rec_by_comp_mask: ERR 430E: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		goto Exit;
 	}
 
 	memcpy((void *)&p_rec_item->rec, (void *)&p_infr->inform_record,
 	       sizeof(ib_inform_info_record_t));
-	cl_qlist_insert_tail(p_ctxt->p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -519,8 +505,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv,
 			    (osm_iir_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_iir_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_iir_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -565,7 +550,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv,
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_iir_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -619,7 +604,7 @@ osm_infr_rcv_process_get_method(IN osm_infr_rcv_t * const p_rcv,
 			for (j = 0; j < 4; j++)
 				p_resp_rec->pad[j] = 0;
 		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_lft_record.c b/opensm/opensm/osm_sa_lft_record.c
index e645569..5f3f208 100644
--- a/opensm/opensm/osm_sa_lft_record.c
+++ b/opensm/opensm/osm_sa_lft_record.c
@@ -60,11 +60,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_LFTR_RCV_POOL_MIN_SIZE      32
-#define OSM_LFTR_RCV_POOL_GROW_SIZE     32
-
 typedef struct _osm_lftr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_lft_record_t rec;
 } osm_lftr_item_t;
 
@@ -81,7 +78,6 @@ typedef struct _osm_lftr_search_ctxt {
 void osm_lftr_rcv_construct(IN osm_lftr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -89,7 +85,6 @@ void osm_lftr_rcv_construct(IN osm_lftr_rcv_t * const p_rcv)
 void osm_lftr_rcv_destroy(IN osm_lftr_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_lftr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -102,8 +97,6 @@ osm_lftr_rcv_init(IN osm_lftr_rcv_t * const p_rcv,
 		  IN osm_subn_t * const p_subn,
 		  IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_lftr_rcv_init);
 
 	osm_lftr_rcv_construct(p_rcv);
@@ -114,14 +107,8 @@ osm_lftr_rcv_init(IN osm_lftr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_LFTR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_LFTR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_lftr_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -137,16 +124,16 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_lftr_rcv_new_lftr);
 
-	p_rec_item = (osm_lftr_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_lftr_rcv_new_lftr: ERR 4402: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
 
-	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) {
+	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_lftr_rcv_new_lftr: "
 			"New LinearForwardingTable: sw 0x%016" PRIx64
@@ -154,9 +141,8 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv,
 			cl_ntoh64(osm_node_get_node_guid(p_sw->p_node)),
 			cl_ntoh16(block), cl_ntoh16(lid)
 		    );
-	}
 
-	memset(&p_rec_item->rec, 0, sizeof(ib_lft_record_t));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 	p_rec_item->rec.block_num = block;
@@ -165,8 +151,7 @@ __osm_lftr_rcv_new_lftr(IN osm_lftr_rcv_t * const p_rcv,
 	osm_switch_get_fwd_tbl_block(p_sw, cl_ntoh16(block),
 				     p_rec_item->rec.lft);
 
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -369,8 +354,7 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data)
 			    (osm_lftr_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_lftr_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_lftr_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -418,7 +402,7 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_lftr_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -465,10 +449,9 @@ void osm_lftr_rcv_process(IN void *ctx, IN void *data)
 		p_rec_item =
 		    (osm_lftr_item_t *) cl_qlist_remove_head(&rec_list);
 		/* copy only if not trimmed */
-		if (i < num_rec) {
+		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_link_record.c b/opensm/opensm/osm_sa_link_record.c
index a6bdc8f..ba239be 100644
--- a/opensm/opensm/osm_sa_link_record.c
+++ b/opensm/opensm/osm_sa_link_record.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -61,11 +61,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_LR_RCV_POOL_MIN_SIZE    64
-#define OSM_LR_RCV_POOL_GROW_SIZE   64
-
 typedef struct _osm_lr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_link_record_t link_rec;
 } osm_lr_item_t;
 
@@ -74,7 +71,6 @@ typedef struct _osm_lr_item {
 void osm_lr_rcv_construct(IN osm_lr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->lr_pool);
 }
 
 /**********************************************************************
@@ -82,7 +78,6 @@ void osm_lr_rcv_construct(IN osm_lr_rcv_t * const p_rcv)
 void osm_lr_rcv_destroy(IN osm_lr_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_lr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->lr_pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -95,8 +90,6 @@ osm_lr_rcv_init(IN osm_lr_rcv_t * const p_rcv,
 		IN osm_subn_t * const p_subn,
 		IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status = IB_SUCCESS;
-
 	OSM_LOG_ENTER(p_log, osm_lr_rcv_init);
 
 	osm_lr_rcv_construct(p_rcv);
@@ -107,14 +100,8 @@ osm_lr_rcv_init(IN osm_lr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->lr_pool,
-				    OSM_LR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_LR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_lr_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_rcv->p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -128,7 +115,7 @@ __osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv,
 {
 	osm_lr_item_t *p_lr_item;
 
-	p_lr_item = (osm_lr_item_t *) cl_qlock_pool_get(&p_rcv->lr_pool);
+	p_lr_item = malloc(sizeof(*p_lr_item));
 	if (p_lr_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_lr_rcv_build_physp_link: ERR 1801: "
@@ -141,13 +128,14 @@ __osm_lr_rcv_build_physp_link(IN osm_lr_rcv_t * const p_rcv,
 			cl_ntoh16(from_lid), cl_ntoh16(to_lid));
 		return;
 	}
+	memset(p_lr_item, 0, sizeof(*p_lr_item));
 
 	p_lr_item->link_rec.from_port_num = from_port;
 	p_lr_item->link_rec.to_port_num = to_port;
 	p_lr_item->link_rec.to_lid = to_lid;
 	p_lr_item->link_rec.from_lid = from_lid;
 
-	cl_qlist_insert_tail(p_list, (cl_list_item_t *) & p_lr_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_lr_item->list_item);
 }
 
 /**********************************************************************
@@ -560,8 +548,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv,
 		/* need to set the mem free ... */
 		p_lr_item = (osm_lr_item_t *) cl_qlist_remove_head(p_list);
 		while (p_lr_item != (osm_lr_item_t *) cl_qlist_end(p_list)) {
-			cl_qlock_pool_put(&p_rcv->lr_pool,
-					  &p_lr_item->pool_item);
+			free(p_lr_item);
 			p_lr_item =
 			    (osm_lr_item_t *) cl_qlist_remove_head(p_list);
 		}
@@ -600,8 +587,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv,
 		/* Release the quick pool items */
 		p_lr_item = (osm_lr_item_t *) cl_qlist_remove_head(p_list);
 		while (p_lr_item != (osm_lr_item_t *) cl_qlist_end(p_list)) {
-			cl_qlock_pool_put(&p_rcv->lr_pool,
-					  &p_lr_item->pool_item);
+			free(p_lr_item);
 			p_lr_item =
 			    (osm_lr_item_t *) cl_qlist_remove_head(p_list);
 		}
@@ -654,8 +640,7 @@ __osm_lr_rcv_respond(IN osm_lr_rcv_t * const p_rcv,
 				*p_resp_lr = p_lr_item->link_rec;
 				num_copied++;
 			}
-			cl_qlock_pool_put(&p_rcv->lr_pool,
-					  &p_lr_item->pool_item);
+			free(p_lr_item);
 			p_resp_lr++;
 			p_lr_item =
 			    (osm_lr_item_t *) cl_qlist_remove_head(p_list);
diff --git a/opensm/opensm/osm_sa_mcmember_record.c b/opensm/opensm/osm_sa_mcmember_record.c
index 5d5fb8d..ddb1ca5 100644
--- a/opensm/opensm/osm_sa_mcmember_record.c
+++ b/opensm/opensm/osm_sa_mcmember_record.c
@@ -70,11 +70,8 @@
 #include <opensm/osm_inform.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_MCMR_RCV_POOL_MIN_SIZE     32
-#define OSM_MCMR_RCV_POOL_GROW_SIZE    32
-
 typedef struct _osm_mcmr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_member_rec_t rec;
 } osm_mcmr_item_t;
 
@@ -93,7 +90,6 @@ typedef struct osm_sa_mcmr_search_ctxt {
 void osm_mcmr_rcv_construct(IN osm_mcmr_recv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -103,9 +99,6 @@ void osm_mcmr_rcv_destroy(IN osm_mcmr_recv_t * const p_rcv)
 	CL_ASSERT(p_rcv);
 
 	OSM_LOG_ENTER(p_rcv->p_log, osm_mcmr_rcv_destroy);
-
-	cl_qlock_pool_destroy(&p_rcv->pool);
-
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -119,8 +112,6 @@ osm_mcmr_rcv_init(IN osm_sm_t * const p_sm,
 		  IN osm_subn_t * const p_subn,
 		  IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status = IB_SUCCESS;
-
 	OSM_LOG_ENTER(p_log, osm_mcmr_rcv_init);
 
 	osm_mcmr_rcv_construct(p_rcv);
@@ -133,18 +124,8 @@ osm_mcmr_rcv_init(IN osm_sm_t * const p_sm,
 	p_rcv->p_mad_pool = p_mad_pool;
 	p_rcv->mlid_ho = 0xC000;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_MCMR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_MCMR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_mcmr_item_t), NULL, NULL, NULL);
-	if (status != CL_SUCCESS) {
-		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
-			"osm_mcmr_rcv_init: ERR 1B02: "
-			"qlock pool init failed (%d)\n", status);
-	}
 	OSM_LOG_EXIT(p_rcv->p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -1727,22 +1708,21 @@ __osm_mcmr_rcv_new_mcmr(IN osm_mcmr_recv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_mcmr_rcv_new_mcmr);
 
-	p_rec_item = (osm_mcmr_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_mcmr_rcv_new_mcmr: ERR 1B15: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
 
-	memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	/* HACK: Untrusted requesters should result with 0 Join
 	   State, Port Guid, and Proxy */
 	p_rec_item->rec = *p_rcvd_rec;
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -2023,7 +2003,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv,
 		    (osm_mcmr_item_t *) cl_qlist_remove_head(&rec_list);
 		while (p_rec_item !=
 		       (osm_mcmr_item_t *) cl_qlist_end(&rec_list)) {
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 			p_rec_item =
 			    (osm_mcmr_item_t *) cl_qlist_remove_head(&rec_list);
 		}
@@ -2070,7 +2050,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv,
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_mcmr_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -2134,7 +2114,7 @@ __osm_mcmr_query_mgrp(IN osm_mcmr_recv_t * const p_rcv,
 				p_resp_rec->proxy_join = 0;
 			}
 		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_mft_record.c b/opensm/opensm/osm_sa_mft_record.c
index 3968304..f9ac527 100644
--- a/opensm/opensm/osm_sa_mft_record.c
+++ b/opensm/opensm/osm_sa_mft_record.c
@@ -59,11 +59,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_MFTR_RCV_POOL_MIN_SIZE      32
-#define OSM_MFTR_RCV_POOL_GROW_SIZE     32
-
 typedef struct _osm_mftr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_mft_record_t rec;
 } osm_mftr_item_t;
 
@@ -80,7 +77,6 @@ typedef struct _osm_mftr_search_ctxt {
 void osm_mftr_rcv_construct(IN osm_mftr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -88,7 +84,6 @@ void osm_mftr_rcv_construct(IN osm_mftr_rcv_t * const p_rcv)
 void osm_mftr_rcv_destroy(IN osm_mftr_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_mftr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -101,8 +96,6 @@ osm_mftr_rcv_init(IN osm_mftr_rcv_t * const p_rcv,
 		  IN osm_subn_t * const p_subn,
 		  IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_mftr_rcv_init);
 
 	osm_mftr_rcv_construct(p_rcv);
@@ -113,14 +106,8 @@ osm_mftr_rcv_init(IN osm_mftr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_MFTR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_MFTR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_mftr_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -138,11 +125,11 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_mftr_rcv_new_mftr);
 
-	p_rec_item = (osm_mftr_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_mftr_rcv_new_mftr: ERR 4A02: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
@@ -160,7 +147,7 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv,
 	position_block_num = ((uint16_t) position << 12) |
 	    (block & IB_MCAST_BLOCK_ID_MASK_HO);
 
-	memset(&p_rec_item->rec, 0, sizeof(ib_mft_record_t));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 	p_rec_item->rec.position_block_num = cl_hton16(position_block_num);
@@ -168,8 +155,7 @@ __osm_mftr_rcv_new_mftr(IN osm_mftr_rcv_t * const p_rcv,
 	/* copy the mft block */
 	osm_switch_get_mft_block(p_sw, block, position, p_rec_item->rec.mft);
 
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -399,8 +385,7 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data)
 			    (osm_mftr_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_mftr_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_mftr_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -448,7 +433,7 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_mftr_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -495,10 +480,9 @@ void osm_mftr_rcv_process(IN void *ctx, IN void *data)
 		p_rec_item =
 		    (osm_mftr_item_t *) cl_qlist_remove_head(&rec_list);
 		/* copy only if not trimmed */
-		if (i < num_rec) {
+		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_multipath_record.c b/opensm/opensm/osm_sa_multipath_record.c
index efc6a07..4fd30c2 100644
--- a/opensm/opensm/osm_sa_multipath_record.c
+++ b/opensm/opensm/osm_sa_multipath_record.c
@@ -67,13 +67,10 @@
 #include <opensm/osm_qos_policy.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_MPR_RCV_POOL_MIN_SIZE	64
-#define OSM_MPR_RCV_POOL_GROW_SIZE	64
-
 #define OSM_SA_MPR_MAX_NUM_PATH        127
 
 typedef struct _osm_mpr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	const osm_port_t *p_src_port;
 	const osm_port_t *p_dest_port;
 	int hops;
@@ -95,7 +92,6 @@ typedef struct _osm_path_parms {
 void osm_mpr_rcv_construct(IN osm_mpr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pr_pool);
 }
 
 /**********************************************************************
@@ -103,7 +99,6 @@ void osm_mpr_rcv_construct(IN osm_mpr_rcv_t * const p_rcv)
 void osm_mpr_rcv_destroy(IN osm_mpr_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_mpr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pr_pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -116,8 +111,6 @@ osm_mpr_rcv_init(IN osm_mpr_rcv_t * const p_rcv,
 		 IN osm_subn_t * const p_subn,
 		 IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_mpr_rcv_init);
 
 	osm_mpr_rcv_construct(p_rcv);
@@ -128,14 +121,8 @@ osm_mpr_rcv_init(IN osm_mpr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pr_pool,
-				    OSM_MPR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_MPR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_mpr_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_rcv->p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -905,20 +892,21 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv,
 			"Src LID 0x%X, Dest LID 0x%X\n",
 			src_lid_ho, dest_lid_ho);
 
-	p_pr_item = (osm_mpr_item_t *) cl_qlock_pool_get(&p_rcv->pr_pool);
+	p_pr_item = malloc(sizeof(*p_pr_item));
 	if (p_pr_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_mpr_rcv_get_lid_pair_path: ERR 4501: "
 			"Unable to allocate path record\n");
 		goto Exit;
 	}
+	memset(p_pr_item, 0, sizeof(*p_pr_item));
 
 	status = __osm_mpr_rcv_get_path_parms(p_rcv, p_mpr, p_src_port,
 					      p_dest_port, dest_lid_ho,
 					      comp_mask, &path_parms);
 
 	if (status != IB_SUCCESS) {
-		cl_qlock_pool_put(&p_rcv->pr_pool, &p_pr_item->pool_item);
+		free(p_pr_item);
 		p_pr_item = NULL;
 		goto Exit;
 	}
@@ -942,8 +930,7 @@ __osm_mpr_rcv_get_lid_pair_path(IN osm_mpr_rcv_t * const p_rcv,
 				"__osm_mpr_rcv_get_lid_pair_path: "
 				"Requested reversible path but failed to get one\n");
 
-			cl_qlock_pool_put(&p_rcv->pr_pool,
-					  &p_pr_item->pool_item);
+			free(p_pr_item);
 			p_pr_item = NULL;
 			goto Exit;
 		}
@@ -1084,9 +1071,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv,
 							    preference);
 
 		if (p_pr_item) {
-			cl_qlist_insert_tail(p_list,
-					     (cl_list_item_t *) & p_pr_item->
-					     pool_item);
+			cl_qlist_insert_tail(p_list, &p_pr_item->list_item);
 			++path_num;
 		}
 
@@ -1152,9 +1137,7 @@ __osm_mpr_rcv_get_port_pair_paths(IN osm_mpr_rcv_t * const p_rcv,
 							    preference);
 
 		if (p_pr_item) {
-			cl_qlist_insert_tail(p_list,
-					     (cl_list_item_t *) & p_pr_item->
-					     pool_item);
+			cl_qlist_insert_tail(p_list, &p_pr_item->list_item);
 			++path_num;
 		}
 	}
@@ -1471,14 +1454,10 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv,
 			matrix[0][0]->path_rec.dlid, matrix[0][0]->hops,
 			matrix[1][1]->path_rec.slid,
 			matrix[1][1]->path_rec.dlid, matrix[1][1]->hops);
-		cl_qlist_insert_tail(p_list,
-				     (cl_list_item_t *) & matrix[0][0]->
-				     pool_item);
-		cl_qlist_insert_tail(p_list,
-				     (cl_list_item_t *) & matrix[1][1]->
-				     pool_item);
-		cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[0][1]->pool_item);
-		cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[1][0]->pool_item);
+		cl_qlist_insert_tail(p_list, &matrix[0][0]->list_item);
+		cl_qlist_insert_tail(p_list, &matrix[1][1]->list_item);
+		free(matrix[0][1]);
+		free(matrix[1][0]);
 	} else {
 		/* Diag B */
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
@@ -1489,14 +1468,10 @@ __osm_mpr_rcv_get_apm_paths(IN osm_mpr_rcv_t * const p_rcv,
 			matrix[0][1]->path_rec.dlid, matrix[0][1]->hops,
 			matrix[1][0]->path_rec.slid,
 			matrix[1][0]->path_rec.dlid, matrix[1][0]->hops);
-		cl_qlist_insert_tail(p_list,
-				     (cl_list_item_t *) & matrix[0][1]->
-				     pool_item);
-		cl_qlist_insert_tail(p_list,
-				     (cl_list_item_t *) & matrix[1][0]->
-				     pool_item);
-		cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[0][0]->pool_item);
-		cl_qlock_pool_put(&p_rcv->pr_pool, &matrix[1][1]->pool_item);
+		cl_qlist_insert_tail(p_list, &matrix[0][1]->list_item);
+		cl_qlist_insert_tail(p_list, &matrix[1][0]->list_item);
+		free(matrix[0][0]);
+		free(matrix[1][1]);
 	}
 
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -1598,8 +1573,7 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv,
 		for (i = 0; i < num_rec; i++) {
 			p_mpr_item =
 			    (osm_mpr_item_t *) cl_qlist_remove_head(p_list);
-			cl_qlock_pool_put(&p_rcv->pr_pool,
-					  &p_mpr_item->pool_item);
+			free(p_mpr_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -1634,7 +1608,7 @@ __osm_mpr_rcv_respond(IN osm_mpr_rcv_t * const p_rcv,
 		/* Copy the Path Records from the list into the MAD */
 		*p_resp_pr = p_mpr_item->path_rec;
 
-		cl_qlock_pool_put(&p_rcv->pr_pool, &p_mpr_item->pool_item);
+		free(p_mpr_item);
 		p_resp_pr++;
 	}
 
diff --git a/opensm/opensm/osm_sa_node_record.c b/opensm/opensm/osm_sa_node_record.c
index b94d005..e78e827 100644
--- a/opensm/opensm/osm_sa_node_record.c
+++ b/opensm/opensm/osm_sa_node_record.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -60,11 +60,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_NR_RCV_POOL_MIN_SIZE    32
-#define OSM_NR_RCV_POOL_GROW_SIZE   32
-
 typedef struct _osm_nr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_node_record_t rec;
 } osm_nr_item_t;
 
@@ -81,7 +78,6 @@ typedef struct _osm_nr_search_ctxt {
 void osm_nr_rcv_construct(IN osm_nr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -89,7 +85,6 @@ void osm_nr_rcv_construct(IN osm_nr_rcv_t * const p_rcv)
 void osm_nr_rcv_destroy(IN osm_nr_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_nr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -102,8 +97,6 @@ osm_nr_rcv_init(IN osm_nr_rcv_t * const p_rcv,
 		IN osm_subn_t * const p_subn,
 		IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_nr_rcv_init);
 
 	osm_nr_rcv_construct(p_rcv);
@@ -114,14 +107,8 @@ osm_nr_rcv_init(IN osm_nr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_NR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_NR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_nr_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -137,16 +124,16 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_nr_rcv_new_nr);
 
-	p_rec_item = (osm_nr_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_nr_rcv_new_nr: ERR 1D02: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
 
-	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) {
+	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_nr_rcv_new_nr: "
 			"New NodeRecord: node 0x%016" PRIx64
@@ -154,9 +141,8 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv,
 			cl_ntoh64(osm_node_get_node_guid(p_node)),
 			cl_ntoh64(port_guid), cl_ntoh16(lid)
 		    );
-	}
 
-	memset(&p_rec_item->rec, 0, sizeof(ib_node_record_t));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 
@@ -164,8 +150,7 @@ __osm_nr_rcv_new_nr(IN osm_nr_rcv_t * const p_rcv,
 	p_rec_item->rec.node_info.port_guid = port_guid;
 	memcpy(&(p_rec_item->rec.node_desc), &(p_node->node_desc),
 	       IB_NODE_DESCRIPTION_SIZE);
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -459,7 +444,7 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data)
 		/* need to set the mem free ... */
 		p_rec_item = (osm_nr_item_t *) cl_qlist_remove_head(&rec_list);
 		while (p_rec_item != (osm_nr_item_t *) cl_qlist_end(&rec_list)) {
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 			p_rec_item =
 			    (osm_nr_item_t *) cl_qlist_remove_head(&rec_list);
 		}
@@ -506,7 +491,7 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_nr_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -551,10 +536,9 @@ void osm_nr_rcv_process(IN void *ctx, IN void *data)
 	for (i = 0; i < pre_trim_num_rec; i++) {
 		p_rec_item = (osm_nr_item_t *) cl_qlist_remove_head(&rec_list);
 		/* copy only if not trimmed */
-		if (i < num_rec) {
+		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
index f46a3be..aae87d4 100644
--- a/opensm/opensm/osm_sa_path_record.c
+++ b/opensm/opensm/osm_sa_path_record.c
@@ -73,15 +73,12 @@
 #include <opensm/osm_sa_mcmember_record.h>
 #include <opensm/osm_prefix_route.h>
 
-#define OSM_PR_RCV_POOL_MIN_SIZE    64
-#define OSM_PR_RCV_POOL_GROW_SIZE   64
-
 extern uint8_t osm_get_lash_sl(osm_opensm_t * p_osm,
 			       const osm_port_t * p_src_port,
 			       const osm_port_t * p_dst_port);
 
 typedef struct _osm_pr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_path_rec_t path_rec;
 } osm_pr_item_t;
 
@@ -111,7 +108,6 @@ static const ib_gid_t zero_gid = { {0x00, 0x00, 0x00, 0x00,
 void osm_pr_rcv_construct(IN osm_pr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pr_pool);
 }
 
 /**********************************************************************
@@ -119,7 +115,6 @@ void osm_pr_rcv_construct(IN osm_pr_rcv_t * const p_rcv)
 void osm_pr_rcv_destroy(IN osm_pr_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_pr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pr_pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -132,8 +127,6 @@ osm_pr_rcv_init(IN osm_pr_rcv_t * const p_rcv,
 		IN osm_subn_t * const p_subn,
 		IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_pr_rcv_init);
 
 	osm_pr_rcv_construct(p_rcv);
@@ -144,14 +137,8 @@ osm_pr_rcv_init(IN osm_pr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pr_pool,
-				    OSM_PR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_PR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_pr_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_rcv->p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -939,20 +926,21 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv,
 			"Src LID 0x%X, Dest LID 0x%X\n",
 			src_lid_ho, dest_lid_ho);
 
-	p_pr_item = (osm_pr_item_t *) cl_qlock_pool_get(&p_rcv->pr_pool);
+	p_pr_item = malloc(sizeof(*p_pr_item));
 	if (p_pr_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_pr_rcv_get_lid_pair_path: ERR 1F01: "
 			"Unable to allocate path record\n");
 		goto Exit;
 	}
+	memset(p_pr_item, 0, sizeof(*p_pr_item));
 
 	status = __osm_pr_rcv_get_path_parms(p_rcv, p_pr, p_src_port,
 					     p_dest_port, dest_lid_ho,
 					     comp_mask, &path_parms);
 
 	if (status != IB_SUCCESS) {
-		cl_qlock_pool_put(&p_rcv->pr_pool, &p_pr_item->pool_item);
+		free(p_pr_item);
 		p_pr_item = NULL;
 		goto Exit;
 	}
@@ -976,8 +964,7 @@ __osm_pr_rcv_get_lid_pair_path(IN osm_pr_rcv_t * const p_rcv,
 				"__osm_pr_rcv_get_lid_pair_path: "
 				"Requested reversible path but failed to get one\n");
 
-			cl_qlock_pool_put(&p_rcv->pr_pool,
-					  &p_pr_item->pool_item);
+			free(p_pr_item);
 			p_pr_item = NULL;
 			goto Exit;
 		}
@@ -1158,9 +1145,7 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv,
 							   preference);
 
 		if (p_pr_item) {
-			cl_qlist_insert_tail(p_list,
-					     (cl_list_item_t *) & p_pr_item->
-					     pool_item);
+			cl_qlist_insert_tail(p_list, &p_pr_item->list_item);
 			++path_num;
 		}
 
@@ -1226,9 +1211,7 @@ __osm_pr_rcv_get_port_pair_paths(IN osm_pr_rcv_t * const p_rcv,
 							   preference);
 
 		if (p_pr_item) {
-			cl_qlist_insert_tail(p_list,
-					     (cl_list_item_t *) & p_pr_item->
-					     pool_item);
+			cl_qlist_insert_tail(p_list, &p_pr_item->list_item);
 			++path_num;
 		}
 	}
@@ -1861,8 +1844,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv,
 			    (osm_pr_item_t *) cl_qlist_remove_head(p_list);
 			while (p_pr_item !=
 			       (osm_pr_item_t *) cl_qlist_end(p_list)) {
-				cl_qlock_pool_put(&p_rcv->pr_pool,
-						  &p_pr_item->pool_item);
+				free(p_pr_item);
 				p_pr_item = (osm_pr_item_t *)
 				    cl_qlist_remove_head(p_list);
 			}
@@ -1907,8 +1889,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv,
 		for (i = 0; i < num_rec; i++) {
 			p_pr_item =
 			    (osm_pr_item_t *) cl_qlist_remove_head(p_list);
-			cl_qlock_pool_put(&p_rcv->pr_pool,
-					  &p_pr_item->pool_item);
+			free(p_pr_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -1949,7 +1930,7 @@ __osm_pr_rcv_respond(IN osm_pr_rcv_t * const p_rcv,
 		if (i < num_rec)
 			*p_resp_pr = p_pr_item->path_rec;
 
-		cl_qlock_pool_put(&p_rcv->pr_pool, &p_pr_item->pool_item);
+		free(p_pr_item);
 		p_resp_pr++;
 	}
 
@@ -2113,14 +2094,14 @@ void osm_pr_rcv_process(IN void *context, IN void *data)
 			goto Unlock;
 		}
 
-		p_pr_item =
-		    (osm_pr_item_t *) cl_qlock_pool_get(&p_rcv->pr_pool);
+		p_pr_item = malloc(sizeof(*p_pr_item));
 		if (p_pr_item == NULL) {
 			osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 				"osm_pr_rcv_process: ERR 1F18: "
 				"Unable to allocate path record for MC group\n");
 			goto Unlock;
 		}
+		memset(p_pr_item, 0, sizeof(*p_pr_item));
 
 		/* Copy PathRecord request into response */
 		p_sa_mad = osm_madw_get_sa_mad_ptr(p_madw);
@@ -2157,9 +2138,7 @@ void osm_pr_rcv_process(IN void *context, IN void *data)
 		p_pr_item->path_rec.hop_flow_raw =
 			cl_hton32(hop_limit) | (flow_label << 8);
 
-		cl_qlist_insert_tail(&pr_list, (cl_list_item_t *)
-				     & p_pr_item->pool_item);
-
+		cl_qlist_insert_tail(&pr_list, &p_pr_item->list_item);
 	}
 
       Unlock:
diff --git a/opensm/opensm/osm_sa_pkey_record.c b/opensm/opensm/osm_sa_pkey_record.c
index 4402b94..1e9f50f 100644
--- a/opensm/opensm/osm_sa_pkey_record.c
+++ b/opensm/opensm/osm_sa_pkey_record.c
@@ -51,11 +51,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_PKEY_REC_RCV_POOL_MIN_SIZE      32
-#define OSM_PKEY_REC_RCV_POOL_GROW_SIZE     32
-
 typedef struct _osm_pkey_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_pkey_table_record_t rec;
 } osm_pkey_item_t;
 
@@ -73,7 +70,6 @@ typedef struct _osm_pkey_search_ctxt {
 void osm_pkey_rec_rcv_construct(IN osm_pkey_rec_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -81,7 +77,6 @@ void osm_pkey_rec_rcv_construct(IN osm_pkey_rec_rcv_t * const p_rcv)
 void osm_pkey_rec_rcv_destroy(IN osm_pkey_rec_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_pkey_rec_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -94,8 +89,6 @@ osm_pkey_rec_rcv_init(IN osm_pkey_rec_rcv_t * const p_rcv,
 		      IN osm_subn_t * const p_subn,
 		      IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_pkey_rec_rcv_init);
 
 	osm_pkey_rec_rcv_construct(p_rcv);
@@ -106,15 +99,8 @@ osm_pkey_rec_rcv_init(IN osm_pkey_rec_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	/* used for matching records collection */
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_PKEY_REC_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_PKEY_REC_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_pkey_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -131,11 +117,11 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_pkey_create);
 
-	p_rec_item = (osm_pkey_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_sa_pkey_create: ERR 4602: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
@@ -154,7 +140,7 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv,
 			cl_ntoh16(lid), osm_physp_get_port_num(p_physp), block);
 	}
 
-	memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 	p_rec_item->rec.block_num = block;
@@ -162,8 +148,7 @@ __osm_sa_pkey_create(IN osm_pkey_rec_rcv_t * const p_rcv,
 	p_rec_item->rec.pkey_tbl =
 	    *(osm_pkey_tbl_block_get(osm_physp_get_pkey_tbl(p_physp), block));
 
-	cl_qlist_insert_tail(p_ctxt->p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -444,8 +429,7 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data)
 			    (osm_pkey_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_pkey_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_pkey_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -494,7 +478,7 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_pkey_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -541,10 +525,9 @@ void osm_pkey_rec_rcv_process(IN void *ctx, IN void *data)
 		p_rec_item =
 		    (osm_pkey_item_t *) cl_qlist_remove_head(&rec_list);
 		/* copy only if not trimmed */
-		if (i < num_rec) {
+		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_portinfo_record.c b/opensm/opensm/osm_sa_portinfo_record.c
index 22869e6..ed3684c 100644
--- a/opensm/opensm/osm_sa_portinfo_record.c
+++ b/opensm/opensm/osm_sa_portinfo_record.c
@@ -64,11 +64,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_PIR_RCV_POOL_MIN_SIZE      32
-#define OSM_PIR_RCV_POOL_GROW_SIZE     32
-
 typedef struct _osm_pir_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_portinfo_record_t rec;
 } osm_pir_item_t;
 
@@ -86,7 +83,6 @@ typedef struct _osm_pir_search_ctxt {
 void osm_pir_rcv_construct(IN osm_pir_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -94,7 +90,6 @@ void osm_pir_rcv_construct(IN osm_pir_rcv_t * const p_rcv)
 void osm_pir_rcv_destroy(IN osm_pir_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_pir_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -107,8 +102,6 @@ osm_pir_rcv_init(IN osm_pir_rcv_t * const p_rcv,
 		 IN osm_subn_t * const p_subn,
 		 IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_pir_rcv_init);
 
 	osm_pir_rcv_construct(p_rcv);
@@ -119,14 +112,8 @@ osm_pir_rcv_init(IN osm_pir_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_PIR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_PIR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_pir_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -141,32 +128,30 @@ __osm_pir_rcv_new_pir(IN osm_pir_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_pir_rcv_new_pir);
 
-	p_rec_item = (osm_pir_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_pir_rcv_new_pir: ERR 2102: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
 
-	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) {
+	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_pir_rcv_new_pir: "
 			"New PortInfoRecord: port 0x%016" PRIx64
 			", lid 0x%X, port 0x%X\n",
 			cl_ntoh64(osm_physp_get_port_guid(p_physp)),
 			cl_ntoh16(lid), osm_physp_get_port_num(p_physp));
-	}
 
-	memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 	p_rec_item->rec.port_info = p_physp->port_info;
 	p_rec_item->rec.port_num = osm_physp_get_port_num(p_physp);
 
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -676,8 +661,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data)
 			    (osm_pir_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_pir_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_pir_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -725,7 +709,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_pir_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -786,7 +770,7 @@ void osm_pir_rcv_process(IN void *ctx, IN void *data)
 			if (trusted_req == FALSE)
 				p_resp_rec->port_info.m_key = 0;
 		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_service_record.c b/opensm/opensm/osm_sa_service_record.c
index abad29f..fb0193e 100644
--- a/opensm/opensm/osm_sa_service_record.c
+++ b/opensm/opensm/osm_sa_service_record.c
@@ -66,11 +66,8 @@
 #include <opensm/osm_service.h>
 #include <opensm/osm_pkey.h>
 
-#define OSM_SR_RCV_POOL_MIN_SIZE    64
-#define OSM_SR_RCV_POOL_GROW_SIZE   64
-
 typedef struct _osm_sr_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_service_record_t service_rec;
 } osm_sr_item_t;
 
@@ -79,7 +76,6 @@ typedef struct osm_sr_match_item {
 	ib_service_record_t *p_service_rec;
 	ib_net64_t comp_mask;
 	osm_sr_rcv_t *p_rcv;
-
 } osm_sr_match_item_t;
 
 typedef struct _osm_sr_search_ctxt {
@@ -92,7 +88,6 @@ typedef struct _osm_sr_search_ctxt {
 void osm_sr_rcv_construct(IN osm_sr_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->sr_pool);
 	cl_timer_construct(&p_rcv->sr_timer);
 }
 
@@ -101,7 +96,6 @@ void osm_sr_rcv_construct(IN osm_sr_rcv_t * const p_rcv)
 void osm_sr_rcv_destroy(IN osm_sr_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_sr_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->sr_pool);
 	cl_timer_trim(&p_rcv->sr_timer, 1);
 	cl_timer_destroy(&p_rcv->sr_timer);
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -116,8 +110,7 @@ osm_sr_rcv_init(IN osm_sr_rcv_t * const p_rcv,
 		IN osm_subn_t * const p_subn,
 		IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status = IB_ERROR;
-	cl_status_t cl_status;
+	ib_api_status_t status;
 
 	OSM_LOG_ENTER(p_log, osm_sr_rcv_init);
 
@@ -129,20 +122,8 @@ osm_sr_rcv_init(IN osm_sr_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	cl_status = cl_qlock_pool_init(&p_rcv->sr_pool,
-				       OSM_SR_RCV_POOL_MIN_SIZE,
-				       0,
-				       OSM_SR_RCV_POOL_GROW_SIZE,
-				       sizeof(osm_sr_item_t), NULL, NULL, NULL);
-	if (cl_status != CL_SUCCESS)
-		goto Exit;
-
 	status = cl_timer_init(&p_rcv->sr_timer, osm_sr_rcv_lease_cb, p_rcv);
-	if (cl_status != CL_SUCCESS)
-		goto Exit;
 
-	status = IB_SUCCESS;
-      Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
 	return (status);
 }
@@ -315,8 +296,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv,
 		/* need to set the mem free ... */
 		p_sr_item = (osm_sr_item_t *) cl_qlist_remove_head(p_list);
 		while (p_sr_item != (osm_sr_item_t *) cl_qlist_end(p_list)) {
-			cl_qlock_pool_put(&p_rcv->sr_pool,
-					  &p_sr_item->pool_item);
+			free(p_sr_item);
 			p_sr_item =
 			    (osm_sr_item_t *) cl_qlist_remove_head(p_list);
 		}
@@ -355,8 +335,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv,
 		/* Release the quick pool items */
 		p_sr_item = (osm_sr_item_t *) cl_qlist_remove_head(p_list);
 		while (p_sr_item != (osm_sr_item_t *) cl_qlist_end(p_list)) {
-			cl_qlock_pool_put(&p_rcv->sr_pool,
-					  &p_sr_item->pool_item);
+			free(p_sr_item);
 			p_sr_item =
 			    (osm_sr_item_t *) cl_qlist_remove_head(p_list);
 		}
@@ -430,8 +409,7 @@ __osm_sr_rcv_respond(IN osm_sr_rcv_t * const p_rcv,
 
 				num_copied++;
 			}
-			cl_qlock_pool_put(&p_rcv->sr_pool,
-					  &p_sr_item->pool_item);
+			free(p_sr_item);
 			p_resp_sr++;
 			p_sr_item =
 			    (osm_sr_item_t *) cl_qlist_remove_head(p_list);
@@ -668,9 +646,7 @@ __get_matching_sr(IN cl_list_item_t * const p_list_item, IN void *context)
 		}
 	}
 
-	p_sr_pool_item =
-	    (osm_sr_item_t *) cl_qlock_pool_get(&p_sr_item->p_rcv->sr_pool);
-
+	p_sr_pool_item = malloc(sizeof(*p_sr_pool_item));
 	if (p_sr_pool_item == NULL) {
 		osm_log(p_sr_item->p_rcv->p_log, OSM_LOG_ERROR,
 			"__get_matching_sr: ERR 2408: "
@@ -680,8 +656,7 @@ __get_matching_sr(IN cl_list_item_t * const p_list_item, IN void *context)
 
 	p_sr_pool_item->service_rec = p_svcr->service_record;
 
-	cl_qlist_insert_tail(&p_sr_item->sr_list,
-			     (cl_list_item_t *) & p_sr_pool_item->pool_item);
+	cl_qlist_insert_tail(&p_sr_item->sr_list, &p_sr_pool_item->list_item);
 
       Exit:
 	return;
@@ -848,7 +823,7 @@ osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv,
 		p_svcr->modified_time = cl_get_time_stamp_sec();
 	}
 
-	p_sr_item = (osm_sr_item_t *) cl_qlock_pool_get(&p_rcv->sr_pool);
+	p_sr_item = malloc(sizeof(*p_sr_item));
 	if (p_sr_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"osm_sr_rcv_process_set_method: ERR 2412: "
@@ -866,8 +841,7 @@ osm_sr_rcv_process_set_method(IN osm_sr_rcv_t * const p_rcv,
 	p_sr_item->service_rec = *p_recvd_service_rec;
 	cl_qlist_init(&sr_list);
 
-	cl_qlist_insert_tail(&sr_list,
-			     (cl_list_item_t *) & p_sr_item->pool_item);
+	cl_qlist_insert_tail(&sr_list, &p_sr_item->list_item);
 
 	__osm_sr_rcv_respond(p_rcv, p_madw, &sr_list);
 
@@ -925,7 +899,7 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv,
 
 	cl_plock_release(p_rcv->p_lock);
 
-	p_sr_item = (osm_sr_item_t *) cl_qlock_pool_get(&p_rcv->sr_pool);
+	p_sr_item = malloc(sizeof(*p_sr_item));
 	if (p_sr_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"osm_sr_rcv_process_delete_method: ERR 2413: "
@@ -939,8 +913,7 @@ osm_sr_rcv_process_delete_method(IN osm_sr_rcv_t * const p_rcv,
 	p_sr_item->service_rec = p_svcr->service_record;
 	cl_qlist_init(&sr_list);
 
-	cl_qlist_insert_tail(&sr_list,
-			     (cl_list_item_t *) & p_sr_item->pool_item);
+	cl_qlist_insert_tail(&sr_list, &p_sr_item->list_item);
 
 	if (p_svcr)
 		osm_svcr_delete(p_svcr);
diff --git a/opensm/opensm/osm_sa_slvl_record.c b/opensm/opensm/osm_sa_slvl_record.c
index 8d8e4dc..fd48296 100644
--- a/opensm/opensm/osm_sa_slvl_record.c
+++ b/opensm/opensm/osm_sa_slvl_record.c
@@ -63,11 +63,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_SLVL_REC_RCV_POOL_MIN_SIZE    32
-#define OSM_SLVL_REC_RCV_POOL_GROW_SIZE   32
-
 typedef struct _osm_slvl_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_slvl_table_record_t rec;
 } osm_slvl_item_t;
 
@@ -85,7 +82,6 @@ typedef struct _osm_slvl_search_ctxt {
 void osm_slvl_rec_rcv_construct(IN osm_slvl_rec_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -93,7 +89,6 @@ void osm_slvl_rec_rcv_construct(IN osm_slvl_rec_rcv_t * const p_rcv)
 void osm_slvl_rec_rcv_destroy(IN osm_slvl_rec_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_slvl_rec_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -106,8 +101,6 @@ osm_slvl_rec_rcv_init(IN osm_slvl_rec_rcv_t * const p_rcv,
 		      IN osm_subn_t * const p_subn,
 		      IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_slvl_rec_rcv_init);
 
 	osm_slvl_rec_rcv_construct(p_rcv);
@@ -118,15 +111,8 @@ osm_slvl_rec_rcv_init(IN osm_slvl_rec_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	/* used for matching records collection */
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_SLVL_REC_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_SLVL_REC_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_slvl_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -143,11 +129,11 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_slvl_create);
 
-	p_rec_item = (osm_slvl_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_sa_slvl_create: ERR 2602: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
@@ -166,7 +152,7 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv,
 			cl_ntoh16(lid), osm_physp_get_port_num(p_physp),
 			in_port_idx);
 
-	memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 	p_rec_item->rec.out_port_num = osm_physp_get_port_num(p_physp);
@@ -174,8 +160,7 @@ __osm_sa_slvl_create(IN osm_slvl_rec_rcv_t * const p_rcv,
 	p_rec_item->rec.slvl_tbl =
 	    *(osm_physp_get_slvl_tbl(p_physp, in_port_idx));
 
-	cl_qlist_insert_tail(p_ctxt->p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -419,8 +404,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data)
 			    (osm_slvl_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_slvl_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_slvl_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -469,7 +453,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_slvl_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -518,7 +502,7 @@ void osm_slvl_rec_rcv_process(IN void *ctx, IN void *data)
 		/* copy only if not trimmed */
 		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_sminfo_record.c b/opensm/opensm/osm_sa_sminfo_record.c
index 2aa136e..6f84ac7 100644
--- a/opensm/opensm/osm_sa_sminfo_record.c
+++ b/opensm/opensm/osm_sa_sminfo_record.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -70,11 +70,8 @@
 #include <opensm/osm_remote_sm.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_SMIR_RCV_POOL_MIN_SIZE     32
-#define OSM_SMIR_RCV_POOL_GROW_SIZE    32
-
 typedef struct _osm_smir_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_sminfo_record_t rec;
 } osm_smir_item_t;
 
@@ -91,7 +88,6 @@ typedef struct _osm_smir_search_ctxt {
 void osm_smir_rcv_construct(IN osm_smir_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -99,9 +95,7 @@ void osm_smir_rcv_construct(IN osm_smir_rcv_t * const p_rcv)
 void osm_smir_rcv_destroy(IN osm_smir_rcv_t * const p_rcv)
 {
 	CL_ASSERT(p_rcv);
-
 	OSM_LOG_ENTER(p_rcv->p_log, osm_smir_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -115,8 +109,6 @@ osm_smir_rcv_init(IN osm_smir_rcv_t * const p_rcv,
 		  IN osm_stats_t * const p_stats,
 		  IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status = IB_SUCCESS;
-
 	OSM_LOG_ENTER(p_log, osm_smir_rcv_init);
 
 	osm_smir_rcv_construct(p_rcv);
@@ -128,14 +120,8 @@ osm_smir_rcv_init(IN osm_smir_rcv_t * const p_rcv,
 	p_rcv->p_stats = p_stats;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_SMIR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_SMIR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_smir_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_rcv->p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 static ib_api_status_t
@@ -152,31 +138,29 @@ __osm_smir_rcv_new_smir(IN osm_smir_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_smir_rcv_new_smir);
 
-	p_rec_item = (osm_smir_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_smir_rcv_new_smir: ERR 2801: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
 
-	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) {
+	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_smir_rcv_new_smir: "
 			"New SMInfo: GUID 0x%016" PRIx64 "\n", cl_ntoh64(guid)
 		    );
-	}
 
-	memset(&p_rec_item->rec, 0, sizeof(ib_sminfo_record_t));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = osm_port_get_base_lid(p_port);
 	p_rec_item->rec.sm_info.guid = guid;
 	p_rec_item->rec.sm_info.act_count = act_count;
 	p_rec_item->rec.sm_info.pri_state = pri_state;
 
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -445,8 +429,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data)
 			    (osm_smir_item_t *) cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_smir_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_smir_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -493,7 +476,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_smir_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -544,7 +527,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data)
 			*p_resp_rec = p_rec_item->rec;
 			p_resp_rec->sm_info.sm_key = 0;
 		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_sw_info_record.c b/opensm/opensm/osm_sa_sw_info_record.c
index edfe106..a9947e1 100644
--- a/opensm/opensm/osm_sa_sw_info_record.c
+++ b/opensm/opensm/osm_sa_sw_info_record.c
@@ -63,7 +63,7 @@
 #define OSM_SIR_RCV_POOL_GROW_SIZE   32
 
 typedef struct _osm_sir_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_switch_info_record_t rec;
 } osm_sir_item_t;
 
@@ -80,7 +80,6 @@ typedef struct _osm_sir_search_ctxt {
 void osm_sir_rcv_construct(IN osm_sir_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -88,7 +87,6 @@ void osm_sir_rcv_construct(IN osm_sir_rcv_t * const p_rcv)
 void osm_sir_rcv_destroy(IN osm_sir_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_sir_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -101,8 +99,6 @@ osm_sir_rcv_init(IN osm_sir_rcv_t * const p_rcv,
 		 IN osm_subn_t * const p_subn,
 		 IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_sir_rcv_init);
 
 	osm_sir_rcv_construct(p_rcv);
@@ -113,14 +109,8 @@ osm_sir_rcv_init(IN osm_sir_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_SIR_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_SIR_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_sir_item_t), NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -135,29 +125,27 @@ __osm_sir_rcv_new_sir(IN osm_sir_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_sir_rcv_new_sir);
 
-	p_rec_item = (osm_sir_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_sir_rcv_new_sir: ERR 5308: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
 
-	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) {
+	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_sir_rcv_new_sir: "
 			"New SwitchInfoRecord: lid 0x%X\n", cl_ntoh16(lid)
 		    );
-	}
 
-	memset(&p_rec_item->rec, 0, sizeof(ib_switch_info_record_t));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 	p_rec_item->rec.switch_info = p_sw->switch_info;
 
-	cl_qlist_insert_tail(p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -393,7 +381,7 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data)
 		/* need to set the mem free ... */
 		p_rec_item = (osm_sir_item_t *) cl_qlist_remove_head(&rec_list);
 		while (p_rec_item != (osm_sir_item_t *) cl_qlist_end(&rec_list)) {
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 			p_rec_item =
 			    (osm_sir_item_t *) cl_qlist_remove_head(&rec_list);
 		}
@@ -442,7 +430,7 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item =
 			    (osm_sir_item_t *) cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -487,10 +475,9 @@ void osm_sir_rcv_process(IN void *ctx, IN void *data)
 	for (i = 0; i < pre_trim_num_rec; i++) {
 		p_rec_item = (osm_sir_item_t *) cl_qlist_remove_head(&rec_list);
 		/* copy only if not trimmed */
-		if (i < num_rec) {
+		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
diff --git a/opensm/opensm/osm_sa_vlarb_record.c b/opensm/opensm/osm_sa_vlarb_record.c
index 49d688a..a538a0b 100644
--- a/opensm/opensm/osm_sa_vlarb_record.c
+++ b/opensm/opensm/osm_sa_vlarb_record.c
@@ -63,11 +63,8 @@
 #include <opensm/osm_pkey.h>
 #include <opensm/osm_sa.h>
 
-#define OSM_VLARB_REC_RCV_POOL_MIN_SIZE      32
-#define OSM_VLARB_REC_RCV_POOL_GROW_SIZE     32
-
 typedef struct _osm_vl_arb_item {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	ib_vl_arb_table_record_t rec;
 } osm_vl_arb_item_t;
 
@@ -85,7 +82,6 @@ typedef struct _osm_vl_arb_search_ctxt {
 void osm_vlarb_rec_rcv_construct(IN osm_vlarb_rec_rcv_t * const p_rcv)
 {
 	memset(p_rcv, 0, sizeof(*p_rcv));
-	cl_qlock_pool_construct(&p_rcv->pool);
 }
 
 /**********************************************************************
@@ -93,7 +89,6 @@ void osm_vlarb_rec_rcv_construct(IN osm_vlarb_rec_rcv_t * const p_rcv)
 void osm_vlarb_rec_rcv_destroy(IN osm_vlarb_rec_rcv_t * const p_rcv)
 {
 	OSM_LOG_ENTER(p_rcv->p_log, osm_vlarb_rec_rcv_destroy);
-	cl_qlock_pool_destroy(&p_rcv->pool);
 	OSM_LOG_EXIT(p_rcv->p_log);
 }
 
@@ -106,8 +101,6 @@ osm_vlarb_rec_rcv_init(IN osm_vlarb_rec_rcv_t * const p_rcv,
 		       IN osm_subn_t * const p_subn,
 		       IN osm_log_t * const p_log, IN cl_plock_t * const p_lock)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_vlarb_rec_rcv_init);
 
 	osm_vlarb_rec_rcv_construct(p_rcv);
@@ -118,16 +111,8 @@ osm_vlarb_rec_rcv_init(IN osm_vlarb_rec_rcv_t * const p_rcv,
 	p_rcv->p_resp = p_resp;
 	p_rcv->p_mad_pool = p_mad_pool;
 
-	/* used for matching records collection */
-	status = cl_qlock_pool_init(&p_rcv->pool,
-				    OSM_VLARB_REC_RCV_POOL_MIN_SIZE,
-				    0,
-				    OSM_VLARB_REC_RCV_POOL_GROW_SIZE,
-				    sizeof(osm_vl_arb_item_t),
-				    NULL, NULL, NULL);
-
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -144,11 +129,11 @@ __osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv,
 
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_sa_vl_arb_create);
 
-	p_rec_item = (osm_vl_arb_item_t *) cl_qlock_pool_get(&p_rcv->pool);
+	p_rec_item = malloc(sizeof(*p_rec_item));
 	if (p_rec_item == NULL) {
 		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
 			"__osm_sa_vl_arb_create: ERR 2A02: "
-			"cl_qlock_pool_get failed\n");
+			"rec_item alloc failed\n");
 		status = IB_INSUFFICIENT_RESOURCES;
 		goto Exit;
 	}
@@ -158,24 +143,22 @@ __osm_sa_vl_arb_create(IN osm_vlarb_rec_rcv_t * const p_rcv,
 	else
 		lid = osm_node_get_base_lid(p_physp->p_node, 0);
 
-	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG)) {
+	if (osm_log_is_active(p_rcv->p_log, OSM_LOG_DEBUG))
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_sa_vl_arb_create: "
 			"New VLArbitration for: port 0x%016" PRIx64
 			", lid 0x%X, port 0x%X Block:%u\n",
 			cl_ntoh64(osm_physp_get_port_guid(p_physp)),
 			cl_ntoh16(lid), osm_physp_get_port_num(p_physp), block);
-	}
 
-	memset(&p_rec_item->rec, 0, sizeof(p_rec_item->rec));
+	memset(p_rec_item, 0, sizeof(*p_rec_item));
 
 	p_rec_item->rec.lid = lid;
 	p_rec_item->rec.port_num = osm_physp_get_port_num(p_physp);
 	p_rec_item->rec.block_num = block;
 	p_rec_item->rec.vl_arb_tbl = *(osm_physp_get_vla_tbl(p_physp, block));
 
-	cl_qlist_insert_tail(p_ctxt->p_list,
-			     (cl_list_item_t *) & p_rec_item->pool_item);
+	cl_qlist_insert_tail(p_ctxt->p_list, &p_rec_item->list_item);
 
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
@@ -436,8 +419,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data)
 			    cl_qlist_remove_head(&rec_list);
 			while (p_rec_item !=
 			       (osm_vl_arb_item_t *) cl_qlist_end(&rec_list)) {
-				cl_qlock_pool_put(&p_rcv->pool,
-						  &p_rec_item->pool_item);
+				free(p_rec_item);
 				p_rec_item = (osm_vl_arb_item_t *)
 				    cl_qlist_remove_head(&rec_list);
 			}
@@ -487,7 +469,7 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data)
 		for (i = 0; i < num_rec; i++) {
 			p_rec_item = (osm_vl_arb_item_t *)
 			    cl_qlist_remove_head(&rec_list);
-			cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+			free(p_rec_item);
 		}
 
 		osm_sa_send_error(p_rcv->p_resp, p_madw,
@@ -534,10 +516,9 @@ void osm_vlarb_rec_rcv_process(IN void *ctx, IN void *data)
 		p_rec_item =
 		    (osm_vl_arb_item_t *) cl_qlist_remove_head(&rec_list);
 		/* copy only if not trimmed */
-		if (i < num_rec) {
+		if (i < num_rec)
 			*p_resp_rec = p_rec_item->rec;
-		}
-		cl_qlock_pool_put(&p_rcv->pool, &p_rec_item->pool_item);
+		free(p_rec_item);
 		p_resp_rec++;
 	}
 
-- 
1.5.3.rc2.29.gc4640f


From sashak at voltaire.com  Sun Dec  9 06:24:40 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 9 Dec 2007 14:24:40 +0000
Subject: [ofa-general] [PATCH RFC] opensm: use malloc instead of
	cl_qlock_pool in osm_mad_pool.c
In-Reply-To: <20071209141822.GJ6213@sashak.voltaire.com>
References: <20071127003157.GA26160@sashak.voltaire.com>
	<20071209141822.GJ6213@sashak.voltaire.com>
Message-ID: <20071209142440.GL6213@sashak.voltaire.com>


Use regular malloc/free instead of cl_qlock_pool allocator in
osm_mad_pool.c. malloc() is more than twice faster than cl_qlock_pool
analogs and using this doesn't require any locking.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/include/opensm/osm_mad_pool.h |    7 +---
 opensm/include/opensm/osm_madw.h     |   80 +++-------------------------------
 opensm/opensm/osm_mad_pool.c         |   56 ++++--------------------
 opensm/opensm/osm_vl15intf.c         |    6 +-
 4 files changed, 20 insertions(+), 129 deletions(-)

diff --git a/opensm/include/opensm/osm_mad_pool.h b/opensm/include/opensm/osm_mad_pool.h
index 9ec0a7a..b8421b9 100644
--- a/opensm/include/opensm/osm_mad_pool.h
+++ b/opensm/include/opensm/osm_mad_pool.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -50,7 +50,6 @@
 
 #include <iba/ib_types.h>
 #include <complib/cl_atomic.h>
-#include <complib/cl_qlockpool.h>
 #include <opensm/osm_base.h>
 #include <vendor/osm_vendor.h>
 #include <opensm/osm_madw.h>
@@ -97,7 +96,6 @@ BEGIN_C_DECLS
 */
 typedef struct _osm_mad_pool {
 	osm_log_t *p_log;
-	cl_qlock_pool_t madw_pool;
 	atomic32_t mads_out;
 } osm_mad_pool_t;
 /*
@@ -105,9 +103,6 @@ typedef struct _osm_mad_pool {
 *	p_log
 *		Pointer to the log object.
 *
-*	lock
-*		Spinlock guarding the pool.
-*
 *	mads_out
 *		Running total of the number of MADs outstanding.
 *
diff --git a/opensm/include/opensm/osm_madw.h b/opensm/include/opensm/osm_madw.h
index d4bcbc1..31707ad 100644
--- a/opensm/include/opensm/osm_madw.h
+++ b/opensm/include/opensm/osm_madw.h
@@ -409,7 +409,7 @@ typedef struct _osm_mad_addr {
 * SYNOPSIS
 */
 typedef struct _osm_madw {
-	cl_pool_item_t pool_item;
+	cl_list_item_t list_item;
 	osm_bind_handle_t h_bind;
 	osm_vend_wrap_t vend_wrap;
 	osm_mad_addr_t mad_addr;
@@ -423,8 +423,8 @@ typedef struct _osm_madw {
 } osm_madw_t;
 /*
 * FIELDS
-*	pool_item
-*		List linkage for pools and lists.  MUST BE FIRST MEMBER!
+*	list_item
+*		List linkage for lists.  MUST BE FIRST MEMBER!
 *
 *	h_bind
 *		Bind handle for the port on which this MAD will be sent
@@ -467,72 +467,6 @@ typedef struct _osm_madw {
 * SEE ALSO
 *********/
 
-/****f* OpenSM: MAD Wrapper/osm_madw_construct
-* NAME
-*	osm_madw_construct
-*
-* DESCRIPTION
-*	This function constructs a MAD Wrapper object.
-*
-* SYNOPSIS
-*/
-static inline void osm_madw_construct(IN osm_madw_t * const p_madw)
-{
-	/*
-	   Don't touch the pool_item since that is an opaque object.
-	   Clear all other objects in the mad wrapper.
-	 */
-	memset(((uint8_t *) p_madw) + sizeof(cl_pool_item_t), 0,
-	       sizeof(*p_madw) - sizeof(cl_pool_item_t));
-}
-
-/*
-* PARAMETERS
-*	p_madw
-*		[in] Pointer to a MAD Wrapper object to construct.
-*
-* RETURN VALUE
-*	This function does not return a value.
-*
-* NOTES
-*	Allows calling osm_madw_init, osm_madw_destroy
-*
-*	Calling osm_madw_construct is a prerequisite to calling any other
-*	method except osm_madw_init.
-*
-* SEE ALSO
-*	MAD Wrapper object, osm_madw_init, osm_madw_destroy
-*********/
-
-/****f* OpenSM: MAD Wrapper/osm_madw_destroy
-* NAME
-*	osm_madw_destroy
-*
-* DESCRIPTION
-*	The osm_madw_destroy function destroys a node, releasing
-*	all resources.
-*
-* SYNOPSIS
-*/
-void osm_madw_destroy(IN osm_madw_t * const p_madw);
-/*
-* PARAMETERS
-*	p_madw
-*		[in] Pointer to a MAD Wrapper object to destroy.
-*
-* RETURN VALUE
-*	This function does not return a value.
-*
-* NOTES
-*	Performs any necessary cleanup of the specified MAD Wrapper object.
-*	Further operations should not be attempted on the destroyed object.
-*	This function should only be called after a call to osm_madw_construct or
-*	osm_madw_init.
-*
-* SEE ALSO
-*	MAD Wrapper object, osm_madw_construct, osm_madw_init
-*********/
-
 /****f* OpenSM: MAD Wrapper/osm_madw_init
 * NAME
 *	osm_madw_init
@@ -548,7 +482,7 @@ osm_madw_init(IN osm_madw_t * const p_madw,
 	      IN const uint32_t mad_size,
 	      IN const osm_mad_addr_t * const p_mad_addr)
 {
-	osm_madw_construct(p_madw);
+	memset(p_madw, 0, sizeof(*p_madw));
 	p_madw->h_bind = h_bind;
 	p_madw->fail_msg = CL_DISP_MSGID_NONE;
 	p_madw->mad_size = mad_size;
@@ -602,7 +536,7 @@ static inline ib_smp_t *osm_madw_get_smp_ptr(IN const osm_madw_t * const p_madw)
 * NOTES
 *
 * SEE ALSO
-*	MAD Wrapper object, osm_madw_construct, osm_madw_destroy
+*	MAD Wrapper object
 *********/
 
 /****f* OpenSM: MAD Wrapper/osm_madw_get_sa_mad_ptr
@@ -631,7 +565,7 @@ static inline ib_sa_mad_t *osm_madw_get_sa_mad_ptr(IN const osm_madw_t *
 * NOTES
 *
 * SEE ALSO
-*	MAD Wrapper object, osm_madw_construct, osm_madw_destroy
+*	MAD Wrapper object
 *********/
 
 /****f* OpenSM: MAD Wrapper/osm_madw_get_perfmgt_mad_ptr
@@ -657,7 +591,7 @@ static inline ib_perfmgt_mad_t *osm_madw_get_perfmgt_mad_ptr(IN const osm_madw_t
 * NOTES
 *
 * SEE ALSO
-*	MAD Wrapper object, osm_madw_construct, osm_madw_destroy
+*	MAD Wrapper object
 *********/
 
 /****f* OpenSM: MAD Wrapper/osm_madw_get_ni_context_ptr
diff --git a/opensm/opensm/osm_mad_pool.c b/opensm/opensm/osm_mad_pool.c
index c3f3f2a..f9ef54c 100644
--- a/opensm/opensm/osm_mad_pool.c
+++ b/opensm/opensm/osm_mad_pool.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -56,24 +56,6 @@
 #include <opensm/osm_log.h>
 #include <vendor/osm_vendor_api.h>
 
-#define OSM_MAD_POOL_MIN_SIZE 256
-#define OSM_MAD_POOL_GROW_SIZE 256
-
-/**********************************************************************
- **********************************************************************/
-cl_status_t
-__osm_mad_pool_ctor(IN void *const p_object,
-		    IN void *context, OUT cl_pool_item_t ** const pp_pool_item)
-{
-	osm_madw_t *p_madw = p_object;
-
-	UNUSED_PARAM(context);
-	osm_madw_construct(p_madw);
-	/* CHECK THIS.  DOCS DON'T DESCRIBE THIS OUT PARAM. */
-	*pp_pool_item = &p_madw->pool_item;
-	return (CL_SUCCESS);
-}
-
 /**********************************************************************
  **********************************************************************/
 void osm_mad_pool_construct(IN osm_mad_pool_t * const p_pool)
@@ -81,7 +63,6 @@ void osm_mad_pool_construct(IN osm_mad_pool_t * const p_pool)
 	CL_ASSERT(p_pool);
 
 	memset(p_pool, 0, sizeof(*p_pool));
-	cl_qlock_pool_construct(&p_pool->madw_pool);
 }
 
 /**********************************************************************
@@ -89,9 +70,6 @@ void osm_mad_pool_construct(IN osm_mad_pool_t * const p_pool)
 void osm_mad_pool_destroy(IN osm_mad_pool_t * const p_pool)
 {
 	CL_ASSERT(p_pool);
-
-	/* HACK: we still rarely see some mads leaking - so ignore this */
-	/* cl_qlock_pool_destroy( &p_pool->madw_pool ); */
 }
 
 /**********************************************************************
@@ -99,29 +77,12 @@ void osm_mad_pool_destroy(IN osm_mad_pool_t * const p_pool)
 ib_api_status_t
 osm_mad_pool_init(IN osm_mad_pool_t * const p_pool, IN osm_log_t * const p_log)
 {
-	ib_api_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_mad_pool_init);
 
 	p_pool->p_log = p_log;
 
-	status = cl_qlock_pool_init(&p_pool->madw_pool,
-				    OSM_MAD_POOL_MIN_SIZE,
-				    0,
-				    OSM_MAD_POOL_GROW_SIZE,
-				    sizeof(osm_madw_t),
-				    __osm_mad_pool_ctor, NULL, p_pool);
-	if (status != IB_SUCCESS) {
-		osm_log(p_log, OSM_LOG_ERROR,
-			"osm_mad_pool_init: ERR 0702: "
-			"Grow pool initialization failed (%s)\n",
-			ib_get_err_str(status));
-		goto Exit;
-	}
-
-      Exit:
 	OSM_LOG_EXIT(p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -142,7 +103,7 @@ osm_madw_t *osm_mad_pool_get(IN osm_mad_pool_t * const p_pool,
 	/*
 	   First, acquire a mad wrapper from the mad wrapper pool.
 	 */
-	p_madw = (osm_madw_t *) cl_qlock_pool_get(&p_pool->madw_pool);
+	p_madw = malloc(sizeof(*p_madw));
 	if (p_madw == NULL) {
 		osm_log(p_pool->p_log, OSM_LOG_ERROR,
 			"osm_mad_pool_get: ERR 0703: "
@@ -162,8 +123,7 @@ osm_madw_t *osm_mad_pool_get(IN osm_mad_pool_t * const p_pool,
 			"Unable to acquire wire MAD\n");
 
 		/* Don't leak wrappers! */
-		cl_qlock_pool_put(&p_pool->madw_pool,
-				  (cl_pool_item_t *) p_madw);
+		free(p_madw);
 		p_madw = NULL;
 		goto Exit;
 	}
@@ -202,7 +162,7 @@ osm_madw_t *osm_mad_pool_get_wrapper(IN osm_mad_pool_t * const p_pool,
 	/*
 	   First, acquire a mad wrapper from the mad wrapper pool.
 	 */
-	p_madw = (osm_madw_t *) cl_qlock_pool_get(&p_pool->madw_pool);
+	p_madw = malloc(sizeof(*p_madw));
 	if (p_madw == NULL) {
 		osm_log(p_pool->p_log, OSM_LOG_ERROR,
 			"osm_mad_pool_get_wrapper: ERR 0705: "
@@ -234,7 +194,9 @@ osm_madw_t *osm_mad_pool_get_wrapper_raw(IN osm_mad_pool_t * const p_pool)
 
 	OSM_LOG_ENTER(p_pool->p_log, osm_mad_pool_get_wrapper_raw);
 
-	p_madw = (osm_madw_t *) cl_qlock_pool_get(&p_pool->madw_pool);
+	p_madw = malloc(sizeof(*p_madw));
+	if (!p_madw)
+		return NULL;
 
 	osm_log(p_pool->p_log, OSM_LOG_DEBUG,
 		"osm_mad_pool_get_wrapper_raw: "
@@ -270,7 +232,7 @@ osm_mad_pool_put(IN osm_mad_pool_t * const p_pool, IN osm_madw_t * const p_madw)
 	/*
 	   Return the mad wrapper to the wrapper pool
 	 */
-	cl_qlock_pool_put(&p_pool->madw_pool, (cl_pool_item_t *) p_madw);
+	free(p_madw);
 	cl_atomic_dec(&p_pool->mads_out);
 
 	OSM_LOG_EXIT(p_pool->p_log);
diff --git a/opensm/opensm/osm_vl15intf.c b/opensm/opensm/osm_vl15intf.c
index 74e749f..5d10ed6 100644
--- a/opensm/opensm/osm_vl15intf.c
+++ b/opensm/opensm/osm_vl15intf.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved.
+ * Copyright (c) 2004-2007 Voltaire, Inc. All rights reserved.
  * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved.
  * Copyright (c) 1996-2003 Intel Corporation. All rights reserved.
  *
@@ -340,10 +340,10 @@ void osm_vl15_post(IN osm_vl15_t * const p_vl, IN osm_madw_t * const p_madw)
 	 */
 	cl_spinlock_acquire(&p_vl->lock);
 	if (p_madw->resp_expected == TRUE) {
-		cl_qlist_insert_tail(&p_vl->rfifo, (cl_list_item_t *) p_madw);
+		cl_qlist_insert_tail(&p_vl->rfifo, &p_madw->list_item);
 		cl_atomic_inc(&p_vl->p_stats->qp0_mads_outstanding);
 	} else
-		cl_qlist_insert_tail(&p_vl->ufifo, (cl_list_item_t *) p_madw);
+		cl_qlist_insert_tail(&p_vl->ufifo, &p_madw->list_item);
 	cl_spinlock_release(&p_vl->lock);
 
 	if (osm_log_is_active(p_vl->p_log, OSM_LOG_DEBUG))
-- 
1.5.3.rc2.29.gc4640f


From myopenib at gmail.com  Sun Dec  9 06:49:58 2007
From: myopenib at gmail.com (Moni Levy)
Date: Sun, 09 Dec 2007 16:49:58 +0200
Subject: [ofa-general] mthca lid_change event generation
Message-ID: <475C0096.9060804@gmail.com>

Roland,
	during some kind of SM failover testing I found out that the IPoIB interfaces get IB_EVENT_LID_CHANGE event without having their lid really changed. I looked into the code that generates it and it looks wrong to me. Why if we receive a "portinfo set" mad with it's client_reregister bit cleared we have to generate the lid_change event even if that mad does not actually change the port lid ? Here is the code, I guess I'm missing something.

                        if(pinfo->clientrereg_resv_subnetto & 0x80)
                                event.event    = IB_EVENT_CLIENT_REREGISTER;
                        else
                                event.event    = IB_EVENT_LID_CHANGE;

--Moni


From a-alexbl at adria-web.com  Sun Dec  9 07:32:16 2007
From: a-alexbl at adria-web.com (Brain Vann)
Date: , 9 Dec 2007 11:32:16 -0400
Subject: [ofa-general] Chatting online
Message-ID: <01c83a57$26c3c8b0$c35908be@a-alexbl>

Hello! I am tired this afternoon. I am nice girl that would like to chat with you. Email me at kvr at ShineBal.info only, because I am writing not from my personal email. Hope you like my pictures.


From hill_adams202 at fastermail.com  Sun Dec  9 07:40:01 2007
From: hill_adams202 at fastermail.com (Hillary Adams)
Date: Sun, 9 Dec 2007 10:40:01 -0500
Subject: [ofa-general] GESCHAEFTSVORSCHLAG
Message-ID: <200712091540.lB9Fe14f012748@nobilityofnorrath.net>

GESCHAEFTSVORSCHLAG


Zuerst mu� ich Ihre Zuversicht in dieser Verhandlung bitten, dies ist auf Grund seiner lage als das Sein total VERTRAULICH und-GEHEIMNIS. Aber ich wei�, da� eine Verhandlung dieses Ausma�es irgendeinen �ngstlich und besorgt machen wird, aber ich versichere Sie, da� aller in ordnung seien wird am Ende des Tages. Wir haben entschieden Sie durch faxsendung wegen der Dringlichkeit dieser Verhandlung zu erreichen, als wir davon zuverl�ssig ueberzeugt worden sind von seiner Schnelligkeit und Vertraulichkeit. Lassen Sie mich zuerst Vorstellen. Ich bin Herr Hillary Adam. ein rechnungspruefer bei der Union Bank Nigeria PLC, Lagos.
 Ich kam zu ihrer kontakt in meiner privaten Suchen f�r eine zuverl�ssige und anst�ndige Person, um eine sehr vertrauliche Verhandlung zu erledigen, die die �bertragung von einer riesigen Summe von Geld zu einem fremden Konto, das maximale Zuversicht erfordert. DER VORSCHLAG: Ein Ausl�nder, Verstorbene Ingenieur Manfred Becker, ein �l H�ndler / Unternehmer m! it dem Bundes Regierung von Nigeria. 

Er war bis seinen Tod vor drei Jahren in einem gr��lichen Flugzeug absturz als Unternehmer bei der regierung taetig, Herr Becker war unsere kunde hier bei der Union Bank PLC., Lagos, und hatte ein schlie�end kontohaben von USD$18.5M (Achtzehn Million, F�nf Hundert Tausend, US Dollar) welcher die Bank erwartet jetzt fraglos, durch seine Verwandten behaupten zu werden oder Andererseit wird den ganze menge als nichtzubehaupten deklarieren und wird zu einem Afrikanischen Vertrauen-Fond f�r waffen und Munitionbesorgung bei einer der freiheitbewegung hier in Afrika gespendet wird.Leidenschaftliche wertvolle Anstrengungen werden durch die Union-Bank gemacht, um in Kontakt mit einen von der Becker Familie oder Verwandten festzustellen aber hat bis jetzt zu keinem Erfoelg gegeben. Es ist wegen der wahrgenommen M�glichkeit keiner Verwandte der Becker zu finden, (er hatte keine bekannte Frau und Kinder) da� das Management unter dem Einflu� dessen Sitzung Vorsitzender, General Kalu Uke !
 ! Kalu (Ausgeschieden) der eine Anordnung f�r den Fond als NICHT ZUBEHAUPTEN deklariert werden sollte, und dann zum dem Vertrauen-Fond f�r Waffen und Munitionbesorgung ausgeben, die den Kurs von Krieg in Afrika infolgedessen gespendet werden. Um diese Negative-Entwicklung
 abzuwenden, ich und einige meiner bew�hrten Kollegen in der Bank haben abgeschlossen das geld nach ihrer zustimmung zu ueberweisen und suchen jetzt Ihre Erlaubnis damit Sie sich als der Verwandter der Verstorbene Engr. Manfred Becker deklarieren damit der Fond in der hoehe von USD$18.5M w�rden infolgedessen �berwiesen werden und w�rden in Ihr Bank-Konto als der Nutznie�er (Verwndter der Becker) gezahlt werden Alles beurkunden und beweis Ihnen zu erm�glichen, diesen Fond zu behaupten werden wir zu ihrer verfuegung stellen damit alles geklappt worden ist, und wir versichern Sie ein 100% Risiko freie Verwicklung. Ihr Anteil w�re 30% von der totalen Menge. 10% ist f�rAufwendungen bei der ueberweissung bearbeitun! g beiseite gesetzt worden,w�hrend die restlichen 60% f�r mich ! und meine Kollegen f�r Anlage-Zwecke in Ihrem Land w�re Wenn dieser Vorschlag bei Ihnen OK ist und Sie w�nschen das Vertrauen auszunutzen, die wir hoffen, auf Ihnen und Ihrer Gesellschaft zu verleihen, dan!
 n netterweise senden Sie mir sofort �ber meinen E-mail addresse, Ihrem vertraulichsten Telefon nummer, Fax-nummer und ihrer vertrautlichen E-mail anschrift, damit ich zu Ihnen die relevanten Details dieser Verhandlung senden kann. 


Danke im voraus. 


Mit besten Gr��e,Hillary Adam. 
UNION BANK PLC.
N.B.BITTE SENDEN SIE MIR Hillary Adam
ANTWORT ZU durch mein E-mail: mail2hills at yahoo.de
f�R VERTRAULICHEN GRUND. 
Schicken Sie keine POST ZU MEINEM B�RO-E-MAIL. 


If you understand english,please kindly reply with english.


From linbostelmet at bostel.de  Sun Dec  9 07:57:08 2007
From: linbostelmet at bostel.de (Mireille SILBACH)
Date: , 9 Dec 2007 16:57:08 +0100
Subject: [ofa-general] Re:
Message-ID: <01c83a84$887c4a00$9b2b4c59@linbostelmet>

Say NO to cheap drugs and pirated soft! Use only totally legal OEM soft at special discount of up to 70%! There's a wide choice and instant delivery!Here are some of the price samples, you'll dig the rest, I bet!Microsoft Windows Vista Business - $79.95
Microsoft Office 2007 Enterprise - $79.95
Macromedia Dreamweaver 8 - $49.95
Adobe Creative Suite 3 Design Premium for Windows - $269.90
Microsoft Office 2003 Professional with Business Contact Manager for Outlook - $69.95
Adobe Illustrator CS2 - $59.95 
Adobe Premiere 2.0 - $59.95 
CorelDraw Graphics Suite X3 - $59.95 
Macromedia Studio 8 - $99.95 
Autodesk AutoCAD 2007 - $129.95
Intuit QuickBooks 2006 Premier Edition - $69.95 
Avid Liquid Pro 7 - $69.95
Adobe Acrobat 8.0 Professional - $79.95
Microsoft Money Home &amp; Business 7 - $39.95 
MS Windows XP Professional with SP2 - $49.95
Adobe Photoshop CS2 V 9.0 - $69.95 
Micrîsoft Office XP Professional - $49.95http://kpokoqh.buybestdealoem.net/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071209/63707866/attachment.html>

From dwshojojim at shojoji.com  Sun Dec  9 10:06:26 2007
From: dwshojojim at shojoji.com (Emilia Pack)
Date: Mon, 10 Dec 2007 02:06:26 +0800
Subject: [ofa-general] Gamble in the best online casino!
Message-ID: <01c83ad1$45760f00$af95273a@dwshojojim>

 Welcome to Golden Gate Casino that offers you a unique possibility to win real money online. Download for free totally realistic and secure software which brings game excitement right into your home and receive 2400$ welcome bonus!

 We guarantee absolute privacy of player information. Friendly 24/7 customer support, quick payouts, only fair gaming!

http://geocities.com/EzraKeith57/

   Play casino games any time you like.


From sashak at voltaire.com  Sun Dec  9 11:27:47 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 9 Dec 2007 19:27:47 +0000
Subject: [ofa-general] Re: [PATCH 1/3] OpenSM: Add null dereference checks
In-Reply-To: <1197075265.29314.131.camel@cardanus.llnl.gov>
References: <1197075265.29314.131.camel@cardanus.llnl.gov>
Message-ID: <20071209192747.GH708@sashak.voltaire.com>

Hi Al,

On 16:54 Fri 07 Dec     , Al Chu wrote:
> Hey Sasha,
> 
> Nothing fancy.  Just noticed the check is done in the ftree equivalent
> destroy function so I figured it should be in the others.

Is it possible/legal usage to have context = NULL in those desctructors?
If not, I don't think we need such checks.

Sasha

> 
> Al
> 
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From c94361165ab2a3c6e8dadb6596d97d79e36bb1f6 Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Fri, 7 Dec 2007 13:43:15 -0800
> Subject: [PATCH] add null check in context destroy functions
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>
> ---
>  opensm/opensm/osm_ucast_lash.c |    2 ++
>  opensm/opensm/osm_ucast_updn.c |    3 +++
>  2 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c
> index 5e7716e..bdcd2d1 100644
> --- a/opensm/opensm/osm_ucast_lash.c
> +++ b/opensm/opensm/osm_ucast_lash.c
> @@ -1469,6 +1469,8 @@ static lash_t *lash_create(osm_opensm_t * p_osm)
>  static void lash_delete(void *context)
>  {
>  	lash_t *p_lash = context;
> +        if (!context)
> +                return;
>  	if (p_lash->switches) {
>  		unsigned id;
>  		for (id = 0; ((int)id) < p_lash->num_switches; id++)
> diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c
> index 0b7b1a9..24c9fe3 100644
> --- a/opensm/opensm/osm_ucast_updn.c
> +++ b/opensm/opensm/osm_ucast_updn.c
> @@ -234,6 +234,9 @@ static void updn_destroy(IN updn_t * const p_updn)
>  {
>  	uint64_t *p_guid_list_item;
>  
> +        if (!p_updn)
> +                return;
> +
>  	/* free the array of guids */
>  	if (p_updn->updn_ucast_reg_inputs.guid_list)
>  		free(p_updn->updn_ucast_reg_inputs.guid_list);
> -- 
> 1.5.1
> 


From sashak at voltaire.com  Sun Dec  9 12:48:22 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 9 Dec 2007 20:48:22 +0000
Subject: [ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of
	routing engine/algorithm used
In-Reply-To: <1197075342.29314.133.camel@cardanus.llnl.gov>
References: <1197075342.29314.133.camel@cardanus.llnl.gov>
Message-ID: <20071209204822.GI708@sashak.voltaire.com>

Hi Al,

On 16:55 Fri 07 Dec     , Al Chu wrote:
> 
> I noticed that when a routing algorithm failed and defaulted back to
> 'minhop', the logs and the console did not report this change.  This is
> because most of that code outputs the routing algorithm name that was
> stored during configuration/setup.

Right, as well as routing engine methods. So next cycle (sweep) this
routing engine method will run again and could be more successful.
So we cannot say that whole routing engine does fallback, just particular
callback in this particular cycle do.

So I'm not sure that just "renaming" helps a lot. I think better solution
in general would be to define something like routing engine chains - so
an user could specify which routing engine to use and in which order it
(whole engine) should fallback. Something like:

	-R ftree updn minhops

could mean - try ftree and if it fails switch to updn, etc..

Sasha

> The name isn't adjusted depending on
> the routing algorithm's success/failure.
> 
> There are several ways this could be fixed.  I decided to easiest was to
> stick a new routed_name field + lock into struct osm_routing_engine, and
> set/use this new field respectively.
> 
> Note that within osm_ucast_mgr_process(), there is a slight logic change
> from what was there before.  If the routing engine's call to
> build_lid_matrices() failed, I've changed the logic to not call the
> routing engine's ucast_build_fwd_tables() function.  This felt like the
> correct logic and seems to be fine given all the routing algorithms in
> OpenSM.  PLMK if there is some behavior subtlety I missed.
> 
> Thanks,
> Al
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From d83332b4c6cb1cdb0fc960808f9fa53615f61201 Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Fri, 7 Dec 2007 13:44:04 -0800
> Subject: [PATCH] fix incorrect reporting of routing engine
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>
> ---
>  opensm/include/opensm/osm_opensm.h |    9 +++++++
>  opensm/opensm/osm_console.c        |    6 +++-
>  opensm/opensm/osm_opensm.c         |    6 +++++
>  opensm/opensm/osm_ucast_mgr.c      |   43 ++++++++++++++++++++++++++----------
>  4 files changed, 50 insertions(+), 14 deletions(-)
> 
> diff --git a/opensm/include/opensm/osm_opensm.h b/opensm/include/opensm/osm_opensm.h
> index 1b5edb8..3fc70fd 100644
> --- a/opensm/include/opensm/osm_opensm.h
> +++ b/opensm/include/opensm/osm_opensm.h
> @@ -107,6 +107,8 @@ struct osm_routing_engine {
>  	int (*ucast_build_fwd_tables) (void *context);
>  	void (*ucast_dump_tables) (void *context);
>  	void (*delete) (void *context);
> +	const char *routed_name;
> +	cl_plock_t routed_name_lock;
>  };
>  /*
>  * FIELDS
> @@ -129,6 +131,13 @@ struct osm_routing_engine {
>  *	delete
>  *		The delete method, may be used for routing engine
>  *		internals cleanup.
> +*
> +*	routed_name
> +*		The routing engine name used for routing (for example,
> +*		the specified one failed and we used the default)
> +*
> +*	routed_name_lock
> +*		Shared lock guarding reads and writes to routed_name.
>  */
>  
>  typedef struct _osm_console_t {
> diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c
> index f669240..d8d16db 100644
> --- a/opensm/opensm/osm_console.c
> +++ b/opensm/opensm/osm_console.c
> @@ -380,9 +380,11 @@ static void print_status(osm_opensm_t * p_osm, FILE * out)
>  			sm_state_mgr_str(p_osm->sm.state_mgr.state));
>  		fprintf(out, "   SA State           : %s\n",
>  			sa_state_str(p_osm->sa.state));
> +                cl_plock_acquire(&p_osm->routing_engine.routed_name_lock);
>  		fprintf(out, "   Routing Engine     : %s\n",
> -			p_osm->routing_engine.name ? p_osm->routing_engine.
> -			name : "null (minhop)");
> +			p_osm->routing_engine.routed_name ? 
> +			p_osm->routing_engine.routed_name : "null (minhop)");
> +                cl_plock_release(&p_osm->routing_engine.routed_name_lock);
>  #ifdef ENABLE_OSM_PERF_MGR
>  		fprintf(out, "\n   PerfMgr state/sweep state : %s/%s\n",
>  			osm_perfmgr_get_state_str(&(p_osm->perfmgr)),
> diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
> index 26b9969..b55a638 100644
> --- a/opensm/opensm/osm_opensm.c
> +++ b/opensm/opensm/osm_opensm.c
> @@ -188,6 +188,8 @@ void osm_opensm_destroy(IN osm_opensm_t * const p_osm)
>  
>  	cl_plock_destroy(&p_osm->lock);
>  
> +	cl_plock_destroy(&p_osm->routing_engine.routed_name_lock);
> +
>  	osm_log_destroy(&p_osm->log);
>  }
>  
> @@ -224,6 +226,10 @@ osm_opensm_init(IN osm_opensm_t * const p_osm,
>  	if (status != IB_SUCCESS)
>  		goto Exit;
>  
> +	status = cl_plock_init(&p_osm->routing_engine.routed_name_lock);
> +	if (status != IB_SUCCESS)
> +		goto Exit;
> +
>  	if (p_opt->single_thread) {
>  		osm_log(&p_osm->log, OSM_LOG_INFO,
>  			"osm_opensm_init: Forcing single threaded dispatcher\n");
> diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
> index 8bb4739..fe15666 100644
> --- a/opensm/opensm/osm_ucast_mgr.c
> +++ b/opensm/opensm/osm_ucast_mgr.c
> @@ -769,6 +769,9 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
>  	struct osm_routing_engine *p_routing_eng;
>  	osm_signal_t signal = OSM_SIGNAL_DONE;
>  	cl_qmap_t *p_sw_guid_tbl;
> +        const char *routed_name = NULL;
> +        int blm = 0;
> +        int ubft = 0;
>  
>  	OSM_LOG_ENTER(p_mgr->p_log, osm_ucast_mgr_process);
>  
> @@ -789,23 +792,39 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
>  
>  	p_mgr->any_change = FALSE;
>  
> -	if (!p_routing_eng->build_lid_matrices ||
> -	    p_routing_eng->build_lid_matrices(p_routing_eng->context) != 0)
> -		osm_ucast_mgr_build_lid_matrices(p_mgr);
> -
> -	osm_log(p_mgr->p_log, OSM_LOG_INFO,
> -		"osm_ucast_mgr_process: "
> -		"%s tables configured on all switches\n",
> -		p_routing_eng->name ? p_routing_eng->name : "null (minhop)");
> +	if (p_routing_eng->build_lid_matrices) {
> +            blm = p_routing_eng->build_lid_matrices(p_routing_eng->context);
> +            if (blm)
> +                osm_ucast_mgr_build_lid_matrices(p_mgr);
> +        }
> +        else
> +            osm_ucast_mgr_build_lid_matrices(p_mgr);
>  
>  	/*
>  	   Now that the lid matrices have been built, we can
>  	   build and download the switch forwarding tables.
>  	 */
> -	if (!p_routing_eng->ucast_build_fwd_tables ||
> -	    p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context))
> -		cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> -				   p_mgr);
> +	if (!blm && p_routing_eng->ucast_build_fwd_tables) {
> +            ubft = p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context);
> +            if (ubft)
> +                cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> +                                   p_mgr);
> +        }
> +        else
> +            cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> +                               p_mgr);
> +
> +        if (!blm && !ubft)
> +            routed_name = p_routing_eng->name;
> +
> +        CL_PLOCK_EXCL_ACQUIRE(&p_routing_eng->routed_name_lock);
> +        p_routing_eng->routed_name = routed_name;
> +        CL_PLOCK_RELEASE(&p_routing_eng->routed_name_lock);
> +
> +	osm_log(p_mgr->p_log, OSM_LOG_INFO,
> +		"osm_ucast_mgr_process: "
> +		"%s tables configured on all switches\n",
> +		routed_name ? routed_name : "null (minhop)");
>  
>  	if (p_mgr->any_change) {
>  		signal = OSM_SIGNAL_DONE_PENDING;
> -- 
> 1.5.1
> 


From rdreier at cisco.com  Sun Dec  9 15:22:39 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Sun, 09 Dec 2007 15:22:39 -0800
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <200712071058.38416.arnd@arndb.de> (Arnd Bergmann's message of
	"Fri, 7 Dec 2007 10:58:37 +0100")
References: <200712061607.20004.fenkes@de.ibm.com>
	<200712061648.24806.arnd@arndb.de> <ada7ijrd6gy.fsf@cisco.com>
	<200712071058.38416.arnd@arndb.de>
Message-ID: <adaodcza1xc.fsf@cisco.com>

 > I think it needs some more inspection. The msleep in there is only called
 > for hcalls that return H_IS_LONG_BUSY(). In theory, you can call
 > ehca_plpar_hcall_norets() from inside an interrupt handler if the
 > hcall in question never returns long busy.

Fair enough... according to Documentation/infiniband/core_locking.txt,
the only driver methods that cannot sleep are:

    create_ah
    modify_ah
    query_ah
    destroy_ah
    bind_mw
    post_send
    post_recv
    poll_cq
    req_notify_cq
    map_phys_fmr

and I don't think ehca does an hcall from any of those.  Of course
there might be other driver-internal code paths that I don't know
about.  Maybe do a quick audit and then stick might_sleep() in the
hcall functions to catch any mistakes?

 - R.


From pete at ml.com  Sun Dec  9 14:05:22 2007
From: pete at ml.com (delmar elias)
Date: Sun, 09 Dec 2007 22:05:22 +0000
Subject: [ofa-general] For openib-general
Message-ID: <000601c83abe$02ba33a6$c9c062ae@aqhppg>

    * Mensch & Technik    * B?rsebl?ttern    * AchillesA380-Verz?gerung, Fehlentwicklung beim A350 - der europ?ische Flugzeugbauer Airbus machte in der letzten Zeit eher mit Krisenmeldungen Schlagzeilen. Der Gesch?ftsentwicklung tat das offensichtlich keinen Abbruch - ein Boeing-Manager sieht den europ?ischen Flugzeugbauer beim Absatz vorn. mehr... [ Forum ]    * Familiendramen: Acht tote Kinder entdeckt - M?tter unter Tatverdacht    * Botnets: Die stille Gefahr im Internet    * Nachwuchs in Niedersachsen: Ministerpr?sident Wulff wird wieder Vater
BujxQoy your mAXaIblvDoseds cmsRmrFqJzonline. VizhIvjxQoAXagra $1.79  only
in aktuelles Urteil des Landgerichts Hamburg w?hlt die Debatte um die sogenannte Forenhaftung wieder auf. Die Kernfragen: Ist der Betreiber eines Forums, eines Blogs oder einer Webseite juristisch verantwortlich f?r Kommentare von Nutzern seines Angebots? Muss er alles vorab zensieren? Von Frank Patalong mehr... [ Forum ]    * mehr Politik    * Postdienste: Pin Group geht beim Kartellamt gegen Mindestlohn vor    * St?dtereisen    * FondsDEBATTEN    * MusikWest fotografiert Ost: Ost- Berlin, mon amour
Other meds are even cheaper.
Click

    * Mensch & Technik    * Lebenslange Haft: H?chststrafe f?r Hannahs M?rder    * mehr Sport    * 30 Jahre "Star Wars": Krieg der Schere      Ehrensenf: Schickt Zielwasser nach Schweden Video abspielen...    * Mensch & Technik    * B?rsebl?ttern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071209/05c661a4/attachment.html>

From chu11 at llnl.gov  Sun Dec  9 16:31:54 2007
From: chu11 at llnl.gov (Albert Chu)
Date: Sun, 9 Dec 2007 16:31:54 -0800 (PST)
Subject: [ofa-general] Re: [PATCH 1/3] OpenSM: Add null dereference checks
In-Reply-To: <20071209192747.GH708@sashak.voltaire.com>
References: <1197075265.29314.131.camel@cardanus.llnl.gov>
	<20071209192747.GH708@sashak.voltaire.com>
Message-ID: <33238.128.15.244.71.1197246714.squirrel@127.0.0.1>


> Hi Al,
>
> On 16:54 Fri 07 Dec     , Al Chu wrote:
>> Hey Sasha,
>>
>> Nothing fancy.  Just noticed the check is done in the ftree equivalent
>> destroy function so I figured it should be in the others.
>
> Is it possible/legal usage to have context = NULL in those desctructors?
> If not, I don't think we need such checks.

I don't think any current code logic in opensm can allow it, but I felt it
would be prudent to add the checks for future code change safety anyways
(one of my initial ideas to fix the routing engine reporting issue
required it).  It's also in the ftree destroy function already too.

Al

> Sasha
>
>>
>> Al
>>
>> --
>> Albert Chu
>> chu11 at llnl.gov
>> 925-422-5311
>> Computer Scientist
>> High Performance Systems Division
>> Lawrence Livermore National Laboratory
>
>> From c94361165ab2a3c6e8dadb6596d97d79e36bb1f6 Mon Sep 17 00:00:00 2001
>> From: Albert L. Chu <chu11 at llnl.gov>
>> Date: Fri, 7 Dec 2007 13:43:15 -0800
>> Subject: [PATCH] add null check in context destroy functions
>>
>>
>> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>
>> ---
>>  opensm/opensm/osm_ucast_lash.c |    2 ++
>>  opensm/opensm/osm_ucast_updn.c |    3 +++
>>  2 files changed, 5 insertions(+), 0 deletions(-)
>>
>> diff --git a/opensm/opensm/osm_ucast_lash.c
>> b/opensm/opensm/osm_ucast_lash.c
>> index 5e7716e..bdcd2d1 100644
>> --- a/opensm/opensm/osm_ucast_lash.c
>> +++ b/opensm/opensm/osm_ucast_lash.c
>> @@ -1469,6 +1469,8 @@ static lash_t *lash_create(osm_opensm_t * p_osm)
>>  static void lash_delete(void *context)
>>  {
>>  	lash_t *p_lash = context;
>> +        if (!context)
>> +                return;
>>  	if (p_lash->switches) {
>>  		unsigned id;
>>  		for (id = 0; ((int)id) < p_lash->num_switches; id++)
>> diff --git a/opensm/opensm/osm_ucast_updn.c
>> b/opensm/opensm/osm_ucast_updn.c
>> index 0b7b1a9..24c9fe3 100644
>> --- a/opensm/opensm/osm_ucast_updn.c
>> +++ b/opensm/opensm/osm_ucast_updn.c
>> @@ -234,6 +234,9 @@ static void updn_destroy(IN updn_t * const p_updn)
>>  {
>>  	uint64_t *p_guid_list_item;
>>
>> +        if (!p_updn)
>> +                return;
>> +
>>  	/* free the array of guids */
>>  	if (p_updn->updn_ucast_reg_inputs.guid_list)
>>  		free(p_updn->updn_ucast_reg_inputs.guid_list);
>> --
>> 1.5.1
>>
>


-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory


From chu11 at llnl.gov  Sun Dec  9 17:12:23 2007
From: chu11 at llnl.gov (Albert Chu)
Date: Sun, 9 Dec 2007 17:12:23 -0800 (PST)
Subject: [ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of
 routing engine/algorithm used
In-Reply-To: <20071209204822.GI708@sashak.voltaire.com>
References: <1197075342.29314.133.camel@cardanus.llnl.gov>
	<20071209204822.GI708@sashak.voltaire.com>
Message-ID: <33254.128.15.244.71.1197249143.squirrel@127.0.0.1>

Hey Sasha,

> Hi Al,
>
> On 16:55 Fri 07 Dec     , Al Chu wrote:
>>
>> I noticed that when a routing algorithm failed and defaulted back to
>> 'minhop', the logs and the console did not report this change.  This is
>> because most of that code outputs the routing algorithm name that was
>> stored during configuration/setup.
>
> Right, as well as routing engine methods. So next cycle (sweep) this
> routing engine method will run again and could be more successful.
> So we cannot say that whole routing engine does fallback, just particular
> callback in this particular cycle do.
>
> So I'm not sure that just "renaming" helps a lot.

Maybe I'm misunderstanding your comments or maybe I didn't explain the
issue and patch well enough.  The issue I'm concerned with is that the
logs and the console will still report that a routing engine/algorithm
succeeded in routing the subnet when it failed.  The patch won't overwrite
any routing engine methods, so that during a later attempt to route the
subnet, the originally configured routing engine will try again.  Whatever
algorithm last succeeded, I store the name of the algorithm in
'routed_name' for logging/output/etc.

Naturally, I don't know OpenSM as well as others.  I could be confused on
how my patch affects other parts of OpenSM??

> I think better solution
> in general would be to define something like routing engine chains - so
> an user could specify which routing engine to use and in which order it
> (whole engine) should fallback. Something like:
>
> 	-R ftree updn minhops
>
> could mean - try ftree and if it fails switch to updn, etc..

I think this is a great idea.  But I don't think it would solve the
generic issue that other parts of OpenSM (such as the console) need to
know which routing algorithm actually routed the subnet for correct
output.

I like this idea as a whole.  Do you suggest in this feature that if the
user does not specify a backup routing algorithm, then OpenSM would
fail/exit if the first routing engine failed?  That actually is behavior
we would prefer at LLNL, although I think the majority of the user
community may not like it.  Atleast not as default behavior.

Al

> Sasha
>
>> The name isn't adjusted depending on
>> the routing algorithm's success/failure.
>>
>> There are several ways this could be fixed.  I decided to easiest was to
>> stick a new routed_name field + lock into struct osm_routing_engine, and
>> set/use this new field respectively.
>>
>> Note that within osm_ucast_mgr_process(), there is a slight logic change
>> from what was there before.  If the routing engine's call to
>> build_lid_matrices() failed, I've changed the logic to not call the
>> routing engine's ucast_build_fwd_tables() function.  This felt like the
>> correct logic and seems to be fine given all the routing algorithms in
>> OpenSM.  PLMK if there is some behavior subtlety I missed.
>>
>> Thanks,
>> Al
>> --
>> Albert Chu
>> chu11 at llnl.gov
>> 925-422-5311
>> Computer Scientist
>> High Performance Systems Division
>> Lawrence Livermore National Laboratory
>
>> From d83332b4c6cb1cdb0fc960808f9fa53615f61201 Mon Sep 17 00:00:00 2001
>> From: Albert L. Chu <chu11 at llnl.gov>
>> Date: Fri, 7 Dec 2007 13:44:04 -0800
>> Subject: [PATCH] fix incorrect reporting of routing engine
>>
>>
>> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>
>> ---
>>  opensm/include/opensm/osm_opensm.h |    9 +++++++
>>  opensm/opensm/osm_console.c        |    6 +++-
>>  opensm/opensm/osm_opensm.c         |    6 +++++
>>  opensm/opensm/osm_ucast_mgr.c      |   43
>> ++++++++++++++++++++++++++----------
>>  4 files changed, 50 insertions(+), 14 deletions(-)
>>
>> diff --git a/opensm/include/opensm/osm_opensm.h
>> b/opensm/include/opensm/osm_opensm.h
>> index 1b5edb8..3fc70fd 100644
>> --- a/opensm/include/opensm/osm_opensm.h
>> +++ b/opensm/include/opensm/osm_opensm.h
>> @@ -107,6 +107,8 @@ struct osm_routing_engine {
>>  	int (*ucast_build_fwd_tables) (void *context);
>>  	void (*ucast_dump_tables) (void *context);
>>  	void (*delete) (void *context);
>> +	const char *routed_name;
>> +	cl_plock_t routed_name_lock;
>>  };
>>  /*
>>  * FIELDS
>> @@ -129,6 +131,13 @@ struct osm_routing_engine {
>>  *	delete
>>  *		The delete method, may be used for routing engine
>>  *		internals cleanup.
>> +*
>> +*	routed_name
>> +*		The routing engine name used for routing (for example,
>> +*		the specified one failed and we used the default)
>> +*
>> +*	routed_name_lock
>> +*		Shared lock guarding reads and writes to routed_name.
>>  */
>>
>>  typedef struct _osm_console_t {
>> diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c
>> index f669240..d8d16db 100644
>> --- a/opensm/opensm/osm_console.c
>> +++ b/opensm/opensm/osm_console.c
>> @@ -380,9 +380,11 @@ static void print_status(osm_opensm_t * p_osm, FILE
>> * out)
>>  			sm_state_mgr_str(p_osm->sm.state_mgr.state));
>>  		fprintf(out, "   SA State           : %s\n",
>>  			sa_state_str(p_osm->sa.state));
>> +
>> cl_plock_acquire(&p_osm->routing_engine.routed_name_lock);
>>  		fprintf(out, "   Routing Engine     : %s\n",
>> -			p_osm->routing_engine.name ? p_osm->routing_engine.
>> -			name : "null (minhop)");
>> +			p_osm->routing_engine.routed_name ?
>> +			p_osm->routing_engine.routed_name : "null (minhop)");
>> +
>> cl_plock_release(&p_osm->routing_engine.routed_name_lock);
>>  #ifdef ENABLE_OSM_PERF_MGR
>>  		fprintf(out, "\n   PerfMgr state/sweep state : %s/%s\n",
>>  			osm_perfmgr_get_state_str(&(p_osm->perfmgr)),
>> diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
>> index 26b9969..b55a638 100644
>> --- a/opensm/opensm/osm_opensm.c
>> +++ b/opensm/opensm/osm_opensm.c
>> @@ -188,6 +188,8 @@ void osm_opensm_destroy(IN osm_opensm_t * const
>> p_osm)
>>
>>  	cl_plock_destroy(&p_osm->lock);
>>
>> +	cl_plock_destroy(&p_osm->routing_engine.routed_name_lock);
>> +
>>  	osm_log_destroy(&p_osm->log);
>>  }
>>
>> @@ -224,6 +226,10 @@ osm_opensm_init(IN osm_opensm_t * const p_osm,
>>  	if (status != IB_SUCCESS)
>>  		goto Exit;
>>
>> +	status = cl_plock_init(&p_osm->routing_engine.routed_name_lock);
>> +	if (status != IB_SUCCESS)
>> +		goto Exit;
>> +
>>  	if (p_opt->single_thread) {
>>  		osm_log(&p_osm->log, OSM_LOG_INFO,
>>  			"osm_opensm_init: Forcing single threaded dispatcher\n");
>> diff --git a/opensm/opensm/osm_ucast_mgr.c
>> b/opensm/opensm/osm_ucast_mgr.c
>> index 8bb4739..fe15666 100644
>> --- a/opensm/opensm/osm_ucast_mgr.c
>> +++ b/opensm/opensm/osm_ucast_mgr.c
>> @@ -769,6 +769,9 @@ osm_signal_t osm_ucast_mgr_process(IN
>> osm_ucast_mgr_t * const p_mgr)
>>  	struct osm_routing_engine *p_routing_eng;
>>  	osm_signal_t signal = OSM_SIGNAL_DONE;
>>  	cl_qmap_t *p_sw_guid_tbl;
>> +        const char *routed_name = NULL;
>> +        int blm = 0;
>> +        int ubft = 0;
>>
>>  	OSM_LOG_ENTER(p_mgr->p_log, osm_ucast_mgr_process);
>>
>> @@ -789,23 +792,39 @@ osm_signal_t osm_ucast_mgr_process(IN
>> osm_ucast_mgr_t * const p_mgr)
>>
>>  	p_mgr->any_change = FALSE;
>>
>> -	if (!p_routing_eng->build_lid_matrices ||
>> -	    p_routing_eng->build_lid_matrices(p_routing_eng->context) != 0)
>> -		osm_ucast_mgr_build_lid_matrices(p_mgr);
>> -
>> -	osm_log(p_mgr->p_log, OSM_LOG_INFO,
>> -		"osm_ucast_mgr_process: "
>> -		"%s tables configured on all switches\n",
>> -		p_routing_eng->name ? p_routing_eng->name : "null (minhop)");
>> +	if (p_routing_eng->build_lid_matrices) {
>> +            blm =
>> p_routing_eng->build_lid_matrices(p_routing_eng->context);
>> +            if (blm)
>> +                osm_ucast_mgr_build_lid_matrices(p_mgr);
>> +        }
>> +        else
>> +            osm_ucast_mgr_build_lid_matrices(p_mgr);
>>
>>  	/*
>>  	   Now that the lid matrices have been built, we can
>>  	   build and download the switch forwarding tables.
>>  	 */
>> -	if (!p_routing_eng->ucast_build_fwd_tables ||
>> -	    p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context))
>> -		cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
>> -				   p_mgr);
>> +	if (!blm && p_routing_eng->ucast_build_fwd_tables) {
>> +            ubft =
>> p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context);
>> +            if (ubft)
>> +                cl_qmap_apply_func(p_sw_guid_tbl,
>> __osm_ucast_mgr_process_tbl,
>> +                                   p_mgr);
>> +        }
>> +        else
>> +            cl_qmap_apply_func(p_sw_guid_tbl,
>> __osm_ucast_mgr_process_tbl,
>> +                               p_mgr);
>> +
>> +        if (!blm && !ubft)
>> +            routed_name = p_routing_eng->name;
>> +
>> +        CL_PLOCK_EXCL_ACQUIRE(&p_routing_eng->routed_name_lock);
>> +        p_routing_eng->routed_name = routed_name;
>> +        CL_PLOCK_RELEASE(&p_routing_eng->routed_name_lock);
>> +
>> +	osm_log(p_mgr->p_log, OSM_LOG_INFO,
>> +		"osm_ucast_mgr_process: "
>> +		"%s tables configured on all switches\n",
>> +		routed_name ? routed_name : "null (minhop)");
>>
>>  	if (p_mgr->any_change) {
>>  		signal = OSM_SIGNAL_DONE_PENDING;
>> --
>> 1.5.1
>>
>


-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory


From dwsegnoassociatim at segnoassociati.it  Sun Dec  9 20:00:47 2007
From: dwsegnoassociatim at segnoassociati.it (Margie Story)
Date: Mon, 10 Dec 2007 12:00:47 +0800
Subject: [ofa-general] Thanks for being with CanadianPharmacy.
Message-ID: <01c83b24$4cd7ed60$990d497d@dwsegnoassociatim>

 Canadian Pharmacy is the right place to buy cheap and quality medications you are looking for. Surprisingly low prices will impress you. Canadian Pharmacy has extremely wide selection of medications to choose from and offers 100% generic meds. Canadian Pharmacy is proud for the level of service it provides. Purchase your medications with us and your order will be delivered to you quickly, without any delay, in full amount, proper condition and packed discreetly. 

http://geocities.com/RaphaelPowers57/

Lowest prices guaranteed

 Tim Ryan


From info at webfolding.com  Sun Dec  9 20:23:54 2007
From: info at webfolding.com (info at webfolding.com)
Date: Mon, 10 Dec 2007 12:23:54 +0800
Subject: [ofa-general] Website Designs For Only $199 (USD)
Message-ID: <E1J1aBO-0005AU-4m@server.davaosale.com>

�Do you want Cheaper yet High Quality Web Designs?� 

Webfolding is a Web design and Development Company. We offer more easy access, precise and faster, our technology is your benefit. 

We offer CHEAPER price in Web Design and Development as low as US$ 199 for Personal Package and US$ 399 for Business Package.

We have a free of charge in Domain name registration and Web hosting.

Please advice us what services are you interested in at the present and we will give you reply as soon as possible. Any comments and proposal is highly appreciated. If you are interested to do business with us, please feel free to get in touch with us

For more information pls. visit our website www.webfolding.com or www.webpryce.com.

Pls. contact us admin at webfolding.com or info at webfolding.com.


From oceunsdgyp at borsamerciroma.com  Sun Dec  9 21:01:32 2007
From: oceunsdgyp at borsamerciroma.com (Leonardo Katz)
Date: Mon, 10 Dec 2007 13:01:32 +0800
Subject: [ofa-general] Change your sex life with XtraSize+!
Message-ID: <01c83b2c$c92f9e00$da71d7dd@oceunsdgyp>

Not all penis enlarging methods are equally effective. Some of them are even dangerous. Don't run a risk and try medically approved 100% safe traction method. Only XtraSize+ helps men to enlarge penis size and achieve permanent results.
 We understand that most customers need confidentiality and respect every need of our clients. Secure online ordering process, discreet packing, security of your private information are guaranteed.

http://geocities.com/BorisBallard87/

Make large penis your recipe for success.


From kliteyn at mellanox.co.il  Sun Dec  9 21:08:34 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 10 Dec 2007 07:08:34 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-10:normal completion
Message-ID: <MTLEXCH01AmEaJswFCH00000197@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-09
OpenSM git rev = Sat_Dec_8_00:46:58_2007 [139abe25ada3bae14b1173fb2842b1fe1c7d171a]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From dwsecondem at seconde.fr  Sun Dec  9 22:40:09 2007
From: dwsecondem at seconde.fr (Risto Nixon)
Date: Mon, 10 Dec 2007 15:40:09 +0900
Subject: [ofa-general] Buy medications in Canada and lower your medication
	expenses.
Message-ID: <01c83b42$f1c29280$5f43e8de@dwsecondem>

 There is no need to buy medications in America at sky-high prices. Purchase them in Canada! All you need is to find a trustworthy online drugstore.

 Cheap medications offered in «CanadianPharmacy» are of extremely high quality. Large selection of medications which are 100% generic! No other online drugstore offers such a level of service. Fast worldwide delivery, no damaged packages, no delays! Full confidentiality! 

http://geocities.com/LucasMccormick69/

 Quality medications should be affordable for all.

Risto Nixon


From vewroo at bo-bedre.com  Mon Dec 10 00:33:15 2007
From: vewroo at bo-bedre.com (Lorraine Kennedy)
Date: Mon, 10 Dec 2007 16:33:15 +0800
Subject: [ofa-general] There is no cheaper source of original and perfectly
	working software.
Message-ID: <01c83b4a$5cf3f650$a521257d@vewroo>

  Need some software urgently? Purchase, download and install right now! Software in English, German, French, Italian, and Spanish for IBM PC and Macintosh! Cheap prices give you the possibility to save or buy more software than you can afford purchasing software on a CD!

 Free of charge professional installation consultations could be of great help. Prompt reply on all your requests. Money back guarantee ensures the quality of product.

http://geocities.com/KareemNash87/

   Buy, download and install right now!


From dwsekam at seka.cz  Mon Dec 10 00:42:13 2007
From: dwsekam at seka.cz (Noreen Odell)
Date: Mon, 10 Dec 2007 09:42:13 +0100
Subject: [ofa-general] Get your free 2400$ welcome bonus and win much more!
Message-ID: <01c83b10$f1110880$4ef7a53e@dwsekam>

 Now you have a brilliant possibility to feel casino excitement without leaving your house. All your favorite games are available to play in Golden Gate Casino. Just download free software and start playing.

 Among our advantages are: fast payouts, high degree of security, all around the clock customer support. These are few reasons why Golden Gate casino is so popular

http://geocities.com/RickKey15/

   Simply try and you'll like it!


From dotanb at dev.mellanox.co.il  Mon Dec 10 01:13:34 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Mon, 10 Dec 2007 11:13:34 +0200
Subject: [ofa-general] Very high CPU load during RDMA communication
In-Reply-To: <002d01c838da$2e3aa380$05c8a8c0@DIEGO>
References: <e2e108260712070543s79450802ue0deda6446c3120e@mail.gmail.com>
	<002d01c838da$2e3aa380$05c8a8c0@DIEGO>
Message-ID: <475D033E.8030507@dev.mellanox.co.il>

Diego Guella wrote:
>
> ----- Original Message -----
>> A newbie question about  InfiniBand: while running an RDMA bandwidth
>> test I noticed that the RDMA communication occupied 99% CPU time,
>> which is much more than I expected for InfiniBand. Is this normal or
>> should I check my setup ?
>>
>> How the test was run:
>> (host 1) ib_rdma_bw -s 104857600
>> (host 2) time ib_rdma_bw -s 104857600 192.168.64.7
>
> I think it's normal, I see the same behavior here.
> If you take a look at ib_rdma_bw code, you can see that the program 
> waits in a while() loop, polling the CQ for the WC to appear.
> I think it's coded that way to get the WC as soon as possible and 
> report a more precise time (that is used after to compute BW).
>
> If you don't want that behavior in your application you could use 
> event notification for the CQ.
If one wants to reduce the CPU usage (on the expense of the latency) he 
can work with CQ events.


Dotan


From jackm at dev.mellanox.co.il  Sun Dec  9 19:25:23 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 10 Dec 2007 05:25:23 +0200
Subject: [ofa-general] [PATCH] mlx4_core: fix max_eqs masking in
	QUERY_DEV_CAP
Message-ID: <200712100525.23484.jackm@dev.mellanox.co.il>

mlx4_core: fix max_eq's read from FW in QUERY_DEV_CAP

log_max_eqs is a 4-bit field, not a 3-bit field in the FW
QUERY_DEV_CAP command (according to the PRM).

Found by Yossi Leybovitch of Mellanox
Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c
index 5064873..535a446 100644
--- a/drivers/net/mlx4/fw.c
+++ b/drivers/net/mlx4/fw.c
@@ -202,7 +202,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_EQ_OFFSET);
 	dev_cap->reserved_eqs = 1 << (field & 0xf);
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_EQ_OFFSET);
-	dev_cap->max_eqs = 1 << (field & 0x7);
+	dev_cap->max_eqs = 1 << (field & 0xf);
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_RSVD_MTT_OFFSET);
 	dev_cap->reserved_mtts = 1 << (field >> 4);
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_MRW_SZ_OFFSET);


From mffeec at bradfoote.com  Mon Dec 10 01:50:15 2007
From: mffeec at bradfoote.com (Fidel Knapp)
Date: Mon, 10 Dec 2007 17:50:15 +0800
Subject: [ofa-general] Create the image of a slim and successful person!
Message-ID: <01c83b55$1e9cf180$e624acda@mffeec>

Achieve a desirable image of a lean person with an absolutely fabulous product that allows you to lose weight without dieting or excessive workouts.
  Try Fatblaster, product with natural herbal ingredients which do all work of fat burning for you. All you have to do is take a recommended dose and eat whatever you like, even a hamburger or a pizza.

http://geocities.com/MerlinPowers82/

Lose weight and improve the quality of your life!


From dwsmlcampm at smlcamp.com  Mon Dec 10 02:38:27 2007
From: dwsmlcampm at smlcamp.com (Tamera Donnelly)
Date: Mon, 10 Dec 2007 13:38:27 +0300
Subject: [ofa-general] Medications that you need.
Message-ID: <01c83b31$f16dcb80$524d1554@dwsmlcampm>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/TyLowe88/

We provide confidential and secure purchase! 


From ollie7janny30 at uboc.com  Mon Dec 10 01:10:18 2007
From: ollie7janny30 at uboc.com (baird malachi)
Date: Mon, 10 Dec 2007 09:10:18 +0000
Subject: [ofa-general] perfectly crafted exclusive watches rolex
Message-ID: <000501c83b1b$04ab20df$8e04e1b2@waxmdmd>

Perfectly crafted luxury timepieces...the finest of products at the LOWEST prices!!

http://legechemical.com/


From nhtju at bowen.uk.com  Mon Dec 10 03:03:18 2007
From: nhtju at bowen.uk.com (Lou Fountain)
Date: Mon, 10 Dec 2007 20:03:18 +0900
Subject: [ofa-general] Bro check out this awesome new product
Message-ID: <156455037.74617528983973@bowen.uk.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/cfea1b66/attachment.html>

From vlad at lists.openfabrics.org  Mon Dec 10 03:12:40 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Mon, 10 Dec 2007 03:12:40 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071210-0200 daily build status
Message-ID: <20071210111240.ACCBAE60A11@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.17
Passed on powerpc with linux-2.6.13
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.19
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.15
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.14
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.13
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.18-1.2798.fc6

Failed:


From fenkes at de.ibm.com  Mon Dec 10 03:20:57 2007
From: fenkes at de.ibm.com (Joachim Fenkes)
Date: Mon, 10 Dec 2007 12:20:57 +0100
Subject: [ofa-general] [PATCH] IB/ehca: Return correct #SGEs for SRQ
Message-ID: <200712101220.57876.fenkes@de.ibm.com>

Firmware would round up the number of SGEs to four, because the WQE
structure holds four SGEs. For SRQ, only three are supported, so return a
fixed value instead.

Signed-off-by: Joachim Fenkes <fenkes at de.ibm.com>
---

The patch will apply cleanly on top of Roland's git. Please review and apply
for 2.6.24 -- Thanks!

 drivers/infiniband/hw/ehca/ehca_qp.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c
index dd12668..eff5fb5 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -838,7 +838,7 @@ struct ib_srq *ehca_create_srq(struct ib_pd *pd,
 
 	/* copy back return values */
 	srq_init_attr->attr.max_wr = qp_init_attr.cap.max_recv_wr;
-	srq_init_attr->attr.max_sge = qp_init_attr.cap.max_recv_sge;
+	srq_init_attr->attr.max_sge = 3;
 
 	/* drive SRQ into RTR state */
 	mqpcb = ehca_alloc_fw_ctrlblock(GFP_KERNEL);
@@ -1750,7 +1750,7 @@ int ehca_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr)
 	}
 
 	srq_attr->max_wr = qpcb->max_nr_outst_recv_wr - 1;
-	srq_attr->max_sge = qpcb->actual_nr_sges_in_rq_wqe;
+	srq_attr->max_sge = 3;
 	srq_attr->srq_limit = EHCA_BMASK_GET(
 		MQPCB_CURR_SRQ_LIMIT, qpcb->curr_srq_limit);
 
-- 
1.5.2


From jpd at bochtoyota.com  Mon Dec 10 03:22:52 2007
From: jpd at bochtoyota.com (Vaughn Rowell)
Date: Mon, 10 Dec 2007 19:22:52 +0800
Subject: [ofa-general] Nothing feels as good as Personal Puss!
Message-ID: <01c83b62$0f1e31d0$d403ed7c@jpd>

 You don't even need to read testimonials (although, there are tons of them) to understand that the Personal Pussy is the best male sex toy available. It feels like tight soft warm and wet pussy allowing you to experience real life like fuck.
   No other sex toy will give you such a real feel of tight soft and warm pussy as the Personal Puss! as it is designed and made from super stretchable and soft materials to ensure the best possible sensations of a good fuck. It doesn't hurt and it can't cause allergy.

http://geocities.com/DeniseDouglas78/

 Order the best fuck thing ever.


From hrosenstock at xsigo.com  Mon Dec 10 04:23:40 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Mon, 10 Dec 2007 04:23:40 -0800
Subject: [ofa-general] opensm/libvendor/osm_vendor_ibumad_sa.c: In
	__osmv_sa_mad_rcv_cb, handle attribute offset of 0
Message-ID: <1197289420.8114.384.camel@hrosenstock-ws.xsigo.com>

opensm/libvendor/osm_vendor_ibumad_sa.c: In __osmv_sa_mad_rcv_cb, handle
attribute offset of 0 which is valid at IBA 1.2.1 when 0 attributes are
returned

Signed-off-by: Hal Rosenstock <hal at xsigo.com>

diff --git a/opensm/libvendor/osm_vendor_ibumad_sa.c b/opensm/libvendor/osm_vendor_ibumad_sa.c
index b06cc69..5ce8fad 100644
--- a/opensm/libvendor/osm_vendor_ibumad_sa.c
+++ b/opensm/libvendor/osm_vendor_ibumad_sa.c
@@ -137,20 +137,23 @@ __osmv_sa_mad_rcv_cb(IN osm_madw_t * p_madw,
 			else
 				query_res.result_cnt = 1;
 #else
-			/* we used the offset value to calculate the number of
-			   records in here */
-			query_res.result_cnt = (uintn_t)
-			    ((p_madw->mad_size - IB_SA_MAD_HDR_SIZE) /
-			     ib_get_attr_size(p_sa_mad->attr_offset));
-			osm_log(p_bind->p_log, OSM_LOG_DEBUG,
-				"__osmv_sa_mad_rcv_cb: Count = %u = %zu / %u (%zu)\n",
-				query_res.result_cnt,
-				p_madw->mad_size - IB_SA_MAD_HDR_SIZE,
-				ib_get_attr_size(p_sa_mad->attr_offset),
-				(p_madw->mad_size -
-				 IB_SA_MAD_HDR_SIZE) %
-				ib_get_attr_size(p_sa_mad->attr_offset)
-			    );
+			if (ib_get_attr_size(p_sa_mad->attr_offset)) {
+				/* we used the offset value to calculate the
+				   number of records in here */
+				query_res.result_cnt = (uintn_t)
+				    ((p_madw->mad_size - IB_SA_MAD_HDR_SIZE) /
+				     ib_get_attr_size(p_sa_mad->attr_offset));
+				osm_log(p_bind->p_log, OSM_LOG_DEBUG,
+					"__osmv_sa_mad_rcv_cb: Count = %u = %zu / %u (%zu)\n",
+					query_res.result_cnt,
+					p_madw->mad_size - IB_SA_MAD_HDR_SIZE,
+					ib_get_attr_size(p_sa_mad->attr_offset),
+					(p_madw->mad_size -
+					 IB_SA_MAD_HDR_SIZE) %
+					ib_get_attr_size(p_sa_mad->attr_offset)
+			    	);
+			} else
+				query_res.result_cnt = 0;
 #endif
 		}
 	}


From fxsja at bmelaw.com  Mon Dec 10 04:34:24 2007
From: fxsja at bmelaw.com (Heather Munoz)
Date: Mon, 10 Dec 2007 21:34:24 +0900
Subject: [ofa-general] Time control
Message-ID: <787206862.77819132598286@bmelaw.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/9b4bacb5/attachment.html>

From dwsilverhawkomaham at silverhawkomaha.com  Mon Dec 10 05:38:28 2007
From: dwsilverhawkomaham at silverhawkomaha.com (Esmeralda Doty)
Date: Mon, 10 Dec 2007 05:38:28 -0800
Subject: [ofa-general] Enjoy easy and convenient online ordering process of
	purchasing with CanadianPharmacy.
Message-ID: <01c83aee$e3e32200$0c852d59@dwsilverhawkomaham>

There's a lot of information online but people continue to ask us whether they can trust online drugstores. So we decided to monitor the quality of the drugs offered at the most popular online pharmacies and according to the results «CanadianPharmacy» is the most reliable drugstore on the Web.Visit our "CanadianPharmacy" site 

Low cost medications from Canada! All popular modern pharmaceutical products are available in «CanadianPharmacy». Your quality, yet cheap meds will be delivered promptly. Customer service team of «CanadianPharmacy» is eager to assist. All you have to do is to choose the product you need from the really wide selection.

http://geocities.com/IrmaPennington31/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/f4faa373/attachment.html>

From vlad at dev.mellanox.co.il  Mon Dec 10 06:20:57 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Mon, 10 Dec 2007 16:20:57 +0200
Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0
	firmware support
In-Reply-To: <475B1E7D.5080607@opengridcomputing.com>
References: <47501F09.4060800@opengridcomputing.com>
	<475B1E7D.5080607@opengridcomputing.com>
Message-ID: <475D4B49.4090709@dev.mellanox.co.il>

Hi Steve,
Sorry, I missed your libcxgb3 updates for ofed-1.2.5.
It is updated now.
There is OFED-1.2.5.4-20071210-0614 build which includes updated libcxgb3 library.

In any case we are going to release OFED-1.2.5.5 in a few days.

Regards,
Vladimir

Steve Wise wrote:
> Vlad, it looks like you didn't pull in version 1.1.0 of libcxgb3 for 
> ofed-1.2.5?
> 
> Right now the ofed-1.2.5.4 is broken from chelsio's perspective because 
> the kernel drivers require 5.0 firmware, but the library doesn't have 
> 5.0 firmware support.
> 
> Can you please pull in 1.1.0 of libcxgb3 and crank a new ofed-1.2.5.4 
> release?
> 
> Pull from:
> 
> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
> 
> 
> Thanks,
> 
> Steve.
> 
> 
> Steve Wise wrote:
>> Vlad, please pull cxgb3 fixes for ofed-1.2.5 from:
>>
>> git://git.openfabrics.org/~swise/ofed-1.2.5 stevo
>>
>> These are cxgb3 bug fixes and PPC64 additions that we need for 
>> ofed-1.2.5  (stay tuned for ofed-1.3 patches soon).
>>
>> The patches are all accepted upstream and were posted here:
>>
>> http://www.spinics.net/lists/netdev/msg47492.html
>>
>> and here:
>>
>> http://www.spinics.net/lists/netdev/msg48240.html
>>
>>
>> Also, please pull version 1.1.0 of libcxgb3 from:
>>
>> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
>>
>> The library and drivers need to be included together as they are both 
>> needed to support the chelsio 5.0 firmware.
>>
>> Alsoalso: After you integrate these, can you crank a daily 
>> OFED-1.2.5.3 build including all this?
>>
>>
>> Thanks,
>>
>> Steve.
>>
> 


From xhejtman at ics.muni.cz  Mon Dec 10 06:34:40 2007
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Mon, 10 Dec 2007 15:34:40 +0100
Subject: [ofa-general] MTHCA driver from OFED 1.3a package
In-Reply-To: <adasl2nlovv.fsf@cisco.com>
References: <20071122140554.GB13609@ics.muni.cz> <adafxyvtixw.fsf@cisco.com>
	<20071124223117.GA4265@ics.muni.cz>
	<15ddcffd0711260322i6fd82fd6r40e4362184a5b9b7@mail.gmail.com>
	<20071126131637.GC4296@ics.muni.cz>
	<15ddcffd0711260537h633c2e6j9b374b2c9c06b439@mail.gmail.com>
	<20071129140227.GF4422@ics.muni.cz> <adaeje9m2c9.fsf@cisco.com>
	<20071130121928.GB4259@ics.muni.cz> <adasl2nlovv.fsf@cisco.com>
Message-ID: <20071210143440.GA4157@ics.muni.cz>

On Fri, Nov 30, 2007 at 07:44:04AM -0800, Roland Dreier wrote:
>  > Fatal DMA error! Please use 'swiotlb=force'
>  > ----------- cut here  --------- please bite here  ---------
>  > Kernel BUG at arch/x86_64/kernel/../../i386/kernel/pci-dma-xen.c:333
> 
> What is this bug being caused by?  That is, what is line 333 of
> pci-dma-xen.c in your source tree?

This is induced by wrong usage of DMA API (or internal Xen error), basically,
bounce buffers do not have index of a page used by map_page or sync_page.

Yes.

Btw, what does this error message mean?
[516793.417451] ib_mthca 0000:08:00.0: Catastrophic error detected: internal
error
[516793.417465] ib_mthca 0000:08:00.0:   buf[00]: 0012f6f8
[516793.417469] ib_mthca 0000:08:00.0:   buf[01]: 00000000
[516793.417472] ib_mthca 0000:08:00.0:   buf[02]: 00000000
[516793.417475] ib_mthca 0000:08:00.0:   buf[03]: 00000000
[516793.417478] ib_mthca 0000:08:00.0:   buf[04]: 00000000
[516793.417481] ib_mthca 0000:08:00.0:   buf[05]: 0012f6dc
[516793.417483] ib_mthca 0000:08:00.0:   buf[06]: 001b3658
[516793.417486] ib_mthca 0000:08:00.0:   buf[07]: 00000000
[516793.417489] ib_mthca 0000:08:00.0:   buf[08]: 00000000
[516793.417492] ib_mthca 0000:08:00.0:   buf[09]: 00000000
[516793.417495] ib_mthca 0000:08:00.0:   buf[0a]: 00000000
[516793.417499] ib_mthca 0000:08:00.0:   buf[0b]: 00000000
[516793.417502] ib_mthca 0000:08:00.0:   buf[0c]: 00000000
[516793.417505] ib_mthca 0000:08:00.0:   buf[0d]: 00000000
[516793.417508] ib_mthca 0000:08:00.0:   buf[0e]: 00000000
[516793.417511] ib_mthca 0000:08:00.0:   buf[0f]: 00000000


-- 
Lukáš Hejtmánek


From swise at opengridcomputing.com  Mon Dec 10 06:58:27 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Mon, 10 Dec 2007 08:58:27 -0600
Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0
	firmware support
In-Reply-To: <475D4B49.4090709@dev.mellanox.co.il>
References: <47501F09.4060800@opengridcomputing.com>
	<475B1E7D.5080607@opengridcomputing.com>
	<475D4B49.4090709@dev.mellanox.co.il>
Message-ID: <475D5413.8080205@opengridcomputing.com>

Great, thanks!


Steve.


Vladimir Sokolovsky wrote:
> Hi Steve,
> Sorry, I missed your libcxgb3 updates for ofed-1.2.5.
> It is updated now.
> There is OFED-1.2.5.4-20071210-0614 build which includes updated 
> libcxgb3 library.
> 
> In any case we are going to release OFED-1.2.5.5 in a few days.
> 
> Regards,
> Vladimir
> 
> Steve Wise wrote:
>> Vlad, it looks like you didn't pull in version 1.1.0 of libcxgb3 for 
>> ofed-1.2.5?
>>
>> Right now the ofed-1.2.5.4 is broken from chelsio's perspective 
>> because the kernel drivers require 5.0 firmware, but the library 
>> doesn't have 5.0 firmware support.
>>
>> Can you please pull in 1.1.0 of libcxgb3 and crank a new ofed-1.2.5.4 
>> release?
>>
>> Pull from:
>>
>> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
>>
>>
>> Thanks,
>>
>> Steve.
>>
>>
>> Steve Wise wrote:
>>> Vlad, please pull cxgb3 fixes for ofed-1.2.5 from:
>>>
>>> git://git.openfabrics.org/~swise/ofed-1.2.5 stevo
>>>
>>> These are cxgb3 bug fixes and PPC64 additions that we need for 
>>> ofed-1.2.5  (stay tuned for ofed-1.3 patches soon).
>>>
>>> The patches are all accepted upstream and were posted here:
>>>
>>> http://www.spinics.net/lists/netdev/msg47492.html
>>>
>>> and here:
>>>
>>> http://www.spinics.net/lists/netdev/msg48240.html
>>>
>>>
>>> Also, please pull version 1.1.0 of libcxgb3 from:
>>>
>>> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
>>>
>>> The library and drivers need to be included together as they are both 
>>> needed to support the chelsio 5.0 firmware.
>>>
>>> Alsoalso: After you integrate these, can you crank a daily 
>>> OFED-1.2.5.3 build including all this?
>>>
>>>
>>> Thanks,
>>>
>>> Steve.
>>>
>>


From dwsnoerenkozijnenm at snoerenkozijnen.nl  Mon Dec 10 06:57:27 2007
From: dwsnoerenkozijnenm at snoerenkozijnen.nl (Adam Effeny)
Date: Mon, 10 Dec 2007 22:57:27 +0800
Subject: [ofa-general] Manage your health and well-being without spending
	heaps of money with CanadianPharmacy.
Message-ID: <01c83b80$09365ef0$21ca9c3d@dwsnoerenkozijnenm>

    More and more people are using the Internet when it comes to health care. We all know the reasons: low prices, timesaving easy ordering process and confidentiality. However, most people can't compare prices, quality, and service of different drugstores. Weve done this job for you. 

http://geocities.com/CliffordZimmerman01/

 Purchase meds with us and enjoy the life to the full.

Adam Effeny


From Ashish.Batwara at lsi.com  Mon Dec 10 07:08:47 2007
From: Ashish.Batwara at lsi.com (Batwara, Ashish)
Date: Mon, 10 Dec 2007 08:08:47 -0700
Subject: [ofa-general] Single SRP request for 4MB using Indirect Buffer
	Descriptor
Message-ID: <01B9E81EECACE94DBBD0A556E768FB8A01E8FA73@NAMAIL2.ad.lsil.com>

Hi,

We are looking for some help where we can send single SRP request from
initiator using Indirect Buffer Memory descriptor. Whatever the bigger
size IO we send, we are seeing SRP packet max. of 2MB IO. We want single
SRP request to be of 4MB.

We are using OFED-1.2 at initiator.

 
Thanks

Ashish

 
Best Regards

=================

Ashish Batwara, PMP | Firmware Architect | Mobile: +1 316 253 9784 |

 
email: ashish.batwara at lsi.com 

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/60438513/attachment.html>

From steasdale at gjenkins.com  Mon Dec 10 07:52:36 2007
From: steasdale at gjenkins.com (bthLam)
Date: Mon, 10 Dec 2007 17:52:36 +0200
Subject: [ofa-general] The efforts often
Message-ID: <01c83b55$72f4bd80$b1226c55@steasdale>

Hey You
I'm 29 years old
I read your profile online
Reply to  me at Bed at GloryLandUsa.info and tell me about yourself if you want to chat or get to know each other better
I will respond right away and send a pic and some of my info right away

Thank you 


From steasdale at gjenkins.com  Mon Dec 10 07:52:36 2007
From: steasdale at gjenkins.com (bthLam)
Date: Mon, 10 Dec 2007 17:52:36 +0200
Subject: [ofa-general] The efforts often
Message-ID: <01c83b55$72f4bd80$b1226c55@steasdale>

Hey You
I'm 29 years old
I read your profile online
Reply to  me at Bed at GloryLandUsa.info and tell me about yourself if you want to chat or get to know each other better
I will respond right away and send a pic and some of my info right away

Thank you 


From sweitzen at cisco.com  Mon Dec 10 08:34:03 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Mon, 10 Dec 2007 08:34:03 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <15ddcffd0712081404l110c2e74q3f15a3f01a6200b9@mail.gmail.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712061226l42cd5ed5ne3b36a3198ba9eaa@mail.gmail.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96F7A@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712081404l110c2e74q3f15a3f01a6200b9@mail.gmail.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A977DA@xmb-sjc-216.amer.cisco.com>


> > I would prefer to leave openib.conf alone.  The old way and 
> new way are
> > not mutually exclusive, are they?
> 
> Indeed the old and the new ways are not mutually exclusive, however,
> as I see it, the name of the game here is why add noise? over Ethernet
> no one would ever never consider to add a weird  script that does not
> use the standard tools to configure networking/bonding/etc. For some
> reason some people that run this thing called OFED neglect that simple
> everyday life fact so they (we, I should say, as at least both me and
> you are signed on this) invent this and that mechanisms.

OK, how about if OFED continues to support the old way to configure
IPoIB bonding, but we mark it as obsolete?

> In other word, the long answer is that OFED should go home,  the
> openibd service should go home, etc, along with this thing of
> configuring bonding through non standard means and the short answer is
> that there no point in configuring bonding through non standard means.

For the forseeable future OFED seems to me to be the fastest way to get
new IB hardware support (for example, ConnectX) into customer's hands.

Scott


From sweitzen at cisco.com  Mon Dec 10 08:39:41 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Mon, 10 Dec 2007 08:39:41 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <15ddcffd0712081409v17f239d6sd6b6952b8a269db3@mail.gmail.com>
References: <4756E743.3040900@mellanox.com>
	<4757A980.2030403@voltaire.com><47586FD5.1040602@mellanox.com>
	<15ddcffd0712081409v17f239d6sd6b6952b8a269db3@mail.gmail.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A977E3@xmb-sjc-216.amer.cisco.com>

> > > The question on usage case of bonding over separate 
> fabrics have been
> > > brought to me several times and I gave this answer, 
> no-one ever tried
> > > to educate me why its interesting, maybe you will do so...
> > >
> >
> > I don't have good reason. I used two separated fabrics configuration
> > because my lacking understanding on ethernet/ib bonding and the old
> > methodology way of redundancy in ethernet  & FC using two 
> separated fabrics.
> 
> Yes, that was my guess, but, my hope was that you can provide some
> reasoning for thismethodology way of redundancy which I understand you
> were using also for SRP HA, so can you say anything in favor of the
> way you were working till now? As I said, this problem of failure in
> one side enforcing a failure in the other side, and worse, when there
> are more than two players, eg one target and N initiators, fail-over
> in one initiator forces the target to fail-over --> forces the other
> N-1 initiators to fail-over!?

I think separate fabrics is a desirable, intuitive redundancy model.
With storage each initiator fails over independently.

Scott 


From xma at us.ibm.com  Mon Dec 10 08:45:42 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Mon, 10 Dec 2007 08:45:42 -0800
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <1196603159.22671.11.camel@mtls03>
Message-ID: <OF2FAC7AAC.A73B2654-ON872573AD.005AF39A-882573AD.002AA46A@us.ibm.com>


Hello Eli,

I just found my email somehow became SPAM. I just resent my comments,
hopefully it can go through.

Eli Cohen <eli at dev.mellanox.co.il> wrote on 12/02/2007 05:45:59 AM:

> On Fri, 2007-11-30 at 15:28 -0800, Shirley Ma wrote:
> > I just touch tested ofed-1.3 beta IPoIB. And found there was a kernel
> > parameter hw_csum being added in IPoIB. I have several questions here:
> > 1. Why not using ethtool to set up these HW_CSUM flags?
> There is no adequate interface in Ethtool for doing it so we use a
> module parameter. This is because we see this as a static configuration
> per host.

Ethtool does support rx csum and tx csum:
#define ETHTOOL_GRXCSUM  0x00000014 Get RX hw csum enable (ethtool_value)
#define ETHTOOL_SRXCSUM  0x00000015 Set RX hw csum enable (ethtool_value)
#define ETHTOOL_GTXCSUM  0x00000016 Get TX hw csum enable (ethtool_value)
#define ETHTOOL_STXCSUM  0x00000017 /* Set TX hw csum enable
(ethtool_value)

We should use ethtool here.

> > 2. I haven't looked at the detailed code yet, is that possible with
this
> > flag, TCP/IP will not do CSUM for HCA which has no TCP/IP offload
support?
> Yes, the HCA need not have checksum offload support. the idea is the IB
> ICRC provides the insurance that the packets are not corrupt.

That's something we discussed long time ago when we wanted GSO to avoid
extra copy by using ICRC to enable SG feature. I remembered Roland rejected
this idea since there could be potenical data corruption. And even if we do
prove that ICRC is 100% accurate, then we should have some codes here to
limit the IP destination within IB subnet when using ICRC. Otherwise, if
the packets routing out to ehthernet IP subnet, these packets will be
dropped.

Thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/e715b317/attachment.html>

From xma at us.ibm.com  Mon Dec 10 08:42:23 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Mon, 10 Dec 2007 08:42:23 -0800
Subject: ***SPAM*** Re: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <475515FB.10801@mellanox.co.il>
Message-ID: <OF25833A43.3082C207-ON872573AD.005B096A-882573AD.002A56C7@us.ibm.com>


Tziporet Koren <tziporet at dev.mellanox.co.il> wrote on 12/04/2007 12:55:23
AM:
> Not until we start some SPEC definition to add this feature to IPoIB

So it's quiet possible the mainline ipoib could be different with OFED-1.3
release. I don't think OFED release should create different kernel branch
rather than mainline. It would cause maintainance issue and is hard for
Distro to pick any feature/code not (or will not be) mainline upper stream.

BTW I can't reproduce the mtu problem I saw before. I might hit same issue
as Eli.

Thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/988b3cfa/attachment.html>

From jonascooper at myjaring.net  Mon Dec 10 07:17:43 2007
From: jonascooper at myjaring.net (REV.FATHER JONAS COOPER)
Date: Mon, 10 Dec 2007 10:17:43 -0500 (EST)
Subject: [ofa-general] UNITED NATION ORGANISATION/Payment/Notification
Message-ID: <55428.41.223.251.72.1197299863.squirrel@www.stealmycorpse.com>

UNITED NATION ORGANISATION/Payment/Notification
UNITED NATIONS ORGANISATION IN CONJUNCTION WITH THE
INTERNATIONAL MONETARY FUND WORLDBANkFACT-FINDING & SPECIAL DUTIES
OFFICE Office of The Director Special duties.Cotonou,Republic of Benin

PLEASE REPLY TO MY PRIVATE BOX Email:  r.cooper6 at yahoo.it

Special duties reference

UNO/WBF LM-05-371
ORDERING CONTRACTOR:
UNO/WBF ? SG
DIPLOMATIC BOX 55KG

To the Beneficiary,

The World Bank Group, Fact Finding & Special Duties office In
conjunction with the United Nations Organization, has received part of
your pending payment with reference number (LM-05-371) amounting to
($15.5million(Fifteen million five hundred thousand USA Dollars) out of
your contractual/inheritance funds from our ordering contractor Bank
quoting reference to UNO/WBF LM-05-371, the said payment is been
arranged in a Security-proof trunk box weighing 75kg padded with
syntheticnylon.According to informat! ion gathered from the bank's
security computer we were notified that you have waited for so long to
receive this payment without success, we also confirmed that you have
not met all statutory requirements in respect of your pending payment. You
are therefore advised to contact our Payment Clearance Department to obtain
necessary information to the Security courier service company that is
specialized in sending diplomatic materials and information from one
country to another,which also has diplomatic immunity to carry
consignment /trunk(Box) such as this.

This office has met with this Security courier service and concluded
shipping arrangement with them, therefore shipment will commence as
soon as we have your go ahead order, the diplomat who will be bringing in
this Consignment trunk (Box) to you is an expert and has been in this
line of work for so many years now, so you have notting to worry about.
After all arrangements we have concluded ! that you must donate (Five
Hundred Thousand United States Dollars) to a charity organization we
designate to you as soon as you receive your inheritance fund. To this
effect,in your response you should send to us a promissory note
promissing to donate the stated amount and also with your address where
you will like the consignment trunk Box to be delivered to.

Please maintain topmost secrecy as it may cause a lot of problems if
found out that we are using this media to help you. Therefore you are
advised not to inform anyone about this until you received your
consignment box.

The above requirement qualifies you for final remittance process of the
received sum.

The below information would be needed for proper filing and to enable
safe delivery.

Full Name
Address
Occupation
Nationality
Mobile/phone number
Fax
Age
Sex

Please confirm message granted with "GO AHEAD ORDER"on MY PRIVATE BOX
 Email: r.cooper6 at yahoo.it

Congratulations.
Yours Faithfully
Rev. Jonas Cooper
Director, Special Duties UNO/WBF.


From xma at us.ibm.com  Mon Dec 10 08:59:37 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Mon, 10 Dec 2007 08:59:37 -0800
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <475515FB.10801@mellanox.co.il>
Message-ID: <OFBAF48EDF.4DF98B65-ON872573AD.005D51CA-882573AD.002BEA80@us.ibm.com>


Tziporet Koren <tziporet at dev.mellanox.co.il> wrote on 12/04/2007 12:55:23
AM:
> Not until we start some SPEC definition to add this feature to IPoIB

So it's quiet possible the mainline ipoib could be different with OFED-1.3
release. I don't think OFED release should create different kernel branch
rather than mainline. It would cause maintainance issue and is hard for
Distro to pick any feature/code not (or will not be) mainline upper stream.

BTW I can't reproduce the mtu problem I saw before. I might hit same issue
as Eli.

Thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/f9becf16/attachment.html>

From eli at dev.mellanox.co.il  Mon Dec 10 08:58:45 2007
From: eli at dev.mellanox.co.il (Eli Cohen)
Date: Mon, 10 Dec 2007 18:58:45 +0200
Subject: [ofa-general] OFED-1.3beta IPoIB testing questions
In-Reply-To: <OF2FAC7AAC.A73B2654-ON872573AD.005AF39A-882573AD.002AA46A@us.ibm.com>
References: <OF2FAC7AAC.A73B2654-ON872573AD.005AF39A-882573AD.002AA46A@us.ibm.com>
Message-ID: <1197305925.23785.71.camel@mtls03>

Hi Shirley,

I guess you missed my response to your comments too. They're attached.


On Mon, 2007-12-10 at 08:45 -0800, Shirley Ma wrote:
> Hello Eli,
> 
> I just found my email somehow became SPAM. I just resent my comments,
> hopefully it can go through.
> 
> Eli Cohen <eli at dev.mellanox.co.il> wrote on 12/02/2007 05:45:59 AM:
> 
> > On Fri, 2007-11-30 at 15:28 -0800, Shirley Ma wrote:
> > > I just touch tested ofed-1.3 beta IPoIB. And found there was a
> kernel 
> > > parameter hw_csum being added in IPoIB. I have several questions
> here:
> > > 1. Why not using ethtool to set up these HW_CSUM flags?
> > There is no adequate interface in Ethtool for doing it so we use a
> > module parameter. This is because we see this as a static
> configuration
> > per host.
> 
> Ethtool does support rx csum and tx csum:
> #define ETHTOOL_GRXCSUM  0x00000014 Get RX hw csum enable
> (ethtool_value)
> #define ETHTOOL_SRXCSUM  0x00000015 Set RX hw csum enable
> (ethtool_value)
> #define ETHTOOL_GTXCSUM  0x00000016 Get TX hw csum enable
> (ethtool_value)
> #define ETHTOOL_STXCSUM  0x00000017 /* Set TX hw csum enable
> (ethtool_value)
> 
> We should use ethtool here.
> 
> > > 2. I haven't looked at the detailed code yet, is that possible
> with this 
> > > flag, TCP/IP will not do CSUM for HCA which has no TCP/IP offload
> support? 
> > Yes, the HCA need not have checksum offload support. the idea is the
> IB
> > ICRC provides the insurance that the packets are not corrupt.
> 
> That's something we discussed long time ago when we wanted GSO to
> avoid extra copy by using ICRC to enable SG feature. I remembered
> Roland rejected this idea since there could be potenical data
> corruption. And even if we do prove that ICRC is 100% accurate, then
> we should have some codes here to limit the IP destination within IB
> subnet when using ICRC. Otherwise, if the packets routing out to
> ehthernet IP subnet, these packets will be dropped.
> 
> Thanks
> Shirley
> 
-------------- next part --------------
An embedded message was scrubbed...
From: Eli Cohen <eli at dev.mellanox.co.il>
Subject: Re: [ofa-general] OFED-1.3beta IPoIB testing questions
Date: Tue, 04 Dec 2007 18:15:25 +0200
Size: 2241
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/c16bc016/attachment.mht>

From ytodoqnarn at bordeaux-trade.com  Mon Dec 10 09:07:42 2007
From: ytodoqnarn at bordeaux-trade.com (Troy Serrano)
Date: Mon, 10 Dec 2007 18:07:42 +0100
Subject: [ofa-general] Re:
Message-ID: <01c83b57$8e8f2b00$1c85595b@ytodoqnarn>

Even if you have no erection problems Cialis would help you to make better sex more often and to bring unimaginable plesure to her. Just disolve half a pill under your tongue and get ready for action in 15 minutes! The tests showed that the majority of men after taking this medication were able to have perfect erection during 36 hours!PackagePrice in your local drugstore*Our priceLearn
More
Now10 tabs$95.95$44.0030 tabs$349.95$117.0060 tabs$549.95$204.0090 tabs$789.95$261.00When you are young and stressed up&hellip;
When you are aged and never give up...
Cialis gives you confidence in any chance, every time.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/ab61af42/attachment.html>

From tei at klarquist.com  Mon Dec 10 09:17:05 2007
From: tei at klarquist.com (Ester Noel)
Date: Mon, 10 Dec 2007 20:17:05 +0300
Subject: [ofa-general] There is no cheaper source of original and perfectly
	working software.
Message-ID: <01c83b69$a1aaf680$f994945a@tei>

  Brilliant opportunity to get software right at the same time you need it without waiting for a CD to be delivered. Just pay money and download your soft. Low prices, discounts and special offers! Most popular localized software in German, French, Italian, Spanish, English and many other languages of the world!

 Purchasing software you can be sure you get perfectly working software, in case you are not satisfied, we offer money refund. Quick response and advice on how to install your software are guaranteed.

http://geocities.com/JeroldAcosta06/

   Incredible selection of programs and applications!


From fenkes at de.ibm.com  Mon Dec 10 09:41:29 2007
From: fenkes at de.ibm.com (Joachim Fenkes)
Date: Mon, 10 Dec 2007 18:41:29 +0100
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <adaodcza1xc.fsf@cisco.com>
References: <200712061607.20004.fenkes@de.ibm.com>
	<200712071058.38416.arnd@arndb.de> <adaodcza1xc.fsf@cisco.com>
Message-ID: <200712101841.30010.fenkes@de.ibm.com>

On Monday 10 December 2007 00:22, Roland Dreier wrote:
> Fair enough... according to Documentation/infiniband/core_locking.txt,
> the only driver methods that cannot sleep are:
> 
>     [...]
>     map_phys_fmr

In fact, we do use hCalls there. Our hardware doesn't actually support FMRs,
so we translate a "map FMR" into a "reallocate PMR", which doesn't work
without hCalls. What's more, the hCalls involved (e.g. H_FREE_RESOURCE)
might well return H_LONG_BUSY, so the whole operation might sleep; no way
around it.

How should we deal with this?

Thanks,
  Joachim


From jlentini at netapp.com  Mon Dec 10 09:51:17 2007
From: jlentini at netapp.com (James Lentini)
Date: Mon, 10 Dec 2007 12:51:17 -0500 (EST)
Subject: [ofa-general] [PATCH 1/2] uDAT/uDAPL v2 - (master branch) changes
	to sync common code base with WinOF 1.01
In-Reply-To: <000001c8392d$ae801400$1dfd070a@amr.corp.intel.com>
References: <000001c8392d$ae801400$1dfd070a@amr.corp.intel.com>
Message-ID: <Pine.LNX.4.64.0712101249000.12638@jlentini-linux.nane.netapp.com>


On Fri, 7 Dec 2007, Arlin Davis wrote:

> James,
> 
> Please review patch series to bring in latest WinOF code base into the mainstream. I would like to
> keep the commond code base from diverging as much as possible. This is a pretty straight forward
> change but it touches alot of files. This is on master branch (now based on a v2 code base) and is
> not targeted for OFED 1.3.
> 
> 1/1 uDAT changes.
> 1/2 uDAPL changes.
> 
>   - add DAT_API to specify calling conventions (windows=__stdcall, linux= ) 
>   - cleanup platform specific definitions for windows
>   - c++ support
> 
> Signed-off by: Arlin Davis <ardavis at ichips.intel.com>

Looks good Arlin. One very minor question:

> diff --git a/dat/common/dat_api.c b/dat/common/dat_api.c
> index 1415f73..a3d2274 100755
> --- a/dat/common/dat_api.c
> +++ b/dat/common/dat_api.c
<snip>
> @@ -334,7 +334,7 @@ DAT_RETURN dat_get_consumer_context (
>          DAT_IA_HANDLE   dapl_ia_handle;
>          DAT_RETURN      dat_status;
>  
> -        dat_status = dats_get_ia_handle((unsigned long)dat_handle,
> +        dat_status = dats_get_ia_handle((DAT_IA_HANDLE)dat_handle,
>                                          &dapl_ia_handle);
>  
>          /* failure to map the handle is unlikely but possible */
<snip>
> @@ -875,8 +874,7 @@ DAT_RETURN dat_lmr_sync_rdma_read(
>      DAT_IA_HANDLE	dapl_ia_handle;
>      DAT_RETURN		dat_status;
>  
> -    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
> -				    &dapl_ia_handle);
> +    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);

For consistency with your change above, should the cast 
be changed to 

+    dat_status = dats_get_ia_handle((DAT_IA_HANDLE)ia_handle, &dapl_ia_handle);

>      if (dat_status == DAT_SUCCESS)
>      {
>  	dat_status = DAT_LMR_SYNC_RDMA_READ (dapl_ia_handle,


From FENKES at de.ibm.com  Mon Dec 10 09:59:16 2007
From: FENKES at de.ibm.com (Joachim Fenkes)
Date: Mon, 10 Dec 2007 18:59:16 +0100
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <OF85E31FAA.DADA6039-ONC12573AA.005439C8-C12573AA.005A132E@LocalDomain>
Message-ID: <OF9AA24809.F691A5CC-ONC12573AD.006155D6-C12573AD.0062ABE2@de.ibm.com>

Hi, guys,

> We're taking this to the firmware architects at the moment, but they're 
not 
> very fond of the idea of reporting the absence of bugs through 
capability 
> flags, as this could quickly lead to the exhaustion of flag bits. We'll 
let 
> the discussion stew for a bit, but if we don't get this flag, we'll have 
to 
> resort to the CPU features.

The architects have spoken, and we're getting a capability flag for this. 
I'll repost my patch with new autodetection code that doesn't involve 
checking the processor version.
 
> >  > Regarding the performance problem, have you checked whether 
converting all
> >  > your spin_lock_irqsave to spin_lock/spin_lock_irq improves your 
performance
> >  > on the older machines? Maybe it's already fast enough that way.
> > 
> > It does seem that the only places that the hcall_lock is taken also
> > use msleep, so they must always be in process context.  So you can
> > safely just use spin_lock(), right?
>
> As Arnd said, there are hCalls that will never return H_LONG_BUSY_*, 
such as 
> H_QUERY_PORT and chums, so they will never sleep. The surrounding 
functions, 
> though, are not prepared to be called from interrupt context (GFP_KERNEL 
comes
> to mind), so I agree that a simple spin_lock() will suffice. Thanks, 
Arnd, for
> pointing this out.

As I pointed out in my earlier mail, there's still an issue with 
map_phys_fmr possibly sleeping. Let's keep the irqsave for the time being 
and revisit this part once we find a solution to map_phys_fmr.

Regards,
  Joachim


From jlentini at netapp.com  Mon Dec 10 09:58:13 2007
From: jlentini at netapp.com (James Lentini)
Date: Mon, 10 Dec 2007 12:58:13 -0500 (EST)
Subject: [ofa-general] [PATCH 2/2] uDAT/uDAPL v2 - (master branch) changes
	to sync common code base with WinOF 1.01
In-Reply-To: <000101c8392e$86f2ff50$1dfd070a@amr.corp.intel.com>
References: <000101c8392e$86f2ff50$1dfd070a@amr.corp.intel.com>
Message-ID: <Pine.LNX.4.64.0712101253210.12638@jlentini-linux.nane.netapp.com>


On Fri, 7 Dec 2007, Arlin Davis wrote:

> 
> 2/2 uDAPL changes.
> 
>   - add DAT_API to specify calling conventions (windows=__stdcall, linux= ) 
>   - cleanup CR+LF's
> 
> Signed-off by: Arlin Davis <ardavis at ichips.intel.com>

Looks good. A few more minor question:

> diff --git a/dapl/common/dapl_csp.c b/dapl/common/dapl_csp.c
> index 6c0aaf2..678ca79 100755
> --- a/dapl/common/dapl_csp.c
> +++ b/dapl/common/dapl_csp.c
> @@ -46,12 +46,12 @@
>   *
>   * uDAPL: User Direct Access Program Library Version 2.0, 6.4.4.2
>   *
> - * The Common Service Point is transport-independent analog of the Public
> - * Service Point. It allows the Consumer to listen on socket-equivalent for
> - * requests for connections arriving on a specified IP port instead of
> + * The Common Service Point is transport-independent analog of the Public
> + * Service Point. It allows the Consumer to listen on socket-equivalent for
> + * requests for connections arriving on a specified IP port instead of
>   * transport-dependent Connection Qualifier. An IA Address follows the
> - * platform conventions and provides among others the IP port to listen on.
> - * An IP port of the Common Service Point advertisement is supported by
> + * platform conventions and provides among others the IP port to listen on.
> + * An IP port of the Common Service Point advertisement is supported by
>   * existing Ethernet infrastructure or DAT Name Service.
>   *
>   * Input:

What are the change above? I don't see any difference between the text 
or the white space.

> diff --git a/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
> b/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
> index 929186f..2b1210d 100755
> --- a/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
> +++ b/dapl/common/dapl_ep_post_rdma_read_to_rmr.c
> @@ -46,7 +46,7 @@
>   *
>   * DAPL Requirements Version xxx, 6.6.24
>   *
> - * Requests the transfer of all the data specified by the remote_buffer 
> + * Requests the transfer of all the data specified by the remote_buffer 
>   * over the connection of the ep_handle Endpoint into the local_iov 
>   * specified by the RMR segments.
>   *

ditto

> diff --git a/dapl/common/dapl_ep_post_send_invalidate.c b/dapl/common/dapl_ep_post_send_invalidate.c
> index 5ce0808..68a3a51 100755
> --- a/dapl/common/dapl_ep_post_send_invalidate.c
> +++ b/dapl/common/dapl_ep_post_send_invalidate.c
> @@ -46,7 +46,7 @@
>   *
>   * DAPL Requirements Version xxx, 6.6.21
>   *
> - * Requests a transfer of all the data from the local_iov over the connection
> + * Requests a transfer of all the data from the local_iov over the connection
>   * of the ep_handle Endpoint to the remote side and invalidates the Remote 
>   * Memory Region context.
>   *

ditto


From fenkes at de.ibm.com  Mon Dec 10 09:59:10 2007
From: fenkes at de.ibm.com (Joachim Fenkes)
Date: Mon, 10 Dec 2007 18:59:10 +0100
Subject: [ofa-general] [PATCH] IB/ehca: Serialize HCA-related hCalls if
	necessary
Message-ID: <200712101859.11218.fenkes@de.ibm.com>

Several pSeries firmware versions share a rare locking issue in the
HCA-related hCalls. Check for a feature flag that indicates the issue being
fixed and serialize all HCA hCalls if not.

Signed-off-by: Joachim Fenkes <fenkes at de.ibm.com>
---

This is the revised version of my previous patch, which does not rely on the
processor version any longer.

 drivers/infiniband/hw/ehca/ehca_main.c |   13 +++++++++++++
 drivers/infiniband/hw/ehca/hcp_if.c    |   28 +++++++++++-----------------
 drivers/infiniband/hw/ehca/hipz_hw.h   |    1 +
 3 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index 90d4334..c7bff3e 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -43,6 +43,7 @@
 #ifdef CONFIG_PPC_64K_PAGES
 #include <linux/slab.h>
 #endif
+
 #include "ehca_classes.h"
 #include "ehca_iverbs.h"
 #include "ehca_mrmw.h"
@@ -66,6 +67,7 @@ int ehca_poll_all_eqs  = 1;
 int ehca_static_rate   = -1;
 int ehca_scaling_code  = 0;
 int ehca_mr_largepage  = 1;
+int ehca_lock_hcalls   = -1;
 
 module_param_named(open_aqp1,     ehca_open_aqp1,     int, S_IRUGO);
 module_param_named(debug_level,   ehca_debug_level,   int, S_IRUGO);
@@ -77,6 +79,7 @@ module_param_named(poll_all_eqs,  ehca_poll_all_eqs,  int, S_IRUGO);
 module_param_named(static_rate,   ehca_static_rate,   int, S_IRUGO);
 module_param_named(scaling_code,  ehca_scaling_code,  int, S_IRUGO);
 module_param_named(mr_largepage,  ehca_mr_largepage,  int, S_IRUGO);
+module_param_named(lock_hcalls,   ehca_lock_hcalls,   bool, S_IRUGO);
 
 MODULE_PARM_DESC(open_aqp1,
 		 "AQP1 on startup (0: no (default), 1: yes)");
@@ -102,6 +105,9 @@ MODULE_PARM_DESC(scaling_code,
 MODULE_PARM_DESC(mr_largepage,
 		 "use large page for MR (0: use PAGE_SIZE (default), "
 		 "1: use large page depending on MR size");
+MODULE_PARM_DESC(lock_hcalls,
+		 "serialize all hCalls made by the driver "
+		 "(default: autodetect)");
 
 DEFINE_RWLOCK(ehca_qp_idr_lock);
 DEFINE_RWLOCK(ehca_cq_idr_lock);
@@ -258,6 +264,7 @@ static struct cap_descr {
 	{ HCA_CAP_UD_LL_QP, "HCA_CAP_UD_LL_QP" },
 	{ HCA_CAP_RESIZE_MR, "HCA_CAP_RESIZE_MR" },
 	{ HCA_CAP_MINI_QP, "HCA_CAP_MINI_QP" },
+	{ HCA_CAP_H_ALLOC_RES_SYNC, "HCA_CAP_H_ALLOC_RES_SYNC" },
 };
 
 static int ehca_sense_attributes(struct ehca_shca *shca)
@@ -333,6 +340,12 @@ static int ehca_sense_attributes(struct ehca_shca *shca)
 		if (EHCA_BMASK_GET(hca_cap_descr[i].mask, shca->hca_cap))
 			ehca_gen_dbg("   %s", hca_cap_descr[i].descr);
 
+	/* Autodetect hCall locking -- the "H_ALLOC_RESOURCE synced" flag is
+	 * a firmware property, so it's valid across all adapters
+	 */
+	if (ehca_lock_hcalls == -1)
+		ehca_lock_hcalls = !(shca->hca_cap & HCA_CAP_H_ALLOC_RES_SYNC);
+
 	/* translate supported MR page sizes; always support 4K */
 	shca->hca_cap_mr_pgsize = EHCA_PAGESIZE;
 	if (ehca_mr_largepage) { /* support extra sizes only if enabled */
diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c
index c16a213..331b5e8 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -89,6 +89,7 @@
 #define HCALL9_REGS_FORMAT HCALL7_REGS_FORMAT " r11=%lx r12=%lx"
 
 static DEFINE_SPINLOCK(hcall_lock);
+extern int ehca_lock_hcalls;
 
 static u32 get_longbusy_msecs(int longbusy_rc)
 {
@@ -120,26 +121,21 @@ static long ehca_plpar_hcall_norets(unsigned long opcode,
 				    unsigned long arg7)
 {
 	long ret;
-	int i, sleep_msecs, do_lock;
-	unsigned long flags;
+	int i, sleep_msecs;
+	unsigned long flags = 0;
 
 	ehca_gen_dbg("opcode=%lx " HCALL7_REGS_FORMAT,
 		     opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7);
 
-	/* lock H_FREE_RESOURCE(MR) against itself and H_ALLOC_RESOURCE(MR) */
-	if ((opcode == H_FREE_RESOURCE) && (arg7 == 5)) {
-		arg7 = 0; /* better not upset firmware */
-		do_lock = 1;
-	}
-
 	for (i = 0; i < 5; i++) {
-		if (do_lock)
+		/* serialize hCalls to work around firmware issue */
+		if (ehca_lock_hcalls)
 			spin_lock_irqsave(&hcall_lock, flags);
 
 		ret = plpar_hcall_norets(opcode, arg1, arg2, arg3, arg4,
 					 arg5, arg6, arg7);
 
-		if (do_lock)
+		if (ehca_lock_hcalls)
 			spin_unlock_irqrestore(&hcall_lock, flags);
 
 		if (H_IS_LONG_BUSY(ret)) {
@@ -174,24 +170,22 @@ static long ehca_plpar_hcall9(unsigned long opcode,
 			      unsigned long arg9)
 {
 	long ret;
-	int i, sleep_msecs, do_lock;
+	int i, sleep_msecs;
 	unsigned long flags = 0;
 
 	ehca_gen_dbg("INPUT -- opcode=%lx " HCALL9_REGS_FORMAT, opcode,
 		     arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9);
 
-	/* lock H_ALLOC_RESOURCE(MR) against itself and H_FREE_RESOURCE(MR) */
-	do_lock = ((opcode == H_ALLOC_RESOURCE) && (arg2 == 5));
-
 	for (i = 0; i < 5; i++) {
-		if (do_lock)
+		/* serialize hCalls to work around firmware issue */
+		if (ehca_lock_hcalls)
 			spin_lock_irqsave(&hcall_lock, flags);
 
 		ret = plpar_hcall9(opcode, outs,
 				   arg1, arg2, arg3, arg4, arg5,
 				   arg6, arg7, arg8, arg9);
 
-		if (do_lock)
+		if (ehca_lock_hcalls)
 			spin_unlock_irqrestore(&hcall_lock, flags);
 
 		if (H_IS_LONG_BUSY(ret)) {
@@ -821,7 +815,7 @@ u64 hipz_h_free_resource_mr(const struct ipz_adapter_handle adapter_handle,
 	return ehca_plpar_hcall_norets(H_FREE_RESOURCE,
 				       adapter_handle.handle,    /* r4 */
 				       mr->ipz_mr_handle.handle, /* r5 */
-				       0, 0, 0, 0, 5);
+				       0, 0, 0, 0, 0);
 }
 
 u64 hipz_h_reregister_pmr(const struct ipz_adapter_handle adapter_handle,
diff --git a/drivers/infiniband/hw/ehca/hipz_hw.h b/drivers/infiniband/hw/ehca/hipz_hw.h
index 485b840..bf996c7 100644
--- a/drivers/infiniband/hw/ehca/hipz_hw.h
+++ b/drivers/infiniband/hw/ehca/hipz_hw.h
@@ -378,6 +378,7 @@ struct hipz_query_hca {
 #define HCA_CAP_UD_LL_QP              EHCA_BMASK_IBM(16, 16)
 #define HCA_CAP_RESIZE_MR             EHCA_BMASK_IBM(17, 17)
 #define HCA_CAP_MINI_QP               EHCA_BMASK_IBM(18, 18)
+#define HCA_CAP_H_ALLOC_RES_SYNC      EHCA_BMASK_IBM(19, 19)
 
 /* query port response block */
 struct hipz_query_port {
-- 
1.5.2


From dotanb at dev.mellanox.co.il  Mon Dec 10 10:15:50 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Mon, 10 Dec 2007 20:15:50 +0200
Subject: [ofa-general] [PATCH] qperf: Removed the unneeded char '>' from the
	label of the man page
Message-ID: <200712102015.50771.dotanb@dev.mellanox.co.il>

Removed the unneeded char '>' from the label of the man page.

Signed-off-by: Dotan Barak <dotanb at dev.mellanox.co.il>

---

Index: ofa_1_3_dev_user/src/userspace/qperf/src/mkman
===================================================================
--- ofa_1_3_dev_user.orig/src/userspace/qperf/src/mkman	2007-12-10 11:42:16.000000000 +0200
+++ ofa_1_3_dev_user/src/userspace/qperf/src/mkman	2007-12-10 20:03:02.000000000 +0200
@@ -147,7 +147,7 @@ sub main() {
     printn '.\" Generated by mkman';
     printn '.TH QPERF 1 "October 2007" "qperf" "User Commands"';
     printn '.SH NAME';
-    printn 'qperf \- Measure RDMA and IP performance>';
+    printn 'qperf \- Measure RDMA and IP performance';
 
     do_synopsis ($main, 'Synopsis');
     do_general  ($main, 'Description');


From dwseikafoodsm at seikafoods.jp  Mon Dec 10 10:32:32 2007
From: dwseikafoodsm at seikafoods.jp (Preston Schaefer)
Date: Mon, 10 Dec 2007 19:32:32 +0100
Subject: [ofa-general] Get what you paid for with CanadianPharmacy.
Message-ID: <01c83b63$68b5dfd0$bb4fcd57@dwseikafoodsm>

 Take advantage of our cheap prices and purchase with Canadian Pharmacy.  Purchase medications in Canada and save money as Canadian meds are much more cheaper than American ones. We offer 100% generic medications and guarantee extremely high quality. Perfectly packed medications are shipped safe and discreet and will be delivered directly to your doorstep. Confidential ordering process. Wide selection of products from different categories. Helpful and cooperative customer service staff.

http://saidsuffix.cn

Don’t hesitate, make your order today. 

 Keith Ryan


From caitlin.bestler at neterion.com  Mon Dec 10 11:15:29 2007
From: caitlin.bestler at neterion.com (Caitlin Bestler)
Date: Mon, 10 Dec 2007 11:15:29 -0800
Subject: [ofa-general] rdma cm timeout option, was [iWARP issues]
In-Reply-To: <C98692FD98048C41885E0B0FACD9DFB80555A1DB@exnane01.hq.netapp.com>
References: <C98692FD98048C41885E0B0FACD9DFB805559DA6@exnane01.hq.netapp.com>
	<000701c81cd2$3d4178f0$9c98070a@amr.corp.intel.com>
	<EXNANE01hzL1f6Ykwn900000878@exnane01.hq.netapp.com>
	<472B4831.7030303@ichips.intel.com>
	<EXNANE017ddhlHs46tf0000087e@exnane01.hq.netapp.com>
	<001301c81d74$426c8840$9c98070a@amr.corp.intel.com>
	<C98692FD98048C41885E0B0FACD9DFB80555A19E@exnane01.hq.netapp.com>
	<472B71DF.2090408@ichips.intel.com>
	<C98692FD98048C41885E0B0FACD9DFB80555A1DB@exnane01.hq.netapp.com>
Message-ID: <469958e00712101115i53613b89iea3a9793a1d6d039@mail.gmail.com>

On Nov 2, 2007 11:17 AM, Kanevsky, Arkady <Arkady.Kanevsky at netapp.com> wrote:
> Yes on first.
> No on second yet.
> Maybe for iWARP the only acceptable value will be "default"?
> But this still feels that ULPs need to do something transport specific.
> except for "default" case.
> But implementing TCP style timeout for IB looks like overkill.
>

Keep in mind that the transport layer timeout only helps iWARP for establishing
the TCP connection. It does not deal with stalled MPA Request exchanges.

So there is need for *some* mechanism to timeout a stalled iWARP connectionion
process even after a valid TCP connection is established. A vendor
specific, non-
configurable, method is probably more than adequate. But *something* has to be
there, you cannot just rely on the TCP mechanisms. Nor can you assume that the
TCP stack servicing RDMA connections has the same defaults as the host stack.


From or.gerlitz at gmail.com  Mon Dec 10 12:23:45 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Mon, 10 Dec 2007 22:23:45 +0200
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A977E3@xmb-sjc-216.amer.cisco.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<47586FD5.1040602@mellanox.com>
	<15ddcffd0712081409v17f239d6sd6b6952b8a269db3@mail.gmail.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A977E3@xmb-sjc-216.amer.cisco.com>
Message-ID: <15ddcffd0712101223qc52e5qcc188c6331ae1cf6@mail.gmail.com>

On 12/10/07, Scott Weitzenkamp (sweitzen) <sweitzen at cisco.com> wrote:
> > Yes, that was my guess, but, my hope was that you can provide some
> > reasoning for thismethodology way of redundancy which I understand you
> > were using also for SRP HA, so can you say anything in favor of the
> > way you were working till now? As I said, this problem of failure in
> > one side enforcing a failure in the other side, and worse, when there
> > are more than two players, eg one target and N initiators, fail-over
> > in one initiator forces the target to fail-over --> forces the other
> > N-1 initiators to fail-over!?

> I think separate fabrics is a desirable, intuitive redundancy model.
> With storage each initiator fails over independently.

If the HA scheme is built on top of bonding, that is, its an IB native
file/block ULP which uses the RDMA-CM or one that runs on top of
IPoIB, then the active-backup mode of the bonding driver will not
allow each initiator to fail-over independently. For now, this is the
only mode supported for bonding/ipoib.

Or.


From or.gerlitz at gmail.com  Mon Dec 10 12:24:49 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Mon, 10 Dec 2007 22:24:49 +0200
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A977DA@xmb-sjc-216.amer.cisco.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712061226l42cd5ed5ne3b36a3198ba9eaa@mail.gmail.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96F7A@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712081404l110c2e74q3f15a3f01a6200b9@mail.gmail.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A977DA@xmb-sjc-216.amer.cisco.com>
Message-ID: <15ddcffd0712101224w1738856bg261b9c8ef8016bdb@mail.gmail.com>

On 12/10/07, Scott Weitzenkamp (sweitzen) <sweitzen at cisco.com> wrote:
> OK, how about if OFED continues to support the old way to configure
> IPoIB bonding, but we mark it as obsolete?

better then nothing, how you suggest to mark it as obsolete, in the
documentation?

Or.


From sweitzen at cisco.com  Mon Dec 10 12:28:29 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Mon, 10 Dec 2007 12:28:29 -0800
Subject: [ofa-general] ipoib bonding problems in 1.3-beta2 and 1.2.5.4,
In-Reply-To: <15ddcffd0712101224w1738856bg261b9c8ef8016bdb@mail.gmail.com>
References: <4756E743.3040900@mellanox.com> <4757A980.2030403@voltaire.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96E64@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712061226l42cd5ed5ne3b36a3198ba9eaa@mail.gmail.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96F7A@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712081404l110c2e74q3f15a3f01a6200b9@mail.gmail.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A977DA@xmb-sjc-216.amer.cisco.com>
	<15ddcffd0712101224w1738856bg261b9c8ef8016bdb@mail.gmail.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304B03DEC@xmb-sjc-216.amer.cisco.com>

Documentation is a good start, maybe after OFED 1.3 we change openibd to
print a warning if bonding is configured in openib.conf.

Scott

 
> -----Original Message-----
> From: Or Gerlitz [mailto:or.gerlitz at gmail.com] 
> Sent: Monday, December 10, 2007 12:25 PM
> To: Scott Weitzenkamp (sweitzen)
> Cc: Or Gerlitz; Vu Pham; OpenFabrics General; Roland Dreier (rdreier)
> Subject: Re: [ofa-general] ipoib bonding problems in 
> 1.3-beta2 and 1.2.5.4,
> 
> On 12/10/07, Scott Weitzenkamp (sweitzen) <sweitzen at cisco.com> wrote:
> > OK, how about if OFED continues to support the old way to configure
> > IPoIB bonding, but we mark it as obsolete?
> 
> better then nothing, how you suggest to mark it as obsolete, in the
> documentation?
> 
> Or.
> 


From dwsomm at som.com  Mon Dec 10 12:32:28 2007
From: dwsomm at som.com (Karyann Fleming)
Date: Mon, 10 Dec 2007 15:32:28 -0500
Subject: [ofa-general] Purchase popular impotency treatment drugs in Canada
	for the best Net prices.
Message-ID: <01c83b41$defb8e00$4a1d8142@dwsomm>

 Summer is time to lose your weight and save money for vacation! The cheapest Weight Loss products are available in the Canadian «CanadianPharmacy» online drugstore.

 With a click of your mouse you can order top quality medications from a really impressive selection offered in the «CanadianPharmacy» online drugstore. Affordable prices, fast delivery, customer friendly staff! Fully confidential ordering process!

http://suggestquiet.cn

 Become one of our happy customers!

Karyann Fleming


From rvm at obsidianresearch.com  Mon Dec 10 12:35:44 2007
From: rvm at obsidianresearch.com (Rolf Manderscheid)
Date: Mon, 10 Dec 2007 13:35:44 -0700
Subject: [ofa-general] [PATCH] ipoib: configurable broadcast scope
Message-ID: <20071210203544.GI30090@obsidianresearch.com>

Hi Roland,

In order to make ipoib work over routers, the scope of link-level
ipoib broadcast address must be non-local.  This requires a way to
configure the scope.  I posted a patch to address this on August 30:
http://lists.openfabrics.org/pipermail/general/2007-August/040197.html
It has since been used by scinet at SC07.

I have split that patch into two pieces: the first isolates the
required changes to the net code, the second adds a broadcast_scope
attribute to parent ipoib devices.  Having at least the first patch
upstream would be convenient, not having it means rebuilding the
whole kernel if you want to use an IB router.

One of the things that came out of SC07 was a request to enable child
devices to use the same pkey as the parent, but with different scopes.
The remaining patches are related to this (they are new and were not
used at the show).

    Rolf


From rvm at obsidianresearch.com  Mon Dec 10 12:38:41 2007
From: rvm at obsidianresearch.com (Rolf Manderscheid)
Date: Mon, 10 Dec 2007 13:38:41 -0700
Subject: [ofa-general] [PATCH 1/4] ipoib: improve IPv4/IPv6 to IB mcast
	mapping functions
In-Reply-To: <20071210203544.GI30090@obsidianresearch.com>
References: <20071210203544.GI30090@obsidianresearch.com>
Message-ID: <20071210203841.GJ30090@obsidianresearch.com>

An ipoib subnet on an IB fabric that spans multiple IB subnets can't
use link-local scope in multicast GIDs.  The existing routines that
map IP/IPv6 multicast addresses into IB link-level addresses hard-code
the scope to link-local, they also leave the partition key field
uninitialised.  This patch adds a parameter (the link-level broadcast
address) to the mapping routines allowing them to initialise both the
scope and the pkey appropriately, and fixes up the call sites.  The
next step will be to add a way to configure the scope for an ipoib device.

Signed-off-by: Rolf Manderscheid <rvm at obsidianresearch.com>

---

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 5a80e74..1b06626 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2608,11 +2608,9 @@ static void cma_set_mgid(struct rdma_id_private *id_priv,
 		/* IPv6 address is an SA assigned MGID. */
 		memcpy(mgid, &sin6->sin6_addr, sizeof *mgid);
 	} else {
-		ip_ib_mc_map(sin->sin_addr.s_addr, mc_map);
+		ip_ib_mc_map(sin->sin_addr.s_addr, dev_addr->broadcast, mc_map);
 		if (id_priv->id.ps == RDMA_PS_UDP)
 			mc_map[7] = 0x01;	/* Use RDMA CM signature */
-		mc_map[8] = ib_addr_get_pkey(dev_addr) >> 8;
-		mc_map[9] = (unsigned char) ib_addr_get_pkey(dev_addr);
 		*mgid = *(union ib_gid *) (mc_map + 4);
 	}
 }
diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index 448eccb..b24508a 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -269,18 +269,21 @@ static inline void ipv6_arcnet_mc_map(const struct in6_addr *addr, char *buf)
 	buf[0] = 0x00;
 }
 
-static inline void ipv6_ib_mc_map(struct in6_addr *addr, char *buf)
+static inline void ipv6_ib_mc_map(const struct in6_addr *addr,
+				  const unsigned char *broadcast, char *buf)
 {
+	unsigned char scope = broadcast[5] & 0xF;
+
 	buf[0]  = 0;		/* Reserved */
 	buf[1]  = 0xff;		/* Multicast QPN */
 	buf[2]  = 0xff;
 	buf[3]  = 0xff;
 	buf[4]  = 0xff;
-	buf[5]  = 0x12;		/* link local scope */
+	buf[5]  = 0x10 | scope;	/* scope from broadcast address */
 	buf[6]  = 0x60;		/* IPv6 signature */
 	buf[7]  = 0x1b;
-	buf[8]  = 0;		/* P_Key */
-	buf[9]  = 0;
+	buf[8]  = broadcast[8];	/* P_Key */
+	buf[9]  = broadcast[9];
 	memcpy(buf + 10, addr->s6_addr + 6, 10);
 }
 #endif
diff --git a/include/net/ip.h b/include/net/ip.h
index 840dd91..50c8889 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -266,20 +266,22 @@ static inline void ip_eth_mc_map(__be32 naddr, char *buf)
  *	Leave P_Key as 0 to be filled in by driver.
  */
 
-static inline void ip_ib_mc_map(__be32 naddr, char *buf)
+static inline void ip_ib_mc_map(__be32 naddr, const unsigned char *broadcast, char *buf)
 {
 	__u32 addr;
+	unsigned char scope = broadcast[5] & 0xF;
+
 	buf[0]  = 0;		/* Reserved */
 	buf[1]  = 0xff;		/* Multicast QPN */
 	buf[2]  = 0xff;
 	buf[3]  = 0xff;
 	addr    = ntohl(naddr);
 	buf[4]  = 0xff;
-	buf[5]  = 0x12;		/* link local scope */
+	buf[5]  = 0x10 | scope;	/* scope from broadcast address */
 	buf[6]  = 0x40;		/* IPv4 signature */
 	buf[7]  = 0x1b;
-	buf[8]  = 0;		/* P_Key */
-	buf[9]  = 0;
+	buf[8]  = broadcast[8];		/* P_Key */
+	buf[9]  = broadcast[9];
 	buf[10] = 0;
 	buf[11] = 0;
 	buf[12] = 0;
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index b3f366a..0abfab3 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -211,7 +211,7 @@ int arp_mc_map(__be32 addr, u8 *haddr, struct net_device *dev, int dir)
 		ip_tr_mc_map(addr, haddr);
 		return 0;
 	case ARPHRD_INFINIBAND:
-		ip_ib_mc_map(addr, haddr);
+		ip_ib_mc_map(addr, dev->broadcast, haddr);
 		return 0;
 	default:
 		if (dir) {
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 67997a7..6d54f7e 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -337,7 +337,7 @@ int ndisc_mc_map(struct in6_addr *addr, char *buf, struct net_device *dev, int d
 		ipv6_arcnet_mc_map(addr, buf);
 		return 0;
 	case ARPHRD_INFINIBAND:
-		ipv6_ib_mc_map(addr, buf);
+		ipv6_ib_mc_map(addr, dev->broadcast, buf);
 		return 0;
 	default:
 		if (dir) {


From rvm at obsidianresearch.com  Mon Dec 10 12:39:49 2007
From: rvm at obsidianresearch.com (Rolf Manderscheid)
Date: Mon, 10 Dec 2007 13:39:49 -0700
Subject: [ofa-general] [PATCH 2/4] ipoib: make broadcast scope configurable
In-Reply-To: <20071210203544.GI30090@obsidianresearch.com>
References: <20071210203544.GI30090@obsidianresearch.com>
Message-ID: <20071210203949.GK30090@obsidianresearch.com>

Add a broadcast_scope attribute to parent ipoib devices to enable ipoib over routers.
The scope can be changed provided that the link is down.  Some mgids are mapped early
so the scope field in the mgid needs to be fixed up before use now that the scope is
configurable.  The pkey fixups have been removed because the mapping functions now
take care of that.

Signed-off-by: Rolf Manderscheid <rvm at obsidianresearch.com>

---

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index fe63723..7b6627f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1056,6 +1056,43 @@ int ipoib_add_umcast_attr(struct net_device *dev)
 	return device_create_file(&dev->dev, &dev_attr_umcast);
 }
 
+static ssize_t show_bcast_scope(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct ipoib_dev_priv *priv = netdev_priv(to_net_dev(dev));
+
+	return sprintf(buf, "0x%x\n", priv->dev->broadcast[5] & 0xF);
+}
+
+static ssize_t set_bcast_scope(struct device *dev,
+			       struct device_attribute *attr,
+			       const char *buf, size_t count)
+{
+	struct ipoib_dev_priv *priv = netdev_priv(to_net_dev(dev));
+	int scope;
+
+	if (priv->dev->flags & IFF_UP)
+		return -EBUSY;
+
+	if (sscanf(buf, "%i", &scope) != 1)
+		return -EINVAL;
+
+	switch (scope) {
+	case 0x2: /* link-local */
+	case 0x5: /* site-local */
+	case 0x8: /* organization-local */
+	case 0xE: /* global */
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	priv->dev->broadcast[5] &= ~0xF;
+	priv->dev->broadcast[5] |= scope;
+	return count;
+}
+static DEVICE_ATTR(broadcast_scope, S_IWUSR | S_IRUGO, show_bcast_scope, set_bcast_scope);
+
 static ssize_t create_child(struct device *dev,
 			    struct device_attribute *attr,
 			    const char *buf, size_t count)
@@ -1179,6 +1216,8 @@ static struct net_device *ipoib_add_port(const char *format,
 		goto sysfs_failed;
 	if (device_create_file(&priv->dev->dev, &dev_attr_delete_child))
 		goto sysfs_failed;
+	if (device_create_file(&priv->dev->dev, &dev_attr_broadcast_scope))
+		goto sysfs_failed;
 
 	return priv->dev;
 
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 858ada1..00e29b9 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -788,9 +788,11 @@ void ipoib_mcast_restart_task(struct work_struct *work)
 
 		memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid);
 
-		/* Add in the P_Key */
-		mgid.raw[4] = (priv->pkey >> 8) & 0xff;
-		mgid.raw[5] = priv->pkey & 0xff;
+		/* FIXME: ipv6 maps the all-nodes multicast group at device creation,
+		   so the mapping can change if the broadcast_scope is changed.  If
+		   the ipv6 core can delay joining the all-nodes group until after
+		   the link is brought up, then this can go away: */
+		mgid.raw[1] = (mgid.raw[1] & ~0xF) | (priv->dev->broadcast[5] & 0xF);
 
 		mcast = __ipoib_mcast_find(dev, &mgid);
 		if (!mcast || test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) {


From rvm at obsidianresearch.com  Mon Dec 10 12:41:16 2007
From: rvm at obsidianresearch.com (Rolf Manderscheid)
Date: Mon, 10 Dec 2007 13:41:16 -0700
Subject: [ofa-general] [PATCH 3/4] ipoib: support multiple devices on the
	same partition
In-Reply-To: <20071210203544.GI30090@obsidianresearch.com>
References: <20071210203544.GI30090@obsidianresearch.com>
Message-ID: <20071210204116.GL30090@obsidianresearch.com>

Multiple devices can coexist in a partition provided they have different scopes.
This patch adds an optional scope argument to the create_child / delete_child
attributes.  Created child devices are still named the same way as before, so
in order to create multiple child devices in the same partition, an interface
name change will be necessary to make room for subsequent children, eg:
	echo 0xffff 0xe > /sys/class/net/ib0/create_child
	ip link set ib0.ffff name ib0.global

Signed-off-by: Rolf Manderscheid <rvm at obsidianresearch.com>

---

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index d35025f..a1834d8 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -450,13 +450,29 @@ void ipoib_transport_dev_cleanup(struct net_device *dev);
 void ipoib_event(struct ib_event_handler *handler,
 		 struct ib_event *record);
 
-int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey);
-int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey);
+int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey,
+		   unsigned char scope);
+int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey,
+		      unsigned char scope);
 
 void ipoib_pkey_poll(struct work_struct *work);
 int ipoib_pkey_dev_delay_open(struct net_device *dev);
 void ipoib_drain_cq(struct net_device *dev);
 
+int ipoib_pkey_scope_in_use(struct net_device *dev, unsigned short pkey,
+			    unsigned char scope);
+
+static inline unsigned char ipoib_get_scope(struct net_device *dev)
+{
+	return dev->broadcast[5] & 0xF;
+}
+
+static inline void ipoib_set_scope(struct net_device *dev, unsigned char scope)
+{
+	dev->broadcast[5] &= ~0xF;
+	dev->broadcast[5] |= scope;
+}
+
 #ifdef CONFIG_INFINIBAND_IPOIB_CM
 
 #define IPOIB_FLAGS_RC		0x80
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 7b6627f..07cce7f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -79,6 +79,8 @@ static const u8 ipv4_bcast_addr[] = {
 	0x00, 0x00, 0x00, 0x00,	0xff, 0xff, 0xff, 0xff
 };
 
+static const u8 default_scope = 2; /* link local */
+
 struct workqueue_struct *ipoib_workqueue;
 
 struct ib_sa_client ipoib_sa_client;
@@ -1059,9 +1061,20 @@ int ipoib_add_umcast_attr(struct net_device *dev)
 static ssize_t show_bcast_scope(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
-	struct ipoib_dev_priv *priv = netdev_priv(to_net_dev(dev));
+	return sprintf(buf, "0x%x\n", (int) ipoib_get_scope(to_net_dev(dev)));
+}
 
-	return sprintf(buf, "0x%x\n", priv->dev->broadcast[5] & 0xF);
+static int valid_scope(unsigned char scope)
+{
+	switch (scope) {
+	case 0x2: /* link-local */
+	case 0x5: /* site-local */
+	case 0x8: /* organization-local */
+	case 0xE: /* global */
+		return 1;
+	default:
+		return 0;
+	}
 }
 
 static ssize_t set_bcast_scope(struct device *dev,
@@ -1074,21 +1087,16 @@ static ssize_t set_bcast_scope(struct device *dev,
 	if (priv->dev->flags & IFF_UP)
 		return -EBUSY;
 
-	if (sscanf(buf, "%i", &scope) != 1)
+	if (sscanf(buf, "%i", &scope) != 1 || !valid_scope(scope))
 		return -EINVAL;
 
-	switch (scope) {
-	case 0x2: /* link-local */
-	case 0x5: /* site-local */
-	case 0x8: /* organization-local */
-	case 0xE: /* global */
-		break;
-	default:
-		return -EINVAL;
-	}
+	if (ipoib_get_scope(priv->dev) == scope) /* no change */
+		return count;
 
-	priv->dev->broadcast[5] &= ~0xF;
-	priv->dev->broadcast[5] |= scope;
+	if (ipoib_pkey_scope_in_use(priv->dev, priv->pkey, scope))
+		return -ENOTUNIQ;
+
+	ipoib_set_scope(priv->dev, scope);
 	return count;
 }
 static DEVICE_ATTR(broadcast_scope, S_IWUSR | S_IRUGO, show_bcast_scope, set_bcast_scope);
@@ -1098,21 +1106,25 @@ static ssize_t create_child(struct device *dev,
 			    const char *buf, size_t count)
 {
 	int pkey;
+	int scope = default_scope;
 	int ret;
 
-	if (sscanf(buf, "%i", &pkey) != 1)
+	if (sscanf(buf, "%i %i", &pkey, &scope) < 1)
 		return -EINVAL;
 
 	if (pkey < 0 || pkey > 0xffff)
 		return -EINVAL;
 
+	if (!valid_scope(scope))
+		return -EINVAL;
+
 	/*
 	 * Set the full membership bit, so that we join the right
 	 * broadcast group, etc.
 	 */
 	pkey |= 0x8000;
 
-	ret = ipoib_vlan_add(to_net_dev(dev), pkey);
+	ret = ipoib_vlan_add(to_net_dev(dev), pkey, scope);
 
 	return ret ? ret : count;
 }
@@ -1123,21 +1135,44 @@ static ssize_t delete_child(struct device *dev,
 			    const char *buf, size_t count)
 {
 	int pkey;
+	int scope = default_scope;
 	int ret;
 
-	if (sscanf(buf, "%i", &pkey) != 1)
+	if (sscanf(buf, "%i %i", &pkey, &scope) < 1)
 		return -EINVAL;
 
 	if (pkey < 0 || pkey > 0xffff)
 		return -EINVAL;
 
-	ret = ipoib_vlan_delete(to_net_dev(dev), pkey);
+	if (!valid_scope(scope))
+		return -EINVAL;
+
+	ret = ipoib_vlan_delete(to_net_dev(dev), pkey, scope);
 
 	return ret ? ret : count;
 
 }
 static DEVICE_ATTR(delete_child, S_IWUGO, NULL, delete_child);
 
+int ipoib_pkey_scope_in_use(struct net_device *dev, unsigned short pkey, unsigned char scope)
+{
+	struct ipoib_dev_priv *ppriv, *priv;
+
+	ppriv = netdev_priv(dev);
+	/*
+	 * We check the parent device and then all of the child interfaces to make sure
+	 * the Pkey and scope don't match.
+	 */
+	if (ppriv->pkey == pkey && ipoib_get_scope(dev) == scope)
+		return 1;
+
+	list_for_each_entry(priv, &ppriv->child_intfs, list)
+		if (priv->pkey == pkey && ipoib_get_scope(priv->dev) == scope)
+			return 1;
+
+	return 0;
+}
+
 int ipoib_add_pkey_attr(struct net_device *dev)
 {
 	return device_create_file(&dev->dev, &dev_attr_pkey);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
index 293f5b8..280556f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
@@ -52,7 +52,8 @@ static ssize_t show_parent(struct device *d, struct device_attribute *attr,
 }
 static DEVICE_ATTR(parent, S_IRUGO, show_parent, NULL);
 
-int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
+int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey,
+		   unsigned char scope)
 {
 	struct ipoib_dev_priv *ppriv, *priv;
 	char intf_name[IFNAMSIZ];
@@ -65,22 +66,11 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
 
 	mutex_lock(&ppriv->vlan_mutex);
 
-	/*
-	 * First ensure this isn't a duplicate. We check the parent device and
-	 * then all of the child interfaces to make sure the Pkey doesn't match.
-	 */
-	if (ppriv->pkey == pkey) {
+	if (ipoib_pkey_scope_in_use(pdev, pkey, scope)) {
 		result = -ENOTUNIQ;
 		goto err;
 	}
 
-	list_for_each_entry(priv, &ppriv->child_intfs, list) {
-		if (priv->pkey == pkey) {
-			result = -ENOTUNIQ;
-			goto err;
-		}
-	}
-
 	snprintf(intf_name, sizeof intf_name, "%s.%04x",
 		 ppriv->dev->name, pkey);
 	priv = ipoib_intf_alloc(intf_name);
@@ -97,6 +87,8 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
 	priv->dev->broadcast[8] = pkey >> 8;
 	priv->dev->broadcast[9] = pkey & 0xff;
 
+	ipoib_set_scope(priv->dev, scope);
+
 	result = ipoib_dev_init(priv->dev, ppriv->ca, ppriv->port);
 	if (result < 0) {
 		ipoib_warn(ppriv, "failed to initialize subinterface: "
@@ -146,7 +138,8 @@ err:
 	return result;
 }
 
-int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey)
+int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey,
+		      unsigned char scope)
 {
 	struct ipoib_dev_priv *ppriv, *priv, *tpriv;
 	int ret = -ENOENT;
@@ -158,7 +151,7 @@ int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey)
 
 	mutex_lock(&ppriv->vlan_mutex);
 	list_for_each_entry_safe(priv, tpriv, &ppriv->child_intfs, list) {
-		if (priv->pkey == pkey) {
+		if (priv->pkey == pkey && ipoib_get_scope(priv->dev) == scope) {
 			unregister_netdev(priv->dev);
 			ipoib_dev_cleanup(priv->dev);
 			list_del(&priv->list);


From rvm at obsidianresearch.com  Mon Dec 10 12:42:12 2007
From: rvm at obsidianresearch.com (Rolf Manderscheid)
Date: Mon, 10 Dec 2007 13:42:12 -0700
Subject: [ofa-general] [PATCH 4/4] ipoib: add broadcast_scope attribute to
	child devices
In-Reply-To: <20071210203544.GI30090@obsidianresearch.com>
References: <20071210203544.GI30090@obsidianresearch.com>
Message-ID: <20071210204212.GM30090@obsidianresearch.com>

This patch just makes child devices slightly more consistent with parent devices
with respect to configuring the scope.  It allows the administrator to correct
the scope after the child has already been created.  The scope argument to
create_child is still needed to support a parent device with default scope
and a child device within the same partition.

Signed-off-by: Rolf Manderscheid <rvm at obsidianresearch.com>

---

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index a1834d8..78d5b2d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -390,6 +390,7 @@ static inline void ipoib_put_ah(struct ipoib_ah *ah)
 
 int ipoib_open(struct net_device *dev);
 int ipoib_add_pkey_attr(struct net_device *dev);
+int ipoib_add_broadcast_scope_attr(struct net_device *dev);
 int ipoib_add_umcast_attr(struct net_device *dev);
 
 void ipoib_send(struct net_device *dev, struct sk_buff *skb,
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 07cce7f..f72266b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1101,6 +1101,11 @@ static ssize_t set_bcast_scope(struct device *dev,
 }
 static DEVICE_ATTR(broadcast_scope, S_IWUSR | S_IRUGO, show_bcast_scope, set_bcast_scope);
 
+int ipoib_add_broadcast_scope_attr(struct net_device *dev)
+{
+	return device_create_file(&dev->dev, &dev_attr_broadcast_scope);
+}
+
 static ssize_t create_child(struct device *dev,
 			    struct device_attribute *attr,
 			    const char *buf, size_t count)
@@ -1159,6 +1164,11 @@ int ipoib_pkey_scope_in_use(struct net_device *dev, unsigned short pkey, unsigne
 	struct ipoib_dev_priv *ppriv, *priv;
 
 	ppriv = netdev_priv(dev);
+	if (ppriv->parent) {
+	    dev = ppriv->parent;
+	    ppriv = netdev_priv(dev);
+	}
+
 	/*
 	 * We check the parent device and then all of the child interfaces to make sure
 	 * the Pkey and scope don't match.
@@ -1251,7 +1261,7 @@ static struct net_device *ipoib_add_port(const char *format,
 		goto sysfs_failed;
 	if (device_create_file(&priv->dev->dev, &dev_attr_delete_child))
 		goto sysfs_failed;
-	if (device_create_file(&priv->dev->dev, &dev_attr_broadcast_scope))
+	if (ipoib_add_broadcast_scope_attr(priv->dev))
 		goto sysfs_failed;
 
 	return priv->dev;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
index 280556f..317a79d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
@@ -116,6 +116,8 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey,
 
 	if (device_create_file(&priv->dev->dev, &dev_attr_parent))
 		goto sysfs_failed;
+	if (ipoib_add_broadcast_scope_attr(priv->dev))
+		goto sysfs_failed;
 
 	list_add_tail(&priv->list, &ppriv->child_intfs);
 

From YJia at tmriusa.com  Mon Dec 10 13:24:21 2007
From: YJia at tmriusa.com (Yicheng Jia)
Date: Mon, 10 Dec 2007 15:24:21 -0600
Subject: [ofa-general] install OFED-1.2.5 on cygwin
Message-ID: <OF7A0D6D72.C835334F-ON862573AD.00753EAB-862573AD.0075A523@medical.local>

Hi All,

I'm a newbie to OFED. Can I install OFED-1.2.5 package in cygwin 
environment? I got such error when I tried to do it:

Checking dependencies. Please wait ...

ERROR: There are no sources for the 1.5.24(0.156/4/2) kernel installed. 
Please install 1.5.24(0.156/4/2) kernel sources to build RPMs on this 
system

Does anyone know how to solve it?

Thanks!
Yicheng Jia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/e84ac73a/attachment.html>

From weiny2 at llnl.gov  Mon Dec 10 13:35:54 2007
From: weiny2 at llnl.gov (Ira Weiny)
Date: Mon, 10 Dec 2007 13:35:54 -0800
Subject: [ofa-general] [PATCH] mstflint: Convert project to autoconf tools.
Message-ID: <20071210133554.44bac886.weiny2@llnl.gov>

This patch removes the makefile and converts the mstflint git tree over to
autoconf tools.  This works great on x86_64 but has not been tested on other
arch's.  (Although it is simple enough I don't see how would not work.)

Thanks,
Ira


>From efb3a07a1f333ea95204d2a2e9462e285e29a65f Mon Sep 17 00:00:00 2001
From: Ira K. Weiny <weiny2 at llnl.gov>
Date: Mon, 10 Dec 2007 13:30:22 -0800
Subject: [PATCH] Convert project to autoconf tools.


Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
---
 Makefile         |   47 -----------------------------------------------
 Makefile.am      |   21 +++++++++++++++++++++
 autogen.sh       |   11 +++++++++++
 configure.in     |   22 ++++++++++++++++++++++
 mstflint.spec.in |   45 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 99 insertions(+), 47 deletions(-)
 delete mode 100644 Makefile
 create mode 100644 Makefile.am
 create mode 100755 autogen.sh
 create mode 100644 configure.in
 create mode 100644 mstflint.spec.in

diff --git a/Makefile b/Makefile
deleted file mode 100644
index 889c97a..0000000
--- a/Makefile
+++ /dev/null
@@ -1,47 +0,0 @@
-#default options
-CFLAGS += -O2
-CFLAGS += -g
-CFLAGS += -Wall
-CXXFLAGS += -fno-exceptions
-CFLAGS += -I.
-LD=$(CXX)
-EXTRA_LOADLIBES=-lz
-LOADLIBES+=${EXTRA_LOADLIBES}
-
-all: default
-bin: mstflint mstmread mstmwrite mstregdump mstvpd
-
-default: bin
-static: bin
-shared: bin
-
-.PHONY: all bin clean static shared default
-.DELETE_ON_ERROR:
-
-default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libz.a)" "$(shell $(CXX)  ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libstdc++.a)"
-default: LD=$(CC)
-static: CFLAGS+=-static
-
-mstflint: mstflint.o mflash.o
-	$(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} mstflint.o mflash.o -o mstflint ${LOADLIBES}
-
-mstflint.o: flint.cpp mflash.h
-	$(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o mstflint.o
-
-mflash.o: mtcr.h mflash.c mflash.h
-	$(CC) ${CFLAGS} -c mflash.c -o mflash.o
-
-mstmwrite: mwrite.c mtcr.h
-	$(CC) ${CFLAGS} mwrite.c -o mstmwrite
-
-mstmread: mread.c mtcr.h
-	$(CC) ${CFLAGS} mread.c -o mstmread
-
-mstregdump: mstdump.c mtcr.h
-	$(CC) ${CFLAGS} mstdump.c -o mstregdump
-
-mstvpd: vpd.c
-	$(CC) ${CFLAGS} vpd.c -o mstvpd
-
-clean:
-	rm -f mstvpd mstregdump mstflint mstmread mstmwrite mstflint.o mflash.o
diff --git a/Makefile.am b/Makefile.am
new file mode 100644
index 0000000..f642d9d
--- /dev/null
+++ b/Makefile.am
@@ -0,0 +1,21 @@
+bin_PROGRAMS = mstmread \
+					mstmwrite \
+					mstflint \
+					mstregdump \
+					mstvpd
+
+mstmread_SOURCES = mread.c mtcr.h
+
+mstmwrite_SOURCES = mwrite.c mtcr.h
+
+mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
+mstflint_LDFLAGS = -lz
+
+mstregdump_SOURCES = mread.c mtcr.h
+
+mstvpd_SOURCES = vpd.c
+
+
+EXTRA_DIST = \
+	mstflint.spec
+
diff --git a/autogen.sh b/autogen.sh
new file mode 100755
index 0000000..4827884
--- /dev/null
+++ b/autogen.sh
@@ -0,0 +1,11 @@
+#! /bin/sh
+
+# create config dir if not exist
+test -d config || mkdir config
+
+set -x
+aclocal -I config
+libtoolize --force --copy
+autoheader
+automake --foreign --add-missing --copy
+autoconf
diff --git a/configure.in b/configure.in
new file mode 100644
index 0000000..0924d65
--- /dev/null
+++ b/configure.in
@@ -0,0 +1,22 @@
+dnl Process this file with autoconf to produce a configure script.
+
+AC_INIT(mstflint)
+
+AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the project name.])
+AC_SUBST([PROJECT])
+
+AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the project version.])
+AC_SUBST([VERSION])
+
+AC_CONFIG_AUX_DIR(config)
+AC_CONFIG_SRCDIR([README])
+AM_INIT_AUTOMAKE(mstflint, 1.3)
+
+dnl Checks for programs
+AC_PROG_CC
+AC_PROG_CXX
+AC_PROG_LIBTOOL
+AC_CONFIG_HEADERS
+
+AC_CONFIG_FILES([Makefile mstflint.spec])
+AC_OUTPUT
diff --git a/mstflint.spec.in b/mstflint.spec.in
new file mode 100644
index 0000000..b5937be
--- /dev/null
+++ b/mstflint.spec.in
@@ -0,0 +1,45 @@
+Summary: Mellanox firmware burning application
+Name: mstflint
+Version: @VERSION@
+Release: 1
+License: GPL/BSD
+Url: http://openib.org/
+Group: System Environment/Base
+BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
+Source: mstflint- at VERSION@.tar.gz
+ExclusiveArch: i386 x86_64 ia64 ppc ppc64
+BuildRequires: zlib-devel
+Requires(post): chkconfig
+
+%description
+This package contains a tool for burning updated firmware on to
+Mellanox manufactured InfiniBand adapters.
+
+%prep
+%setup -q
+
+%build
+%configure
+make
+
+%install
+rm -rf $RPM_BUILD_ROOT
+make DESTDIR=${RPM_BUILD_ROOT} install
+# remove unpackaged files from the buildroot
+rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
+
+%clean
+rm -rf $RPM_BUILD_ROOT
+
+%files
+%defattr(-,root,root)
+%{_bindir}/mstmread
+%{_bindir}/mstmwrite
+%{_bindir}/mstflint
+%{_bindir}/mstregdump
+%{_bindir}/mstvpd
+
+%changelog
+* Fri Dec 07 2007 Ira Weiny <weiny2 at llnl.gov> 1.0.0
+   initial creation
+
-- 
1.5.1

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Convert-project-to-autoconf-tools.patch
Type: application/octet-stream
Size: 4652 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/59ea7719/attachment.obj>

From rdreier at cisco.com  Mon Dec 10 13:47:37 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 10 Dec 2007 13:47:37 -0800
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <200712101841.30010.fenkes@de.ibm.com> (Joachim Fenkes's message
	of "Mon, 10 Dec 2007 18:41:29 +0100")
References: <200712061607.20004.fenkes@de.ibm.com>
	<200712071058.38416.arnd@arndb.de> <adaodcza1xc.fsf@cisco.com>
	<200712101841.30010.fenkes@de.ibm.com>
Message-ID: <adahciq9q86.fsf@cisco.com>

 > >     map_phys_fmr
 > 
 > In fact, we do use hCalls there. Our hardware doesn't actually support FMRs,
 > so we translate a "map FMR" into a "reallocate PMR", which doesn't work
 > without hCalls. What's more, the hCalls involved (e.g. H_FREE_RESOURCE)
 > might well return H_LONG_BUSY, so the whole operation might sleep; no way
 > around it.

It's a big problem.  If you cannot implement FMRs in such a way that
you can handling having map_phys_fmr being called in a context that
can't sleep, then I think the only option is to remove your FMR
support.  It's an optional device feature, so this should be OK
(although the iSER driver currently seems to depend on a device
supporting FMRs, which is probably going to be a problem with iWARP
support in the future anyway).

The fact that consumers can map FMRs from interrupt context, while
holding locks, etc, is pretty fundamental to the use of FMRs so I
don't see any way around the requirement that map_phys_fmr never
sleep.

 - R.


From dwshafferpoolsm at shafferpools.com  Mon Dec 10 14:58:06 2007
From: dwshafferpoolsm at shafferpools.com (Simon Davis)
Date: Tue, 11 Dec 2007 00:58:06 +0200
Subject: [ofa-general] Enjoy easy and convenient online ordering process of
	purchasing with CanadianPharmacy.
Message-ID: <01c83b90$e39b6300$026de3c1@dwshafferpoolsm>

 {PharmaCanadaAll-1}
 {PharmaCanadaAll-1}

 {PharmaCanadaAll-2}

http://printball.cn

 Just right place to order meds online!

Simon Davis


From sean.hefty at intel.com  Mon Dec 10 15:53:25 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Mon, 10 Dec 2007 15:53:25 -0800
Subject: [ofa-general] [PATCH] IB/CM: add support for routed paths
In-Reply-To: <20071210203544.GI30090@obsidianresearch.com>
References: <20071210203544.GI30090@obsidianresearch.com>
Message-ID: <000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>

Paths with hop_limit > 1 indicate that the connection will be routed
between IB subnets.  Update the subnet local field in the CM REQ
based on the hop_limit value.  In addition, if the path is routed, then
set the LIDs in the REQ to the permissive LIDs.  This is used to indicate
to the passive side that it should use the LIDs in the received
local route header (LRH) associated with the REQ when programming the
QP.

This is a temporary work-around to the IB CM to support IB router
development until the IB router specification is completed.  It is not
anticipated that this work-around will cause any interoperability
issues with existing stacks or future stacks that will properly support
IB routers when defined.

Signed-off-by: Sean Hefty <sean.hefty at intel.com>
---
This is a related patch to add support for IB routers in the IB CM.  This
patch differs from previous patches in that the changes are limited to
the ib_cm module only, and the ib_cm interface remains unchanged.  I believe
that changes to the interface and ULPs will eventually be needed, but
it would be better to defer those changes once the router spec is more
defined.

This patch and those posted by Rolf are also available at:

git://git.openfabrics.org/~shefty/rdma-dev.git ib_router


 drivers/infiniband/core/cm.c |   91 +++++++++++++++++++++++++++++++-----------
 1 files changed, 67 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 2e39236..c3212b9 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004-2006 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2004-2007 Intel Corporation.  All rights reserved.
  * Copyright (c) 2004 Topspin Corporation.  All rights reserved.
  * Copyright (c) 2004, 2005 Voltaire Corporation.  All rights reserved.
  * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
@@ -895,6 +895,9 @@ static void cm_format_req(struct cm_req_msg *req_msg,
 			  struct cm_id_private *cm_id_priv,
 			  struct ib_cm_req_param *param)
 {
+	struct ib_sa_path_rec *pri_path = param->primary_path;
+	struct ib_sa_path_rec *alt_path = param->alternate_path;
+
 	cm_format_mad_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
 			  cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_REQ));
 
@@ -918,35 +921,46 @@ static void cm_format_req(struct cm_req_msg *req_msg,
 	cm_req_set_max_cm_retries(req_msg, param->max_cm_retries);
 	cm_req_set_srq(req_msg, param->srq);
 
-	req_msg->primary_local_lid = param->primary_path->slid;
-	req_msg->primary_remote_lid = param->primary_path->dlid;
-	req_msg->primary_local_gid = param->primary_path->sgid;
-	req_msg->primary_remote_gid = param->primary_path->dgid;
-	cm_req_set_primary_flow_label(req_msg, param->primary_path->flow_label);
-	cm_req_set_primary_packet_rate(req_msg, param->primary_path->rate);
-	req_msg->primary_traffic_class = param->primary_path->traffic_class;
-	req_msg->primary_hop_limit = param->primary_path->hop_limit;
-	cm_req_set_primary_sl(req_msg, param->primary_path->sl);
-	cm_req_set_primary_subnet_local(req_msg, 1); /* local only... */
+	if (pri_path->hop_limit <= 1) {
+		req_msg->primary_local_lid = pri_path->slid;
+		req_msg->primary_remote_lid = pri_path->dlid;
+	} else {
+		/* Work-around until there's a way to obtain remote LID info */
+		req_msg->primary_local_lid = IB_LID_PERMISSIVE;
+		req_msg->primary_remote_lid = IB_LID_PERMISSIVE;
+	}
+	req_msg->primary_local_gid = pri_path->sgid;
+	req_msg->primary_remote_gid = pri_path->dgid;
+	cm_req_set_primary_flow_label(req_msg, pri_path->flow_label);
+	cm_req_set_primary_packet_rate(req_msg, pri_path->rate);
+	req_msg->primary_traffic_class = pri_path->traffic_class;
+	req_msg->primary_hop_limit = pri_path->hop_limit;
+	cm_req_set_primary_sl(req_msg, pri_path->sl);
+	cm_req_set_primary_subnet_local(req_msg, (pri_path->hop_limit <= 1));
 	cm_req_set_primary_local_ack_timeout(req_msg,
 		cm_ack_timeout(cm_id_priv->av.port->cm_dev->ack_delay,
-			       param->primary_path->packet_life_time));
+			       pri_path->packet_life_time));
 
-	if (param->alternate_path) {
-		req_msg->alt_local_lid = param->alternate_path->slid;
-		req_msg->alt_remote_lid = param->alternate_path->dlid;
-		req_msg->alt_local_gid = param->alternate_path->sgid;
-		req_msg->alt_remote_gid = param->alternate_path->dgid;
+	if (alt_path) {
+		if (alt_path->hop_limit <= 1) {
+			req_msg->alt_local_lid = alt_path->slid;
+			req_msg->alt_remote_lid = alt_path->dlid;
+		} else {
+			req_msg->alt_local_lid = IB_LID_PERMISSIVE;
+			req_msg->alt_remote_lid = IB_LID_PERMISSIVE;
+		}
+		req_msg->alt_local_gid = alt_path->sgid;
+		req_msg->alt_remote_gid = alt_path->dgid;
 		cm_req_set_alt_flow_label(req_msg,
-					  param->alternate_path->flow_label);
-		cm_req_set_alt_packet_rate(req_msg, param->alternate_path->rate);
-		req_msg->alt_traffic_class = param->alternate_path->traffic_class;
-		req_msg->alt_hop_limit = param->alternate_path->hop_limit;
-		cm_req_set_alt_sl(req_msg, param->alternate_path->sl);
-		cm_req_set_alt_subnet_local(req_msg, 1); /* local only... */
+					  alt_path->flow_label);
+		cm_req_set_alt_packet_rate(req_msg, alt_path->rate);
+		req_msg->alt_traffic_class = alt_path->traffic_class;
+		req_msg->alt_hop_limit = alt_path->hop_limit;
+		cm_req_set_alt_sl(req_msg, alt_path->sl);
+		cm_req_set_alt_subnet_local(req_msg, (alt_path->hop_limit <= 1));
 		cm_req_set_alt_local_ack_timeout(req_msg,
 			cm_ack_timeout(cm_id_priv->av.port->cm_dev->ack_delay,
-				       param->alternate_path->packet_life_time));
+				       alt_path->packet_life_time));
 	}
 
 	if (param->private_data && param->private_data_len)
@@ -1359,6 +1373,34 @@ out:
 	return listen_cm_id_priv;
 }
 
+/*
+ * Work-around for inter-subnet connections.  If the LIDs are permissive,
+ * we need to override the LID/SL data in the REQ with the LID information
+ * in the work completion.
+ */
+static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
+{
+	if (!cm_req_get_primary_subnet_local(req_msg)) {
+		if (req_msg->primary_local_lid == IB_LID_PERMISSIVE) {
+			req_msg->primary_local_lid = cpu_to_be16(wc->slid);
+			cm_req_set_primary_sl(req_msg, wc->sl);
+		}
+
+		if (req_msg->primary_remote_lid == IB_LID_PERMISSIVE)
+			req_msg->primary_remote_lid = cpu_to_be16(wc->dlid_path_bits);
+	}
+
+	if (!cm_req_get_alt_subnet_local(req_msg)) {
+		if (req_msg->alt_local_lid == IB_LID_PERMISSIVE) {
+			req_msg->alt_local_lid = cpu_to_be16(wc->slid);
+			cm_req_set_alt_sl(req_msg, wc->sl);
+		}
+
+		if (req_msg->alt_remote_lid == IB_LID_PERMISSIVE)
+			req_msg->alt_remote_lid = cpu_to_be16(wc->dlid_path_bits);
+	}
+}
+
 static int cm_req_handler(struct cm_work *work)
 {
 	struct ib_cm_id *cm_id;
@@ -1399,6 +1441,7 @@ static int cm_req_handler(struct cm_work *work)
 	cm_id_priv->id.service_id = req_msg->service_id;
 	cm_id_priv->id.service_mask = __constant_cpu_to_be64(~0ULL);
 
+	cm_process_routed_req(req_msg, work->mad_recv_wc->wc);
 	cm_format_paths_from_req(req_msg, &work->path[0], &work->path[1]);
 	ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av);
 	if (ret) {


From chu11 at llnl.gov  Mon Dec 10 16:39:37 2007
From: chu11 at llnl.gov (Al Chu)
Date: Mon, 10 Dec 2007 16:39:37 -0800
Subject: [ofa-general] [PATCH] OpenSM: Fix error return corner case
Message-ID: <1197333577.29314.154.camel@cardanus.llnl.gov>

Hey Sasha,

I noticed that in osm_ucast_updn_setup(), the code has set the
context/callbacks in the routing engine struct, but does not revert the
changes if the later call to updn_init() fails.  The callbacks would be
left in place and erroneously executed at a later time.  Patch is
attached.

Thanks,
Al

-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-fix-error-return-corner-case.patch
Type: text/x-patch
Size: 1814 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071210/250c1faa/attachment.bin>

From dwskchilem at skchile.cl  Mon Dec 10 18:10:58 2007
From: dwskchilem at skchile.cl (Aurelia Bartlett)
Date: Tue, 11 Dec 2007 12:10:58 +1000
Subject: [ofa-general] Sign up & collect $500! 
Message-ID: <01c83bee$e331b500$f812234d@dwskchilem>

   
Never lose again!
 Big Dollars casino gives you a chance to win!
Sign up & collect $500!  
Participate in friendly tournaments of any game you like!  
A giant progressive jackpot waits to be grabbed! 
 USA players, you're in luck!

Make Big Dollars your new cash cow and find out what VIP is all about$!
Don't miss the cash-giving spree at Big Dollars.  
Let it roll and get yourself some new money!   

REGISTER
PLAY
WIN!
  Download and Play  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/caf9437d/attachment.html>

From kliteyn at mellanox.co.il  Mon Dec 10 21:18:45 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 11 Dec 2007 07:18:45 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-11:normal completion
Message-ID: <MTLEXCH01oZolmRBaYF0000035b@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-10
OpenSM git rev = Sat_Dec_8_00:46:58_2007 [139abe25ada3bae14b1173fb2842b1fe1c7d171a]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From maricel.estay at icce.cl  Mon Dec 10 21:12:04 2007
From: maricel.estay at icce.cl (CHILEMAIL)
Date: Tue, 11 Dec 2007 01:12:04 -0400
Subject: [ofa-general] Base De Datos
Message-ID: <20071210.LGEMQPHAESIMGHZX@icce.cl>

Visita el link para envío de correos masivos 
http://oferta.deremate.cl/id=18284417_correos-masivos-directo-a-la-bandeja-de-entrada
visita este link para venta de base de datos
http://oferta.deremate.cl/id=18582352_bases-de-datos-a-pedido-para-mailing-marketing
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/d3db3847/attachment.html>

From FENKES at de.ibm.com  Tue Dec 11 00:38:58 2007
From: FENKES at de.ibm.com (Joachim Fenkes)
Date: Tue, 11 Dec 2007 09:38:58 +0100
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on
	POWER5
In-Reply-To: <adahciq9q86.fsf@cisco.com>
Message-ID: <OFD9564F75.44193623-ONC12573AE.002EA542-C12573AE.002F5FBC@de.ibm.com>

Roland Dreier <rdreier at cisco.com> wrote on 10.12.2007 22:47:37:

> It's a big problem.  If you cannot implement FMRs in such a way that
> you can handling having map_phys_fmr being called in a context that
> can't sleep, then I think the only option is to remove your FMR
> support.

That's kind of what I feared you would say =)

> It's an optional device feature, so this should be OK
> (although the iSER driver currently seems to depend on a device
> supporting FMRs, which is probably going to be a problem with iWARP
> support in the future anyway).

I don't feel very well with removing code from the driver that iSER seems 
to depend on. Are there plans to fix this in iSER?

In reality, PHYP rarely ever returns H_LONG_BUSY, and we haven't had any 
problems with iSER in the field yet. I admit that our FMR code is 
dangerous, but I prefer "dangerous but working for the customer" over "not 
working for the customer at all".
 
Maybe we can agree on keeping the status quo until no more ULPs depend on 
FMR, then remove FMR from ehca? If so, we'd also let the _irqsave 
spinlocks around hCalls stay in place.

Regards,
  Joachim


From dwsaadm at saad.com  Tue Dec 11 02:23:00 2007
From: dwsaadm at saad.com (Marlon Ladd)
Date: Tue, 11 Dec 2007 13:23:00 +0300
Subject: [ofa-general] =?iso-8859-1?q?Safe_play=85_OUR_money_on_YOUR_way!?=
Message-ID: <01c83bf8$f34e9200$1385f557@dwsaadm>

   
1 - REGISTER
2 - PLAY
3 - WIN!

EVERYONE including players from the USA are invited 
to join the fun where you'll: 

-	Get a $500 bonus immediately!!!

-	Get a chance to win our huge progressive jackpot - see it climb

-	Participate in tournaments in all your favorite games! 

-	Make deposits and collect your winnings quickly, safely & securely!
-	Get dedicated online support 
-	Enjoy a respected, award-winning establishment and join thousands of happy patrons 
Download Casino HERE 

http://geocities.com/FaithBoyd52/ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/28a0a2cd/attachment.html>

From vlad at lists.openfabrics.org  Tue Dec 11 03:09:10 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Tue, 11 Dec 2007 03:09:10 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071211-0200 daily build status
Message-ID: <20071211110910.48252E60051@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.19
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.17
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.14
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.13
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.22
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.15
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.16.21-0.8-smp

Failed:


From iaywvl at bnslink.com  Tue Dec 11 04:38:18 2007
From: iaywvl at bnslink.com (Maricela Carrier)
Date: Tue, 11 Dec 2007 13:38:18 +0100
Subject: [ofa-general] Save on quality software!
Message-ID: <01c83bfb$167a4100$78f7b84f@iaywvl>

  Brilliant opportunity to get software right at the same time you need it without waiting for a CD to be delivered. Just pay money and download your soft. Low prices, discounts and special offers! Most popular localized software in German, French, Italian, Spanish, English and many other languages of the world!

 Accept this brilliant offer and take the advantage of our free installation consultations. Money back guarantee is available.

http://geocities.com/ChristopherKim21/

   Incredible selection of programs and applications!


From dwselllasvegasm at selllasvegas.com  Tue Dec 11 04:51:06 2007
From: dwselllasvegasm at selllasvegas.com (Michael Hyster)
Date: Tue, 11 Dec 2007 14:51:06 +0200
Subject: [ofa-general] Great news about saving opportunities.
Message-ID: <01c83c05$42022900$61e68959@dwselllasvegasm>

 Canadian Pharmacy offers 100% safe, effective and pure generic medications of extremely high quality as they are purchased from world famous pharmaceutical manufacturers. Check our surprisingly great selection of meds and make order. Confidentiality and discreet purchasing is guaranteed. 

http://behindwere.cn

 Make significant savings buying medications in Canada!

Michael Hyster


From hrosenstock at xsigo.com  Tue Dec 11 04:59:45 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Tue, 11 Dec 2007 04:59:45 -0800
Subject: [ofa-general] [PATCH] IB/CM: add support for routed paths
In-Reply-To: <000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>
References: <20071210203544.GI30090@obsidianresearch.com>
	<000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>
Message-ID: <1197377985.8114.637.camel@hrosenstock-ws.xsigo.com>

Sean,

On Mon, 2007-12-10 at 15:53 -0800, Sean Hefty wrote:
> Paths with hop_limit > 1 indicate that the connection will be routed
> between IB subnets.  Update the subnet local field in the CM REQ
> based on the hop_limit value.  In addition, if the path is routed, then
> set the LIDs in the REQ to the permissive LIDs.  This is used to indicate
> to the passive side that it should use the LIDs in the received
> local route header (LRH) associated with the REQ when programming the
> QP.

Just wondering:

Could the subnet local component(s) in the REQ be used on the passive
side instead of permissive LIDs ? That might be more "standard" than
using permissive LIDs.

-- Hal

> This is a temporary work-around to the IB CM to support IB router
> development until the IB router specification is completed.  It is not
> anticipated that this work-around will cause any interoperability
> issues with existing stacks or future stacks that will properly support
> IB routers when defined.
> 
> Signed-off-by: Sean Hefty <sean.hefty at intel.com>
> ---
> This is a related patch to add support for IB routers in the IB CM.  This
> patch differs from previous patches in that the changes are limited to
> the ib_cm module only, and the ib_cm interface remains unchanged.  I believe
> that changes to the interface and ULPs will eventually be needed, but
> it would be better to defer those changes once the router spec is more
> defined.
> 
> This patch and those posted by Rolf are also available at:
> 
> git://git.openfabrics.org/~shefty/rdma-dev.git ib_router
> 
> 
>  drivers/infiniband/core/cm.c |   91 +++++++++++++++++++++++++++++++-----------
>  1 files changed, 67 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index 2e39236..c3212b9 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2004-2006 Intel Corporation.  All rights reserved.
> + * Copyright (c) 2004-2007 Intel Corporation.  All rights reserved.
>   * Copyright (c) 2004 Topspin Corporation.  All rights reserved.
>   * Copyright (c) 2004, 2005 Voltaire Corporation.  All rights reserved.
>   * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
> @@ -895,6 +895,9 @@ static void cm_format_req(struct cm_req_msg *req_msg,
>  			  struct cm_id_private *cm_id_priv,
>  			  struct ib_cm_req_param *param)
>  {
> +	struct ib_sa_path_rec *pri_path = param->primary_path;
> +	struct ib_sa_path_rec *alt_path = param->alternate_path;
> +
>  	cm_format_mad_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
>  			  cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_REQ));
>  
> @@ -918,35 +921,46 @@ static void cm_format_req(struct cm_req_msg *req_msg,
>  	cm_req_set_max_cm_retries(req_msg, param->max_cm_retries);
>  	cm_req_set_srq(req_msg, param->srq);
>  
> -	req_msg->primary_local_lid = param->primary_path->slid;
> -	req_msg->primary_remote_lid = param->primary_path->dlid;
> -	req_msg->primary_local_gid = param->primary_path->sgid;
> -	req_msg->primary_remote_gid = param->primary_path->dgid;
> -	cm_req_set_primary_flow_label(req_msg, param->primary_path->flow_label);
> -	cm_req_set_primary_packet_rate(req_msg, param->primary_path->rate);
> -	req_msg->primary_traffic_class = param->primary_path->traffic_class;
> -	req_msg->primary_hop_limit = param->primary_path->hop_limit;
> -	cm_req_set_primary_sl(req_msg, param->primary_path->sl);
> -	cm_req_set_primary_subnet_local(req_msg, 1); /* local only... */
> +	if (pri_path->hop_limit <= 1) {
> +		req_msg->primary_local_lid = pri_path->slid;
> +		req_msg->primary_remote_lid = pri_path->dlid;
> +	} else {
> +		/* Work-around until there's a way to obtain remote LID info */
> +		req_msg->primary_local_lid = IB_LID_PERMISSIVE;
> +		req_msg->primary_remote_lid = IB_LID_PERMISSIVE;
> +	}
> +	req_msg->primary_local_gid = pri_path->sgid;
> +	req_msg->primary_remote_gid = pri_path->dgid;
> +	cm_req_set_primary_flow_label(req_msg, pri_path->flow_label);
> +	cm_req_set_primary_packet_rate(req_msg, pri_path->rate);
> +	req_msg->primary_traffic_class = pri_path->traffic_class;
> +	req_msg->primary_hop_limit = pri_path->hop_limit;
> +	cm_req_set_primary_sl(req_msg, pri_path->sl);
> +	cm_req_set_primary_subnet_local(req_msg, (pri_path->hop_limit <= 1));
>  	cm_req_set_primary_local_ack_timeout(req_msg,
>  		cm_ack_timeout(cm_id_priv->av.port->cm_dev->ack_delay,
> -			       param->primary_path->packet_life_time));
> +			       pri_path->packet_life_time));
>  
> -	if (param->alternate_path) {
> -		req_msg->alt_local_lid = param->alternate_path->slid;
> -		req_msg->alt_remote_lid = param->alternate_path->dlid;
> -		req_msg->alt_local_gid = param->alternate_path->sgid;
> -		req_msg->alt_remote_gid = param->alternate_path->dgid;
> +	if (alt_path) {
> +		if (alt_path->hop_limit <= 1) {
> +			req_msg->alt_local_lid = alt_path->slid;
> +			req_msg->alt_remote_lid = alt_path->dlid;
> +		} else {
> +			req_msg->alt_local_lid = IB_LID_PERMISSIVE;
> +			req_msg->alt_remote_lid = IB_LID_PERMISSIVE;
> +		}
> +		req_msg->alt_local_gid = alt_path->sgid;
> +		req_msg->alt_remote_gid = alt_path->dgid;
>  		cm_req_set_alt_flow_label(req_msg,
> -					  param->alternate_path->flow_label);
> -		cm_req_set_alt_packet_rate(req_msg, param->alternate_path->rate);
> -		req_msg->alt_traffic_class = param->alternate_path->traffic_class;
> -		req_msg->alt_hop_limit = param->alternate_path->hop_limit;
> -		cm_req_set_alt_sl(req_msg, param->alternate_path->sl);
> -		cm_req_set_alt_subnet_local(req_msg, 1); /* local only... */
> +					  alt_path->flow_label);
> +		cm_req_set_alt_packet_rate(req_msg, alt_path->rate);
> +		req_msg->alt_traffic_class = alt_path->traffic_class;
> +		req_msg->alt_hop_limit = alt_path->hop_limit;
> +		cm_req_set_alt_sl(req_msg, alt_path->sl);
> +		cm_req_set_alt_subnet_local(req_msg, (alt_path->hop_limit <= 1));
>  		cm_req_set_alt_local_ack_timeout(req_msg,
>  			cm_ack_timeout(cm_id_priv->av.port->cm_dev->ack_delay,
> -				       param->alternate_path->packet_life_time));
> +				       alt_path->packet_life_time));
>  	}
>  
>  	if (param->private_data && param->private_data_len)
> @@ -1359,6 +1373,34 @@ out:
>  	return listen_cm_id_priv;
>  }
>  
> +/*
> + * Work-around for inter-subnet connections.  If the LIDs are permissive,
> + * we need to override the LID/SL data in the REQ with the LID information
> + * in the work completion.
> + */
> +static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
> +{
> +	if (!cm_req_get_primary_subnet_local(req_msg)) {
> +		if (req_msg->primary_local_lid == IB_LID_PERMISSIVE) {
> +			req_msg->primary_local_lid = cpu_to_be16(wc->slid);
> +			cm_req_set_primary_sl(req_msg, wc->sl);
> +		}
> +
> +		if (req_msg->primary_remote_lid == IB_LID_PERMISSIVE)
> +			req_msg->primary_remote_lid = cpu_to_be16(wc->dlid_path_bits);
> +	}
> +
> +	if (!cm_req_get_alt_subnet_local(req_msg)) {
> +		if (req_msg->alt_local_lid == IB_LID_PERMISSIVE) {
> +			req_msg->alt_local_lid = cpu_to_be16(wc->slid);
> +			cm_req_set_alt_sl(req_msg, wc->sl);
> +		}
> +
> +		if (req_msg->alt_remote_lid == IB_LID_PERMISSIVE)
> +			req_msg->alt_remote_lid = cpu_to_be16(wc->dlid_path_bits);
> +	}
> +}
> +
>  static int cm_req_handler(struct cm_work *work)
>  {
>  	struct ib_cm_id *cm_id;
> @@ -1399,6 +1441,7 @@ static int cm_req_handler(struct cm_work *work)
>  	cm_id_priv->id.service_id = req_msg->service_id;
>  	cm_id_priv->id.service_mask = __constant_cpu_to_be64(~0ULL);
>  
> +	cm_process_routed_req(req_msg, work->mad_recv_wc->wc);
>  	cm_format_paths_from_req(req_msg, &work->path[0], &work->path[1]);
>  	ret = cm_init_av_by_path(&work->path[0], &cm_id_priv->av);
>  	if (ret) {
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From sashak at voltaire.com  Tue Dec 11 05:46:32 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 11 Dec 2007 13:46:32 +0000
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs query
	only single ports
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
Message-ID: <20071211134632.GC23319@sashak.voltaire.com>


For CAs query performance counters only for single ports by lid and port
number, and not whole node with 'all ports' option.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/scripts/ibcheckerrors.in |   32 ++++++++++------------------
 1 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/infiniband-diags/scripts/ibcheckerrors.in b/infiniband-diags/scripts/ibcheckerrors.in
index cac2475..5cfabc6 100644
--- a/infiniband-diags/scripts/ibcheckerrors.in
+++ b/infiniband-diags/scripts/ibcheckerrors.in
@@ -79,15 +79,15 @@ echo "$text" | awk '
 BEGIN {
 	ne=0
 }
-function check_node(lid)
+function check_node(lid, port)
 {
 	nodechecked=1
 	if (system("'$IBPATH'/ibchecknode '"$ca_info"' '$gflags' '$verbose' " lid)) {
 		ne++
-		badnode=1
+		print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
 		return
 	}
-	if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " 255"))
+	if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
 		nodeerr=1;
 }
 
@@ -105,30 +105,22 @@ function check_node(lid)
 
 			lid = substr($0, index($0, "port 0 lid ") + 11)
 			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid)
+			check_node(lid, 255)
 		}
 /^\[/	{
 		nports++
 		port = $1
-		if (!nodechecked) {
-			lid = substr($0, index($0, " lid ") + 5)
-			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid)
-		}
-		if (badnode) {
-			print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
-			next
-		}
 		sub("\\(.*\\)", "", port)
 		gsub("[\\[\\]]", "", port)
-		if (nodeerr)
-			if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port)) {
-				if (!'$v' && oldlid != lid) {
-					print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure"
-					oldlid = lid
-				}
+		if (ntype != "Switch") {
+			lid = substr($0, index($0, " lid ") + 5)
+			lid = substr(lid, 1, index(lid, " ") - 1)
+			check_node(lid, port)
+			if (nodeerr)
 				pcnterr++;
-			}
+		} else if (nodeerr &&
+			   system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
+			pcnterr++;
 }
 
 /^ib/	{print $0; next}
-- 
1.5.3.rc2.29.gc4640f


From asybddvfp at boxpro.com  Tue Dec 11 05:57:46 2007
From: asybddvfp at boxpro.com (Sarah Pothan)
Date: Tue, 11 Dec 2007 14:57:46 +0100
Subject: [ofa-general] Find cheap alternative to expensive American
	medications.
Message-ID: <01c83c06$306d5100$05c31e53@asybddvfp>

 Looking for the cheapest Men's Health medications? Purchase them discreetly and confidentially with Canadian Pharmacy online drugstore.

 Safe and secure ordering process, protected ordering system, full confidentiality. Prompt worldwide delivery. Friendly customer service, customer support center to answer all your requests.

http://specialenter.cn

 «CanadianPharmacy» is the best Canadian drugstore online.

Sarah Pothan


From sashak at voltaire.com  Tue Dec 11 06:53:28 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 11 Dec 2007 14:53:28 +0000
Subject: [ofa-general] Re: [PATCH] OpenSM: Fix error return corner case
In-Reply-To: <1197333577.29314.154.camel@cardanus.llnl.gov>
References: <1197333577.29314.154.camel@cardanus.llnl.gov>
Message-ID: <20071211145328.GE23319@sashak.voltaire.com>

On 16:39 Mon 10 Dec     , Al Chu wrote:
> Hey Sasha,
> 
> I noticed that in osm_ucast_updn_setup(), the code has set the
> context/callbacks in the routing engine struct, but does not revert the
> changes if the later call to updn_init() fails.  The callbacks would be
> left in place and erroneously executed at a later time.  Patch is
> attached.
> 
> Thanks,
> Al
> 
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From caeb46ed713df285785d84930b1db1850e11d19b Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Mon, 10 Dec 2007 16:27:07 -0800
> Subject: [PATCH] fix error return corner case
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>

Applied. Thanks.

(BTW please use tab character for indention. You can format you code
with opensm/osm_indent script if you wish.)

Sasha


From hrosenstock at xsigo.com  Tue Dec 11 06:57:48 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Tue, 11 Dec 2007 06:57:48 -0800
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <20071211134632.GC23319@sashak.voltaire.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
Message-ID: <1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> For CAs query performance counters only for single ports by lid and port
> number, and not whole node with 'all ports' option.

Should the description also reference the bug # ?

Will a similar thing be done to the other diag scripts which have this
same issue (but haven't been reported yet) ?

Would it be better to fix this in the underlying tool used (perfquery)
and in that way address it for all the diag scripts ?

-- Hal

> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> ---
>  infiniband-diags/scripts/ibcheckerrors.in |   32 ++++++++++------------------
>  1 files changed, 12 insertions(+), 20 deletions(-)
> 
> diff --git a/infiniband-diags/scripts/ibcheckerrors.in b/infiniband-diags/scripts/ibcheckerrors.in
> index cac2475..5cfabc6 100644
> --- a/infiniband-diags/scripts/ibcheckerrors.in
> +++ b/infiniband-diags/scripts/ibcheckerrors.in
> @@ -79,15 +79,15 @@ echo "$text" | awk '
>  BEGIN {
>  	ne=0
>  }
> -function check_node(lid)
> +function check_node(lid, port)
>  {
>  	nodechecked=1
>  	if (system("'$IBPATH'/ibchecknode '"$ca_info"' '$gflags' '$verbose' " lid)) {
>  		ne++
> -		badnode=1
> +		print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
>  		return
>  	}
> -	if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " 255"))
> +	if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
>  		nodeerr=1;
>  }
>  
> @@ -105,30 +105,22 @@ function check_node(lid)
>  
>  			lid = substr($0, index($0, "port 0 lid ") + 11)
>  			lid = substr(lid, 1, index(lid, " ") - 1)
> -			check_node(lid)
> +			check_node(lid, 255)
>  		}
>  /^\[/	{
>  		nports++
>  		port = $1
> -		if (!nodechecked) {
> -			lid = substr($0, index($0, " lid ") + 5)
> -			lid = substr(lid, 1, index(lid, " ") - 1)
> -			check_node(lid)
> -		}
> -		if (badnode) {
> -			print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
> -			next
> -		}
>  		sub("\\(.*\\)", "", port)
>  		gsub("[\\[\\]]", "", port)
> -		if (nodeerr)
> -			if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port)) {
> -				if (!'$v' && oldlid != lid) {
> -					print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure"
> -					oldlid = lid
> -				}
> +		if (ntype != "Switch") {
> +			lid = substr($0, index($0, " lid ") + 5)
> +			lid = substr(lid, 1, index(lid, " ") - 1)
> +			check_node(lid, port)
> +			if (nodeerr)
>  				pcnterr++;
> -			}
> +		} else if (nodeerr &&
> +			   system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
> +			pcnterr++;
>  }
>  
>  /^ib/	{print $0; next}


From hartlch14 at gmail.com  Tue Dec 11 07:03:14 2007
From: hartlch14 at gmail.com (Chuck Hartley)
Date: Tue, 11 Dec 2007 10:03:14 -0500
Subject: [ofa-general] Cisco Topspin drivers available
Message-ID: <c177de4a0712110703p594ea7dalbf5f124689d41107@mail.gmail.com>

Does anyone here know where I can find up to date drivers for the Cisco
Topspin HCAs?  I tried the Cisco and IBM websites and the drivers there are
are least a year old and do not support any recent kernels.  We have an IBM
QS21 blade running Fedora 7 (or possibly RHEL 5.1 soon) and need a driver
for the HCA.  The Cisco release notes say to build the driver from the
source tarball if you have a unsupported OS, but there do not seem to be any
tarballs on the images available for download.  Even though the HCA
apparently uses a Mellanox chip, it is not supported my the mthca driver.
Here is what lspci says:

InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor
compatibility mode) (rev a0)

Hopefully someone here is using one of these with a modern kernel /
distribution...

Thanks,
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/c80f85c9/attachment.html>

From sashak at voltaire.com  Tue Dec 11 07:15:23 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 11 Dec 2007 15:15:23 +0000
Subject: [ofa-general] Re: opensm/libvendor/osm_vendor_ibumad_sa.c: In
	__osmv_sa_mad_rcv_cb, handle attribute offset of 0
In-Reply-To: <1197289420.8114.384.camel@hrosenstock-ws.xsigo.com>
References: <1197289420.8114.384.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071211151523.GG23319@sashak.voltaire.com>

On 04:23 Mon 10 Dec     , Hal Rosenstock wrote:
> opensm/libvendor/osm_vendor_ibumad_sa.c: In __osmv_sa_mad_rcv_cb, handle
> attribute offset of 0 which is valid at IBA 1.2.1 when 0 attributes are
> returned
> 
> Signed-off-by: Hal Rosenstock <hal at xsigo.com>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Tue Dec 11 07:27:27 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 11 Dec 2007 15:27:27 +0000
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071211152727.GI23319@sashak.voltaire.com>

On 06:57 Tue 11 Dec     , Hal Rosenstock wrote:
> On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> > For CAs query performance counters only for single ports by lid and port
> > number, and not whole node with 'all ports' option.
> 
> Should the description also reference the bug # ?

I will add.

> Will a similar thing be done to the other diag scripts which have this
> same issue (but haven't been reported yet) ?

It is reasonable. I will try to check other scripts too.

> Would it be better to fix this in the underlying tool used (perfquery)
> and in that way address it for all the diag scripts ?

I think perfquery could/should be improved as well, but it is not the
same issue. I think that in general it is more accurate when whole
fabric is checked to query endport's by port and not by node - multiport
CA can have disconnected ports and/or ports which connected to another
subnet - in this way its counters are irrelevant to the check. Right?

Sasha


From dwsalenteym at salentey.fr  Tue Dec 11 07:13:56 2007
From: dwsalenteym at salentey.fr (Kelly Hansen)
Date: Tue, 11 Dec 2007 07:13:56 -0800
Subject: [ofa-general] Every possible meds you need but at lower price.
Message-ID: <01c83bc5$64745200$132d1b48@dwsalenteym>

 {PharmaCanadaAll-1}
 {PharmaCanadaAll-1}

 {PharmaCanadaAll-2}

http://presssilver.cn

 It's time to start saving!

Kelly Hansen


From hrosenstock at xsigo.com  Tue Dec 11 07:25:53 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Tue, 11 Dec 2007 07:25:53 -0800
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <20071211152727.GI23319@sashak.voltaire.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
Message-ID: <1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-11 at 15:27 +0000, Sasha Khapyorsky wrote:
> On 06:57 Tue 11 Dec     , Hal Rosenstock wrote:
> > On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> > > For CAs query performance counters only for single ports by lid and port
> > > number, and not whole node with 'all ports' option.
> > 
> > Should the description also reference the bug # ?
> 
> I will add.
> 
> > Will a similar thing be done to the other diag scripts which have this
> > same issue (but haven't been reported yet) ?
> 
> It is reasonable. I will try to check other scripts too.
> 
> > Would it be better to fix this in the underlying tool used (perfquery)
> > and in that way address it for all the diag scripts ?
> 
> I think perfquery could/should be improved as well, but it is not the
> same issue. 

Why not ?

If perfquery paved over the lack of support for all ports, then all the
scripts would be fine as is, right ?

> I think that in general it is more accurate when whole
> fabric is checked to query endport's by port and not by node - multiport
> CA can have disconnected ports and/or ports which connected to another
> subnet - in this way its counters are irrelevant to the check. Right?

Yes, but doing it on a node basis cuts down on the number of queries.
One can always go back and dive down to the port level after seeing
which nodes are of interest.

-- Hal

> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From kliteyn at dev.mellanox.co.il  Tue Dec 11 07:36:43 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 11 Dec 2007 17:36:43 +0200
Subject: [ofa-general] [PATCH] opensm: fixing coredump in QoS policy pkey
	validation
Message-ID: <475EAE8B.70206@dev.mellanox.co.il>

Fixing segmentation fault in validating pkeys in QoS policy

Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_qos_policy.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c
index f4fb0d3..544fbb4 100644
--- a/opensm/opensm/osm_qos_policy.c
+++ b/opensm/opensm/osm_qos_policy.c
@@ -900,8 +900,8 @@ int osm_qos_policy_validate(osm_qos_policy_t * p_qos_policy,
 		 */

 		for (j = 0; j < p_qos_match_rule->pkey_range_len; j++) {
-			for ( pkey_64 = p_qos_match_rule->pkey_range_arr[i][0];
-			      pkey_64 <= p_qos_match_rule->pkey_range_arr[i][1];
+			for ( pkey_64 = p_qos_match_rule->pkey_range_arr[j][0];
+			      pkey_64 <= p_qos_match_rule->pkey_range_arr[j][1];
 			      pkey_64++) {
                                 pkey = cl_hton16((uint16_t)(pkey_64 & 0x7fff));
 				p_prtn = (osm_prtn_t *)cl_qmap_get(
-- 
1.5.1.4


From kliteyn at dev.mellanox.co.il  Tue Dec 11 07:39:24 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 11 Dec 2007 17:39:24 +0200
Subject: [ofa-general] [PATCH] opensm: QoS policy - fixing pkey range
	implementation
Message-ID: <475EAF2C.2020507@dev.mellanox.co.il>

Fixing pkey range implementation in QoS policy.
Ignoring the "special" most significant bit of
the pkey was causing problems.

Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_qos_parser.y |   45 +++++++++++++++++++++++++++++++++++----
 opensm/opensm/osm_qos_policy.c |    2 +-
 2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/opensm/opensm/osm_qos_parser.y b/opensm/opensm/osm_qos_parser.y
index f87102c..edfa25c 100644
--- a/opensm/opensm/osm_qos_parser.y
+++ b/opensm/opensm/osm_qos_parser.y
@@ -94,6 +94,11 @@ static int __parser_match_rule_end();
 static void __parser_ulp_match_rule_start();
 static int __parser_ulp_match_rule_end();

+static void __pkey_rangelist2rangearr(
+    cl_list_t    * p_list,
+    uint64_t  ** * p_arr,
+    unsigned     * p_arr_len);
+
 static void __rangelist2rangearr(
     cl_list_t    * p_list,
     uint64_t  ** * p_arr,
@@ -677,7 +682,7 @@ qos_ulp:            TK_ULP_DEFAULT single_number {
                         }

                         /* get all the pkey ranges */
-                        __rangelist2rangearr( &tmp_parser_struct.num_pair_list,
+                        __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list,
                                               &range_arr,
                                               &range_len );

@@ -927,7 +932,7 @@ qos_ulp:            TK_ULP_DEFAULT single_number {
                         }

                         /* get all the pkey ranges */
-                        __rangelist2rangearr( &tmp_parser_struct.num_pair_list,
+                        __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list,
                                               &range_arr,
                                               &range_len );

@@ -1182,7 +1187,7 @@ port_group_pkey:        port_group_pkey_start list_of_ranges {
                                 uint64_t ** range_arr;
                                 unsigned range_len;

-                                __rangelist2rangearr( &tmp_parser_struct.num_pair_list,
+                                __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list,
                                                       &range_arr,
                                                       &range_len );

@@ -1867,7 +1872,7 @@ qos_level_pkey:         qos_level_pkey_start list_of_ranges {
                                 uint64_t ** range_arr;
                                 unsigned range_len;

-                                __rangelist2rangearr( &tmp_parser_struct.num_pair_list,
+                                __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list,
                                                       &range_arr,
                                                       &range_len );

@@ -2094,7 +2099,7 @@ qos_match_rule_pkey:    qos_match_rule_pkey_start list_of_ranges {
                                 uint64_t ** range_arr;
                                 unsigned range_len;

-                                __rangelist2rangearr( &tmp_parser_struct.num_pair_list,
+                                __pkey_rangelist2rangearr( &tmp_parser_struct.num_pair_list,
                                                       &range_arr,
                                                       &range_len );

@@ -2778,6 +2783,36 @@ static void __sort_reduce_rangearr(
 /***************************************************
  ***************************************************/

+static void __pkey_rangelist2rangearr(
+    cl_list_t    * p_list,
+    uint64_t  ** * p_arr,
+    unsigned     * p_arr_len)
+{
+    uint64_t   tmp_pkey;
+    uint64_t * p_pkeys;
+    cl_list_iterator_t list_iterator;
+
+    list_iterator= cl_list_head(p_list);
+    while( list_iterator != cl_list_end(p_list) )
+    {
+       p_pkeys = (uint64_t *)cl_list_obj(list_iterator);
+       p_pkeys[0] &= 0x7fff;
+       p_pkeys[1] &= 0x7fff;
+       if (p_pkeys[0] > p_pkeys[1])
+       {
+           tmp_pkey = p_pkeys[1];
+           p_pkeys[1] = p_pkeys[0];
+           p_pkeys[0] = tmp_pkey;
+       }
+       list_iterator = cl_list_next(list_iterator);
+    }
+
+    __rangelist2rangearr(p_list, p_arr, p_arr_len);
+}
+
+/***************************************************
+ ***************************************************/
+
 static void __rangelist2rangearr(
     cl_list_t    * p_list,
     uint64_t  ** * p_arr,
diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c
index 544fbb4..6140de0 100644
--- a/opensm/opensm/osm_qos_policy.c
+++ b/opensm/opensm/osm_qos_policy.c
@@ -691,7 +691,7 @@ static osm_qos_match_rule_t *__qos_policy_get_match_rule_by_params(
 			if (!__is_num_in_range_arr
 			    (p_qos_match_rule->pkey_range_arr,
 			     p_qos_match_rule->pkey_range_len,
-			     pkey)) {
+			     pkey & 0x7FFF)) {
 				list_iterator = cl_list_next(list_iterator);
 				continue;
 			}
-- 
1.5.1.4


From sashak at voltaire.com  Tue Dec 11 08:46:57 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 11 Dec 2007 16:46:57 +0000
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
References: <1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
	<1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071211164657.GJ23319@sashak.voltaire.com>

On 07:25 Tue 11 Dec     , Hal Rosenstock wrote:
> On Tue, 2007-12-11 at 15:27 +0000, Sasha Khapyorsky wrote:
> > On 06:57 Tue 11 Dec     , Hal Rosenstock wrote:
> > > On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> > > > For CAs query performance counters only for single ports by lid and port
> > > > number, and not whole node with 'all ports' option.
> > > 
> > > Should the description also reference the bug # ?
> > 
> > I will add.
> > 
> > > Will a similar thing be done to the other diag scripts which have this
> > > same issue (but haven't been reported yet) ?
> > 
> > It is reasonable. I will try to check other scripts too.
> > 
> > > Would it be better to fix this in the underlying tool used (perfquery)
> > > and in that way address it for all the diag scripts ?
> > 
> > I think perfquery could/should be improved as well, but it is not the
> > same issue. 
> 
> Why not ?
> 
> If perfquery paved over the lack of support for all ports, then all the
> scripts would be fine as is, right ?

Yes, but I think that it more accurate to query CA ports and not just
nodes (even if 'all ports' option is supported).

> 
> > I think that in general it is more accurate when whole
> > fabric is checked to query endport's by port and not by node - multiport
> > CA can have disconnected ports and/or ports which connected to another
> > subnet - in this way its counters are irrelevant to the check. Right?
> 
> Yes, but doing it on a node basis cuts down on the number of queries.

True, but doing right things is more important here than number of
queries IMO (BTW in practice the difference in number of queries is not
so significant - it is in percents, not in times).

> One can always go back and dive down to the port level after seeing
> which nodes are of interest.

The problem is that one can get invalid error report with such script -
for example when CA has "bad" port which is connected to another subnet.

Sasha


From sashak at voltaire.com  Tue Dec 11 08:56:17 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 11 Dec 2007 16:56:17 +0000
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <20071211164657.GJ23319@sashak.voltaire.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
	<1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
	<20071211164657.GJ23319@sashak.voltaire.com>
Message-ID: <20071211165617.GK23319@sashak.voltaire.com>

On 16:46 Tue 11 Dec     , Sasha Khapyorsky wrote:
> On 07:25 Tue 11 Dec     , Hal Rosenstock wrote:
> > On Tue, 2007-12-11 at 15:27 +0000, Sasha Khapyorsky wrote:
> > > On 06:57 Tue 11 Dec     , Hal Rosenstock wrote:
> > > > On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> > > > > For CAs query performance counters only for single ports by lid and port
> > > > > number, and not whole node with 'all ports' option.
> > > > 
> > > > Should the description also reference the bug # ?
> > > 
> > > I will add.
> > > 
> > > > Will a similar thing be done to the other diag scripts which have this
> > > > same issue (but haven't been reported yet) ?
> > > 
> > > It is reasonable. I will try to check other scripts too.
> > > 
> > > > Would it be better to fix this in the underlying tool used (perfquery)
> > > > and in that way address it for all the diag scripts ?
> > > 
> > > I think perfquery could/should be improved as well, but it is not the
> > > same issue. 
> > 
> > Why not ?
> > 
> > If perfquery paved over the lack of support for all ports, then all the
> > scripts would be fine as is, right ?

Another aspect of this.

I'm not close that 'all ports' simulation in perfquery is great thing.
perfquery is low level tool and it should be able to indicate in clear
way that 'all ports' option is not supported by port instead of hiding
this behind simulation. Maybe 'all ports' simulation should optional...
I'm not sure yet.

Also I think that when perfquery targets CA port just by LID and when
port number is not specified 'all ports' should not be default, but
instead port number of this LID. Such behavior seems to be more "native"
for me.

Sasha


From rdreier at cisco.com  Tue Dec 11 08:45:51 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 11 Dec 2007 08:45:51 -0800
Subject: [ofa-general] Cisco Topspin drivers available
In-Reply-To: <c177de4a0712110703p594ea7dalbf5f124689d41107@mail.gmail.com>
	(Chuck Hartley's message of "Tue, 11 Dec 2007 10:03:14 -0500")
References: <c177de4a0712110703p594ea7dalbf5f124689d41107@mail.gmail.com>
Message-ID: <adazlwh89j4.fsf@cisco.com>

The questions about getting source obviously need to go to Cisco
support (and I don't work in support so I don't know).  But...

 > Even though the HCA apparently uses a Mellanox chip, it is not
 > supported my the mthca driver.

 > InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev a0)

What happens when you load the mthca driver?  This device should work
fine with mthca.

 - R.


From mshefty at ichips.intel.com  Tue Dec 11 09:12:31 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Tue, 11 Dec 2007 09:12:31 -0800
Subject: [ofa-general] [PATCH] IB/CM: add support for routed paths
In-Reply-To: <1197377985.8114.637.camel@hrosenstock-ws.xsigo.com>
References: <20071210203544.GI30090@obsidianresearch.com>	<000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>
	<1197377985.8114.637.camel@hrosenstock-ws.xsigo.com>
Message-ID: <475EC4FF.9080702@ichips.intel.com>

> Could the subnet local component(s) in the REQ be used on the passive
> side instead of permissive LIDs ? That might be more "standard" than
> using permissive LIDs.

It could be used.  I wanted this to work on the passive side if the LIDs 
were set correctly in the REQ.

- Sean


From akepner at sgi.com  Tue Dec 11 09:42:23 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Tue, 11 Dec 2007 09:42:23 -0800
Subject: [ofa-general] rc1 this week?
Message-ID: <20071211174223.GB19090@sgi.com>


Hi Tziporet; 

Is OFED-1.3 rc1 still expected to be out this week?

-- 
Arthur


From arlin.r.davis at intel.com  Tue Dec 11 09:54:36 2007
From: arlin.r.davis at intel.com (Arlin Davis)
Date: Tue, 11 Dec 2007 09:54:36 -0800
Subject: [ofa-general] [PATCH 2/2] uDAT/uDAPL v2 - (master branch) changes
	to sync common code base with WinOF 1.01
In-Reply-To: <Pine.LNX.4.64.0712101253210.12638@jlentini-linux.nane.netapp.com>
References: <000101c8392e$86f2ff50$1dfd070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712101253210.12638@jlentini-linux.nane.netapp.com>
Message-ID: <000001c83c1e$e4e1fc40$4297070a@amr.corp.intel.com>

 
>
>Looks good. A few more minor question:
>
>
>What are the change above? I don't see any difference between the text 
>or the white space.
>

Extra linefeeds removed.


From openib-general at openib.org  Tue Dec 11 10:17:21 2007
From: openib-general at openib.org (Dawn Moses)
Date: Tue, 11 Dec 2007 13:17:21 -0500
Subject: [ofa-general] Hot repl1ca w4tches from 2008
Message-ID: <eybrqaERVPREVopenib-general@openib.org>


Winter is hitting and christmas are coming.
Do you need perfect gift? 0rder high qual1ty
repl1ca of w4tches, purses & bags from 2008!
http://www.iuibuee.com/


From ofw-bounces at lists.openfabrics.org  Tue Dec 11 10:18:11 2007
From: ofw-bounces at lists.openfabrics.org (ofw-bounces at lists.openfabrics.org)
Date: Tue, 11 Dec 2007 10:18:11 -0800
Subject: [ofa-general] Your message to ofw awaits moderator approval
Message-ID: <mailman.88.1197397091.317.ofw@lists.openfabrics.org>

Your mail to 'ofw' with the subject

    Hot repl1ca w4tches from 2008

Is being held until the list moderator can review it for approval.

The reason it is being held:

    Post by non-member to a members-only list

Either the message will get posted to the list, or you will receive
notification of the moderator's decision.  If you would like to cancel
this posting, please visit the following URL:

    http://lists.openfabrics.org/cgi-bin/mailman/confirm/ofw/0e8b4ca994135921354a939be332b4ba7651f8f4


From arlin.r.davis at intel.com  Tue Dec 11 10:25:27 2007
From: arlin.r.davis at intel.com (Arlin Davis)
Date: Tue, 11 Dec 2007 10:25:27 -0800
Subject: [ofa-general] [PATCH 1/2 rev2] uDAT/uDAPL v2 - (master branch)
	changes to sync common code base with WinOF 1.01
In-Reply-To: <Pine.LNX.4.64.0712101249000.12638@jlentini-linux.nane.netapp.com>
References: <000001c8392d$ae801400$1dfd070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712101249000.12638@jlentini-linux.nane.netapp.com>
Message-ID: <000101c83c23$34399510$4297070a@amr.corp.intel.com>

 
>> -    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
>> -				    &dapl_ia_handle);
>> +    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
>
>For consistency with your change above, should the cast 
>be changed to 
>
>+    dat_status = dats_get_ia_handle((DAT_IA_HANDLE)ia_handle, 
>&dapl_ia_handle);
>

Good catch. I missed some dat_api.c changes from Stan. Here is rev2. 

  - add DAT_API to specify calling conventions (windows=__stdcall, linux= ) 
  - cleanup platform specific definitions for windows
  - c++ support
  - add handle check macros DAT_IA_HANDLE_TO_UL and UL_TO_DAT_IA_HANDLE

Signed-off by: Arlin Davis <ardavis at ichips.intel.com>
Signed-off by: Stan Smith <stan.smith at intel.com>

diff --git a/dat/common/dat_api.c b/dat/common/dat_api.c
index 1415f73..c882319 100755
--- a/dat/common/dat_api.c
+++ b/dat/common/dat_api.c
@@ -57,7 +57,7 @@
 typedef struct 
 {
     DAT_OS_LOCK     handle_lock;
-    int	            handle_max;
+    unsigned long   handle_max;
     void            **handle_array;
 } DAT_HANDLE_VEC;
 
@@ -78,7 +78,7 @@ DAT_RETURN
 dats_handle_vector_init ( void )
 {
     DAT_RETURN		dat_status;
-    int			i;
+    unsigned long	i;
 
     dat_status = DAT_SUCCESS;
 
@@ -113,12 +113,12 @@ dats_handle_vector_init ( void )
  * Install an ia_handle into a handle vector and return a small
  * integer.
  ***********************************************************************/
-unsigned long
+DAT_IA_HANDLE
 dats_set_ia_handle (
 	IN  DAT_IA_HANDLE		ia_handle )
 {   
-    unsigned long   i;
-    void 	   **h;
+    unsigned long	i;
+    void		**h;
 
     dat_os_lock (&g_hv.handle_lock);
 
@@ -135,7 +135,7 @@ dats_set_ia_handle (
             
 	    dat_os_dbg_print (DAT_OS_DBG_TYPE_PROVIDER_API,
 			      "dat_set_handle %p to %d\n", ia_handle, i);
-	    return (unsigned long) i;
+	    return UL_TO_DAT_IA_HANDLE(i);
 	}
     }
 
@@ -149,7 +149,7 @@ dats_set_ia_handle (
     if (h == NULL)
     {
 	dat_os_unlock (&g_hv.handle_lock);
-	return -1;
+	return UL_TO_DAT_IA_HANDLE(-1);
     }
     /* copy old data to new area & free old memory*/
     memcpy((void *)h, (void *)g_hv.handle_array, sizeof(void *) * g_hv.handle_max);
@@ -169,8 +169,7 @@ dats_set_ia_handle (
     dat_os_dbg_print (DAT_OS_DBG_TYPE_PROVIDER_API,
 		      "dat_set_handle x %p to %d\n", ia_handle, i);
 
-    return (unsigned long) i;
-
+    return UL_TO_DAT_IA_HANDLE(i);
 }
 
 /***********************************************************************
@@ -180,17 +179,17 @@ dats_set_ia_handle (
  ***********************************************************************/
 DAT_RETURN
 dats_get_ia_handle(
-	IN  unsigned long		handle,
+	IN  DAT_IA_HANDLE 		handle,
 	OUT DAT_IA_HANDLE		*ia_handle_p )
 {
     DAT_RETURN		dat_status;
 
-    if (handle > g_hv.handle_max)
+    if (DAT_IA_HANDLE_TO_UL(handle) > g_hv.handle_max)
     {
 	dat_status = DAT_ERROR(DAT_INVALID_HANDLE, DAT_INVALID_HANDLE_IA);
 	goto bail;
     }
-    *ia_handle_p = g_hv.handle_array[handle];
+    *ia_handle_p = g_hv.handle_array[DAT_IA_HANDLE_TO_UL(handle)];
 
     if (*ia_handle_p == NULL)
     {
@@ -226,7 +225,7 @@ DAT_RETURN
 dats_is_ia_handle (
 	IN  DAT_HANDLE			dat_handle)
 {
-    unsigned long handle = (unsigned long) dat_handle;
+    unsigned long handle = DAT_IA_HANDLE_TO_UL((DAT_IA_HANDLE)dat_handle);
 
     if (g_hv.handle_max < handle )
     {
@@ -250,16 +249,16 @@ dats_is_ia_handle (
  ***********************************************************************/
 DAT_RETURN
 dats_free_ia_handle (
-	IN  unsigned long		handle)
+	IN  DAT_IA_HANDLE		handle)
 {
     DAT_RETURN		dat_status;
 
-    if (handle > g_hv.handle_max)
+    if (DAT_IA_HANDLE_TO_UL(handle) > g_hv.handle_max)
     {
 	dat_status = DAT_ERROR(DAT_INVALID_HANDLE, DAT_INVALID_HANDLE_IA);
 	goto bail;
     }
-    g_hv.handle_array[handle] = NULL;
+    g_hv.handle_array[DAT_IA_HANDLE_TO_UL(handle)] = NULL;
     dat_status = DAT_SUCCESS;
 
     dat_os_dbg_print (DAT_OS_DBG_TYPE_PROVIDER_API,
@@ -272,7 +271,7 @@ dats_free_ia_handle (
 /**********************************************************************
  * API definitions for common API entry points
  **********************************************************************/
-DAT_RETURN dat_ia_query (
+DAT_RETURN DAT_API dat_ia_query (
 	IN      DAT_IA_HANDLE		ia_handle,
 	OUT     DAT_EVD_HANDLE		*async_evd_handle,
 	IN      DAT_IA_ATTR_MASK	ia_attr_mask,
@@ -283,8 +282,7 @@ DAT_RETURN dat_ia_query (
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_IA_QUERY (dapl_ia_handle,
@@ -298,7 +296,7 @@ DAT_RETURN dat_ia_query (
     return dat_status;
 }
 
-DAT_RETURN dat_set_consumer_context (
+DAT_RETURN DAT_API dat_set_consumer_context (
 	IN      DAT_HANDLE		dat_handle,
 	IN      DAT_CONTEXT		context)
 {
@@ -307,8 +305,8 @@ DAT_RETURN dat_set_consumer_context (
         DAT_IA_HANDLE   dapl_ia_handle;
         DAT_RETURN      dat_status;
 
-        dat_status = dats_get_ia_handle((unsigned long)dat_handle,
-                                        &dapl_ia_handle);
+        dat_status = dats_get_ia_handle((DAT_IA_HANDLE)dat_handle,
+					&dapl_ia_handle);
         
         /* failure to map the handle is unlikely but possible */
         /* in a mult-threaded environment                     */
@@ -325,7 +323,7 @@ DAT_RETURN dat_set_consumer_context (
 }
 
 
-DAT_RETURN dat_get_consumer_context (
+DAT_RETURN DAT_API dat_get_consumer_context (
 	IN      DAT_HANDLE		dat_handle,
 	OUT     DAT_CONTEXT		*context)
 {
@@ -334,7 +332,7 @@ DAT_RETURN dat_get_consumer_context (
         DAT_IA_HANDLE   dapl_ia_handle;
         DAT_RETURN      dat_status;
 
-        dat_status = dats_get_ia_handle((unsigned long)dat_handle,
+        dat_status = dats_get_ia_handle((DAT_IA_HANDLE)dat_handle,
                                         &dapl_ia_handle);
 
         /* failure to map the handle is unlikely but possible */
@@ -352,7 +350,7 @@ DAT_RETURN dat_get_consumer_context (
 }
 
 
-DAT_RETURN dat_get_handle_type (
+DAT_RETURN DAT_API dat_get_handle_type (
 	IN      DAT_HANDLE		dat_handle,
 	OUT     DAT_HANDLE_TYPE		*type)
 {
@@ -361,7 +359,7 @@ DAT_RETURN dat_get_handle_type (
         DAT_IA_HANDLE   dapl_ia_handle;
         DAT_RETURN      dat_status;
 
-        dat_status = dats_get_ia_handle((unsigned long)dat_handle,
+        dat_status = dats_get_ia_handle((DAT_IA_HANDLE)dat_handle,
                                         &dapl_ia_handle);
 
         /* failure to map the handle is unlikely but possible */
@@ -374,12 +372,11 @@ DAT_RETURN dat_get_handle_type (
         dat_handle = dapl_ia_handle;
     }
 
-    return DAT_GET_HANDLE_TYPE (dat_handle,
-				type);
+    return DAT_GET_HANDLE_TYPE (dat_handle, type);
 }
 
 
-DAT_RETURN dat_cr_query (
+DAT_RETURN DAT_API dat_cr_query (
 	IN      DAT_CR_HANDLE		cr_handle,
 	IN      DAT_CR_PARAM_MASK	cr_param_mask,
 	OUT     DAT_CR_PARAM		*cr_param)
@@ -394,7 +391,7 @@ DAT_RETURN dat_cr_query (
 }
 
 
-DAT_RETURN dat_cr_accept (
+DAT_RETURN DAT_API dat_cr_accept (
 	IN      DAT_CR_HANDLE		cr_handle,
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		private_data_size,
@@ -411,7 +408,7 @@ DAT_RETURN dat_cr_accept (
 }
 
 
-DAT_RETURN dat_cr_reject (
+DAT_RETURN DAT_API dat_cr_reject (
 	IN      DAT_CR_HANDLE 		cr_handle,
 	IN	DAT_COUNT		private_data_size,
 	IN const DAT_PVOID		private_data)
@@ -424,7 +421,7 @@ DAT_RETURN dat_cr_reject (
 }
 
 
-DAT_RETURN dat_evd_resize (
+DAT_RETURN DAT_API dat_evd_resize (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	IN      DAT_COUNT		evd_min_qlen)
 {
@@ -437,7 +434,7 @@ DAT_RETURN dat_evd_resize (
 }
 
 
-DAT_RETURN dat_evd_post_se (
+DAT_RETURN DAT_API dat_evd_post_se (
 	IN      DAT_EVD_HANDLE	       evd_handle,
 	IN      const DAT_EVENT		*event)
 {
@@ -450,7 +447,7 @@ DAT_RETURN dat_evd_post_se (
 }
 
 
-DAT_RETURN dat_evd_dequeue (
+DAT_RETURN DAT_API dat_evd_dequeue (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	OUT     DAT_EVENT		*event)
 {
@@ -463,7 +460,7 @@ DAT_RETURN dat_evd_dequeue (
 }
 
 
-DAT_RETURN dat_evd_free (
+DAT_RETURN DAT_API dat_evd_free (
 	IN      DAT_EVD_HANDLE 		evd_handle)
 {
     if (evd_handle == NULL)
@@ -473,7 +470,7 @@ DAT_RETURN dat_evd_free (
     return DAT_EVD_FREE (evd_handle);
 }
 
-DAT_RETURN dat_evd_query (
+DAT_RETURN DAT_API dat_evd_query (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	IN      DAT_EVD_PARAM_MASK	evd_param_mask,
 	OUT     DAT_EVD_PARAM		*evd_param)
@@ -488,7 +485,7 @@ DAT_RETURN dat_evd_query (
 }
 
 
-DAT_RETURN dat_ep_create (
+DAT_RETURN DAT_API dat_ep_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_PZ_HANDLE		pz_handle,
 	IN      DAT_EVD_HANDLE		recv_completion_evd_handle,
@@ -500,8 +497,7 @@ DAT_RETURN dat_ep_create (
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status =  DAT_EP_CREATE (dapl_ia_handle,
@@ -517,7 +513,7 @@ DAT_RETURN dat_ep_create (
 }
 
 
-DAT_RETURN dat_ep_query (
+DAT_RETURN DAT_API dat_ep_query (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_EP_PARAM_MASK	ep_param_mask,
 	OUT     DAT_EP_PARAM		*ep_param)
@@ -532,7 +528,7 @@ DAT_RETURN dat_ep_query (
 }
 
 
-DAT_RETURN dat_ep_modify (
+DAT_RETURN DAT_API dat_ep_modify (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_EP_PARAM_MASK	ep_param_mask,
 	IN      const DAT_EP_PARAM 	*ep_param)
@@ -546,7 +542,7 @@ DAT_RETURN dat_ep_modify (
 			  ep_param);
 }
 
-DAT_RETURN dat_ep_connect (
+DAT_RETURN DAT_API dat_ep_connect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_IA_ADDRESS_PTR	remote_ia_address,
 	IN      DAT_CONN_QUAL		remote_conn_qual,
@@ -570,7 +566,7 @@ DAT_RETURN dat_ep_connect (
 			   connect_flags);
 }
 
-DAT_RETURN dat_ep_common_connect (
+DAT_RETURN DAT_API dat_ep_common_connect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_IA_ADDRESS_PTR	remote_ia_address,
 	IN      DAT_TIMEOUT		timeout,
@@ -588,7 +584,7 @@ DAT_RETURN dat_ep_common_connect (
 			   private_data);
 }
 
-DAT_RETURN dat_ep_dup_connect (
+DAT_RETURN DAT_API dat_ep_dup_connect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_EP_HANDLE		ep_dup_handle,
 	IN      DAT_TIMEOUT		timeout,
@@ -609,7 +605,7 @@ DAT_RETURN dat_ep_dup_connect (
 }
 
 
-DAT_RETURN dat_ep_disconnect (
+DAT_RETURN DAT_API dat_ep_disconnect (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_CLOSE_FLAGS		close_flags)
 {
@@ -621,7 +617,7 @@ DAT_RETURN dat_ep_disconnect (
 			      close_flags);
 }
 
-DAT_RETURN dat_ep_post_send (
+DAT_RETURN DAT_API dat_ep_post_send (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -639,7 +635,7 @@ DAT_RETURN dat_ep_post_send (
 			     completion_flags);
 }
 
-DAT_RETURN dat_ep_post_send_with_invalidate (
+DAT_RETURN DAT_API dat_ep_post_send_with_invalidate (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -661,7 +657,7 @@ DAT_RETURN dat_ep_post_send_with_invalidate (
 			     rmr_context);
 }
 
-DAT_RETURN dat_ep_post_recv (
+DAT_RETURN DAT_API dat_ep_post_recv (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -680,7 +676,7 @@ DAT_RETURN dat_ep_post_recv (
 }
 
 
-DAT_RETURN dat_ep_post_rdma_read (
+DAT_RETURN DAT_API dat_ep_post_rdma_read (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -701,7 +697,7 @@ DAT_RETURN dat_ep_post_rdma_read (
 }
 
 
-DAT_RETURN dat_ep_post_rdma_read_to_rmr (
+DAT_RETURN DAT_API dat_ep_post_rdma_read_to_rmr (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      const DAT_RMR_TRIPLET	*local_iov,
 	IN      DAT_DTO_COOKIE		user_cookie,
@@ -720,7 +716,7 @@ DAT_RETURN dat_ep_post_rdma_read_to_rmr (
 }
 
 
-DAT_RETURN dat_ep_post_rdma_write (
+DAT_RETURN DAT_API dat_ep_post_rdma_write (
 	IN      DAT_EP_HANDLE		ep_handle,
 	IN      DAT_COUNT		num_segments,
 	IN      DAT_LMR_TRIPLET		*local_iov,
@@ -741,7 +737,7 @@ DAT_RETURN dat_ep_post_rdma_write (
 }
 
 
-DAT_RETURN dat_ep_get_status (
+DAT_RETURN DAT_API dat_ep_get_status (
 	IN      DAT_EP_HANDLE		ep_handle,
 	OUT     DAT_EP_STATE		*ep_state,
 	OUT     DAT_BOOLEAN 		*recv_idle,
@@ -758,7 +754,7 @@ DAT_RETURN dat_ep_get_status (
 }
 
 
-DAT_RETURN dat_ep_free (
+DAT_RETURN DAT_API dat_ep_free (
 	IN      DAT_EP_HANDLE		ep_handle)
 {
     if (ep_handle == NULL)
@@ -769,7 +765,7 @@ DAT_RETURN dat_ep_free (
 }
 
 
-DAT_RETURN dat_ep_reset (
+DAT_RETURN DAT_API dat_ep_reset (
 	IN      DAT_EP_HANDLE		ep_handle)
 {
     if (ep_handle == NULL)
@@ -780,7 +776,7 @@ DAT_RETURN dat_ep_reset (
 }
 
 
-DAT_RETURN dat_lmr_free (
+DAT_RETURN DAT_API dat_lmr_free (
 	IN      DAT_LMR_HANDLE		lmr_handle)
 {
     if (lmr_handle == NULL)
@@ -791,7 +787,7 @@ DAT_RETURN dat_lmr_free (
 }
 
 
-DAT_RETURN dat_rmr_create (
+DAT_RETURN DAT_API dat_rmr_create (
 	IN      DAT_PZ_HANDLE		pz_handle,
 	OUT     DAT_RMR_HANDLE		*rmr_handle)
 {
@@ -804,7 +800,7 @@ DAT_RETURN dat_rmr_create (
 }
 
 
-DAT_RETURN dat_rmr_create_for_ep (
+DAT_RETURN DAT_API dat_rmr_create_for_ep (
 	IN      DAT_PZ_HANDLE		pz_handle,
 	OUT     DAT_RMR_HANDLE		*rmr_handle)
 {
@@ -815,7 +811,7 @@ DAT_RETURN dat_rmr_create_for_ep (
     return DAT_RMR_CREATE_FOR_EP (pz_handle,
 			   rmr_handle);
 }
-DAT_RETURN dat_rmr_query (
+DAT_RETURN DAT_API dat_rmr_query (
 	IN      DAT_RMR_HANDLE		rmr_handle,
 	IN      DAT_RMR_PARAM_MASK	rmr_param_mask,
 	OUT     DAT_RMR_PARAM		*rmr_param)
@@ -830,7 +826,7 @@ DAT_RETURN dat_rmr_query (
 }
 
 
-DAT_RETURN dat_rmr_bind (
+DAT_RETURN DAT_API dat_rmr_bind (
 	IN      DAT_RMR_HANDLE		rmr_handle,
 	IN	DAT_LMR_HANDLE		lmr_handle,
 	IN      const DAT_LMR_TRIPLET	*lmr_triplet,
@@ -857,7 +853,7 @@ DAT_RETURN dat_rmr_bind (
 }
 
 
-DAT_RETURN dat_rmr_free (
+DAT_RETURN DAT_API dat_rmr_free (
 	IN      DAT_RMR_HANDLE		rmr_handle)
 {
     if (rmr_handle == NULL)
@@ -867,7 +863,7 @@ DAT_RETURN dat_rmr_free (
     return DAT_RMR_FREE (rmr_handle);
 }
 
-DAT_RETURN dat_lmr_sync_rdma_read(
+DAT_RETURN DAT_API dat_lmr_sync_rdma_read(
 	IN      DAT_IA_HANDLE           ia_handle,
 	IN      const DAT_LMR_TRIPLET   *local_segments,
 	IN      DAT_VLEN                num_segments)
@@ -875,8 +871,7 @@ DAT_RETURN dat_lmr_sync_rdma_read(
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_LMR_SYNC_RDMA_READ (dapl_ia_handle,
@@ -888,7 +883,7 @@ DAT_RETURN dat_lmr_sync_rdma_read(
     return dat_status;
 }
 
-DAT_RETURN dat_lmr_sync_rdma_write(
+DAT_RETURN DAT_API dat_lmr_sync_rdma_write(
 	IN      DAT_IA_HANDLE           ia_handle,
 	IN      const DAT_LMR_TRIPLET   *local_segments,
 	IN      DAT_VLEN                num_segments)
@@ -896,8 +891,7 @@ DAT_RETURN dat_lmr_sync_rdma_write(
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_LMR_SYNC_RDMA_WRITE (dapl_ia_handle,
@@ -909,7 +903,7 @@ DAT_RETURN dat_lmr_sync_rdma_write(
 }
 
 
-DAT_RETURN dat_psp_create (
+DAT_RETURN DAT_API dat_psp_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_CONN_QUAL		conn_qual,
 	IN      DAT_EVD_HANDLE		evd_handle,
@@ -919,8 +913,7 @@ DAT_RETURN dat_psp_create (
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_PSP_CREATE (dapl_ia_handle,
@@ -934,7 +927,7 @@ DAT_RETURN dat_psp_create (
 }
 
 
-DAT_RETURN dat_psp_create_any (
+DAT_RETURN DAT_API dat_psp_create_any (
 	IN      DAT_IA_HANDLE		ia_handle,
 	OUT     DAT_CONN_QUAL		*conn_qual,
 	IN      DAT_EVD_HANDLE		evd_handle,
@@ -944,8 +937,7 @@ DAT_RETURN dat_psp_create_any (
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_PSP_CREATE_ANY (dapl_ia_handle,
@@ -959,7 +951,7 @@ DAT_RETURN dat_psp_create_any (
 }
 
 
-DAT_RETURN dat_psp_query (
+DAT_RETURN DAT_API dat_psp_query (
 	IN      DAT_PSP_HANDLE		psp_handle,
 	IN      DAT_PSP_PARAM_MASK	psp_param_mask,
 	OUT     DAT_PSP_PARAM 		*psp_param)
@@ -974,7 +966,7 @@ DAT_RETURN dat_psp_query (
 }
 
 
-DAT_RETURN dat_psp_free (
+DAT_RETURN DAT_API dat_psp_free (
 	IN      DAT_PSP_HANDLE	psp_handle)
 {
     if (psp_handle == NULL)
@@ -984,7 +976,7 @@ DAT_RETURN dat_psp_free (
     return DAT_PSP_FREE (psp_handle);
 }
 
-DAT_RETURN dat_csp_create (
+DAT_RETURN DAT_API dat_csp_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_COMM		*comm,
 	IN	DAT_IA_ADDRESS_PTR	address,
@@ -994,8 +986,7 @@ DAT_RETURN dat_csp_create (
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_CSP_CREATE (dapl_ia_handle,
@@ -1007,7 +998,7 @@ DAT_RETURN dat_csp_create (
     return dat_status;
 }
 
-DAT_RETURN dat_csp_query (
+DAT_RETURN DAT_API dat_csp_query (
 	IN      DAT_CSP_HANDLE		csp_handle,
 	IN      DAT_CSP_PARAM_MASK	csp_param_mask,
 	OUT     DAT_CSP_PARAM 		*csp_param)
@@ -1021,7 +1012,7 @@ DAT_RETURN dat_csp_query (
 			  csp_param);
 }
 
-DAT_RETURN dat_csp_free (
+DAT_RETURN DAT_API dat_csp_free (
 	IN      DAT_CSP_HANDLE	csp_handle)
 {
     if (csp_handle == NULL)
@@ -1032,7 +1023,7 @@ DAT_RETURN dat_csp_free (
 }
 
 
-DAT_RETURN dat_rsp_create (
+DAT_RETURN DAT_API dat_rsp_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_CONN_QUAL		conn_qual,
 	IN      DAT_EP_HANDLE		ep_handle,
@@ -1042,8 +1033,7 @@ DAT_RETURN dat_rsp_create (
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_RSP_CREATE (dapl_ia_handle,
@@ -1057,7 +1047,7 @@ DAT_RETURN dat_rsp_create (
 }
 
 
-DAT_RETURN dat_rsp_query (
+DAT_RETURN DAT_API dat_rsp_query (
 	IN      DAT_RSP_HANDLE		rsp_handle,
 	IN      DAT_RSP_PARAM_MASK	rsp_param_mask,
 	OUT     DAT_RSP_PARAM		*rsp_param)
@@ -1072,7 +1062,7 @@ DAT_RETURN dat_rsp_query (
 }
 
 
-DAT_RETURN dat_rsp_free (
+DAT_RETURN DAT_API dat_rsp_free (
 	IN      DAT_RSP_HANDLE		rsp_handle)
 {
     if (rsp_handle == NULL)
@@ -1083,15 +1073,14 @@ DAT_RETURN dat_rsp_free (
 }
 
 
-DAT_RETURN dat_pz_create (
+DAT_RETURN DAT_API dat_pz_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	OUT     DAT_PZ_HANDLE		*pz_handle)
 {
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_PZ_CREATE (dapl_ia_handle,
@@ -1102,7 +1091,7 @@ DAT_RETURN dat_pz_create (
 }
 
 
-DAT_RETURN dat_pz_query (
+DAT_RETURN DAT_API dat_pz_query (
 	IN      DAT_PZ_HANDLE		pz_handle,
 	IN      DAT_PZ_PARAM_MASK	pz_param_mask,
 	OUT     DAT_PZ_PARAM		*pz_param)
@@ -1117,7 +1106,7 @@ DAT_RETURN dat_pz_query (
 }
 
 
-DAT_RETURN dat_pz_free (
+DAT_RETURN DAT_API dat_pz_free (
 	IN      DAT_PZ_HANDLE		pz_handle)
 {
     if (pz_handle == NULL)
@@ -1127,7 +1116,7 @@ DAT_RETURN dat_pz_free (
     return DAT_PZ_FREE (pz_handle);
 }
 
-DAT_RETURN dat_ep_create_with_srq(
+DAT_RETURN DAT_API dat_ep_create_with_srq(
         IN      DAT_IA_HANDLE          ia_handle,
         IN      DAT_PZ_HANDLE          pz_handle,
         IN      DAT_EVD_HANDLE         recv_evd_handle,
@@ -1140,8 +1129,7 @@ DAT_RETURN dat_ep_create_with_srq(
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_EP_CREATE_WITH_SRQ (dapl_ia_handle,
@@ -1157,7 +1145,7 @@ DAT_RETURN dat_ep_create_with_srq(
     return dat_status;
 }
 
-DAT_RETURN dat_ep_recv_query(
+DAT_RETURN DAT_API dat_ep_recv_query(
         IN      DAT_EP_HANDLE         ep_handle,
         OUT     DAT_COUNT *           nbufs_allocated,
         OUT     DAT_COUNT *           bufs_alloc_span)
@@ -1171,7 +1159,7 @@ DAT_RETURN dat_ep_recv_query(
 			      bufs_alloc_span);
 }
 
-DAT_RETURN dat_ep_set_watermark(
+DAT_RETURN DAT_API dat_ep_set_watermark(
         IN      DAT_EP_HANDLE         ep_handle,
         IN      DAT_COUNT             soft_high_watermark,
         IN      DAT_COUNT             hard_high_watermark)
@@ -1187,7 +1175,7 @@ DAT_RETURN dat_ep_set_watermark(
 
 /* SRQ functions */
 
-DAT_RETURN dat_srq_create(
+DAT_RETURN DAT_API dat_srq_create(
         IN      DAT_IA_HANDLE           ia_handle,
         IN      DAT_PZ_HANDLE           pz_handle,
         IN      DAT_SRQ_ATTR            *srq_attr,
@@ -1196,8 +1184,7 @@ DAT_RETURN dat_srq_create(
     DAT_IA_HANDLE	dapl_ia_handle;
     DAT_RETURN		dat_status;
 
-    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
-				    &dapl_ia_handle);
+    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
     if (dat_status == DAT_SUCCESS)
     {
 	dat_status = DAT_SRQ_CREATE(dapl_ia_handle,
@@ -1209,13 +1196,13 @@ DAT_RETURN dat_srq_create(
     return dat_status;
 }
 
-DAT_RETURN dat_srq_free(
+DAT_RETURN DAT_API dat_srq_free(
 	IN      DAT_SRQ_HANDLE        srq_handle)
 {
     return DAT_SRQ_FREE (srq_handle);
 }
 
-DAT_RETURN dat_srq_post_recv(
+DAT_RETURN DAT_API dat_srq_post_recv(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_COUNT              num_segments,
 	IN      DAT_LMR_TRIPLET *      local_iov,
@@ -1231,7 +1218,7 @@ DAT_RETURN dat_srq_post_recv(
 			      user_cookie);
 }
 
-DAT_RETURN dat_srq_query(
+DAT_RETURN DAT_API dat_srq_query(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_SRQ_PARAM_MASK     srq_param_mask,
 	OUT     DAT_SRQ_PARAM *        srq_param)
@@ -1245,7 +1232,7 @@ DAT_RETURN dat_srq_query(
 			  srq_param);
 }
 
-DAT_RETURN dat_srq_resize(
+DAT_RETURN DAT_API dat_srq_resize(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_COUNT              srq_max_recv_dto)
 {
@@ -1257,7 +1244,7 @@ DAT_RETURN dat_srq_resize(
 			   srq_max_recv_dto);
 }
 
-DAT_RETURN dat_srq_set_lw(
+DAT_RETURN DAT_API dat_srq_set_lw(
 	IN      DAT_SRQ_HANDLE         srq_handle,
 	IN      DAT_COUNT              low_watermark)
 {
@@ -1270,8 +1257,10 @@ DAT_RETURN dat_srq_set_lw(
 }
 
 #ifdef DAT_EXTENSIONS
+
 extern int g_dat_extensions;
-DAT_RETURN dat_extension_op(
+
+DAT_RETURN DAT_API dat_extension_op(
         IN      DAT_HANDLE              handle,
         IN      DAT_EXTENDED_OP         ext_op,
         IN      ... )
@@ -1282,8 +1271,7 @@ DAT_RETURN dat_extension_op(
      va_list args;
 
      /* If not IA handle then just passthrough */
-     if (dats_get_ia_handle((unsigned long)handle,
-				&dapl_handle) != DAT_SUCCESS)
+     if (dats_get_ia_handle(handle, &dapl_handle) != DAT_SUCCESS)
      {
 	     dapl_handle = handle;
      }
diff --git a/dat/common/dat_init.h b/dat/common/dat_init.h
index 9010afd..4596b01 100644
--- a/dat/common/dat_init.h
+++ b/dat/common/dat_init.h
@@ -66,22 +66,33 @@ typedef enum
 DAT_MODULE_STATE
 dat_module_get_state ( void ) ;
 
+#if defined(_MSC_VER) || defined(_WIN64) || defined(_WIN32)
+/* NT. MSC compiler, Win32/64 platform */
+void
+dat_init ( void );
+
+void
+dat_fini ( void );
+
+#else /* GNU C */
+
 void
 dat_init ( void ) __attribute__ ((constructor));
 
 void
 dat_fini ( void ) __attribute__ ((destructor));
+#endif
 
 extern DAT_RETURN 
 dats_handle_vector_init ( void );
 
-extern unsigned long
+extern DAT_IA_HANDLE
 dats_set_ia_handle (
 	IN  DAT_IA_HANDLE		ia_handle);
 
 extern DAT_RETURN 
 dats_get_ia_handle(
-	IN	unsigned long		handle,
+	IN	DAT_IA_HANDLE		handle,
 	OUT	DAT_IA_HANDLE		*ia_handle_p);
 
 extern DAT_BOOLEAN
@@ -90,6 +101,6 @@ dats_is_ia_handle (
 
 extern DAT_RETURN 
 dats_free_ia_handle(
-	IN	unsigned long		handle);
+	IN	DAT_IA_HANDLE		handle);
 
 #endif
diff --git a/dat/common/dat_strerror.c b/dat/common/dat_strerror.c
index 5f88336..885a261 100644
--- a/dat/common/dat_strerror.c
+++ b/dat/common/dat_strerror.c
@@ -49,12 +49,12 @@
  *                                                                   *
  *********************************************************************/
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_major (
     IN  DAT_RETURN 		value,
     OUT const char 		**message );
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_minor (
     IN  DAT_RETURN 		value,
     OUT const char 		**message );
@@ -66,7 +66,7 @@ dat_strerror_minor (
  *                                                                   *
  *********************************************************************/
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_major (
     IN  DAT_RETURN 		value,
     OUT const char 		**message )
@@ -187,7 +187,7 @@ dat_strerror_major (
 }
 
 
-DAT_RETURN
+static DAT_RETURN
 dat_strerror_minor (
     IN  DAT_RETURN 		value,
     OUT const char 		**message )
@@ -600,7 +600,7 @@ dat_strerror_minor (
  *                                                                   *
  *********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_strerror (
     IN  DAT_RETURN 		value,
     OUT const char 		**major_message,
diff --git a/dat/include/dat/dat.h b/dat/include/dat/dat.h
index 7fa543b..9c1632c 100755
--- a/dat/include/dat/dat.h
+++ b/dat/include/dat/dat.h
@@ -58,6 +58,11 @@
 
 #include <dat/dat_error.h>
 
+#ifdef __cplusplus
+extern "C"
+{
+#endif
+
 /* Generic DAT types */
 
 typedef char *  DAT_NAME_PTR;	/* Format for ia_name and attributes */
@@ -972,7 +977,7 @@ typedef struct dat_provider_info
  * unload the library after the last close.
  */
 
-extern DAT_RETURN dat_ia_openv (
+extern DAT_RETURN DAT_API dat_ia_openv (
 	IN      const DAT_NAME_PTR,	/* provider             */
 	IN      DAT_COUNT,		/* asynch_evd_min_qlen  */
 	INOUT   DAT_EVD_HANDLE *,	/* asynch_evd_handle    */
@@ -986,7 +991,7 @@ extern DAT_RETURN dat_ia_openv (
 		DAT_VERSION_MAJOR, DAT_VERSION_MINOR, \
 		DAT_THREADSAFE)
 
-extern DAT_RETURN dat_ia_query (
+extern  DAT_RETURN DAT_API dat_ia_query (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	OUT     DAT_EVD_HANDLE *,	/* async_evd_handle     */
 	IN      DAT_IA_ATTR_MASK,	/* ia_attr_mask         */
@@ -994,38 +999,38 @@ extern DAT_RETURN dat_ia_query (
 	IN      DAT_PROVIDER_ATTR_MASK,	/* provider_attr_mask   */
 	OUT     DAT_PROVIDER_ATTR * );	/* provider_attr        */
 
-extern DAT_RETURN dat_ia_close (
+extern  DAT_RETURN DAT_API dat_ia_close (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_CLOSE_FLAGS );	/* close_flags          */
 
 /* helper functions */
 
-extern DAT_RETURN dat_set_consumer_context (
+extern DAT_RETURN DAT_API dat_set_consumer_context (
 	IN      DAT_HANDLE,		/* dat_handle           */
 	IN      DAT_CONTEXT);		/* context              */
 
-extern DAT_RETURN dat_get_consumer_context (
+extern DAT_RETURN DAT_API dat_get_consumer_context (
 	IN      DAT_HANDLE,		/* dat_handle           */
 	OUT     DAT_CONTEXT * );	/* context              */
 
-extern DAT_RETURN dat_get_handle_type (
+extern DAT_RETURN DAT_API dat_get_handle_type (
 	IN      DAT_HANDLE,		/* dat_handle           */
 	OUT     DAT_HANDLE_TYPE * );	/* handle_type          */
 
 /* CR functions */
 
-extern DAT_RETURN dat_cr_query (
+extern DAT_RETURN DAT_API dat_cr_query (
 	IN      DAT_CR_HANDLE,		/* cr_handle            */
 	IN      DAT_CR_PARAM_MASK,	/* cr_param_mask        */
 	OUT     DAT_CR_PARAM * );	/* cr_param             */
 
-extern DAT_RETURN dat_cr_accept (
+extern DAT_RETURN DAT_API dat_cr_accept (
 	IN      DAT_CR_HANDLE,		/* cr_handle            */
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* private_data_size    */
 	IN      const DAT_PVOID );	/* private_data         */
 
-extern DAT_RETURN dat_cr_reject (
+extern DAT_RETURN DAT_API dat_cr_reject (
 	IN      DAT_CR_HANDLE, 		/* cr_handle            */
 	IN	DAT_COUNT,		/* private_data_size	*/
 	IN const DAT_PVOID );		/* private_data		*/
@@ -1033,35 +1038,35 @@ extern DAT_RETURN dat_cr_reject (
 /* For DAT-1.1 and above, this function is defined for both uDAPL and
  * kDAPL. For DAT-1.0, it is only defined for uDAPL.
  */
-extern DAT_RETURN dat_cr_handoff (
+extern DAT_RETURN DAT_API dat_cr_handoff (
 	IN      DAT_CR_HANDLE,		/* cr_handle            */
 	IN      DAT_CONN_QUAL);		/* handoff              */
 
 /* EVD functions */
 
-extern DAT_RETURN dat_evd_resize (
+extern DAT_RETURN DAT_API dat_evd_resize (
 	IN      DAT_EVD_HANDLE,	        /* evd_handle           */
 	IN      DAT_COUNT );	        /* evd_min_qlen         */
 
-extern DAT_RETURN dat_evd_post_se (
+extern DAT_RETURN DAT_API dat_evd_post_se (
 	IN      DAT_EVD_HANDLE,	        /* evd_handle           */
 	IN      const DAT_EVENT * );    /* event                */
 
-extern DAT_RETURN dat_evd_dequeue (
+extern DAT_RETURN DAT_API dat_evd_dequeue (
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	OUT     DAT_EVENT * );		/* event                */
 
-extern DAT_RETURN dat_evd_query (
+extern DAT_RETURN DAT_API dat_evd_query (
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	IN      DAT_EVD_PARAM_MASK,	/* evd_param_mask       */
 	OUT     DAT_EVD_PARAM * );	/* evd_param            */
 
-extern DAT_RETURN dat_evd_free (
+extern DAT_RETURN DAT_API dat_evd_free (
 	IN      DAT_EVD_HANDLE );	/* evd_handle           */
 
 /* EP functions */
 
-extern DAT_RETURN dat_ep_create (
+extern DAT_RETURN DAT_API dat_ep_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_PZ_HANDLE,		/* pz_handle            */
 	IN      DAT_EVD_HANDLE,		/* recv_completion_evd_handle */
@@ -1070,17 +1075,17 @@ extern DAT_RETURN dat_ep_create (
 	IN      const DAT_EP_ATTR *,	/* ep_attributes        */
 	OUT     DAT_EP_HANDLE * );	/* ep_handle            */
 
-extern DAT_RETURN dat_ep_query (
+extern DAT_RETURN DAT_API dat_ep_query (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EP_PARAM_MASK,	/* ep_param_mask        */
 	OUT     DAT_EP_PARAM * );	/* ep_param             */
 
-extern DAT_RETURN dat_ep_modify (
+extern DAT_RETURN DAT_API dat_ep_modify (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EP_PARAM_MASK,	/* ep_param_mask        */
 	IN      const DAT_EP_PARAM * ); /* ep_param             */
 
-extern DAT_RETURN dat_ep_connect (
+extern DAT_RETURN DAT_API dat_ep_connect (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_IA_ADDRESS_PTR,	/* remote_ia_address    */
 	IN      DAT_CONN_QUAL,		/* remote_conn_qual     */
@@ -1090,7 +1095,7 @@ extern DAT_RETURN dat_ep_connect (
 	IN      DAT_QOS,		/* quality_of_service   */
 	IN      DAT_CONNECT_FLAGS );	/* connect_flags        */
 
-extern DAT_RETURN dat_ep_dup_connect (
+extern DAT_RETURN DAT_API dat_ep_dup_connect (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EP_HANDLE,		/* ep_dup_handle        */
 	IN      DAT_TIMEOUT,		/* timeout              */
@@ -1098,25 +1103,25 @@ extern DAT_RETURN dat_ep_dup_connect (
 	IN      const DAT_PVOID,	/* private_data         */
 	IN      DAT_QOS);		/* quality_of_service   */
 
-extern DAT_RETURN dat_ep_common_connect (
+extern DAT_RETURN DAT_API dat_ep_common_connect (
 	IN      DAT_EP_HANDLE,          /* ep_handle            */
 	IN      DAT_IA_ADDRESS_PTR,     /* remote_ia_address    */
 	IN      DAT_TIMEOUT,            /* timeout              */
 	IN      DAT_COUNT,              /* private_data_size    */
 	IN      const DAT_PVOID );      /* private_data         */
 
-extern DAT_RETURN dat_ep_disconnect (
+extern DAT_RETURN DAT_API dat_ep_disconnect (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_CLOSE_FLAGS );	/* close_flags          */
 
-extern DAT_RETURN dat_ep_post_send (
+extern DAT_RETURN DAT_API dat_ep_post_send (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
 	IN      DAT_DTO_COOKIE,		/* user_cookie          */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_send_with_invalidate (
+extern DAT_RETURN DAT_API dat_ep_post_send_with_invalidate (
 	IN      DAT_EP_HANDLE,          /* ep_handle            */
 	IN      DAT_COUNT,              /* num_segments         */
 	IN      DAT_LMR_TRIPLET *,      /* local_iov            */
@@ -1125,14 +1130,14 @@ extern DAT_RETURN dat_ep_post_send_with_invalidate (
 	IN	DAT_BOOLEAN,		/* invalidate_flag 	*/
 	IN	DAT_RMR_CONTEXT );	/* RMR to invalidate	*/
 
-extern DAT_RETURN dat_ep_post_recv (
+extern DAT_RETURN DAT_API dat_ep_post_recv (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
 	IN      DAT_DTO_COOKIE,		/* user_cookie          */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_rdma_read (
+extern DAT_RETURN DAT_API dat_ep_post_rdma_read (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
@@ -1140,14 +1145,14 @@ extern DAT_RETURN dat_ep_post_rdma_read (
 	IN      const DAT_RMR_TRIPLET *,/* remote_iov           */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_rdma_read_to_rmr (
+extern DAT_RETURN DAT_API dat_ep_post_rdma_read_to_rmr (
 	IN      DAT_EP_HANDLE,          /* ep_handle            */
 	IN const DAT_RMR_TRIPLET *,      /* local_iov            */
 	IN      DAT_DTO_COOKIE,         /* user_cookie          */
 	IN      const DAT_RMR_TRIPLET *,/* remote_iov           */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_post_rdma_write (
+extern DAT_RETURN DAT_API dat_ep_post_rdma_write (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_COUNT,		/* num_segments         */
 	IN      DAT_LMR_TRIPLET *,	/* local_iov            */
@@ -1155,19 +1160,19 @@ extern DAT_RETURN dat_ep_post_rdma_write (
 	IN      const DAT_RMR_TRIPLET *,/* remote_iov           */
 	IN      DAT_COMPLETION_FLAGS ); /* completion_flags     */
 
-extern DAT_RETURN dat_ep_get_status (
+extern DAT_RETURN DAT_API dat_ep_get_status (
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	OUT     DAT_EP_STATE *,		/* ep_state             */
 	OUT     DAT_BOOLEAN *,		/* recv_idle            */
 	OUT     DAT_BOOLEAN * );	/* request_idle         */
 
-extern DAT_RETURN dat_ep_free (
+extern DAT_RETURN DAT_API dat_ep_free (
 	IN      DAT_EP_HANDLE);		/* ep_handle            */
 
-extern DAT_RETURN dat_ep_reset (
+extern DAT_RETURN DAT_API dat_ep_reset (
 	IN      DAT_EP_HANDLE);		/* ep_handle            */
 
-extern DAT_RETURN dat_ep_create_with_srq (
+extern DAT_RETURN DAT_API dat_ep_create_with_srq (
         IN      DAT_IA_HANDLE,          /* ia_handle            */
         IN      DAT_PZ_HANDLE,          /* pz_handle            */
         IN      DAT_EVD_HANDLE,         /* recv_evd_handle      */
@@ -1177,49 +1182,49 @@ extern DAT_RETURN dat_ep_create_with_srq (
         IN      const DAT_EP_ATTR *,    /* ep_attributes        */
         OUT     DAT_EP_HANDLE *);       /* ep_handle            */
 
-extern DAT_RETURN dat_ep_recv_query (
+extern DAT_RETURN DAT_API dat_ep_recv_query (
         IN      DAT_EP_HANDLE,          /* ep_handle            */
         OUT     DAT_COUNT *,            /* nbufs_allocated      */
         OUT     DAT_COUNT *);           /* bufs_alloc_span      */
 
-extern DAT_RETURN dat_ep_set_watermark (
+extern DAT_RETURN DAT_API dat_ep_set_watermark (
         IN      DAT_EP_HANDLE,          /* ep_handle            */
         IN      DAT_COUNT,              /* soft_high_watermark  */
         IN      DAT_COUNT);             /* hard_high_watermark  */
 
 /* LMR functions */
 
-extern DAT_RETURN dat_lmr_free (
+extern DAT_RETURN DAT_API dat_lmr_free (
 	IN      DAT_LMR_HANDLE);	/* lmr_handle           */
 
 /* Non-coherent memory functions */
 
-extern DAT_RETURN dat_lmr_sync_rdma_read (
+extern DAT_RETURN DAT_API dat_lmr_sync_rdma_read (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      const DAT_LMR_TRIPLET *, /* local_segments      */
 	IN      DAT_VLEN);              /* num_segments         */
 
-extern DAT_RETURN dat_lmr_sync_rdma_write (
+extern DAT_RETURN DAT_API dat_lmr_sync_rdma_write (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      const DAT_LMR_TRIPLET *, /* local_segments      */
 	IN      DAT_VLEN);              /* num_segments         */
 
 /* RMR functions */
 
-extern DAT_RETURN dat_rmr_create (
+extern DAT_RETURN DAT_API dat_rmr_create (
 	IN      DAT_PZ_HANDLE,		/* pz_handle            */
 	OUT     DAT_RMR_HANDLE *);	/* rmr_handle           */
 
-extern DAT_RETURN dat_rmr_create_for_ep (
+extern DAT_RETURN DAT_API dat_rmr_create_for_ep (
 	IN      DAT_PZ_HANDLE,          /* pz_handle            */
 	OUT     DAT_RMR_HANDLE *);      /* rmr_handle           */
 
-extern DAT_RETURN dat_rmr_query (
+extern DAT_RETURN DAT_API dat_rmr_query (
 	IN      DAT_RMR_HANDLE,		/* rmr_handle           */
 	IN      DAT_RMR_PARAM_MASK,	/* rmr_param_mask       */
 	OUT     DAT_RMR_PARAM *);	/* rmr_param            */
 
-extern DAT_RETURN dat_rmr_bind (
+extern DAT_RETURN DAT_API dat_rmr_bind (
 	IN      DAT_RMR_HANDLE,		/* rmr_handle           */
 	IN	DAT_LMR_HANDLE,		/* lmr_handle		*/
 	IN      const DAT_LMR_TRIPLET *,/* lmr_triplet          */
@@ -1230,114 +1235,114 @@ extern DAT_RETURN dat_rmr_bind (
 	IN      DAT_COMPLETION_FLAGS,	/* completion_flags     */
 	OUT     DAT_RMR_CONTEXT * );	/* context              */
 
-extern DAT_RETURN dat_rmr_free (
+extern DAT_RETURN DAT_API dat_rmr_free (
 	IN      DAT_RMR_HANDLE);	/* rmr_handle           */
 
 /* PSP functions */
 
-extern DAT_RETURN dat_psp_create (
+extern DAT_RETURN DAT_API dat_psp_create (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      DAT_CONN_QUAL,          /* conn_qual            */
 	IN      DAT_EVD_HANDLE,         /* evd_handle           */
 	IN      DAT_PSP_FLAGS,          /* psp_flags            */
 	OUT     DAT_PSP_HANDLE * );     /* psp_handle           */
 
-extern DAT_RETURN dat_psp_create_any (
+extern DAT_RETURN DAT_API dat_psp_create_any (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	OUT     DAT_CONN_QUAL *,	/* conn_qual            */
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	IN      DAT_PSP_FLAGS,		/* psp_flags            */
 	OUT     DAT_PSP_HANDLE * );	/* psp_handle           */
 
-extern DAT_RETURN dat_psp_query (
+extern DAT_RETURN DAT_API dat_psp_query (
 	IN      DAT_PSP_HANDLE,		/* psp_handle           */
 	IN      DAT_PSP_PARAM_MASK,	/* psp_param_mask       */
 	OUT     DAT_PSP_PARAM * );	/* psp_param            */
 
-extern DAT_RETURN dat_psp_free (
+extern DAT_RETURN DAT_API dat_psp_free (
 	IN      DAT_PSP_HANDLE );	/* psp_handle           */
 
 /* RSP functions */
 
-extern DAT_RETURN dat_rsp_create (
+extern DAT_RETURN DAT_API dat_rsp_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_CONN_QUAL,		/* conn_qual            */
 	IN      DAT_EP_HANDLE,		/* ep_handle            */
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	OUT     DAT_RSP_HANDLE * );	/* rsp_handle           */
 
-extern DAT_RETURN dat_rsp_query (
+extern DAT_RETURN DAT_API dat_rsp_query (
 	IN      DAT_RSP_HANDLE,		/* rsp_handle           */
 	IN      DAT_RSP_PARAM_MASK,	/* rsp_param_mask       */
 	OUT     DAT_RSP_PARAM * );	/* rsp_param            */
 
-extern DAT_RETURN dat_rsp_free (
+extern DAT_RETURN DAT_API dat_rsp_free (
 	IN      DAT_RSP_HANDLE );	/* rsp_handle           */
 
 /* CSP functions */
 
-extern DAT_RETURN dat_csp_create (
+extern DAT_RETURN DAT_API dat_csp_create (
 	IN      DAT_IA_HANDLE,          /* ia_handle            */
 	IN      DAT_COMM *,          	/* communicator		*/
 	IN      DAT_IA_ADDRESS_PTR,     /* address		*/
 	IN      DAT_EVD_HANDLE,         /* evd_handle           */
 	OUT     DAT_CSP_HANDLE * );     /* csp_handle           */
 
-extern DAT_RETURN dat_csp_query (
+extern DAT_RETURN DAT_API dat_csp_query (
 	IN      DAT_CSP_HANDLE,         /* csp_handle           */
 	IN      DAT_CSP_PARAM_MASK,     /* csp_param_mask       */
 	OUT     DAT_CSP_PARAM * );      /* csp_param            */
 
-extern DAT_RETURN dat_csp_free (
+extern DAT_RETURN DAT_API dat_csp_free (
 	IN      DAT_CSP_HANDLE );       /* csp_handle           */
 
 /* PZ functions */
 
-extern DAT_RETURN dat_pz_create (
+extern DAT_RETURN DAT_API dat_pz_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	OUT     DAT_PZ_HANDLE * );	/* pz_handle            */
 
-extern DAT_RETURN dat_pz_query (
+extern DAT_RETURN DAT_API dat_pz_query (
 	IN      DAT_PZ_HANDLE,		/* pz_handle            */
 	IN      DAT_PZ_PARAM_MASK,	/* pz_param_mask        */
 	OUT     DAT_PZ_PARAM *);	/* pz_param             */
 
-extern DAT_RETURN dat_pz_free (
+extern DAT_RETURN DAT_API dat_pz_free (
 	IN      DAT_PZ_HANDLE );	/* pz_handle            */
 
 /* SRQ functions */
 
-extern DAT_RETURN dat_srq_create (
+extern DAT_RETURN DAT_API dat_srq_create (
         IN      DAT_IA_HANDLE,          /* ia_handle            */
         IN      DAT_PZ_HANDLE,          /* pz_handle            */
         IN      DAT_SRQ_ATTR *,         /* srq_attr             */
         OUT     DAT_SRQ_HANDLE *);      /* srq_handle           */
 
-extern DAT_RETURN dat_srq_free (
+extern DAT_RETURN DAT_API dat_srq_free (
 	IN      DAT_SRQ_HANDLE);        /* srq_handle           */
 
-extern DAT_RETURN dat_srq_post_recv (
+extern DAT_RETURN DAT_API dat_srq_post_recv (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT,              /* num_segments         */
 	IN      DAT_LMR_TRIPLET *,      /* local_iov            */
 	IN      DAT_DTO_COOKIE);        /* user_cookie          */
 
-extern DAT_RETURN dat_srq_query (
+extern DAT_RETURN DAT_API dat_srq_query (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_SRQ_PARAM_MASK,     /* srq_param_mask       */
 	OUT     DAT_SRQ_PARAM *);       /* srq_param            */
 
-extern DAT_RETURN dat_srq_resize (
+extern DAT_RETURN DAT_API dat_srq_resize (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT);             /* srq_max_recv_dto     */
 
-extern DAT_RETURN dat_srq_set_lw (
+extern DAT_RETURN DAT_API dat_srq_set_lw (
 	IN      DAT_SRQ_HANDLE,         /* srq_handle           */
 	IN      DAT_COUNT);             /* low_watermark        */
 
 #ifdef DAT_EXTENSIONS
 typedef int	DAT_EXTENDED_OP;
-extern DAT_RETURN dat_extension_op(
+extern DAT_RETURN DAT_API dat_extension_op(
 	IN	DAT_HANDLE,		/* handle */
 	IN      DAT_EXTENDED_OP,	/* operation */
 	IN	... );			/* args */
@@ -1349,7 +1354,7 @@ extern DAT_RETURN dat_extension_op(
  * Note the dat_ia_open and dat_ia_close functions are linked to
  * registration code which "redirects" to the appropriate provider.
  */
-extern DAT_RETURN dat_registry_list_providers (
+extern DAT_RETURN DAT_API dat_registry_list_providers (
 	IN      DAT_COUNT,		/* max_to_return        */
 	OUT     DAT_COUNT *,		/* entries_returned     */
 	OUT     DAT_PROVIDER_INFO *(dat_provider_list[]) ); /* dat_provider_list */
@@ -1357,10 +1362,14 @@ extern DAT_RETURN dat_registry_list_providers (
 /*
  * DAT error functions.
  */
-extern DAT_RETURN dat_strerror (
+extern DAT_RETURN DAT_API dat_strerror (
 	IN      DAT_RETURN,		/* dat function return */
 	OUT     const char ** ,		/* major message string */
 	OUT     const char ** );	/* minor message string */
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _DAT_H_ */
 
diff --git a/dat/include/dat/dat_platform_specific.h b/dat/include/dat/dat_platform_specific.h
index 0314035..b43fe2c 100644
--- a/dat/include/dat/dat_platform_specific.h
+++ b/dat/include/dat/dat_platform_specific.h
@@ -123,6 +123,9 @@ typedef unsigned long long	DAT_UVERYLONG;	/* unsigned longest native to
compiler
 typedef void *                  DAT_PVOID;
 typedef int                     DAT_COUNT;
 
+#define DAT_IA_HANDLE_TO_UL(a) (unsigned long)(a)
+#define UL_TO_DAT_IA_HANDLE(a) (DAT_IA_HANDLE)(a)
+
 #include <sys/socket.h>
 #include <netinet/in.h>
 typedef struct sockaddr        DAT_SOCK_ADDR;  /* Socket address header native to OS */
@@ -133,6 +136,9 @@ typedef struct sockaddr_in6    DAT_SOCK_ADDR6; /* Socket address header native t
 
 typedef DAT_UINT64		DAT_PADDR;
 
+#define DAT_API
+#define DAT_EXPORT		extern
+
 /* Solaris ends */
 
 
@@ -156,6 +162,9 @@ typedef DAT_UINT64		DAT_PADDR;
 #define UINT64_C(c)	c ## ULL
 #endif /* UINT64_C */
 
+#define DAT_IA_HANDLE_TO_UL(a) (unsigned long)(a)
+#define UL_TO_DAT_IA_HANDLE(a) (DAT_IA_HANDLE)(a)
+
 #if defined(__KERNEL__)
 #include <linux/socket.h>
 #include <linux/in.h>
@@ -174,37 +183,74 @@ typedef int DAT_FD;		/* DAT File Descriptor */
 
 typedef struct sockaddr         DAT_SOCK_ADDR; /* Socket address header native to OS */
 typedef struct sockaddr_in6     DAT_SOCK_ADDR6; /* Socket address header native to OS */
-#define DAT_AF_INET AF_INET
-#define DAT_AF_INET6 AF_INET6
-/* Linux ends */
+#define DAT_AF_INET		AF_INET
+#define DAT_AF_INET6		AF_INET6
 
+#define DAT_API
+#define DAT_EXPORT		extern
 
-/* Win32 begins */
-#elif defined(_MSC_VER) || defined(_WIN32) /* NT. MSC compiler, Win32 platform */
+/* Linux ends */
+
+/* Win32/64 begins */
+#elif defined(_MSC_VER) || defined(_WIN32) || defined(_WIN64)
+/* NT. MSC compiler, Win32/64 platform */
 
 typedef unsigned __int32        DAT_UINT32;	/* Unsigned host order, 32 bits */
 typedef unsigned __int64        DAT_UINT64;	/* unsigned host order, 64 bits */
-typedef unsigned  long		DAT_UVERYLONG;	/* unsigned longest native to compiler */
+typedef unsigned  long	        DAT_UVERYLONG;	/* unsigned longest native to compiler */
+
+#if defined(_WIN64)
+#define DAT_IA_HANDLE_TO_UL(a) (unsigned long)((DAT_UINT64)(a))
+#define UL_TO_DAT_IA_HANDLE(a) (DAT_IA_HANDLE)((DAT_UINT64)(a))
+#else // _WIN32
+#define DAT_IA_HANDLE_TO_UL(a) (unsigned long)(a)
+#define UL_TO_DAT_IA_HANDLE(a) (DAT_IA_HANDLE)(a)
+#endif
 
 typedef void *                  DAT_PVOID;
-typedef long                    DAT_COUNT;
+typedef int                     DAT_COUNT;
+typedef DAT_UINT64              DAT_PADDR;
 
-typedef struct sockaddr         DAT_SOCK_ADDR;	/* Socket address header native to OS */
-typedef struct sockaddr_in6     DAT_SOCK_ADDR6; /* Socket address header native to OS */
+typedef struct dat_comm {
+	int	domain;
+	int	type;
+	int	protocol;
+} DAT_COMM;
+
+typedef int DAT_FD;		/* DAT File Descriptor */
+
+typedef struct sockaddr     DAT_SOCK_ADDR; /* Sock addr header native to OS */
+typedef struct sockaddr_in6 DAT_SOCK_ADDR6;/* Sock addr header native to OS */
 
 #ifndef UINT64_C
 #define UINT64_C(c) c ## i64
 #endif /* UINT64_C */
 
-#define DAT_AF_INET             AF_INET
-#define DAT_AF_INET6            AF_INET6
+#define DAT_AF_INET        AF_INET
+#define DAT_AF_INET6       AF_INET6
+
+#if defined(EXPORT_DAT_SYMBOLS)
+#define DAT_EXPORT	__declspec(dllexport)
+#else
+#define DAT_EXPORT	__declspec(dllimport)
+#endif
+
+#define DAT_API		__stdcall
+
+#ifndef __inline__
+#define __inline__	__inline
+#endif
+
+#ifndef INLINE
+#define INLINE		__inline
+#endif
 
 #if defined(__KDAPL__)
 /* must have the DDK for this definition */
 typedef PHYSICAL_ADDRESS	DAT_PADDR;
 #endif /* __KDAPL__ */
 
-/* Win32 ends */
+/* Windoze ends */
 
 
 #else
diff --git a/dat/include/dat/dat_registry.h b/dat/include/dat/dat_registry.h
index fe9db4b..80c3801 100644
--- a/dat/include/dat/dat_registry.h
+++ b/dat/include/dat/dat_registry.h
@@ -59,6 +59,11 @@
 #ifndef _DAT_REGISTRY_H_
 #define _DAT_REGISTRY_H_
 
+#ifdef __cplusplus
+extern "C"
+{
+#endif
+
 #if defined(_UDAT_H_)
 #include <dat/udat_redirection.h>
 #elif defined(_KDAT_H_)
@@ -78,11 +83,11 @@
  *
  */
 
-extern DAT_RETURN dat_registry_add_provider (
+extern DAT_RETURN DAT_API dat_registry_add_provider (
 	IN  const DAT_PROVIDER *,               /* provider          */
 	IN  const DAT_PROVIDER_INFO* );         /* provider info     */
 
-extern DAT_RETURN dat_registry_remove_provider (
+extern DAT_RETURN DAT_API dat_registry_remove_provider (
 	IN  const DAT_PROVIDER *,               /* provider          */
 	IN  const DAT_PROVIDER_INFO* );         /* provider info     */
 
@@ -99,11 +104,11 @@ extern DAT_RETURN dat_registry_remove_provider (
 #define DAT_PROVIDER_INIT_FUNC_STR   "dat_provider_init"
 #define DAT_PROVIDER_FINI_FUNC_STR   "dat_provider_fini"
 
-typedef void ( *DAT_PROVIDER_INIT_FUNC) (
+typedef void ( DAT_API *DAT_PROVIDER_INIT_FUNC) (
 	IN const DAT_PROVIDER_INFO *,           /* provider info     */
 	IN const char *);                       /* instance data     */
 
-typedef void ( *DAT_PROVIDER_FINI_FUNC) (
+typedef void ( DAT_API *DAT_PROVIDER_FINI_FUNC) (
 	IN const DAT_PROVIDER_INFO *);          /* provider info     */
 
 typedef enum dat_ha_relationship
@@ -115,9 +120,13 @@ typedef enum dat_ha_relationship
 	DAT_HA_EXTENSION_BASE
 } DAT_HA_RELATIONSHIP;
 
-extern DAT_RETURN dat_registry_providers_related (
+extern DAT_RETURN DAT_API dat_registry_providers_related (
 	IN      const DAT_NAME_PTR,
 	IN      const DAT_NAME_PTR,
 	OUT     DAT_HA_RELATIONSHIP * );
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _DAT_REGISTRY_H_ */
diff --git a/dat/include/dat/udat.h b/dat/include/dat/udat.h
index 7a89241..a9bb2ac 100755
--- a/dat/include/dat/udat.h
+++ b/dat/include/dat/udat.h
@@ -60,6 +60,11 @@
 
 #include <dat/dat_platform_specific.h>
 
+#ifdef __cplusplus
+extern "C"
+{
+#endif
+
 typedef enum dat_mem_type
 {
         /* Shared between udat and kdat */
@@ -152,7 +157,7 @@ typedef char (* DAT_LMR_COOKIE)[DAT_LMR_COOKIE_SIZE];
 
 /* Format for OS wait proxy agent function */
 
-typedef void (*DAT_AGENT_FUNC)
+typedef void (DAT_API *DAT_AGENT_FUNC)
 (
     DAT_PVOID,                 /* instance data */
     DAT_EVD_HANDLE             /* Event Dispatcher*/
@@ -410,7 +415,7 @@ struct dat_provider_attr
  * User DAT function call definitions,
  */
 
-extern DAT_RETURN dat_lmr_create (
+extern DAT_RETURN DAT_API dat_lmr_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_MEM_TYPE,		/* mem_type             */
 	IN      DAT_REGION_DESCRIPTION, /* region_description   */
@@ -424,72 +429,76 @@ extern DAT_RETURN dat_lmr_create (
 	OUT     DAT_VLEN *,		/* registered_length    */
 	OUT     DAT_VADDR * );		/* registered_address   */
 
-extern DAT_RETURN dat_lmr_query (
+extern DAT_RETURN DAT_API dat_lmr_query (
 	IN      DAT_LMR_HANDLE,		/* lmr_handle           */
 	IN      DAT_LMR_PARAM_MASK,	/* lmr_param_mask       */
 	OUT     DAT_LMR_PARAM * );	/* lmr_param            */
 
 /* Event Functions */
 
-extern DAT_RETURN dat_evd_create (
+extern DAT_RETURN DAT_API dat_evd_create (
 	IN      DAT_IA_HANDLE,		/* ia_handle            */
 	IN      DAT_COUNT,		/* evd_min_qlen         */
 	IN      DAT_CNO_HANDLE,		/* cno_handle           */
 	IN      DAT_EVD_FLAGS,		/* evd_flags            */
 	OUT     DAT_EVD_HANDLE * );	/* evd_handle           */
 
-extern DAT_RETURN dat_evd_modify_cno (
+extern DAT_RETURN DAT_API dat_evd_modify_cno (
 	IN      DAT_EVD_HANDLE,		/* evd_handle           */
 	IN      DAT_CNO_HANDLE);	/* cno_handle           */
 
-extern DAT_RETURN dat_cno_create (
+extern DAT_RETURN DAT_API dat_cno_create (
 	IN 	DAT_IA_HANDLE,		/* ia_handle            */
 	IN 	DAT_OS_WAIT_PROXY_AGENT,/* agent                */
 	OUT 	DAT_CNO_HANDLE *);	/* cno_handle           */
 
-extern DAT_RETURN dat_cno_modify_agent (
+extern DAT_RETURN DAT_API dat_cno_modify_agent (
 	IN 	DAT_CNO_HANDLE,		 /* cno_handle           */
 	IN 	DAT_OS_WAIT_PROXY_AGENT);/* agent                */
 
-extern DAT_RETURN dat_cno_query (
+extern DAT_RETURN DAT_API dat_cno_query (
 	IN      DAT_CNO_HANDLE,		/* cno_handle            */
 	IN      DAT_CNO_PARAM_MASK,	/* cno_param_mask        */
 	OUT     DAT_CNO_PARAM * );	/* cno_param             */
 
-extern DAT_RETURN dat_cno_free (
+extern DAT_RETURN DAT_API dat_cno_free (
 	IN DAT_CNO_HANDLE);		/* cno_handle            */
 
-extern DAT_RETURN dat_cno_wait (
+extern DAT_RETURN DAT_API dat_cno_wait (
 	IN  	DAT_CNO_HANDLE,		/* cno_handle            */
 	IN  	DAT_TIMEOUT,		/* timeout               */
 	OUT 	DAT_EVD_HANDLE *);	/* evd_handle            */
 
-extern DAT_RETURN dat_evd_enable (
+extern DAT_RETURN DAT_API dat_evd_enable (
 	IN      DAT_EVD_HANDLE);	/* evd_handle            */
 
-extern DAT_RETURN dat_evd_wait (
+extern DAT_RETURN DAT_API dat_evd_wait (
 	IN  	DAT_EVD_HANDLE,		/* evd_handle            */
 	IN  	DAT_TIMEOUT,		/* timeout               */
 	IN  	DAT_COUNT,		/* threshold             */
 	OUT 	DAT_EVENT *,		/* event                 */
 	OUT 	DAT_COUNT * );		/* n_more_events         */
 
-extern DAT_RETURN dat_evd_disable (
+extern DAT_RETURN DAT_API dat_evd_disable (
 	IN      DAT_EVD_HANDLE);	/* evd_handle            */
 
-extern DAT_RETURN dat_evd_set_unwaitable (
+extern DAT_RETURN DAT_API dat_evd_set_unwaitable (
 	IN DAT_EVD_HANDLE);		/* evd_handle            */
 
-extern DAT_RETURN dat_evd_clear_unwaitable (
+extern DAT_RETURN DAT_API dat_evd_clear_unwaitable (
 	IN DAT_EVD_HANDLE);		/* evd_handle            */
 
-extern DAT_RETURN dat_cno_fd_create (
+extern DAT_RETURN DAT_API dat_cno_fd_create (
 	IN	DAT_IA_HANDLE,		/* ia_handle		*/
 	OUT	DAT_FD *,		/* file descriptor	*/
 	OUT	DAT_CNO_HANDLE * );	/* cno_handle		*/
 
-extern DAT_RETURN dat_cno_trigger (
+extern DAT_RETURN DAT_API dat_cno_trigger (
 	IN	DAT_CNO_HANDLE,		/* cno_handle		*/
 	OUT	DAT_EVD_HANDLE * );	/* evd_handle		*/
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _UDAT_H_ */
diff --git a/dat/udat/udat.c b/dat/udat/udat.c
index 8356482..9137e65 100755
--- a/dat/udat/udat.c
+++ b/dat/udat/udat.c
@@ -82,15 +82,15 @@ int g_dat_extensions = 0;
  * Function: dat_registry_add_provider
  ***********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_registry_add_provider (
 	IN  const DAT_PROVIDER	 	*provider,
 	IN  const DAT_PROVIDER_INFO	*provider_info )
 {
-    DAT_DR_ENTRY 		entry;
+    DAT_DR_ENTRY 	entry;
 
     dat_os_dbg_print (DAT_OS_DBG_TYPE_PROVIDER_API,
-		      "DAT Registry: dat_registry_add_provider (%s,%x:%x,%x)\n",
+	      "DAT Registry: %s (%s,%x:%x,%x)\n", __FUNCTION__,
 		      provider_info->ia_name,
 		      provider_info->dapl_version_major,
 		      provider_info->dapl_version_minor,
@@ -123,7 +123,7 @@ dat_registry_add_provider (
 // Function: dat_registry_remove_provider
 //***********************************************************************
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_registry_remove_provider (
 	IN  const DAT_PROVIDER 		*provider,
 	IN  const DAT_PROVIDER_INFO	*provider_info )
@@ -155,7 +155,7 @@ dat_registry_remove_provider (
  * Function: dat_ia_open
  ***********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_ia_openv (
     IN	   const DAT_NAME_PTR	name,
     IN	   DAT_COUNT		async_event_qlen,
@@ -271,7 +271,7 @@ dat_ia_openv (
  * Function: dat_ia_close
  ***********************************************************************/
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_ia_close (
     IN DAT_IA_HANDLE	ia_handle,
     IN DAT_CLOSE_FLAGS	ia_flags)
@@ -366,7 +366,7 @@ dat_ia_close (
 // Function: dat_registry_list_providers
 //***********************************************************************
 
-DAT_RETURN
+DAT_RETURN DAT_API
 dat_registry_list_providers (
     IN  DAT_COUNT   		max_to_return,
     OUT DAT_COUNT   		*entries_returned,
diff --git a/dat/udat/udat_api.c b/dat/udat/udat_api.c
index 58813fe..8dea05c 100644
--- a/dat/udat/udat_api.c
+++ b/dat/udat/udat_api.c
@@ -52,7 +52,7 @@
 
 #define UDAT_IS_BAD_HANDLE(h) ( NULL == (p) )
 
-DAT_RETURN dat_lmr_create (
+DAT_RETURN DAT_API dat_lmr_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_MEM_TYPE		mem_type,
 	IN      DAT_REGION_DESCRIPTION	region_description,
@@ -91,7 +91,7 @@ DAT_RETURN dat_lmr_create (
 }
 
 
-DAT_RETURN dat_evd_create (
+DAT_RETURN DAT_API dat_evd_create (
 	IN      DAT_IA_HANDLE		ia_handle,
 	IN      DAT_COUNT		evd_min_qlen,
 	IN      DAT_CNO_HANDLE		cno_handle,
@@ -116,7 +116,7 @@ DAT_RETURN dat_evd_create (
 }
 
 
-DAT_RETURN dat_evd_modify_cno (
+DAT_RETURN DAT_API dat_evd_modify_cno (
 	IN      DAT_EVD_HANDLE		evd_handle,
 	IN      DAT_CNO_HANDLE		cno_handle)
 {
@@ -129,7 +129,7 @@ DAT_RETURN dat_evd_modify_cno (
 }
 
 
-DAT_RETURN dat_cno_create (
+DAT_RETURN DAT_API dat_cno_create (
 	IN 	DAT_IA_HANDLE		ia_handle,
 	IN 	DAT_OS_WAIT_PROXY_AGENT agent,
 	OUT 	DAT_CNO_HANDLE		*cno_handle)
@@ -149,7 +149,7 @@ DAT_RETURN dat_cno_create (
     return dat_status;
 }
 
-DAT_RETURN dat_cno_fd_create (
+DAT_RETURN DAT_API dat_cno_fd_create (
 	IN 	DAT_IA_HANDLE		ia_handle,
 	OUT	DAT_FD			*fd,
 	OUT 	DAT_CNO_HANDLE		*cno_handle)
@@ -169,7 +169,7 @@ DAT_RETURN dat_cno_fd_create (
     return dat_status;
 }
 
-DAT_RETURN dat_cno_modify_agent (
+DAT_RETURN DAT_API dat_cno_modify_agent (
 	IN 	DAT_CNO_HANDLE		 cno_handle,
 	IN 	DAT_OS_WAIT_PROXY_AGENT	 agent)
 {
@@ -182,7 +182,7 @@ DAT_RETURN dat_cno_modify_agent (
 }
 
 
-DAT_RETURN dat_cno_query (
+DAT_RETURN DAT_API dat_cno_query (
 	IN      DAT_CNO_HANDLE		cno_handle,
 	IN      DAT_CNO_PARAM_MASK	cno_param_mask,
 	OUT     DAT_CNO_PARAM		*cno_param)
@@ -193,7 +193,7 @@ DAT_RETURN dat_cno_query (
 }
 
 
-DAT_RETURN dat_cno_free (
+DAT_RETURN DAT_API dat_cno_free (
 	IN DAT_CNO_HANDLE		cno_handle)
 {
     if (cno_handle == NULL)
@@ -204,7 +204,7 @@ DAT_RETURN dat_cno_free (
 }
 
 
-DAT_RETURN dat_cno_wait (
+DAT_RETURN DAT_API dat_cno_wait (
 	IN  	DAT_CNO_HANDLE		cno_handle,
 	IN  	DAT_TIMEOUT		timeout,
 	OUT 	DAT_EVD_HANDLE		*evd_handle)
@@ -219,7 +219,7 @@ DAT_RETURN dat_cno_wait (
 }
 
 
-DAT_RETURN dat_evd_enable (
+DAT_RETURN DAT_API dat_evd_enable (
 	IN      DAT_EVD_HANDLE		evd_handle)
 {
     if (evd_handle == NULL)
@@ -230,7 +230,7 @@ DAT_RETURN dat_evd_enable (
 }
 
 
-DAT_RETURN dat_evd_wait (
+DAT_RETURN DAT_API dat_evd_wait (
 	IN  	DAT_EVD_HANDLE		evd_handle,
 	IN  	DAT_TIMEOUT		Timeout,
 	IN  	DAT_COUNT		Threshold,
@@ -249,7 +249,7 @@ DAT_RETURN dat_evd_wait (
 }
 
 
-DAT_RETURN dat_evd_disable (
+DAT_RETURN DAT_API dat_evd_disable (
 	IN      DAT_EVD_HANDLE		evd_handle)
 {
     if (evd_handle == NULL)
@@ -260,7 +260,7 @@ DAT_RETURN dat_evd_disable (
 }
 
 
-DAT_RETURN dat_evd_set_unwaitable (
+DAT_RETURN DAT_API dat_evd_set_unwaitable (
 	IN 	DAT_EVD_HANDLE		 evd_handle)
 {
     if (evd_handle == NULL)
@@ -271,7 +271,7 @@ DAT_RETURN dat_evd_set_unwaitable (
 }
 
 
-DAT_RETURN dat_evd_clear_unwaitable (
+DAT_RETURN DAT_API dat_evd_clear_unwaitable (
 	IN 	DAT_EVD_HANDLE		 evd_handle)
 {
     if (evd_handle == NULL)
@@ -283,7 +283,7 @@ DAT_RETURN dat_evd_clear_unwaitable (
 
 
-DAT_RETURN dat_cr_handoff (
+DAT_RETURN DAT_API dat_cr_handoff (
     IN          DAT_CR_HANDLE		cr_handle,
     IN          DAT_CONN_QUAL		handoff)
 {
@@ -296,7 +296,7 @@ DAT_RETURN dat_cr_handoff (
 }
 
 
-DAT_RETURN dat_lmr_query (
+DAT_RETURN DAT_API dat_lmr_query (
 	IN      DAT_LMR_HANDLE		lmr_handle,
 	IN      DAT_LMR_PARAM_MASK	lmr_param_mask,
 	OUT     DAT_LMR_PARAM		*lmr_param)


From hrosenstock at xsigo.com  Tue Dec 11 10:34:37 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Tue, 11 Dec 2007 10:34:37 -0800
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <20071211164657.GJ23319@sashak.voltaire.com>
References: <1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
	<1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
	<20071211164657.GJ23319@sashak.voltaire.com>
Message-ID: <1197398079.8114.718.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-11 at 16:46 +0000, Sasha Khapyorsky wrote:
> On 07:25 Tue 11 Dec     , Hal Rosenstock wrote:
> > On Tue, 2007-12-11 at 15:27 +0000, Sasha Khapyorsky wrote:
> > > On 06:57 Tue 11 Dec     , Hal Rosenstock wrote:
> > > > On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> > > > > For CAs query performance counters only for single ports by lid and port
> > > > > number, and not whole node with 'all ports' option.
> > > > 
> > > > Should the description also reference the bug # ?
> > > 
> > > I will add.
> > > 
> > > > Will a similar thing be done to the other diag scripts which have this
> > > > same issue (but haven't been reported yet) ?
> > > 
> > > It is reasonable. I will try to check other scripts too.
> > > 
> > > > Would it be better to fix this in the underlying tool used (perfquery)
> > > > and in that way address it for all the diag scripts ?
> > > 
> > > I think perfquery could/should be improved as well, but it is not the
> > > same issue. 
> > 
> > Why not ?
> > 
> > If perfquery paved over the lack of support for all ports, then all the
> > scripts would be fine as is, right ?
> 
> Yes, but I think that it more accurate to query CA ports and not just
> nodes (even if 'all ports' option is supported).

Router ports need same handling as CA ports.
 
> > > I think that in general it is more accurate when whole
> > > fabric is checked to query endport's by port and not by node - multiport
> > > CA can have disconnected ports and/or ports which connected to another
> > > subnet - in this way its counters are irrelevant to the check. Right?
> > 
> > Yes, but doing it on a node basis cuts down on the number of queries.
> 
> True, but doing right things is more important here than number of
> queries IMO (BTW in practice the difference in number of queries is not
> so significant - it is in percents, not in times).

As long as switch ports aren't individually queried.

> > One can always go back and dive down to the port level after seeing
> > which nodes are of interest.
> 
> The problem is that one can get invalid error report with such script -
> for example when CA has "bad" port which is connected to another subnet.

It is a bad port; just not in this particular subnet.

-- Hal

> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Tue Dec 11 10:41:44 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Tue, 11 Dec 2007 10:41:44 -0800
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <20071211165617.GK23319@sashak.voltaire.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
	<1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
	<20071211164657.GJ23319@sashak.voltaire.com>
	<20071211165617.GK23319@sashak.voltaire.com>
Message-ID: <1197398504.8114.725.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-11 at 16:56 +0000, Sasha Khapyorsky wrote:
> On 16:46 Tue 11 Dec     , Sasha Khapyorsky wrote:
> > On 07:25 Tue 11 Dec     , Hal Rosenstock wrote:
> > > On Tue, 2007-12-11 at 15:27 +0000, Sasha Khapyorsky wrote:
> > > > On 06:57 Tue 11 Dec     , Hal Rosenstock wrote:
> > > > > On Tue, 2007-12-11 at 13:46 +0000, Sasha Khapyorsky wrote:
> > > > > > For CAs query performance counters only for single ports by lid and port
> > > > > > number, and not whole node with 'all ports' option.
> > > > > 
> > > > > Should the description also reference the bug # ?
> > > > 
> > > > I will add.
> > > > 
> > > > > Will a similar thing be done to the other diag scripts which have this
> > > > > same issue (but haven't been reported yet) ?
> > > > 
> > > > It is reasonable. I will try to check other scripts too.
> > > > 
> > > > > Would it be better to fix this in the underlying tool used (perfquery)
> > > > > and in that way address it for all the diag scripts ?
> > > > 
> > > > I think perfquery could/should be improved as well, but it is not the
> > > > same issue. 
> > > 
> > > Why not ?
> > > 
> > > If perfquery paved over the lack of support for all ports, then all the
> > > scripts would be fine as is, right ?
> 
> Another aspect of this.
> 
> I'm not close that 'all ports' simulation in perfquery is great thing.

In a sense, it's no different than what the agent itself might be doing;
albeit over a larger time span.

> perfquery is low level tool and it should be able to indicate in clear
> way that 'all ports' option is not supported by port instead of hiding
> this behind simulation. Maybe 'all ports' simulation should optional...

Guess you did a 180 turn on this. Last I recall on the list you wanted
this functionality.

> I'm not sure yet.
> 
> Also I think that when perfquery targets CA port just by LID and when
> port number is not specified 'all ports' should not be default, but
> instead port number of this LID. Such behavior seems to be more "native"
> for me.

Not sure what you mean by "native" but there is some precedence in the
IB spec for your preference depending on one's interpretation of the
PortSelect component description.

-- Hal

> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Tue Dec 11 10:50:01 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Tue, 11 Dec 2007 10:50:01 -0800
Subject: [ofa-general] [PATCH] IB/CM: add support for routed paths
In-Reply-To: <475EC4FF.9080702@ichips.intel.com>
References: <20071210203544.GI30090@obsidianresearch.com>
	<000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>
	<1197377985.8114.637.camel@hrosenstock-ws.xsigo.com>
	<475EC4FF.9080702@ichips.intel.com>
Message-ID: <1197399001.8114.731.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-11 at 09:12 -0800, Sean Hefty wrote:
> > Could the subnet local component(s) in the REQ be used on the passive
> > side instead of permissive LIDs ? That might be more "standard" than
> > using permissive LIDs.
> 
> It could be used.  I wanted this to work on the passive side if the LIDs 
> were set correctly in the REQ.

By correctly, do you mean set to permissive in the REQ ?

Guess I don't see much difference (on the passive side) in checking the
LIDs for permissive or the subnet local field to determine whether to
use the LIDs from the LRH. The only difference is this additional
special meaning to the permissive LIDs.

-- Hal

> 
> - Sean
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From mshefty at ichips.intel.com  Tue Dec 11 11:02:00 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Tue, 11 Dec 2007 11:02:00 -0800
Subject: [ofa-general] [PATCH] IB/CM: add support for routed paths
In-Reply-To: <1197399001.8114.731.camel@hrosenstock-ws.xsigo.com>
References: <20071210203544.GI30090@obsidianresearch.com>	
	<000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>	
	<1197377985.8114.637.camel@hrosenstock-ws.xsigo.com>	
	<475EC4FF.9080702@ichips.intel.com>
	<1197399001.8114.731.camel@hrosenstock-ws.xsigo.com>
Message-ID: <475EDEA8.2020906@ichips.intel.com>

> By correctly, do you mean set to permissive in the REQ ?

I meant if the LIDs in the REQ were set to the actual values that the 
passive side should use.

- Sean


From ssufficool at rov.sbcounty.gov  Tue Dec 11 11:07:55 2007
From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley)
Date: Tue, 11 Dec 2007 11:07:55 -0800
Subject: [ofa-general] Cisco Topspin drivers available
In-Reply-To: <c177de4a0712110703p594ea7dalbf5f124689d41107@mail.gmail.com>
Message-ID: <C2F174F99918D54CA2A96E57C5079B6F3551AD@sbc-exmsg2.sbcounty.gov>

Mellanox site has instructions on identifying the PSID of the card (
http://www.mellanox.com/support/HCA_FW_identification.php ) .
 
Once you get the correct chipset id, you should be able to update from
the Mellanox firmware.
 
http://www.mellanox.com/support/firmware_table_IH3Ex.php
 
However, your PSID will be reflashed to read as a Mellanox card unless
you use an OEM flash INI file instructions.
 
 
	-----Original Message-----
	From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Chuck
Hartley
	Sent: Tuesday, December 11, 2007 7:03 AM
	To: general at lists.openfabrics.org
	Subject: [ofa-general] Cisco Topspin drivers available
	
	
	Does anyone here know where I can find up to date drivers for
the Cisco Topspin HCAs?  I tried the Cisco and IBM websites and the
drivers there are are least a year old and do not support any recent
kernels.  We have an IBM QS21 blade running Fedora 7 (or possibly RHEL
5.1 soon) and need a driver for the HCA.  The Cisco release notes say to
build the driver from the source tarball if you have a unsupported OS,
but there do not seem to be any tarballs on the images available for
download.  Even though the HCA apparently uses a Mellanox chip, it is
not supported my the mthca driver.  Here is what lspci says: 
	
	InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex
(Tavor compatibility mode) (rev a0)
	
	Hopefully someone here is using one of these with a modern
kernel / distribution...
	
	Thanks,
	Chuck
	

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/958b15be/attachment.html>

From tziporet at dev.mellanox.co.il  Tue Dec 11 12:34:51 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Tue, 11 Dec 2007 22:34:51 +0200
Subject: [ofa-general] rc1 this week?
In-Reply-To: <20071211174223.GB19090@sgi.com>
References: <20071211174223.GB19090@sgi.com>
Message-ID: <475EF46B.8010601@mellanox.co.il>

akepner at sgi.com wrote:
> Hi Tziporet; 
>
> Is OFED-1.3 rc1 still expected to be out this week?
>
>   
yes - I will update status tomorrow
Tziporet


From tziporet at dev.mellanox.co.il  Tue Dec 11 12:37:10 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Tue, 11 Dec 2007 22:37:10 +0200
Subject: [ofa-general] [PATCH] mstflint: Convert project to autoconf tools.
In-Reply-To: <20071210133554.44bac886.weiny2@llnl.gov>
References: <20071210133554.44bac886.weiny2@llnl.gov>
Message-ID: <475EF4F6.6000401@mellanox.co.il>

Adding Oren who is mstflint maintainer

Tziporet

Ira Weiny wrote:
> This patch removes the makefile and converts the mstflint git tree over to
> autoconf tools.  This works great on x86_64 but has not been tested on other
> arch's.  (Although it is simple enough I don't see how would not work.)
>
> Thanks,
> Ira
>
>
> >From efb3a07a1f333ea95204d2a2e9462e285e29a65f Mon Sep 17 00:00:00 2001
> From: Ira K. Weiny <weiny2 at llnl.gov>
> Date: Mon, 10 Dec 2007 13:30:22 -0800
> Subject: [PATCH] Convert project to autoconf tools.
>
>
> Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
> ---
>  Makefile         |   47 -----------------------------------------------
>  Makefile.am      |   21 +++++++++++++++++++++
>  autogen.sh       |   11 +++++++++++
>  configure.in     |   22 ++++++++++++++++++++++
>  mstflint.spec.in |   45 +++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 99 insertions(+), 47 deletions(-)
>  delete mode 100644 Makefile
>  create mode 100644 Makefile.am
>  create mode 100755 autogen.sh
>  create mode 100644 configure.in
>  create mode 100644 mstflint.spec.in
>
> diff --git a/Makefile b/Makefile
> deleted file mode 100644
> index 889c97a..0000000
> --- a/Makefile
> +++ /dev/null
> @@ -1,47 +0,0 @@
> -#default options
> -CFLAGS += -O2
> -CFLAGS += -g
> -CFLAGS += -Wall
> -CXXFLAGS += -fno-exceptions
> -CFLAGS += -I.
> -LD=$(CXX)
> -EXTRA_LOADLIBES=-lz
> -LOADLIBES+=${EXTRA_LOADLIBES}
> -
> -all: default
> -bin: mstflint mstmread mstmwrite mstregdump mstvpd
> -
> -default: bin
> -static: bin
> -shared: bin
> -
> -.PHONY: all bin clean static shared default
> -.DELETE_ON_ERROR:
> -
> -default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libz.a)" "$(shell $(CXX)  ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libstdc++.a)"
> -default: LD=$(CC)
> -static: CFLAGS+=-static
> -
> -mstflint: mstflint.o mflash.o
> -	$(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} mstflint.o mflash.o -o mstflint ${LOADLIBES}
> -
> -mstflint.o: flint.cpp mflash.h
> -	$(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o mstflint.o
> -
> -mflash.o: mtcr.h mflash.c mflash.h
> -	$(CC) ${CFLAGS} -c mflash.c -o mflash.o
> -
> -mstmwrite: mwrite.c mtcr.h
> -	$(CC) ${CFLAGS} mwrite.c -o mstmwrite
> -
> -mstmread: mread.c mtcr.h
> -	$(CC) ${CFLAGS} mread.c -o mstmread
> -
> -mstregdump: mstdump.c mtcr.h
> -	$(CC) ${CFLAGS} mstdump.c -o mstregdump
> -
> -mstvpd: vpd.c
> -	$(CC) ${CFLAGS} vpd.c -o mstvpd
> -
> -clean:
> -	rm -f mstvpd mstregdump mstflint mstmread mstmwrite mstflint.o mflash.o
> diff --git a/Makefile.am b/Makefile.am
> new file mode 100644
> index 0000000..f642d9d
> --- /dev/null
> +++ b/Makefile.am
> @@ -0,0 +1,21 @@
> +bin_PROGRAMS = mstmread \
> +					mstmwrite \
> +					mstflint \
> +					mstregdump \
> +					mstvpd
> +
> +mstmread_SOURCES = mread.c mtcr.h
> +
> +mstmwrite_SOURCES = mwrite.c mtcr.h
> +
> +mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
> +mstflint_LDFLAGS = -lz
> +
> +mstregdump_SOURCES = mread.c mtcr.h
> +
> +mstvpd_SOURCES = vpd.c
> +
> +
> +EXTRA_DIST = \
> +	mstflint.spec
> +
> diff --git a/autogen.sh b/autogen.sh
> new file mode 100755
> index 0000000..4827884
> --- /dev/null
> +++ b/autogen.sh
> @@ -0,0 +1,11 @@
> +#! /bin/sh
> +
> +# create config dir if not exist
> +test -d config || mkdir config
> +
> +set -x
> +aclocal -I config
> +libtoolize --force --copy
> +autoheader
> +automake --foreign --add-missing --copy
> +autoconf
> diff --git a/configure.in b/configure.in
> new file mode 100644
> index 0000000..0924d65
> --- /dev/null
> +++ b/configure.in
> @@ -0,0 +1,22 @@
> +dnl Process this file with autoconf to produce a configure script.
> +
> +AC_INIT(mstflint)
> +
> +AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the project name.])
> +AC_SUBST([PROJECT])
> +
> +AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the project version.])
> +AC_SUBST([VERSION])
> +
> +AC_CONFIG_AUX_DIR(config)
> +AC_CONFIG_SRCDIR([README])
> +AM_INIT_AUTOMAKE(mstflint, 1.3)
> +
> +dnl Checks for programs
> +AC_PROG_CC
> +AC_PROG_CXX
> +AC_PROG_LIBTOOL
> +AC_CONFIG_HEADERS
> +
> +AC_CONFIG_FILES([Makefile mstflint.spec])
> +AC_OUTPUT
> diff --git a/mstflint.spec.in b/mstflint.spec.in
> new file mode 100644
> index 0000000..b5937be
> --- /dev/null
> +++ b/mstflint.spec.in
> @@ -0,0 +1,45 @@
> +Summary: Mellanox firmware burning application
> +Name: mstflint
> +Version: @VERSION@
> +Release: 1
> +License: GPL/BSD
> +Url: http://openib.org/
> +Group: System Environment/Base
> +BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
> +Source: mstflint- at VERSION@.tar.gz
> +ExclusiveArch: i386 x86_64 ia64 ppc ppc64
> +BuildRequires: zlib-devel
> +Requires(post): chkconfig
> +
> +%description
> +This package contains a tool for burning updated firmware on to
> +Mellanox manufactured InfiniBand adapters.
> +
> +%prep
> +%setup -q
> +
> +%build
> +%configure
> +make
> +
> +%install
> +rm -rf $RPM_BUILD_ROOT
> +make DESTDIR=${RPM_BUILD_ROOT} install
> +# remove unpackaged files from the buildroot
> +rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
> +
> +%clean
> +rm -rf $RPM_BUILD_ROOT
> +
> +%files
> +%defattr(-,root,root)
> +%{_bindir}/mstmread
> +%{_bindir}/mstmwrite
> +%{_bindir}/mstflint
> +%{_bindir}/mstregdump
> +%{_bindir}/mstvpd
> +
> +%changelog
> +* Fri Dec 07 2007 Ira Weiny <weiny2 at llnl.gov> 1.0.0
> +   initial creation
> +
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From weiny2 at llnl.gov  Tue Dec 11 13:00:00 2007
From: weiny2 at llnl.gov (Ira Weiny)
Date: Tue, 11 Dec 2007 13:00:00 -0800
Subject: [ofa-general] [PATCH] mstflint: Convert project to autoconf tools.
In-Reply-To: <475EF4F6.6000401@mellanox.co.il>
References: <20071210133554.44bac886.weiny2@llnl.gov>
	<475EF4F6.6000401@mellanox.co.il>
Message-ID: <20071211130000.41b3025b.weiny2@llnl.gov>

On Tue, 11 Dec 2007 22:37:10 +0200
Tziporet Koren <tziporet at dev.mellanox.co.il> wrote:

> Adding Oren who is mstflint maintainer

Ah my bad, sorry.

Thanks,
Ira

> 
> Tziporet
> 
> Ira Weiny wrote:
> > This patch removes the makefile and converts the mstflint git tree over to
> > autoconf tools.  This works great on x86_64 but has not been tested on other
> > arch's.  (Although it is simple enough I don't see how would not work.)
> >
> > Thanks,
> > Ira
> >
> >
> > >From efb3a07a1f333ea95204d2a2e9462e285e29a65f Mon Sep 17 00:00:00 2001
> > From: Ira K. Weiny <weiny2 at llnl.gov>
> > Date: Mon, 10 Dec 2007 13:30:22 -0800
> > Subject: [PATCH] Convert project to autoconf tools.
> >
> >
> > Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
> > ---
> >  Makefile         |   47 -----------------------------------------------
> >  Makefile.am      |   21 +++++++++++++++++++++
> >  autogen.sh       |   11 +++++++++++
> >  configure.in     |   22 ++++++++++++++++++++++
> >  mstflint.spec.in |   45 +++++++++++++++++++++++++++++++++++++++++++++
> >  5 files changed, 99 insertions(+), 47 deletions(-)
> >  delete mode 100644 Makefile
> >  create mode 100644 Makefile.am
> >  create mode 100755 autogen.sh
> >  create mode 100644 configure.in
> >  create mode 100644 mstflint.spec.in
> >
> > diff --git a/Makefile b/Makefile
> > deleted file mode 100644
> > index 889c97a..0000000
> > --- a/Makefile
> > +++ /dev/null
> > @@ -1,47 +0,0 @@
> > -#default options
> > -CFLAGS += -O2
> > -CFLAGS += -g
> > -CFLAGS += -Wall
> > -CXXFLAGS += -fno-exceptions
> > -CFLAGS += -I.
> > -LD=$(CXX)
> > -EXTRA_LOADLIBES=-lz
> > -LOADLIBES+=${EXTRA_LOADLIBES}
> > -
> > -all: default
> > -bin: mstflint mstmread mstmwrite mstregdump mstvpd
> > -
> > -default: bin
> > -static: bin
> > -shared: bin
> > -
> > -.PHONY: all bin clean static shared default
> > -.DELETE_ON_ERROR:
> > -
> > -default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libz.a)" "$(shell $(CXX)  ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libstdc++.a)"
> > -default: LD=$(CC)
> > -static: CFLAGS+=-static
> > -
> > -mstflint: mstflint.o mflash.o
> > -	$(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} mstflint.o mflash.o -o mstflint ${LOADLIBES}
> > -
> > -mstflint.o: flint.cpp mflash.h
> > -	$(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o mstflint.o
> > -
> > -mflash.o: mtcr.h mflash.c mflash.h
> > -	$(CC) ${CFLAGS} -c mflash.c -o mflash.o
> > -
> > -mstmwrite: mwrite.c mtcr.h
> > -	$(CC) ${CFLAGS} mwrite.c -o mstmwrite
> > -
> > -mstmread: mread.c mtcr.h
> > -	$(CC) ${CFLAGS} mread.c -o mstmread
> > -
> > -mstregdump: mstdump.c mtcr.h
> > -	$(CC) ${CFLAGS} mstdump.c -o mstregdump
> > -
> > -mstvpd: vpd.c
> > -	$(CC) ${CFLAGS} vpd.c -o mstvpd
> > -
> > -clean:
> > -	rm -f mstvpd mstregdump mstflint mstmread mstmwrite mstflint.o mflash.o
> > diff --git a/Makefile.am b/Makefile.am
> > new file mode 100644
> > index 0000000..f642d9d
> > --- /dev/null
> > +++ b/Makefile.am
> > @@ -0,0 +1,21 @@
> > +bin_PROGRAMS = mstmread \
> > +					mstmwrite \
> > +					mstflint \
> > +					mstregdump \
> > +					mstvpd
> > +
> > +mstmread_SOURCES = mread.c mtcr.h
> > +
> > +mstmwrite_SOURCES = mwrite.c mtcr.h
> > +
> > +mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
> > +mstflint_LDFLAGS = -lz
> > +
> > +mstregdump_SOURCES = mread.c mtcr.h
> > +
> > +mstvpd_SOURCES = vpd.c
> > +
> > +
> > +EXTRA_DIST = \
> > +	mstflint.spec
> > +
> > diff --git a/autogen.sh b/autogen.sh
> > new file mode 100755
> > index 0000000..4827884
> > --- /dev/null
> > +++ b/autogen.sh
> > @@ -0,0 +1,11 @@
> > +#! /bin/sh
> > +
> > +# create config dir if not exist
> > +test -d config || mkdir config
> > +
> > +set -x
> > +aclocal -I config
> > +libtoolize --force --copy
> > +autoheader
> > +automake --foreign --add-missing --copy
> > +autoconf
> > diff --git a/configure.in b/configure.in
> > new file mode 100644
> > index 0000000..0924d65
> > --- /dev/null
> > +++ b/configure.in
> > @@ -0,0 +1,22 @@
> > +dnl Process this file with autoconf to produce a configure script.
> > +
> > +AC_INIT(mstflint)
> > +
> > +AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the project name.])
> > +AC_SUBST([PROJECT])
> > +
> > +AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the project version.])
> > +AC_SUBST([VERSION])
> > +
> > +AC_CONFIG_AUX_DIR(config)
> > +AC_CONFIG_SRCDIR([README])
> > +AM_INIT_AUTOMAKE(mstflint, 1.3)
> > +
> > +dnl Checks for programs
> > +AC_PROG_CC
> > +AC_PROG_CXX
> > +AC_PROG_LIBTOOL
> > +AC_CONFIG_HEADERS
> > +
> > +AC_CONFIG_FILES([Makefile mstflint.spec])
> > +AC_OUTPUT
> > diff --git a/mstflint.spec.in b/mstflint.spec.in
> > new file mode 100644
> > index 0000000..b5937be
> > --- /dev/null
> > +++ b/mstflint.spec.in
> > @@ -0,0 +1,45 @@
> > +Summary: Mellanox firmware burning application
> > +Name: mstflint
> > +Version: @VERSION@
> > +Release: 1
> > +License: GPL/BSD
> > +Url: http://openib.org/
> > +Group: System Environment/Base
> > +BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
> > +Source: mstflint- at VERSION@.tar.gz
> > +ExclusiveArch: i386 x86_64 ia64 ppc ppc64
> > +BuildRequires: zlib-devel
> > +Requires(post): chkconfig
> > +
> > +%description
> > +This package contains a tool for burning updated firmware on to
> > +Mellanox manufactured InfiniBand adapters.
> > +
> > +%prep
> > +%setup -q
> > +
> > +%build
> > +%configure
> > +make
> > +
> > +%install
> > +rm -rf $RPM_BUILD_ROOT
> > +make DESTDIR=${RPM_BUILD_ROOT} install
> > +# remove unpackaged files from the buildroot
> > +rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
> > +
> > +%clean
> > +rm -rf $RPM_BUILD_ROOT
> > +
> > +%files
> > +%defattr(-,root,root)
> > +%{_bindir}/mstmread
> > +%{_bindir}/mstmwrite
> > +%{_bindir}/mstflint
> > +%{_bindir}/mstregdump
> > +%{_bindir}/mstvpd
> > +
> > +%changelog
> > +* Fri Dec 07 2007 Ira Weiny <weiny2 at llnl.gov> 1.0.0
> > +   initial creation
> > +
> >   
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From jgunthorpe at obsidianresearch.com  Tue Dec 11 13:02:12 2007
From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe)
Date: Tue, 11 Dec 2007 14:02:12 -0700
Subject: [ofa-general] [PATCH] IB/CM: add support for routed paths
In-Reply-To: <1197399001.8114.731.camel@hrosenstock-ws.xsigo.com>
References: <20071210203544.GI30090@obsidianresearch.com>
	<000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>
	<1197377985.8114.637.camel@hrosenstock-ws.xsigo.com>
	<475EC4FF.9080702@ichips.intel.com>
	<1197399001.8114.731.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071211210212.GO6530@obsidianresearch.com>

On Tue, Dec 11, 2007 at 10:50:01AM -0800, Hal Rosenstock wrote:

> Guess I don't see much difference (on the passive side) in checking the
> LIDs for permissive or the subnet local field to determine whether to
> use the LIDs from the LRH. The only difference is this additional
> special meaning to the permissive LIDs.

I think there are three cases here:

1) It is subnet local, the node should just use the lids
2) It is not subnet local, and the necessary local lids are not known.
   The node should do the LRH copy work around
3) It is not subnet local, and the necessary local lids are provided.
   The node shoul djust use the the lids.

Just using the subnet local field does not provide enough information
to tell which of the three cases we need to use.

I expect the main use of the subnet local bit should be control use
of a GRH on that path - we have overloaded hoplimit for that case
which is probably not entirely correct.

Jason


From hartlch14 at gmail.com  Tue Dec 11 13:29:53 2007
From: hartlch14 at gmail.com (Chuck Hartley)
Date: Tue, 11 Dec 2007 16:29:53 -0500
Subject: [ofa-general] OFED 1.2.5.4 build fails for kernel 2.6.23
Message-ID: <c177de4a0712111329y72e4580bub9e23d75d17338a0@mail.gmail.com>

I tried building OFED 1.2.5.4 on a Fedora 7 system with kernel
2.6.23.1-21.fc7 and got a fatal compile error.  Apparently the number of
arguments to kmem_cache_create() changed from 6 to 5 starting with kernel
version 2.6.23. Error output below:

  gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-
1.2.5.4/drivers/infiniband/core/.mad.o.d  -nostdinc -isystem
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/include -D__KERNEL__ \
 \
-I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include \
-I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/include \
-Iinclude \
 \
-include include/linux/autoconf.h \
-include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/include/linux/autoconf.h\
 -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing
-fno-common -Werror-implicit-function-declaration -Os  -mtune=generic -m64
-mno-red-zone -mcmodel=kernel -pipe -Wno-sign-compare
-fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2
-mno-3dnow -maccumulate-outgoing-args -DCONFIG_AS_CFI=1
-DCONFIG_AS_CFI_SIGNAL_FRAME=1 -fstack-protector -fomit-frame-pointer -g
-fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign
-DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(mad)"
-D"KBUILD_MODNAME=KBUILD_STR(ib_mad)" -c -o
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/mad.o/var/tmp/OFEDRPM/BUILD/ofa_kernel-
1.2.5.4/drivers/infiniband/core/mad.c
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/mad.c: In
function 'ib_mad_init_module':
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband/core/mad.c:2970:
error: too many arguments to function 'kmem_cache_create'
make[4]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-
1.2.5.4/drivers/infiniband/core/mad.o] Error 1
make[3]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-
1.2.5.4/drivers/infiniband/core] Error 2
make[2]: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4/drivers/infiniband]
Error 2
make[1]: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2.5.4] Error 2
make[1]: Leaving directory `/usr/src/kernels/2.6.23.1-21.fc7-x86_64'
make: *** [kernel] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.12849 (%install)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/04179a90/attachment.html>

From jlentini at netapp.com  Tue Dec 11 13:50:44 2007
From: jlentini at netapp.com (James Lentini)
Date: Tue, 11 Dec 2007 16:50:44 -0500 (EST)
Subject: [ofa-general] [PATCH 2/2] uDAT/uDAPL v2 - (master branch) changes
	to sync common code base with WinOF 1.01
In-Reply-To: <000001c83c1e$e4e1fc40$4297070a@amr.corp.intel.com>
References: <000101c8392e$86f2ff50$1dfd070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712101253210.12638@jlentini-linux.nane.netapp.com>
	<000001c83c1e$e4e1fc40$4297070a@amr.corp.intel.com>
Message-ID: <Pine.LNX.4.64.0712111649050.12638@jlentini-linux.nane.netapp.com>


On Tue, 11 Dec 2007, Arlin Davis wrote:

>  
> 
> >
> >Looks good. A few more minor question:
> >
> >
> >What are the change above? I don't see any difference between the text 
> >or the white space.
> >
> 
> Extra linefeeds removed.

Ok. 

I looked at the diff in a hex editor to see if that was the change, 
but I didn't see anything. Maybe these were stripped out somewhere 
along the way...


From jlentini at netapp.com  Tue Dec 11 13:57:59 2007
From: jlentini at netapp.com (James Lentini)
Date: Tue, 11 Dec 2007 16:57:59 -0500 (EST)
Subject: [ofa-general] [PATCH 1/2 rev2] uDAT/uDAPL v2 - (master branch)
	changes to sync common code base with WinOF 1.01
In-Reply-To: <000101c83c23$34399510$4297070a@amr.corp.intel.com>
References: <000001c8392d$ae801400$1dfd070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712101249000.12638@jlentini-linux.nane.netapp.com>
	<000101c83c23$34399510$4297070a@amr.corp.intel.com>
Message-ID: <Pine.LNX.4.64.0712111655210.12638@jlentini-linux.nane.netapp.com>


On Tue, 11 Dec 2007, Arlin Davis wrote:

>  
> >> -    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
> >> -				    &dapl_ia_handle);
> >> +    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
> >
> >For consistency with your change above, should the cast 
> >be changed to 
> >
> >+    dat_status = dats_get_ia_handle((DAT_IA_HANDLE)ia_handle, 
> >&dapl_ia_handle);
> >
> 
> Good catch. I missed some dat_api.c changes from Stan. Here is rev2. 
> 
>   - add DAT_API to specify calling conventions (windows=__stdcall, linux= ) 
>   - cleanup platform specific definitions for windows
>   - c++ support
>   - add handle check macros DAT_IA_HANDLE_TO_UL and UL_TO_DAT_IA_HANDLE

For naming consistency, I'd suggest DAT_UL_TO_IA_HANDLE instead of 
UL_TO_DAT_IA_HANDLE. I defer to your judgement on which to use.

Other than that, looks good.


From Arkady.Kanevsky at netapp.com  Tue Dec 11 14:07:16 2007
From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady)
Date: Tue, 11 Dec 2007 17:07:16 -0500
Subject: [ofa-general] iWARP peer-to-peer revised proposal
Message-ID: <C98692FD98048C41885E0B0FACD9DFB805ACF8B9@exnane01.hq.netapp.com>

Goal is to have something implemented now in IW_CM to solve interop 
issues which we can use as a starting point to submisison to IETF MPA 
"extension".

The proposal is for an initiator side to generate the first message
(RTU)
in RDDP mode.
RTU stands for the 3rd MPA message - Ready to Use.
 
Initiator side:
1. IW_CM sends first MPA message (request).
 
Option A: no change for MPA request.
Option B: Steal a bit from reserve field to indicate that Initiator 
"supports" peer-to-peer model and wants to use it.
The default value is 0 indicating that Initiator does
not support peer-to-peer model which is the same as current MPA format.
Value 1 indicates support.
Option C: The same as option B but steal a bit from private data for it
instead of reserved field.

[For the quick fix Option A is the easiest. For the MPA
extension proposal Option B looks bette.]

2. Responder side
MPA response indicates whether or not Responder can handle RTU message 
and what type of message RTU should be.
 
Option A: Steal a couple of bits from reserve field.
It will encode the format of the RTU message it can handle.
Default 0 value is that it can NOT handle RTU message.
(This means that Initiator must send 1st message as part of ULP to ULP 
traffic). The other 3 values represent 0b RR, 0b RW, untagged 0b Send.
Option B: the same as above but steal bits from private data instead of
reserved field.
 
3. ULP must use the same post to Send Queue model as is for IB.
That is no posting to SQ till you get connection setup completion event.
(I am not aware of any ULP which does not do that now.
If you do, please, indicate it.)
 
4. Initiator (IW_CM):
Send RTU message based on the MPA response.
Either:
A. 0B RR (signalled?)
B. 0B RW (unsignalled?)
C. untagged 0B Send (unsignalled?)
 
If RTU was send signalled then IW_CM generates Connection Established 
event when it reaps completion.
If RTU was send unsignalled then Connection Established event 
generated when post returns successfully.
[I prefer unsignalled to avoid contaminating the CQ which can be 
shared]
 
5. Responder:
responder does _not_ emit any FPDUs until it received RTU.
There is not need to dictate how it is done under the covers.

One way of doing it is for iWARP vendor driver/FW/HW does not generate
Connection established 
event when MPA response is send.
Instead it waits for the RTU message to arrive and handles its under 
the cover (vendor magic) and generates Connection Established event 
when it does. 
It may temporarily use RR or RW credits or slot in "under covers" CQ.

Another, implementation _could_ pass up the ESTABLISHED event before the
RTU arrives but stall the SQ until it does arrive...

6. While MPA response message does not have timeout parameter
the RDMA_CM connect_accept message will have tiemout parameter
if RTU message was requested. If RTU message does not arrive within
the timeout the RDMA connection is teared down.
Currently, RDMA_CM has default timeout value for IB. I suggest
we keep that as the default value for MPA response.
It corresponds to the user specified timeout value of 0.

7. For OFA interop lets agree what the RTU message type will be.
My recommendation is unsignalled 0b Read.

Arkady Kanevsky                     email: arkady at netapp.com
Network Appliance Inc.              phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195
Waltham, MA 02451                   central phone: 781-768-5300


From giconnectpe at gmail.com  Tue Dec 11 14:17:07 2007
From: giconnectpe at gmail.com (rogerio andre)
Date: Tue, 11 Dec 2007 22:17:07 GMT
Subject: [ofa-general] =?iso-8859-1?q?Como_receber_cr=E9ditos_no_seu_celul?=
	=?iso-8859-1?q?ar?=
Message-ID: <20071211221707.7D77F12FB@socom2.uol.com.br>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/45e5a390/attachment.html>

From stan.smith at intel.com  Tue Dec 11 14:48:43 2007
From: stan.smith at intel.com (Smith, Stan)
Date: Tue, 11 Dec 2007 14:48:43 -0800
Subject: [ofa-general] [PATCH 2/2] uDAT/uDAPL v2 - (master branch) changes
	to sync common code base with WinOF 1.01
In-Reply-To: <Pine.LNX.4.64.0712111649050.12638@jlentini-linux.nane.netapp.com>
References: <000101c8392e$86f2ff50$1dfd070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712101253210.12638@jlentini-linux.nane.netapp.com>
	<000001c83c1e$e4e1fc40$4297070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712111649050.12638@jlentini-linux.nane.netapp.com>
Message-ID: <55CE0347B98FCA468923E5FBC25CB4DC028A111E@orsmsx413.amr.corp.intel.com>

James Lentini wrote:
> On Tue, 11 Dec 2007, Arlin Davis wrote:
> 
>> 
>> 
>>> 
>>> Looks good. A few more minor question:
>>> 
>>> 
>>> What are the change above? I don't see any difference between the
>>> text or the white space. 
>>> 
>> 
>> Extra linefeeds removed.
> 
> Ok.
> 
> I looked at the diff in a hex editor to see if that was the change,
> but I didn't see anything. Maybe these were stripped out somewhere
> along the way...

^M inadvertently introduced by use of Visual Studio edit sessions were
removed by the patch.
Linux vim would show them as ^M.

S.


From stan.smith at intel.com  Tue Dec 11 14:49:43 2007
From: stan.smith at intel.com (Smith, Stan)
Date: Tue, 11 Dec 2007 14:49:43 -0800
Subject: [ofa-general] [PATCH 1/2 rev2] uDAT/uDAPL v2 - (master branch)
	changes to sync common code base with WinOF 1.01
In-Reply-To: <Pine.LNX.4.64.0712111655210.12638@jlentini-linux.nane.netapp.com>
References: <000001c8392d$ae801400$1dfd070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712101249000.12638@jlentini-linux.nane.netapp.com>
	<000101c83c23$34399510$4297070a@amr.corp.intel.com>
	<Pine.LNX.4.64.0712111655210.12638@jlentini-linux.nane.netapp.com>
Message-ID: <55CE0347B98FCA468923E5FBC25CB4DC028A1123@orsmsx413.amr.corp.intel.com>

James Lentini wrote:
> On Tue, 11 Dec 2007, Arlin Davis wrote:
> 
>> 
>>>> -    dat_status = dats_get_ia_handle((unsigned long)ia_handle,
>>>> -				    &dapl_ia_handle);
>>>> +    dat_status = dats_get_ia_handle(ia_handle, &dapl_ia_handle);
>>> 
>>> For consistency with your change above, should the cast
>>> be changed to
>>> 
>>> +    dat_status = dats_get_ia_handle((DAT_IA_HANDLE)ia_handle,
>>> &dapl_ia_handle); 
>>> 
>> 
>> Good catch. I missed some dat_api.c changes from Stan. Here is rev2.
>> 
>>   - add DAT_API to specify calling conventions (windows=__stdcall,
>> linux= ) 
>>   - cleanup platform specific definitions for windows
>>   - c++ support
>>   - add handle check macros DAT_IA_HANDLE_TO_UL and
>> UL_TO_DAT_IA_HANDLE 
> 
> For naming consistency, I'd suggest DAT_UL_TO_IA_HANDLE instead of
> UL_TO_DAT_IA_HANDLE. I defer to your judgement on which to use.

Makes sense to me.

> 
> Other than that, looks good.


From pradeeps at linux.vnet.ibm.com  Tue Dec 11 15:35:08 2007
From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana)
Date: Tue, 11 Dec 2007 15:35:08 -0800
Subject: [ofa-general] Trivial patch -IPoIB compile failure fix
Message-ID: <475F1EAC.9030709@linux.vnet.ibm.com>

ipoib_main.c fails to compile if "Connected Mode" is not enabled
in the .config. Compile tested.

Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
---
--- linux-2.6.24-rc3/drivers/infiniband/ulp/ipoib/ipoib_main.c.orig	2007-12-11 16:10:28.000000000 -0800
+++ linux-2.6.24-rc3/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-12-11 16:11:04.000000000 -0800
@@ -1266,7 +1266,9 @@ static int __init ipoib_init_module(void
 	ipoib_sendq_size = min(ipoib_sendq_size, IPOIB_MAX_QUEUE_SIZE);
 	ipoib_sendq_size = max(ipoib_sendq_size, IPOIB_MIN_QUEUE_SIZE);
 
+#ifdef CONFIG_INFINIBAND_IPOIB_CM
 	ipoib_max_conn_qp = min(ipoib_max_conn_qp, IPOIB_CM_MAX_CONN_QP);
+#endif
 
 	ret = ipoib_register_debugfs();
 	if (ret)


From infomailorg at carosmith.com  Tue Dec 11 10:44:01 2007
From: infomailorg at carosmith.com (INFO MAIL)
Date: Tue, 11 Dec 2007 19:44:01 +0100 (CET)
Subject: [ofa-general] YAHOO AWARD WINNING 2007
Message-ID: <20071211184401.24620E1C@atmail.b-one.net>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071211/af564299/attachment.html>

From dwsilvasplendidm at silvasplendid.it  Tue Dec 11 17:47:04 2007
From: dwsilvasplendidm at silvasplendid.it (Lazaro Hickman)
Date: Wed, 12 Dec 2007 09:47:04 +0800
Subject: [ofa-general] Medications that you need.
Message-ID: <01c83ca3$f3579c00$5f10bcda@dwsilvasplendidm>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/FreddieMorrison28/

We provide confidential and secure purchase! 


From kliteyn at mellanox.co.il  Tue Dec 11 21:11:15 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 12 Dec 2007 07:11:15 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-12:normal completion
Message-ID: <MTLEXCH01zPZrIkyPzG000004de@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-11
OpenSM git rev = Tue_Dec_11_03:55:20_2007 [3a9e90568ad06c7efbd150b1c341d14d83811ea1]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From vlad at lists.openfabrics.org  Wed Dec 12 03:07:31 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Wed, 12 Dec 2007 03:07:31 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071212-0200 daily build status
Message-ID: <20071212110731.BBE48E601DA@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.15
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.15
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.23
Passed on ppc64 with linux-2.6.18
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.15
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ppc64 with linux-2.6.18-8.el5

Failed:


From moshek at voltaire.com  Wed Dec 12 03:54:40 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Wed, 12 Dec 2007 13:54:40 +0200
Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and
	5.0firmware support
In-Reply-To: <475D4B49.4090709@dev.mellanox.co.il>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2CD5@exil.voltaire.com>

What is the planned rate of releasing OFED-1.2.5.X   versions ?

& what is the level of testing and QA ?

Isn't the paste of moving 1.25.1 -> 1.2.5.5  too fast ?

Moshe

____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
-----Original Message-----
From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladimir
Sokolovsky
Sent: Monday, December 10, 2007 4:21 PM
To: Steve Wise
Cc: OpenFabrics EWG; OpenFabrics General
Subject: [ofa-general] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes
and 5.0firmware support


Hi Steve,
Sorry, I missed your libcxgb3 updates for ofed-1.2.5.
It is updated now.
There is OFED-1.2.5.4-20071210-0614 build which includes updated
libcxgb3 library.

In any case we are going to release OFED-1.2.5.5 in a few days.

Regards,
Vladimir

Steve Wise wrote:
> Vlad, it looks like you didn't pull in version 1.1.0 of libcxgb3 for
> ofed-1.2.5?
> 
> Right now the ofed-1.2.5.4 is broken from chelsio's perspective 
> because
> the kernel drivers require 5.0 firmware, but the library doesn't have 
> 5.0 firmware support.
> 
> Can you please pull in 1.1.0 of libcxgb3 and crank a new ofed-1.2.5.4
> release?
> 
> Pull from:
> 
> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
> 
> 
> Thanks,
> 
> Steve.
> 
> 
> Steve Wise wrote:
>> Vlad, please pull cxgb3 fixes for ofed-1.2.5 from:
>>
>> git://git.openfabrics.org/~swise/ofed-1.2.5 stevo
>>
>> These are cxgb3 bug fixes and PPC64 additions that we need for
>> ofed-1.2.5  (stay tuned for ofed-1.3 patches soon).
>>
>> The patches are all accepted upstream and were posted here:
>>
>> http://www.spinics.net/lists/netdev/msg47492.html
>>
>> and here:
>>
>> http://www.spinics.net/lists/netdev/msg48240.html
>>
>>
>> Also, please pull version 1.1.0 of libcxgb3 from:
>>
>> git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5
>>
>> The library and drivers need to be included together as they are both
>> needed to support the chelsio 5.0 firmware.
>>
>> Alsoalso: After you integrate these, can you crank a daily
>> OFED-1.2.5.3 build including all this?
>>
>>
>> Thanks,
>>
>> Steve.
>>
> 

_______________________________________________
general mailing list
general at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


From ogerlitz at voltaire.com  Wed Dec 12 04:14:25 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Wed, 12 Dec 2007 14:14:25 +0200
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related
	hCalls on POWER5
In-Reply-To: <OFD9564F75.44193623-ONC12573AE.002EA542-C12573AE.002F5FBC@de.ibm.com>
References: <OFD9564F75.44193623-ONC12573AE.002EA542-C12573AE.002F5FBC@de.ibm.com>
Message-ID: <475FD0A1.3090304@voltaire.com>

Joachim Fenkes wrote:
> Roland Dreier <rdreier at cisco.com> wrote on 10.12.2007 22:47:37:

>> It's an optional device feature, so this should be OK
>> (although the iSER driver currently seems to depend on a device
>> supporting FMRs, which is probably going to be a problem with iWARP
>> support in the future anyway).

> I don't feel very well with removing code from the driver that iSER seems 
> to depend on. Are there plans to fix this in iSER?

What is the fix you suggest, to add a device query that tells you for 
which verbs the documentation does not apply? or enhance the code of the 
  map_phys_fmr verb within the ehca driver to return error if called 
from non-sleepable context?

Or.


From hrosenstock at xsigo.com  Wed Dec 12 04:17:43 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 12 Dec 2007 04:17:43 -0800
Subject: [ofa-general] [PATCH] IB/CM: add support for routed paths
In-Reply-To: <20071211210212.GO6530@obsidianresearch.com>
References: <20071210203544.GI30090@obsidianresearch.com>
	<000d01c83b87$dacc6070$9c98070a@amr.corp.intel.com>
	<1197377985.8114.637.camel@hrosenstock-ws.xsigo.com>
	<475EC4FF.9080702@ichips.intel.com>
	<1197399001.8114.731.camel@hrosenstock-ws.xsigo.com>
	<20071211210212.GO6530@obsidianresearch.com>
Message-ID: <1197461863.15966.53.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-11 at 14:02 -0700, Jason Gunthorpe wrote:
> On Tue, Dec 11, 2007 at 10:50:01AM -0800, Hal Rosenstock wrote:
> 
> > Guess I don't see much difference (on the passive side) in checking the
> > LIDs for permissive or the subnet local field to determine whether to
> > use the LIDs from the LRH. The only difference is this additional
> > special meaning to the permissive LIDs.
> 
> I think there are three cases here:
> 
> 1) It is subnet local, the node should just use the lids
> 2) It is not subnet local, and the necessary local lids are not known.
>    The node should do the LRH copy work around
> 3) It is not subnet local, and the necessary local lids are provided.
>    The node shoul djust use the the lids.
> 
> Just using the subnet local field does not provide enough information
> to tell which of the three cases we need to use.

Thanks; that was what I was missing.

-- Hal

> 
> I expect the main use of the subnet local bit should be control use
> of a GRH on that path - we have overloaded hoplimit for that case
> which is probably not entirely correct.
> 
> Jason


From dwskippycomputersm at skippycomputers.com  Wed Dec 12 04:16:10 2007
From: dwskippycomputersm at skippycomputers.com (Aida Dickson)
Date: Wed, 12 Dec 2007 19:16:10 +0700
Subject: [ofa-general] Receive a real time experience of gambling without
	visiting a real casino!
Message-ID: <01c83cf3$73f1b900$dcb75e3b@dwskippycomputersm>

 Now you have a brilliant possibility to feel casino excitement without leaving your house. All your favorite games are available to play in Golden Gate Casino. Just download free software and start playing.

 We provide 24 hours a day, 7 days a week support and service! Truly fair play guaranteed for players. High level of security!

http://geocities.com/JuniorFreeman71/

   Top choice casino!


From hrosenstock at xsigo.com  Wed Dec 12 04:21:53 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 12 Dec 2007 04:21:53 -0800
Subject: [ofa-general] who can give me info on the utility vendstat?
In-Reply-To: <1196870831.30768.318.camel@hrosenstock-ws.xsigo.com>
References: <4756ACBB.6050604@dev.mellanox.co.il>
	<1196863318.30768.288.camel@hrosenstock-ws.xsigo.com>
	<4756B32F.5010801@dev.mellanox.co.il>
	<1196864753.30768.296.camel@hrosenstock-ws.xsigo.com>
	<4756BBF6.8040008@dev.mellanox.co.il>
	<1196867101.30768.306.camel@hrosenstock-ws.xsigo.com>
	<4756CAF8.2060605@dev.mellanox.co.il>
	<1196870831.30768.318.camel@hrosenstock-ws.xsigo.com>
Message-ID: <1197462113.15966.60.camel@hrosenstock-ws.xsigo.com>

Dotan,

On Wed, 2007-12-05 at 08:07 -0800, Hal Rosenstock wrote: 
> On Wed, 2007-12-05 at 17:59 +0200, Dotan Barak wrote:
> > > Yes, IS3 definitely works for this. I think Anafa also works too as well
> > > as Tavor HCA although I can't be 100% sure in my current test
> > > environment.
> > >
> > > -- Hal
> > >   
> > 
> > After some digging i notices that only the switches supports this MAD
> > (i didn't see this in the HCA's PRMs).
> 
> It seems they do respond though but maybe not with the proper info so
> should they respond with some error then ?

> Should vendstat query node type first and make sure it is a switch for
> this option ?

Any idea on the above two questions ? Thanks.

-- Hal

> -- Hal
> 
> > thanks for the quick response
> > Dotan
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From ogerlitz at voltaire.com  Wed Dec 12 04:52:20 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Wed, 12 Dec 2007 14:52:20 +0200
Subject: [ofa-general] RE: [PATCH] librdmacm/man: fix-up man pages
In-Reply-To: <000801c832b6$81feb850$f5d8180a@amr.corp.intel.com>
References: <000101c81a64$3582de80$9c98070a@amr.corp.intel.com>	<4726EEAC.3070105@voltaire.com>
	<472755C4.10600@ichips.intel.com>	<47285F53.4060402@voltaire.com>
	<4728BF4A.1060301@ichips.intel.com>	<15ddcffd0710311320v6b91b3cm3be0f7882e30ad2b@mail.gmail.com>	<000001c81cb5$4ce12160$9c98070a@amr.corp.intel.com>	<15ddcffd0711270435t12a18dc3waac2596b3884ac72@mail.gmail.com>	<000001c8311a$176cdbe0$63248686@amr.corp.intel.com>	<15ddcffd0711280307u7a89c6c2q2854b071f74d9123@mail.gmail.com>
	<000801c832b6$81feb850$f5d8180a@amr.corp.intel.com>
Message-ID: <475FD984.6080203@voltaire.com>

Sean Hefty wrote:
> These have been updated and pushed upstream.  Please let me know if you're aware
> of any other documentation changes.

OK, here's some feedback on the documentation (man pages)

1. for rdma_disconnect - mention that

- it applies only to connected service
- the QP is moved to the error state and following that all the posted 
work requests will be flushed to the completion queue.
- the disconnected event would be generated in both sides of the connection

2. for rdma_join_multicast
- mention that as with unicast, if source address is provided to 
rdma_resolve_addr then the routing table need not be set to route this 
group to an ipoib device

- mention that the attach operation is done once the join SA query join 
is finished

- point from this page to the page of rdma_get_cm_event

3. librdmacm return codes

I understand its either zero (success) or unari-minus-some-errno-value, 
is it correct? if yes, can you document that at the main page? down the 
road, it can be nice to document the return values/reasons for each API 
entry, but I guess this might not fit into the timeline of OFED 1.3?!

Or.


From changquing.tang at hp.com  Wed Dec 12 07:24:38 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Wed, 12 Dec 2007 15:24:38 +0000
Subject: [ofa-general] XRC cleanup order issue
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE143AE8@G5W0278.americas.hpqcorp.net>


HI,
        This question is mainly for Mellanox engineers.

        With XRC, the rank who create the QP which is used for transport to all ranks on that node can NOT exit first if other ranks are still using
the transport. This restriction is a problem for our dynamic process definition where any rank could die with any reason, but without teardown the
whole application.

        I am thinking about shared memory usage, where the creator does not have to keep alive while other processes can still use it, untill the
last process exits, then the system will cleanup the shared memory.

        Can't XRC mimic the shared memory behavior ?


Thanks.

--CQ Tang


From kliteyn at dev.mellanox.co.il  Wed Dec 12 07:38:24 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 12 Dec 2007 17:38:24 +0200
Subject: [ofa-general] CMA can't establish connection with QoS on
Message-ID: <47600070.8050008@dev.mellanox.co.il>

Hi Sean,

I'm having troubles with using rdma when QoS is enabled on the subnet.
Everything works fine as long as QoS policy enforces SL=0, but when
the policy defines SL!=0, the client application hungs.

The only thing I see is that CMA sends sends PathRecord query to OpenSM
and OpenSM returns the QoS-enforced SL.

I've also opened bugzilla bug for this - num. 821
You can see more details and instructions to reproduce it there.

-- Yevgeny


From RAISCH at de.ibm.com  Wed Dec 12 08:02:23 2007
From: RAISCH at de.ibm.com (Christoph Raisch)
Date: Wed, 12 Dec 2007 17:02:23 +0100
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related
	hCalls on POWER5
In-Reply-To: <475FD0A1.3090304@voltaire.com>
References: <OFD9564F75.44193623-ONC12573AE.002EA542-C12573AE.002F5FBC@de.ibm.com>
	<475FD0A1.3090304@voltaire.com>
Message-ID: <OF2694B462.1504C1FC-ONC12573AF.0056AAC0-C12573AF.0056F862@de.ibm.com>

Or Gerlitz <ogerlitz at voltaire.com> wrote on 12.12.2007 13:14:25:

> Joachim Fenkes wrote:
> > Roland Dreier <rdreier at cisco.com> wrote on 10.12.2007 22:47:37:
>
> >> It's an optional device feature, so this should be OK
> >> (although the iSER driver currently seems to depend on a device
> >> supporting FMRs, which is probably going to be a problem with iWARP
> >> support in the future anyway).
>
> > I don't feel very well with removing code from the driver that iSER
seems
> > to depend on. Are there plans to fix this in iSER?
>
> What is the fix you suggest, to add a device query that tells you for
> which verbs the documentation does not apply? or enhance the code of the
>   map_phys_fmr verb within the ehca driver to return error if called
> from non-sleepable context?

Roland,
what is your suggestion here?

We could implement both versions Or is proposing, but having both
at the same time sound like overkill.

Christoph R.


From kliteyn at dev.mellanox.co.il  Wed Dec 12 07:52:04 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 12 Dec 2007 17:52:04 +0200
Subject: [ofa-general] [PATCH] opensm: trivial change of log message
Message-ID: <476003A4.1090004@dev.mellanox.co.il>

Improving pkey format in log messages

Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_sa_multipath_record.c |    2 +-
 opensm/opensm/osm_sa_path_record.c      |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/opensm/opensm/osm_sa_multipath_record.c b/opensm/opensm/osm_sa_multipath_record.c
index efc6a07..a37b726 100644
--- a/opensm/opensm/osm_sa_multipath_record.c
+++ b/opensm/opensm/osm_sa_multipath_record.c
@@ -819,7 +819,7 @@ __osm_mpr_rcv_get_path_parms(IN osm_mpr_rcv_t * const p_rcv,
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_mpr_rcv_get_path_parms: MultiPath params:"
 			" mtu = %u, rate = %u, packet lifetime = %u,"
-			" pkey = %u, sl = %u, hops = %u\n", mtu, rate,
+			" pkey = 0x%04X, sl = %u, hops = %u\n", mtu, rate,
 			pkt_life, cl_ntoh16(required_pkey), required_sl, hops);

       Exit:
diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
index f46a3be..4f20d8e 100644
--- a/opensm/opensm/osm_sa_path_record.c
+++ b/opensm/opensm/osm_sa_path_record.c
@@ -835,7 +835,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv,
 		osm_log(p_rcv->p_log, OSM_LOG_DEBUG,
 			"__osm_pr_rcv_get_path_parms: Path params:"
 			" mtu = %u, rate = %u, packet lifetime = %u,"
-			" pkey = %u, sl = %u\n",
+			" pkey = 0x%04X, sl = %u\n",
 			mtu, rate, pkt_life, cl_ntoh16(pkey), sl);
       Exit:
 	OSM_LOG_EXIT(p_rcv->p_log);
-- 
1.5.1.4


From dotanb at dev.mellanox.co.il  Wed Dec 12 07:58:48 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Wed, 12 Dec 2007 17:58:48 +0200
Subject: [ofa-general] in PPC64 i'm able to register the code segment with
	write permission
Message-ID: <47600538.8030101@dev.mellanox.co.il>

Hi all.

I'm using the following machine attributes:
*************************************************************
Host Name         : mtlsqt185
Host Architecture : ppc64
Linux Distribution: SUSE Linux Enterprise Server 10 (ppc) VERSION = 10 
PATCHLEVEL = 1
Kernel Version    : 2.6.16.53-0.16-ppc64
GCC Version       : gcc (GCC) 4.1.2 20070115 (prerelease) (SUSE Linux)
Memory size       : 1740232 kB
Number of CPUs    : 8
cpu MHz           : 4005.000000MHz
MST Version       : 4.4.3
Driver Version    : OFED-1.2.5.4-20071210-0614
HCA ID(s)         : mlx4_0
HCA model(s)      : 25418
FW version(s)     : 2.3.906
Board(s)          : IBM08A0000001
*************************************************************

I'm executing the gen2_basic test (i guess any other test will do the 
trick too)
and i try to register one of the functions address (which is in the Code
Segment) with write permission enabled.

In all of the machines in our regression i fail to do it but in PPC64 i 
can do it.
When i checked the address which is being registered and the permission 
of it's VMA, i noticed
that the VMA of this function has write enable permission.

The function that i try to register is in address 0x1005ac80.

mtlsqt185:~ # cat /proc/17366/maps
00100000-00103000 r-xp 00100000 00:00 0
10000000-1004a000 r-xp 00000000 08:03 1063667                            
/tmp/tsscr/svn.mlx_tp/branches/ofed1.2.5/gen2/userspace/useraccess/gen2_basic/gen2_basic
1005a000-1005e000 rw-p 0004a000 08:03 1063667                            
/tmp/tsscr/svn.mlx_tp/branches/ofed1.2.5/gen2/userspace/useraccess/gen2_basic/gen2_basic
1005e000-1015f000 rw-p 1005e000 00:00 0                                  
[heap]


Is this is an IB issue?
Is this is a security hole in the linux kernel in PPC (because viruses 
can change the code in the code segment ...)

thanks
Dotan


From sashak at voltaire.com  Wed Dec 12 09:48:06 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 17:48:06 +0000
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <1197398504.8114.725.camel@hrosenstock-ws.xsigo.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
	<1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
	<20071211164657.GJ23319@sashak.voltaire.com>
	<20071211165617.GK23319@sashak.voltaire.com>
	<1197398504.8114.725.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071212174806.GL23319@sashak.voltaire.com>

Hi Hal,

On 10:41 Tue 11 Dec     , Hal Rosenstock wrote:
> > 
> > I'm not close that 'all ports' simulation in perfquery is great thing.
> 
> In a sense, it's no different than what the agent itself might be doing;
> albeit over a larger time span.
> 
> > perfquery is low level tool and it should be able to indicate in clear
> > way that 'all ports' option is not supported by port instead of hiding
> > this behind simulation. Maybe 'all ports' simulation should optional...
> 
> Guess you did a 180 turn on this. Last I recall on the list you wanted
> this functionality.

Just wanted it fully functional...

Sasha


From sashak at voltaire.com  Wed Dec 12 10:24:13 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 18:24:13 +0000
Subject: [ofa-general] Re: [PATCH] opensm: trivial change of log message
In-Reply-To: <476003A4.1090004@dev.mellanox.co.il>
References: <476003A4.1090004@dev.mellanox.co.il>
Message-ID: <20071212182413.GN23319@sashak.voltaire.com>

On 17:52 Wed 12 Dec     , Yevgeny Kliteynik wrote:
> Improving pkey format in log messages
> 
> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Wed Dec 12 10:27:46 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 18:27:46 +0000
Subject: [ofa-general] Re: [PATCH] opensm: fixing coredump in QoS policy pkey
	validation
In-Reply-To: <475EAE8B.70206@dev.mellanox.co.il>
References: <475EAE8B.70206@dev.mellanox.co.il>
Message-ID: <20071212182746.GO23319@sashak.voltaire.com>

On 17:36 Tue 11 Dec     , Yevgeny Kliteynik wrote:
> Fixing segmentation fault in validating pkeys in QoS policy
> 
> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Wed Dec 12 10:30:31 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 18:30:31 +0000
Subject: [ofa-general] Re: [PATCH] opensm: QoS policy - fixing pkey range
	implementation
In-Reply-To: <475EAF2C.2020507@dev.mellanox.co.il>
References: <475EAF2C.2020507@dev.mellanox.co.il>
Message-ID: <20071212183031.GP23319@sashak.voltaire.com>

On 17:39 Tue 11 Dec     , Yevgeny Kliteynik wrote:
> Fixing pkey range implementation in QoS policy.
> Ignoring the "special" most significant bit of
> the pkey was causing problems.
> 
> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From hrosenstock at xsigo.com  Wed Dec 12 10:30:05 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 12 Dec 2007 10:30:05 -0800
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <20071212174806.GL23319@sashak.voltaire.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
	<1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
	<20071211164657.GJ23319@sashak.voltaire.com>
	<20071211165617.GK23319@sashak.voltaire.com>
	<1197398504.8114.725.camel@hrosenstock-ws.xsigo.com>
	<20071212174806.GL23319@sashak.voltaire.com>
Message-ID: <1197484205.23465.37.camel@hrosenstock-ws.xsigo.com>

Hi Sasha,

On Wed, 2007-12-12 at 17:48 +0000, Sasha Khapyorsky wrote:
> Hi Hal,
> 
> On 10:41 Tue 11 Dec     , Hal Rosenstock wrote:
> > > 
> > > I'm not close that 'all ports' simulation in perfquery is great thing.
> > 
> > In a sense, it's no different than what the agent itself might be doing;
> > albeit over a larger time span.
> > 
> > > perfquery is low level tool and it should be able to indicate in clear
> > > way that 'all ports' option is not supported by port instead of hiding
> > > this behind simulation. Maybe 'all ports' simulation should optional...
> > 
> > Guess you did a 180 turn on this. Last I recall on the list you wanted
> > this functionality.
> 
> Just wanted it fully functional...

And now ?

-- Hal

> 
> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From sean.hefty at intel.com  Wed Dec 12 10:30:53 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Wed, 12 Dec 2007 10:30:53 -0800
Subject: [ofa-general] RE: CMA can't establish connection with QoS on
In-Reply-To: <47600070.8050008@dev.mellanox.co.il>
References: <47600070.8050008@dev.mellanox.co.il>
Message-ID: <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com>

>I'm having troubles with using rdma when QoS is enabled on the subnet.
>Everything works fine as long as QoS policy enforces SL=0, but when
>the policy defines SL!=0, the client application hungs.
>
>The only thing I see is that CMA sends sends PathRecord query to OpenSM
>and OpenSM returns the QoS-enforced SL.
>
>I've also opened bugzilla bug for this - num. 821
>You can see more details and instructions to reproduce it there.

I haven't tried to reproduce this yet, but looking at the code:

I didn't notice anything that either the rdma_cm or ib_cm did differently based
on the value of the SL.  The rdma_cm doesn't do anything with SL except pass
whatever it gets in the path record through to the ib_cm.  The ib_cm copies the
SL into the REQ, but it does end up being used to create an address handle to
send the MAD.  Is there any chance that the creation of the address handle is
causing the problem?  Does any other QoS related code with SL != 0 work?

- Sean


From sashak at voltaire.com  Wed Dec 12 10:47:53 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 18:47:53 +0000
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: for CAs
	query only single ports
In-Reply-To: <1197484205.23465.37.camel@hrosenstock-ws.xsigo.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
	<1197386753.8114.694.camel@hrosenstock-ws.xsigo.com>
	<20071211164657.GJ23319@sashak.voltaire.com>
	<20071211165617.GK23319@sashak.voltaire.com>
	<1197398504.8114.725.camel@hrosenstock-ws.xsigo.com>
	<20071212174806.GL23319@sashak.voltaire.com>
	<1197484205.23465.37.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071212184753.GQ23319@sashak.voltaire.com>

On 10:30 Wed 12 Dec     , Hal Rosenstock wrote:
> Hi Sasha,
> 
> On Wed, 2007-12-12 at 17:48 +0000, Sasha Khapyorsky wrote:
> > Hi Hal,
> > 
> > On 10:41 Tue 11 Dec     , Hal Rosenstock wrote:
> > > > 
> > > > I'm not close that 'all ports' simulation in perfquery is great thing.
> > > 
> > > In a sense, it's no different than what the agent itself might be doing;
> > > albeit over a larger time span.
> > > 
> > > > perfquery is low level tool and it should be able to indicate in clear
> > > > way that 'all ports' option is not supported by port instead of hiding
> > > > this behind simulation. Maybe 'all ports' simulation should optional...
> > > 
> > > Guess you did a 180 turn on this. Last I recall on the list you wanted
> > > this functionality.
> > 
> > Just wanted it fully functional...
> 
> And now ?

And now too. But think it should be off by default, a new command line
option could be added to turn it on.

Sasha


From Thomas.Talpey at netapp.com  Wed Dec 12 10:40:48 2007
From: Thomas.Talpey at netapp.com (Talpey, Thomas)
Date: Wed, 12 Dec 2007 13:40:48 -0500
Subject: [ofa-general] rdma cm timeout option, was [iWARP issues]
In-Reply-To: <469958e00712101115i53613b89iea3a9793a1d6d039@mail.gmail.co
 m>
References: <C98692FD98048C41885E0B0FACD9DFB805559DA6@exnane01.hq.netapp.com>
	<000701c81cd2$3d4178f0$9c98070a@amr.corp.intel.com>
	<EXNANE01hzL1f6Ykwn900000878@exnane01.hq.netapp.com>
	<472B4831.7030303@ichips.intel.com>
	<EXNANE017ddhlHs46tf0000087e@exnane01.hq.netapp.com>
	<001301c81d74$426c8840$9c98070a@amr.corp.intel.com>
	<C98692FD98048C41885E0B0FACD9DFB80555A19E@exnane01.hq.netapp.com>
	<472B71DF.2090408@ichips.intel.com>
	<C98692FD98048C41885E0B0FACD9DFB80555A1DB@exnane01.hq.netapp.com>
	<469958e00712101115i53613b89iea3a9793a1d6d039@mail.gmail.com>
Message-ID: <EXNANE01vXee4fDGCl900000b3b@exnane01.hq.netapp.com>

At 02:15 PM 12/10/2007, Caitlin Bestler wrote:
>So there is need for *some* mechanism to timeout a stalled iWARP connectionion
>process even after a valid TCP connection is established. A vendor
>specific, non-
>configurable, method is probably more than adequate. But *something* has to be
>there, you cannot just rely on the TCP mechanisms. Nor can you assume that the
>TCP stack servicing RDMA connections has the same defaults as the host stack.

Yes, protection from upper layer timeouts is important. But I'm not
certain whether you're advocating a CM layer timeout, or a vendor-
specific requirement in each card's lower stack.

I tend to prefer the latter - to avoid assumptions in the CM. At
connection time, CM can't tell the difference between a TCP delay
and an MPA delay. This could lead to bad assumptions and misleading
errors.

Tom.


From sashak at voltaire.com  Wed Dec 12 10:56:23 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 18:56:23 +0000
Subject: [ofa-general] Re: [PATCH 1/3] OpenSM: Add null dereference checks
In-Reply-To: <33238.128.15.244.71.1197246714.squirrel@127.0.0.1>
References: <1197075265.29314.131.camel@cardanus.llnl.gov>
	<20071209192747.GH708@sashak.voltaire.com>
	<33238.128.15.244.71.1197246714.squirrel@127.0.0.1>
Message-ID: <20071212185623.GS23319@sashak.voltaire.com>

Hi Al,

On 16:31 Sun 09 Dec     , Albert Chu wrote:
> 
> > Is it possible/legal usage to have context = NULL in those desctructors?
> > If not, I don't think we need such checks.
> 
> I don't think any current code logic in opensm can allow it, but I felt it
> would be prudent to add the checks for future code change safety anyways

I don't like such kind of "for sure" code - segfault is easier to debug
than memory leak :). If in a future it will be needed we can add it then.

> (one of my initial ideas to fix the routing engine reporting issue
> required it).  It's also in the ftree destroy function already too.

Probably just remove it there.

Sasha


From Caitlin.Bestler at neterion.com  Wed Dec 12 10:48:56 2007
From: Caitlin.Bestler at neterion.com (Caitlin Bestler)
Date: Wed, 12 Dec 2007 13:48:56 -0500
Subject: [ofa-general] rdma cm timeout option, was [iWARP issues]
In-Reply-To: <EXNANE01vXee4fDGCl900000b3b@exnane01.hq.netapp.com>
References: <C98692FD98048C41885E0B0FACD9DFB805559DA6@exnane01.hq.netapp.com>
	<000701c81cd2$3d4178f0$9c98070a@amr.corp.intel.com>
	<EXNANE01hzL1f6Ykwn900000878@exnane01.hq.netapp.com>
	<472B4831.7030303@ichips.intel.com>
	<EXNANE017ddhlHs46tf0000087e@exnane01.hq.netapp.com>
	<001301c81d74$426c8840$9c98070a@amr.corp.intel.com>
	<C98692FD98048C41885E0B0FACD9DFB80555A19E@exnane01.hq.netapp.com>
	<472B71DF.2090408@ichips.intel.com>
	<C98692FD98048C41885E0B0FACD9DFB80555A1DB@exnane01.hq.netapp.com>
	<469958e00712101115i53613b89iea3a9793a1d6d039@mail.gmail.com>
	<EXNANE01vXee4fDGCl900000b3b@exnane01.hq.netapp.com>
Message-ID: <78C9135A3D2ECE4B8162EBDCE82CAD7702B13CDD@nekter>


> -----Original Message-----
> From: Talpey, Thomas [mailto:Thomas.Talpey at netapp.com]
> Sent: Wednesday, December 12, 2007 10:41 AM
> To: Caitlin Bestler
> Cc: OpenFabrics General
> Subject: Re: [ofa-general] rdma cm timeout option, was [iWARP issues]
> 
> At 02:15 PM 12/10/2007, Caitlin Bestler wrote:
> >So there is need for *some* mechanism to timeout a stalled iWARP
> connectionion
> >process even after a valid TCP connection is established. A vendor
> >specific, non-
> >configurable, method is probably more than adequate. But *something*
> has to be
> >there, you cannot just rely on the TCP mechanisms. Nor can you assume
> that the
> >TCP stack servicing RDMA connections has the same defaults as the
host
> stack.
> 
> Yes, protection from upper layer timeouts is important. But I'm not
> certain whether you're advocating a CM layer timeout, or a vendor-
> specific requirement in each card's lower stack.
> 
> I tend to prefer the latter - to avoid assumptions in the CM. At
> connection time, CM can't tell the difference between a TCP delay
> and an MPA delay. This could lead to bad assumptions and misleading
> errors.
> 

Either approach makes sense to me. The strongest argument for the latter
is that the "pending MPA" state is one that Consumers will tend not to
be aware of, and will probably never be aware of unless someone actually
uses it for some sort of DoS attack.

So if it is actually handled by vendor specific code then consumers will
be able to remain ignorant of this intermediate step, which is probably
what they would prefer.

Of course that means that OFA, as a whole, cannot be ignorant of this,
because that would inevitably mean that some vendor will forget to do
it.
So it needs to be highlighted in OFA to vendor doc, and ignored in OFA
to Consumer doc.


From rdreier at cisco.com  Wed Dec 12 11:09:27 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 12 Dec 2007 11:09:27 -0800
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related
	hCalls on POWER5
In-Reply-To: <475FD0A1.3090304@voltaire.com> (Or Gerlitz's message of "Wed,
	12 Dec 2007 14:14:25 +0200")
References: <OFD9564F75.44193623-ONC12573AE.002EA542-C12573AE.002F5FBC@de.ibm.com>
	<475FD0A1.3090304@voltaire.com>
Message-ID: <adabq8v91co.fsf@cisco.com>

 > What is the fix you suggest, to add a device query that tells you for 
 > which verbs the documentation does not apply? or enhance the code of the 
 >   map_phys_fmr verb within the ehca driver to return error if called 
 > from non-sleepable context?

I think the right fix for iSER would be to make iSER work even for
devices that don't support FMRs.  For example cxgb3 doesn't implement
FMRs so if anyone ever updates iSER to work on iWARP and not just IB,
then this is something that has to be tackled anyway.  Then ehca could
just get rid of the FMR support it has.


From mshefty at ichips.intel.com  Wed Dec 12 11:12:31 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Wed, 12 Dec 2007 11:12:31 -0800
Subject: [ofa-general] RE: CMA can't establish connection with QoS on
In-Reply-To: <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com>
References: <47600070.8050008@dev.mellanox.co.il>
	<000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com>
Message-ID: <4760329F.4020204@ichips.intel.com>

As just another data point on this:

If OpenSM is not changed to support QoS, but the rdma_cm changes the SL 
in the returned path record to 1, things work fine.  I can connect and 
transfer data between my test systems.

I'm going to enable QoS on OpenSM and retest, but I need to manually 
reconfigure my systems first.

- Sean


From caitlin.bestler at gmail.com  Wed Dec 12 11:14:00 2007
From: caitlin.bestler at gmail.com (Caitlin Bestler)
Date: Wed, 12 Dec 2007 11:14:00 -0800
Subject: [ofa-general] iWARP peer-to-peer revised proposal
In-Reply-To: <C98692FD98048C41885E0B0FACD9DFB805ACF8B9@exnane01.hq.netapp.com>
References: <C98692FD98048C41885E0B0FACD9DFB805ACF8B9@exnane01.hq.netapp.com>
Message-ID: <469958e00712121114s16384619l7bbf9aa35b51ccfe@mail.gmail.com>

On Dec 11, 2007 2:07 PM, Kanevsky, Arkady <Arkady.Kanevsky at netapp.com> wrote:
> Goal is to have something implemented now in IW_CM to solve interop
> issues which we can use as a starting point to submisison to IETF MPA
> "extension".
>
> The proposal is for an initiator side to generate the first message
> (RTU)
> in RDDP mode.
> RTU stands for the 3rd MPA message - Ready to Use.
>
> Initiator side:
> 1. IW_CM sends first MPA message (request).
>
> Option A: no change for MPA request.
> Option B: Steal a bit from reserve field to indicate that Initiator
> "supports" peer-to-peer model and wants to use it.
> The default value is 0 indicating that Initiator does
> not support peer-to-peer model which is the same as current MPA format.
> Value 1 indicates support.
> Option C: The same as option B but steal a bit from private data for it
> instead of reserved field.
>

The key phrase here is "and wants to use it". For most RDMA connections,
and *all* RDMA connections for many ULPs, there is a natural active/initiator
side message that will be immediately or promptly available after connection
establishment. No special unblocking action is required, nor should
one be taken.

An active/initiator ULP that believes it is at risk of not producing a
prompt intiial
message should have an option to so indicate to find out if its remote peer will
let it off the hook to some degree. The two degrees discussed to date
are allowing
use of zero lengh RDMA Reads or RDMA Writes to unblock.

As long as we are clear that this feature is not forced upon the ULP unless the
ULP sees a need for it, I have no problem with any of the proposals. Changes to
the MPA Header, however, should be proposed through the IETF. OFA coming
to a consensus on a proposal is fine, but it should go to the IETF after that. A
convention on use of private data could be reached within OFA alone.


> [For the quick fix Option A is the easiest. For the MPA

>
> 7. For OFA interop lets agree what the RTU message type will be.
> My recommendation is unsignalled 0b Read.
>

I see no problem with OFA interop testing limiting the tested options
to those that interop participants have declared an intention to support.


From sean.hefty at intel.com  Wed Dec 12 12:14:42 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Wed, 12 Dec 2007 12:14:42 -0800
Subject: [ofa-general] opensm build error
Message-ID: <000101c83cfb$a1db70b0$ff0da8c0@amr.corp.intel.com>

I hit the following build error after pulling the latest opensm:

opensm-osm_node_desc_rcv.o(.text+0x9f): In function `__osm_nd_rcv_process_nd':
/home/mshefty/management/opensm/opensm/osm_node_desc_rcv.c:82: undefined referen
ce to `remap_node_name'
opensm-osm_opensm.o(.text+0x301): In function `osm_opensm_destroy':
/home/mshefty/management/opensm/opensm/osm_opensm.c:187: undefined reference to
`close_node_name_map'
opensm-osm_opensm.o(.text+0x6f6): In function `osm_opensm_init':
/home/mshefty/management/opensm/opensm/osm_opensm.c:314: undefined reference to
`open_node_name_map'
collect2: ld returned 1 exit status
make[1]: *** [opensm] Error 1

I worked around the problem by running make && make install from
management/opensm/complib subdirectory before running make && make install from
management/opensm.  I don't know if this is a problem with the documentation, or
if the build process itself could work around this.

- Sean


From rdreier at cisco.com  Wed Dec 12 12:26:41 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 12 Dec 2007 12:26:41 -0800
Subject: [ofa-general] Re: [PATCH] IB/ehca: Return correct #SGEs for SRQ
In-Reply-To: <200712101220.57876.fenkes@de.ibm.com> (Joachim Fenkes's message
	of "Mon, 10 Dec 2007 12:20:57 +0100")
References: <200712101220.57876.fenkes@de.ibm.com>
Message-ID: <ada7ijj8xry.fsf@cisco.com>

thanks, applied.


From kilian at stanford.edu  Wed Dec 12 12:28:14 2007
From: kilian at stanford.edu (Kilian CAVALOTTI)
Date: Wed, 12 Dec 2007 12:28:14 -0800
Subject: [ofa-general] Cisco Topspin drivers available
In-Reply-To: <C2F174F99918D54CA2A96E57C5079B6F3551AD@sbc-exmsg2.sbcounty.gov>
References: <C2F174F99918D54CA2A96E57C5079B6F3551AD@sbc-exmsg2.sbcounty.gov>
Message-ID: <200712121228.15086.kilian@stanford.edu>

On Tuesday 11 December 2007 11:07:55 am Sufficool, Stanley wrote:
> Mellanox site has instructions on identifying the PSID of the card (
> http://www.mellanox.com/support/HCA_FW_identification.php ) .

I'm afraid this won't work. Cisco HCAs are provided with a modified 
firmware, and the Board ID is not the standard Mellanox's one:

# flint -d /dev/mst/mt25204_pci_cr0 query
Image type:      Failsafe
FW Version:      1.2.917
I.S. Version:    1
Device ID:       25204
Chip Revision:   A0
Description:     Node             Port1            Sys image
GUIDs:           0005ad000008c9d8 0005ad000008c9d9 0005ad000100d050
Board ID:        4��  <- should be something like MT_0xxxx
VSD:             4��
PSID:

# ibv_devinfo
hca_id: mthca0
        fw_ver:                         1.2.917
        node_guid:                      0005:ad00:0008:c9d8
        sys_image_guid:                 0005:ad00:0100:d050
        vendor_id:                      0x05ad
        vendor_part_id:                 25204
        hw_ver:                         0xA0
        board_id:                       HCA.Cheetah-DDR.20      
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 2
                        port_lid:               132
                        port_lmc:               0x00


Although I think it should be safe to flash a Cisco HCA with a Mellanox 
firmware (I doubt the chips are different), identifying the right image 
to apply is more challenging.

Cheers,
-- 
Kilian


From rdreier at cisco.com  Wed Dec 12 12:29:09 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 12 Dec 2007 12:29:09 -0800
Subject: [ofa-general] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls if
	necessary
In-Reply-To: <200712101859.11218.fenkes@de.ibm.com> (Joachim Fenkes's message
	of "Mon, 10 Dec 2007 18:59:10 +0100")
References: <200712101859.11218.fenkes@de.ibm.com>
Message-ID: <ada3au78xnu.fsf@cisco.com>

thanks, applied.

With your next batch of patches for 2.6.25, could you clean up:

 > --- a/drivers/infiniband/hw/ehca/hcp_if.c
 > +++ b/drivers/infiniband/hw/ehca/hcp_if.c
 > @@ -89,6 +89,7 @@
 >  #define HCALL9_REGS_FORMAT HCALL7_REGS_FORMAT " r11=%lx r12=%lx"
 >  
 >  static DEFINE_SPINLOCK(hcall_lock);
 > +extern int ehca_lock_hcalls;

and move that extern declaration into an appropriate header file?

Thanks...


From a-acres at afval.nl  Wed Dec 12 13:26:55 2007
From: a-acres at afval.nl (Shelley Burnett)
Date: Wed, 12 Dec 2007 22:26:55 +0100
Subject: [ofa-general] I was looking for you
Message-ID: <01c83d0e$19fc70f0$b0df7b59@a-acres>

Hello! I am tired today. I am nice girl that would like to chat with you. Email me at h at ShineBal.info only, because I am writing not from my personal email. Hope you like my pictures.


From sashak at voltaire.com  Wed Dec 12 13:30:57 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 21:30:57 +0000
Subject: [ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of
	routing engine/algorithm used
In-Reply-To: <33254.128.15.244.71.1197249143.squirrel@127.0.0.1>
References: <1197075342.29314.133.camel@cardanus.llnl.gov>
	<20071209204822.GI708@sashak.voltaire.com>
	<33254.128.15.244.71.1197249143.squirrel@127.0.0.1>
Message-ID: <20071212213057.GU23319@sashak.voltaire.com>

Hi Al,

On 17:12 Sun 09 Dec     , Albert Chu wrote:
> >
> > So I'm not sure that just "renaming" helps a lot.
> 
> Maybe I'm misunderstanding your comments or maybe I didn't explain the
> issue and patch well enough.

No, actually it seems that I misread this part of the patch. My fault,
sorry. I will comment again shortly.

Sasha


From kliteyn at dev.mellanox.co.il  Wed Dec 12 13:44:00 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 12 Dec 2007 23:44:00 +0200
Subject: [ofa-general] Re: CMA can't establish connection with QoS on
In-Reply-To: <000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com>
References: <47600070.8050008@dev.mellanox.co.il>
	<000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com>
Message-ID: <47605620.3070105@dev.mellanox.co.il>

Sean Hefty wrote:
>> I'm having troubles with using rdma when QoS is enabled on the subnet.
>> Everything works fine as long as QoS policy enforces SL=0, but when
>> the policy defines SL!=0, the client application hungs.
>>
>> The only thing I see is that CMA sends sends PathRecord query to OpenSM
>> and OpenSM returns the QoS-enforced SL.
>>
>> I've also opened bugzilla bug for this - num. 821
>> You can see more details and instructions to reproduce it there.
> 
> I haven't tried to reproduce this yet, but looking at the code:
> 
> I didn't notice anything that either the rdma_cm or ib_cm did differently based
> on the value of the SL.  The rdma_cm doesn't do anything with SL except pass
> whatever it gets in the path record through to the ib_cm.  The ib_cm copies the
> SL into the REQ, but it does end up being used to create an address handle to
> send the MAD.  Is there any chance that the creation of the address handle is
> causing the problem?  Does any other QoS related code with SL != 0 work?

Not sure if it helps, but "ibv_rc_pingpong -l <any SL>" works.

-- Yevgeny

> - Sean
> 


From sashak at voltaire.com  Wed Dec 12 14:05:22 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 22:05:22 +0000
Subject: [ofa-general] opensm build error
In-Reply-To: <000101c83cfb$a1db70b0$ff0da8c0@amr.corp.intel.com>
References: <000101c83cfb$a1db70b0$ff0da8c0@amr.corp.intel.com>
Message-ID: <20071212220522.GV23319@sashak.voltaire.com>

Hi Sean,

On 12:14 Wed 12 Dec     , Sean Hefty wrote:
> I hit the following build error after pulling the latest opensm:
> 
> opensm-osm_node_desc_rcv.o(.text+0x9f): In function `__osm_nd_rcv_process_nd':
> /home/mshefty/management/opensm/opensm/osm_node_desc_rcv.c:82: undefined referen
> ce to `remap_node_name'
> opensm-osm_opensm.o(.text+0x301): In function `osm_opensm_destroy':
> /home/mshefty/management/opensm/opensm/osm_opensm.c:187: undefined reference to
> `close_node_name_map'
> opensm-osm_opensm.o(.text+0x6f6): In function `osm_opensm_init':
> /home/mshefty/management/opensm/opensm/osm_opensm.c:314: undefined reference to
> `open_node_name_map'
> collect2: ld returned 1 exit status
> make[1]: *** [opensm] Error 1
> 
> I worked around the problem by running make && make install from
> management/opensm/complib subdirectory before running make && make install from
> management/opensm.  I don't know if this is a problem with the documentation, or
> if the build process itself could work around this.

Thanks for reporting this. I think it is a bug in the OpenSM build -
-L$(libdir) is placed first in opensm linker command before local path,
So you get such results.

The actual problem is that OpenSM configurator adds common OpenSM
library paths (for libibumad, etc.) which includes $(libdir) in LDFLAGS
instead of LDADD. LDFLAGS are going before LDADD in a linker command
line, and $(libdir)/libosmcomp.so is grabbed this way.

The patch shortly.

Sasha


From durydesign.com at reoutlook.com  Wed Dec 12 14:00:44 2007
From: durydesign.com at reoutlook.com (Thomas Wood)
Date: Wed, 12 Dec 2007 17:00:44 -0500
Subject: [ofa-general] Avoid enhancement pills
Message-ID: <000901c83d0a$57ad4680$0100007f@beyqgk>


Info attached or here:
http://www.tefrint.net/

-----
He hurried to repair the damag
Connor couldnt imagine why the
Sinclair cleared his throat, m
Brenna started out looking ser
 
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071212/06cb9012/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: img75.jpg
Type: image/jpg
Size: 0 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071212/06cb9012/attachment.jpg>

From sashak at voltaire.com  Wed Dec 12 14:15:02 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 22:15:02 +0000
Subject: [ofa-general] [PATCH] opensm/config/osmvsel.m4: update LDADD
	variable, not LDFLAGS
In-Reply-To: <20071212220522.GV23319@sashak.voltaire.com>
References: <000101c83cfb$a1db70b0$ff0da8c0@amr.corp.intel.com>
	<20071212220522.GV23319@sashak.voltaire.com>
Message-ID: <20071212221502.GW23319@sashak.voltaire.com>


Then OSMV_LDADD is selected it should update LDADD variable and not
LDFLAGS - many Makefiles in OpenSM uses prog_LDADD for its own
definitions addition and since LDFLAGS are going before any LDADD in a
linker command such customizations could be skiped.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/config/osmvsel.m4 |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/opensm/config/osmvsel.m4 b/opensm/config/osmvsel.m4
index aa20c2f..2350b32 100644
--- a/opensm/config/osmvsel.m4
+++ b/opensm/config/osmvsel.m4
@@ -139,11 +139,9 @@ if test "$disable_libcheck" != "yes"; then
 
  dnl based on the with_osmv we can try the vendor flag
  if test $with_osmv = "openib"; then
-   osmv_save_ldflags=$LDFLAGS
-   LDFLAGS="$LDFLAGS $OSMV_LDADD"
+   LDADD="$LDADD $OSMV_LDADD"
    AC_CHECK_LIB(ibumad, umad_init, [],
 	 AC_MSG_ERROR([umad_init() not found. libosmvendor of type openib requires libibumad.]))
-   LD_FLAGS=$osmv_save_ldflags
  elif test $with_osmv = "sim" ; then
    LDFLAGS="$LDFLAGS -L$with_sim/lib"
    AC_CHECK_FILE([$with_sim/lib/libibmscli.a], [], 
-- 
1.5.3.4.206.g58ba4


From weiny2 at llnl.gov  Wed Dec 12 14:09:47 2007
From: weiny2 at llnl.gov (Ira Weiny)
Date: Wed, 12 Dec 2007 14:09:47 -0800
Subject: [ofa-general] [PATCH] mstflint: Convert project to autoconf tools.
In-Reply-To: <20071211130000.41b3025b.weiny2@llnl.gov>
References: <20071210133554.44bac886.weiny2@llnl.gov>
	<475EF4F6.6000401@mellanox.co.il>
	<20071211130000.41b3025b.weiny2@llnl.gov>
Message-ID: <20071212140947.2d039a85.weiny2@llnl.gov>

Well, as luck would have it the patch has a bug.  I accidentally had the same
.c files for mstmread and mstregdump in Makefile.am.

This new attached patch should be correct,
Ira

On Tue, 11 Dec 2007 13:00:00 -0800
Ira Weiny <weiny2 at llnl.gov> wrote:

> On Tue, 11 Dec 2007 22:37:10 +0200
> Tziporet Koren <tziporet at dev.mellanox.co.il> wrote:
> 
> > Adding Oren who is mstflint maintainer
> 
> Ah my bad, sorry.
> 
> Thanks,
> Ira
> 
> > 
> > Tziporet
> > 
> > Ira Weiny wrote:
> > > This patch removes the makefile and converts the mstflint git tree over to
> > > autoconf tools.  This works great on x86_64 but has not been tested on other
> > > arch's.  (Although it is simple enough I don't see how would not work.)
> > >
> > > Thanks,
> > > Ira
> > >

>From e453fd0f1773e9e3715d16ca1641948191c84b07 Mon Sep 17 00:00:00 2001
From: Ira K. Weiny <weiny2 at wopri.(none)>
Date: Mon, 10 Dec 2007 13:30:22 -0800
Subject: [PATCH] Convert project to autoconf tools.


Signed-off-by: Ira K. Weiny <weiny2 at woprjr0.(none)>
---
 Makefile         |   47 -----------------------------------------------
 Makefile.am      |   21 +++++++++++++++++++++
 autogen.sh       |   11 +++++++++++
 configure.in     |   22 ++++++++++++++++++++++
 mstflint.spec.in |   45 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 99 insertions(+), 47 deletions(-)
 delete mode 100644 Makefile
 create mode 100644 Makefile.am
 create mode 100755 autogen.sh
 create mode 100644 configure.in
 create mode 100644 mstflint.spec.in

diff --git a/Makefile b/Makefile
deleted file mode 100644
index 889c97a..0000000
--- a/Makefile
+++ /dev/null
@@ -1,47 +0,0 @@
-#default options
-CFLAGS += -O2
-CFLAGS += -g
-CFLAGS += -Wall
-CXXFLAGS += -fno-exceptions
-CFLAGS += -I.
-LD=$(CXX)
-EXTRA_LOADLIBES=-lz
-LOADLIBES+=${EXTRA_LOADLIBES}
-
-all: default
-bin: mstflint mstmread mstmwrite mstregdump mstvpd
-
-default: bin
-static: bin
-shared: bin
-
-.PHONY: all bin clean static shared default
-.DELETE_ON_ERROR:
-
-default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libz.a)" "$(shell $(CXX)  ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libstdc++.a)"
-default: LD=$(CC)
-static: CFLAGS+=-static
-
-mstflint: mstflint.o mflash.o
-	$(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} mstflint.o mflash.o -o mstflint ${LOADLIBES}
-
-mstflint.o: flint.cpp mflash.h
-	$(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o mstflint.o
-
-mflash.o: mtcr.h mflash.c mflash.h
-	$(CC) ${CFLAGS} -c mflash.c -o mflash.o
-
-mstmwrite: mwrite.c mtcr.h
-	$(CC) ${CFLAGS} mwrite.c -o mstmwrite
-
-mstmread: mread.c mtcr.h
-	$(CC) ${CFLAGS} mread.c -o mstmread
-
-mstregdump: mstdump.c mtcr.h
-	$(CC) ${CFLAGS} mstdump.c -o mstregdump
-
-mstvpd: vpd.c
-	$(CC) ${CFLAGS} vpd.c -o mstvpd
-
-clean:
-	rm -f mstvpd mstregdump mstflint mstmread mstmwrite mstflint.o mflash.o
diff --git a/Makefile.am b/Makefile.am
new file mode 100644
index 0000000..1a71e75
--- /dev/null
+++ b/Makefile.am
@@ -0,0 +1,21 @@
+bin_PROGRAMS = mstmread \
+					mstmwrite \
+					mstflint \
+					mstregdump \
+					mstvpd
+
+mstmread_SOURCES = mread.c mtcr.h
+
+mstmwrite_SOURCES = mwrite.c mtcr.h
+
+mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
+mstflint_LDFLAGS = -lz
+
+mstregdump_SOURCES = mstdump.c mtcr.h
+
+mstvpd_SOURCES = vpd.c
+
+
+EXTRA_DIST = \
+	mstflint.spec
+
diff --git a/autogen.sh b/autogen.sh
new file mode 100755
index 0000000..4827884
--- /dev/null
+++ b/autogen.sh
@@ -0,0 +1,11 @@
+#! /bin/sh
+
+# create config dir if not exist
+test -d config || mkdir config
+
+set -x
+aclocal -I config
+libtoolize --force --copy
+autoheader
+automake --foreign --add-missing --copy
+autoconf
diff --git a/configure.in b/configure.in
new file mode 100644
index 0000000..0924d65
--- /dev/null
+++ b/configure.in
@@ -0,0 +1,22 @@
+dnl Process this file with autoconf to produce a configure script.
+
+AC_INIT(mstflint)
+
+AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the project name.])
+AC_SUBST([PROJECT])
+
+AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the project version.])
+AC_SUBST([VERSION])
+
+AC_CONFIG_AUX_DIR(config)
+AC_CONFIG_SRCDIR([README])
+AM_INIT_AUTOMAKE(mstflint, 1.3)
+
+dnl Checks for programs
+AC_PROG_CC
+AC_PROG_CXX
+AC_PROG_LIBTOOL
+AC_CONFIG_HEADERS
+
+AC_CONFIG_FILES([Makefile mstflint.spec])
+AC_OUTPUT
diff --git a/mstflint.spec.in b/mstflint.spec.in
new file mode 100644
index 0000000..b5937be
--- /dev/null
+++ b/mstflint.spec.in
@@ -0,0 +1,45 @@
+Summary: Mellanox firmware burning application
+Name: mstflint
+Version: @VERSION@
+Release: 1
+License: GPL/BSD
+Url: http://openib.org/
+Group: System Environment/Base
+BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
+Source: mstflint- at VERSION@.tar.gz
+ExclusiveArch: i386 x86_64 ia64 ppc ppc64
+BuildRequires: zlib-devel
+Requires(post): chkconfig
+
+%description
+This package contains a tool for burning updated firmware on to
+Mellanox manufactured InfiniBand adapters.
+
+%prep
+%setup -q
+
+%build
+%configure
+make
+
+%install
+rm -rf $RPM_BUILD_ROOT
+make DESTDIR=${RPM_BUILD_ROOT} install
+# remove unpackaged files from the buildroot
+rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
+
+%clean
+rm -rf $RPM_BUILD_ROOT
+
+%files
+%defattr(-,root,root)
+%{_bindir}/mstmread
+%{_bindir}/mstmwrite
+%{_bindir}/mstflint
+%{_bindir}/mstregdump
+%{_bindir}/mstvpd
+
+%changelog
+* Fri Dec 07 2007 Ira Weiny <weiny2 at llnl.gov> 1.0.0
+   initial creation
+
-- 
1.5.1

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Convert-project-to-autoconf-tools.patch
Type: application/octet-stream
Size: 4664 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071212/68f9c744/attachment.obj>

From rdreier at cisco.com  Wed Dec 12 14:10:13 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 12 Dec 2007 14:10:13 -0800
Subject: [ofa-general] [PATCH 6/6] IB/ipath - Add the work completion
	error code to the QP error debug output
In-Reply-To: <20071207160104.13957.31784.stgit@eng-46.internal.keyresearch.com>
	(Arthur Jones's message of "Fri, 07 Dec 2007 08:01:04 -0800")
References: <20071207160033.13957.93150.stgit@eng-46.internal.keyresearch.com>
	<20071207160104.13957.31784.stgit@eng-46.internal.keyresearch.com>
Message-ID: <aday7bz7eey.fsf@cisco.com>

thanks, applied all 6 for 2.6.25


From prescott at hpc.ufl.edu  Wed Dec 12 14:19:06 2007
From: prescott at hpc.ufl.edu (Craig Prescott)
Date: Wed, 12 Dec 2007 17:19:06 -0500
Subject: [ofa-general] SDP and iWARP in OFED 1.2?
Message-ID: <47605E5A.8060704@hpc.ufl.edu>


Hi;

I am trying run the netperf SDP_STREAM test between a
pair of Chelsio S310-SR iWARP cards in x86_64 hosts
(CentOS 5.0, OFED 1.2).

My netperf client command starts like so:

[root at tebow1 ~]# LD_PRELOAD=/usr/lib64/libsdp.so 
/opt/netperf/bin/netperf -l 10 -H xx.xx.xx.xx -L yy.yy.yy.yy -c -C -t 
SDP_STREAM
SDP STREAM TEST from yy.yy.yy.yy (yy.yy.yy.yy) port 0 AF_INET to 
xx.xx.xx.xx (xx.xx.xx.xx) port 0 AF_INET

And then the client panics.  The RIP on the console refers to
ib_sdp:sdp_connected_handler, and I can see rdma_cm:cma_iw_handler
and iw_cm:cm_work_handler in the traceback.  From strace on the
client, the last output I get is:

connect(8, {sa_family=AF_INET, sin_port=htons(12865), 
sin_addr=inet_addr("128.227.253.92")}, 128

I can see the connection on the netserver host in sdpnetstat:

Proto Recv-Q Send-Q Local Address           Foreign Address
sdp        0      0 xx.xx.xx.xx:51804       yy.yy.yy.yy:48590

My libsdp.conf on both hosts is very simple:

use both server * *:*
use both client * *:*

The hosts I'm using also have 4X SDR IB HCAs (mthca),
if that matters.  The SDP_STREAM test runs perfectly
over IB.

Is SDP on iWARP supposed to work in OFED 1.2?  Any advice
or hints on getting it working would be most appreciated.

Thanks,
Craig


From sweitzen at cisco.com  Wed Dec 12 15:25:56 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 12 Dec 2007 15:25:56 -0800
Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes
	inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
In-Reply-To: <000001c8338c$206e80d0$614b8270$@rr.com>
References: <47445630.10000@dev.mellanox.co.il><F57121538EA0C94F86018DDD40ADA1D1911FFC@mtiexch01.mti.com><A15335FBE9BD2449AF2C9EF3D1EB8EA304A34151@xmb-sjc-216.amer.cisco.com>
	<000001c8338c$206e80d0$614b8270$@rr.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304B048E4@xmb-sjc-216.amer.cisco.com>

Jim, when do you plan to enably bzcopy by default?

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems


> -----Original Message-----
> From: general-bounces at lists.openfabrics.org 
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jim Mott
> Sent: Friday, November 30, 2007 12:04 PM
> To: ewg at lists.openfabrics.org
> Cc: general at lists.openfabrics.org
> Subject: [ofa-general] RE: [ewg] Not seeing any SDP 
> performance changes inOFED 1.3 beta, and I get Oops when 
> enabling sdp_zcopy_thresh
> 
> Hi,
>   This kernel Oops is new and I will look at it.  Dotan and 
> the Mellanox regression tests have been keeping me busy 
> recently.  There
> was a problem like this, but only in multi-threaded apps 
> using a single socket or when doing cleanup after ^C.
> 
>   I will re-enable default bzcopy behavior once all the 
> important Mellanox regression tests are passing.  Until then, 
> setting the
> sdp_zcopy_threah variable by hand (8192 and up should give 
> better performance) and running simple tests like netperf should be
> working fine.  You should not be seeing any problem here.  [I 
> have only tested locally with x86_64 rhat4u4, rhat5, 2.6.23.8, and
> 2.6.24-rc2.  Mellanox regression tests everything and they 
> have not submitted this Oops yet.]
> 
>   I have opened bugs in the openfabrics bugzilla for 
> everything I am currently working on.  It is down right now 
> or I would add
> pointers.
> 
> 
> Here is my work list; additions or priority changes welcome:
> 
> SDP OPEN ISSUES LIST (Priority order)
> =====================================
> 1) DONE: BUG: Unload of mlx4 and ib_sdp fails while SDP active
>   11/6 [PATCH 1/1 V2] SDP - Fix reference count bug ...
> 
> 2) DONE: BUG: Many data corruption failures
>   11/11 [PATCH 1/1] SDP - Fix bug where zcopy bcopy returns ...
> 
> 3) DONE: Bug 793 - kernel BUG at net/core/skbuff.c:95!
>   11/26 [PATCH 1/1] SDP - bug793; skbuff changes ...
> 
> 4) TODO: BUG: kernel oops in SDP regression 
>   Replicated problem by hitting ^C during a transfer.  I have 
> created a patch that fixes the problem, but it needs more work
> to move into production.  There are some side effects I do not
> yet understand.
>   This is the one I am working on now.  I hope to drop it soon.
> There is a bug open tracking it.
> 
> 5) TODO: BUG: libsdp returns good RC when it should fail
> 
> 6) TODO: BUG: aio_test fails in SDP regression
> 
> 7) TODO: Bug 779 - Lock ordering problem during accept on 1.2.5
>   After building a 2.6.23.8 kernel with lock checking enabled, I
> can not reproduce this problem.  Looks like I'll need more input
> from the reporter.  (Bug updated to say this).  I will continue to
> code review though.
> 
> 8) DONE: Bug 294 - connect does not allow AF_INET_SDP
>   [fix in bugzilla dropped] 
> 
> 9) DONE: Backport work needed to support 2.6.24
> 
> 10) TODO: Package user space libsdp for Redhat
>   This is supposed to be easy to do, but it will take me some time
> to figure out the detail.  
> 
> 11) DONE: BUG: Memory leak
>   11/20 [PATCH 1/1 v2] SDP - Fix a memory leak in bzcopy
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org 
> [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Scott 
> Weitzenkamp (sweitzen)
> Sent: Friday, November 30, 2007 12:37 PM
> To: Jim Mott; Scott Weitzenkamp (sweitzen); ewg at lists.openfabrics.org
> Cc: general at lists.openfabrics.org
> Subject: [ewg] Not seeing any SDP performance changes in OFED 
> 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
> 
> Jim,
> 
> Using netperf with TCP_STREAM and TCP_RR, I'm not seeing any 
> changes in
> SDP throughput or CPU utilization comparing OFED 1.3 beta and OFED
> 1.2.5.  Looks like I need to set a non-zero value in
> /sys/module/ib_sdp/sdp_zcopy_thresh?  Do you plan to enable this by
> default soon?
> 
> I tried "echo 4096 > /sys/module/ib_sdp/sdp_zcopy_thresh" on RHEL4 and
> then tried netperf, and got an Oops.
> 
> Unable to handle kernel NULL pointer deref
> erence at 0000000000000000 RIP:
> <Nov/30 10:33 am><ffffffff80163ff0>{put_page+0}
> <Nov/30 10:33 am>PML4 1a3047067 PGD 1a7a6d067 PMD 0
> <Nov/30 10:33 am>Oops: 0000 [1] SMP
> <Nov/30 10:33 am>CPU 0
> <Nov/30 10:33 am>Modules linked in: parport_pc lp parport autofs4
> i2c_dev i2c_co
> re nfs lockd nfs_acl sunrpc rdma_ucm(U) rds(U) ib_sdp(U) rdma_cm(U)
> iw_cm(U) ib_
> addr(U) mlx4_ib(U) mlx4_core(U) ds yenta_socket pcmcia_core dm_mirror
> dm_multipa
> th dm_mod joydev button battery ac uhci_hcd ehci_hcd shpchp 
> ib_mthca(U)
> ib_ipoib
> (U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U)
> ib_core(U) md5
>  ipv6 e1000 floppy ata_piix libata sg ext3 jbd mptscsih mptsas mptspi
> mptscsi mp
> tbase sd_mod scsi_mod
> <Nov/30 10:33 am>Pid: 6802, comm: netperf241 Not tainted
> 2.6.9-55.ELlargesmp
> <Nov/30 10:33 am>RIP: 0010:[<ffffffff80163ff0>]
> <ffffffff80163ff0>{put_page+0}
> <Nov/30 10:33 am>RSP: 0018:00000101a7bcbbc0  EFLAGS: 00010203
> <Nov/30 10:33 am>RAX: 0000000000000000 RBX: 0000000000000001 RCX:
> 00000000000002
> 02
> <Nov/30 10:33 am>RDX: 00000101b0b43e80 RSI: 0000000000000202 RDI:
> 00000000000000
> 00
> <Nov/30 10:33 am>RBP: 00000101b85761c0 R08: 0000000000000000 R09:
> 00000000000000
> 00
> <Nov/30 10:33 am>R10: 0000000000000246 R11: ffffffffa02e0e36 R12:
> 00000101a4b330
> 80
> <Nov/30 10:33 am>R13: 00000101a7bcbd58 R14: 0000000000000000 R15:
> 00000000000100
> 00
> <Nov/30 10:33 am>FS:  0000002a95696940(0000) GS:ffffffff80500380(0000)
> knlGS:000
> 0000000000000
> <Nov/30 10:33 am>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> <Nov/30 10:33 am>CR2: 0000000000000000 CR3: 0000000000101000 CR4:
> 00000000000006
> e0
> <Nov/30 10:33 am>Process netperf241 (pid: 6802, threadinfo
> 00000101a7bca000, tas
> k 00000101a70df030)
> <Nov/30 10:33 am>Stack: ffffffffa02e110a 0000000000000100
> 0000000000000000 00000
> 00000529780
> <Nov/30 10:33 am>       0001000000000246 0000000000000246
> 000000008013feac 00000
> 800ffffffe0
> <Nov/30 10:33 am>       0000000000000000 00000101a7bcbe88
> <Nov/30 10:33 am>Call 
> Trace:<ffffffffa02e110a>{:ib_sdp:sdp_sendmsg+724}
> <fffffff
> f801478b2>{queue_delayed_work+101}
> <Nov/30 10:33 am>       <ffffffffa02c6200>{:ib_addr:queue_req+122}
> <ffffffff802a
> 7ecb>{sock_sendmsg+271}
> <Nov/30 10:33 am>       <ffffffff80169a61>{do_no_page+916}
> <ffffffff801359a8>{au
> toremove_wake_function+0}
> <Nov/30 10:33 am>       <ffffffff802a7c53>{sockfd_lookup+16}
> <ffffffff802a939a>{
> sys_sendto+195}
> <Nov/30 10:33 am>       <ffffffff801242b9>{do_page_fault+577}
> <ffffffff801934c8>
> {dnotify_parent+34}
> <Nov/30 10:33 am>       <ffffffff80179335>{vfs_read+248}
> <ffffffff8011026a>{syst
> em_call+126}
> <Nov/30 10:33 am>
> 
> <Nov/30 10:33 am>Code: 8b 07 48 89 fa f6 c4 80 74 3b 48 8b 57 10 8b 02
> 48 89 d1
> f6
> <Nov/30 10:33 am>RIP <ffffffff80163ff0>{put_page+0} RSP
> <00000101a7bcbbc0>
> <Nov/30 10:33 am>CR2: 0000000000000000
> <Nov/30 10:33 am> <0>Kernel panic - not syncing: Oops
> 
> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From sashak at voltaire.com  Wed Dec 12 15:46:17 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 23:46:17 +0000
Subject: [ofa-general] [PATCH] infiniband-diags/ibcheckerrors: fix port
	errors count
In-Reply-To: <20071211152727.GI23319@sashak.voltaire.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
Message-ID: <20071212234617.GX23319@sashak.voltaire.com>


This fixes port error counter (pcnterr) update. Also removes unsued
variables.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/scripts/ibcheckerrors.in |   18 +++++++-----------
 1 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/infiniband-diags/scripts/ibcheckerrors.in b/infiniband-diags/scripts/ibcheckerrors.in
index 5cfabc6..a45bd63 100644
--- a/infiniband-diags/scripts/ibcheckerrors.in
+++ b/infiniband-diags/scripts/ibcheckerrors.in
@@ -21,7 +21,6 @@ brief=""
 v=0
 ntype=""
 nodeguid=""
-oldlid=""
 topofile=""
 ca_info=""
 
@@ -81,14 +80,14 @@ BEGIN {
 }
 function check_node(lid, port)
 {
-	nodechecked=1
 	if (system("'$IBPATH'/ibchecknode '"$ca_info"' '$gflags' '$verbose' " lid)) {
 		ne++
 		print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
-		return
+		return 1;
 	}
 	if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
-		nodeerr=1;
+		return 2;
+	return 0;
 }
 
 /^Ca/ || /^Switch/ || /^Rt/ {
@@ -97,15 +96,13 @@ function check_node(lid, port)
 			if ('$v')
 				print "\n# Checking " ntype ": nodeguid 0x" nodeguid
 
-			nodechecked=0
-			nodeerr=0
-			badnode=0
+			err = 0;
 			if (ntype != "Switch")
 				next
 
 			lid = substr($0, index($0, "port 0 lid ") + 11)
 			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid, 255)
+			err = check_node(lid, 255)
 		}
 /^\[/	{
 		nports++
@@ -115,10 +112,9 @@ function check_node(lid, port)
 		if (ntype != "Switch") {
 			lid = substr($0, index($0, " lid ") + 5)
 			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid, port)
-			if (nodeerr)
+			if (check_node(lid, port) == 2)
 				pcnterr++;
-		} else if (nodeerr &&
+		} else if (err &&
 			   system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
 			pcnterr++;
 }
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Wed Dec 12 15:47:37 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 12 Dec 2007 23:47:37 +0000
Subject: [ofa-general] [PATCH] infiniabd-diags/scripts: fix perfquery usage
In-Reply-To: <20071211152727.GI23319@sashak.voltaire.com>
References: <A15335FBE9BD2449AF2C9EF3D1EB8EA304A34FE4@xmb-sjc-216.amer.cisco.com>
	<1196863780.30768.293.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A35082@xmb-sjc-216.amer.cisco.com>
	<1196873579.30768.327.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A968DB@xmb-sjc-216.amer.cisco.com>
	<1196874875.30768.334.camel@hrosenstock-ws.xsigo.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A96901@xmb-sjc-216.amer.cisco.com>
	<20071211134632.GC23319@sashak.voltaire.com>
	<1197385068.8114.676.camel@hrosenstock-ws.xsigo.com>
	<20071211152727.GI23319@sashak.voltaire.com>
Message-ID: <20071212234737.GY23319@sashak.voltaire.com>


It similar to ibcheckerrors fix - don't use 'all port' option for CA
ports querying and reset.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/scripts/ibchecknet.in      |   46 ++++++++++-----------------
 infiniband-diags/scripts/ibclearcounters.in |    6 +---
 infiniband-diags/scripts/ibclearerrors.in   |    4 +--
 infiniband-diags/scripts/ibdatacounters.in  |   41 +++++++----------------
 4 files changed, 32 insertions(+), 65 deletions(-)

diff --git a/infiniband-diags/scripts/ibchecknet.in b/infiniband-diags/scripts/ibchecknet.in
index b6e0945..ebcf22d 100644
--- a/infiniband-diags/scripts/ibchecknet.in
+++ b/infiniband-diags/scripts/ibchecknet.in
@@ -72,16 +72,16 @@ BEGIN {
 	ne=0
 	pe=0
 }
-function check_node(lid)
+function check_node(lid, port)
 {
-	nodechecked=1
-	if (system("'$IBPATH'/ibchecknode'"$ca_info"' '$gflags' '$verbose' " lid)) {
+	if (system("'$IBPATH'/ibchecknode '"$ca_info"' '$gflags' '$verbose' " lid)) {
 		ne++
-		badnode=1
-		return
+		print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
+		return 1;
 	}
-	if (system("'$IBPATH'/ibcheckerrs'"$ca_info"' '$gflags' '$verbose' " lid " 255"))
-		nodeerr=1;
+	if (system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
+		return  2;
+	return 0;
 }
 
 /^Ca/ || /^Switch/ || /^Rt/ {
@@ -90,30 +90,27 @@ function check_node(lid)
 			if ('$v' || ntype != "Switch") 
 				print "\n# Checking " ntype ": nodeguid 0x" nodeguid
 
-			nodechecked=0
-			nodeerr=0
-			badnode=0
+			err = 0;
 			if (ntype != "Switch")
 				next
 
 			lid = substr($0, index($0, "port 0 lid ") + 11)
 			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid)
+			err = check_node(lid, 255)
 		}
 /^\[/	{
 		nports++
 		port = $1
-		if (!nodechecked) {
-			lid = substr($0, index($0, " lid ") + 5)
-			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid)
-		}
-		if (badnode) {
-			print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
-			next
-		}
 		sub("\\(.*\\)", "", port)
 		gsub("[\\[\\]]", "", port)
+		if (ntype != "Switch") {
+			lid = substr($0, index($0, " lid ") + 5)
+			lid = substr(lid, 1, index(lid, " ") - 1)
+			if (check_node(lid, port) == 2)
+  				pcnterr++;
+		} else if (err &&
+			   system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port))
+			pcnterr++;
 		if (system("'$IBPATH'/ibcheckport'"$ca_info"' '$gflags' '$verbose' " lid " " port)) {
 			if (!'$v' && oldlid != lid) {
 				print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure"
@@ -121,15 +118,6 @@ function check_node(lid)
 			}
 			pe++;
 		}
-
-		if (nodeerr)
-			if (system("'$IBPATH'/ibcheckerrs'"$ca_info"' '$gflags' '$verbose' " lid " " port)) {
-				if (!'$v' && oldlid != lid) {
-					print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure"
-					oldlid = lid
-				}
-				pcnterr++;
-			}
 }
 
 /^ib/	{print $0; next}
diff --git a/infiniband-diags/scripts/ibclearcounters.in b/infiniband-diags/scripts/ibclearcounters.in
index 0413d86..5b4e60c 100644
--- a/infiniband-diags/scripts/ibclearcounters.in
+++ b/infiniband-diags/scripts/ibclearcounters.in
@@ -18,7 +18,6 @@ trap user_abort SIGINT
 gflags=""
 verbose=""
 v=0
-oldlid=""
 topofile=""
 ca_info=""
 
@@ -67,14 +66,12 @@ echo "$text" | awk '
 
 function clear_counters(lid)
 {
-	nodecleared=1
 	if (system("'$IBPATH'/perfquery'"$ca_info"' '$gflags' -R -a " lid))
 		nodeerr++
 }
 
 function clear_port_counters(lid, port)
 {
-	nodecleared=1
 	if (system("'$IBPATH'/perfquery'"$ca_info"' '$gflags' -R " lid " " port))
 		nodeerr++
 }
@@ -82,7 +79,6 @@ function clear_port_counters(lid, port)
 /^Ca/ || /^Switch/ || /^Rt/ {
 			nnodes++
 			ntype=$1; nodeguid=substr($3, 4, 16); ports=$2
-			nodecleared=0
 			if (ntype != "Switch")
 				next
 
@@ -95,7 +91,7 @@ function clear_port_counters(lid, port)
 			port = $1
 			sub("\\(.*\\)", "", port)
 			gsub("[\\[\\]]", "", port)
-			if (!nodecleared) {
+			if (ntype != "Switch") {
 				lid = substr($0, index($0, " lid ") + 5)
 				lid = substr(lid, 1, index(lid, " ") - 1)
 				clear_port_counters(lid, port)
diff --git a/infiniband-diags/scripts/ibclearerrors.in b/infiniband-diags/scripts/ibclearerrors.in
index 930efa6..edf93f5 100644
--- a/infiniband-diags/scripts/ibclearerrors.in
+++ b/infiniband-diags/scripts/ibclearerrors.in
@@ -67,7 +67,6 @@ echo "$text" | awk '
 
 function clear_errors(lid, port)
 {
-	nodecleared=1
 	if (system("'$IBPATH'/perfquery'"$ca_info"' '$gflags' -R " lid " " port " 0x0fff"))
 		nodeerr++
 }
@@ -75,7 +74,6 @@ function clear_errors(lid, port)
 /^Ca/ || /^Switch/ || /^Rt/ {
 			nnodes++
 			ntype=$1; nodeguid=substr($3, 4, 16); ports=$2
-			nodecleared=0
 			if (ntype != "Switch")
 				next
 
@@ -88,7 +86,7 @@ function clear_errors(lid, port)
 			port = $1
 			sub("\\(.*\\)", "", port)
 			gsub("[\\[\\]]", "", port)
-			if (!nodecleared) {
+			if (ntype != "Switch") {
 				lid = substr($0, index($0, " lid ") + 5)
 				lid = substr(lid, 1, index(lid, " ") - 1)
 				clear_errors(lid, port)
diff --git a/infiniband-diags/scripts/ibdatacounters.in b/infiniband-diags/scripts/ibdatacounters.in
index 7f0df1c..fa2455b 100644
--- a/infiniband-diags/scripts/ibdatacounters.in
+++ b/infiniband-diags/scripts/ibdatacounters.in
@@ -21,7 +21,6 @@ brief=""
 v=0
 ntype=""
 nodeguid=""
-oldlid=""
 topofile=""
 ca_info=""
 
@@ -79,16 +78,14 @@ echo "$text" | awk '
 BEGIN {
 	ne=0
 }
-function check_node(lid)
+function check_node(lid, port)
 {
-	nodechecked=1
-	if (system("'$IBPATH'/ibchecknode'"$ca_info"' '$gflags' '$verbose' " lid)) {
+	if (system("'$IBPATH'/ibchecknode '"$ca_info"' '$gflags' '$verbose' " lid)) {
 		ne++
-		badnode=1
-		return
+		print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
+		return 1;
 	}
-	if (system("'$IBPATH'/ibdatacounts'"$ca_info"' '$gflags' '$verbose' '$brief' " lid " 255"))
-		nodeerr=1;
+	return system("'$IBPATH'/ibcheckerrs '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port);
 }
 
 /^Ca/ || /^Switch/ || /^Rt/ {
@@ -97,37 +94,25 @@ function check_node(lid)
 			if ('$v')
 				print "\n# Checking " ntype ": nodeguid 0x" nodeguid
 
-			nodechecked=0
-			nodeerr=0
-			badnode=0
+			err = 0;
 			if (ntype != "Switch")
 				next
 
 			lid = substr($0, index($0, "port 0 lid ") + 11)
 			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid)
+			err = check_node(lid, 255)
 		}
 /^\[/	{
 		nports++
 		port = $1
-		if (!nodechecked) {
-			lid = substr($0, index($0, " lid ") + 5)
-			lid = substr(lid, 1, index(lid, " ") - 1)
-			check_node(lid)
-		}
-		if (badnode) {
-			print "\n# " ntype ": nodeguid 0x" nodeguid " failed"
-			next
-		}
 		sub("\\(.*\\)", "", port)
 		gsub("[\\[\\]]", "", port)
-		if (nodeerr)
-			if (system("'$IBPATH'/ibdatacounts'"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port)) {
-				if (!'$v' && oldlid != lid) {
-					print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure"
-					oldlid = lid
-				}
-			}
+		if (ntype != "Switch") {
+			lid = substr($0, index($0, " lid ") + 5)
+			lid = substr(lid, 1, index(lid, " ") - 1)
+			check_node(lid, port)
+		} else if (err) 
+			system("'$IBPATH'/ibdatacounts '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port);
 }
 
 /^ib/	{print $0; next}
-- 
1.5.3.4.206.g58ba4


From mshefty at ichips.intel.com  Wed Dec 12 16:16:44 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Wed, 12 Dec 2007 16:16:44 -0800
Subject: [ofa-general] [PATCH] opensm/config/osmvsel.m4: update LDADD
	variable, not LDFLAGS
In-Reply-To: <20071212221502.GW23319@sashak.voltaire.com>
References: <000101c83cfb$a1db70b0$ff0da8c0@amr.corp.intel.com>	<20071212220522.GV23319@sashak.voltaire.com>
	<20071212221502.GW23319@sashak.voltaire.com>
Message-ID: <476079EC.6020306@ichips.intel.com>

This fixes the problem.

Thanks,
Sean


From sashak at voltaire.com  Wed Dec 12 17:25:52 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 13 Dec 2007 01:25:52 +0000
Subject: [ofa-general] [PATCH] opensm: don't zero base LID when invalid value
	is received
Message-ID: <20071213012552.GZ23319@sashak.voltaire.com>


This addresses bug 246 (https://bugs.openfabrics.org/show_bug.cgi?id=246):
zero lid received from opensm in set port_info smp.

When invalid value of LID (it was 0xffff in this case) is received,
OpenSM clears it to zero now. Instead this patch will try to recover
using current LID value stored in osm_physp_t.port_info.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/include/opensm/osm_port.h  |   39 -----------------------------------
 opensm/opensm/osm_port_info_rcv.c |   41 ++++++++++++++++++++----------------
 2 files changed, 23 insertions(+), 57 deletions(-)

diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h
index fcb0a16..bba4e44 100644
--- a/opensm/include/opensm/osm_port.h
+++ b/opensm/include/opensm/osm_port.h
@@ -459,45 +459,6 @@ osm_physp_set_port_info(IN osm_physp_t * const p_physp,
 *	Port, Physical Port
 *********/
 
-/****f* OpenSM: Physical Port/osm_physp_trim_base_lid_to_valid_range
-* NAME
-*  osm_physp_trim_base_lid_to_valid_range
-*
-* DESCRIPTION
-*  Validates the base LID in the Physical Port object
-*  and resets it if the base LID is invalid.
-*
-* SYNOPSIS
-*/
-static inline ib_net16_t
-osm_physp_trim_base_lid_to_valid_range(IN osm_physp_t * const p_physp)
-{
-	ib_net16_t orig_lid = 0;
-
-	CL_ASSERT(osm_physp_is_valid(p_physp));
-	if ((cl_ntoh16(p_physp->port_info.base_lid) > IB_LID_UCAST_END_HO) ||
-	    (cl_ntoh16(p_physp->port_info.base_lid) < IB_LID_UCAST_START_HO)) {
-		orig_lid = p_physp->port_info.base_lid;
-		p_physp->port_info.base_lid = 0;
-	}
-	return orig_lid;
-}
-
-/*
-* PARAMETERS
-*	p_physp
-*		[in] Pointer to an osm_physp_t object.
-*
-* RETURN VALUES
-*	Returns 0 if the base LID in the Physical port object is valid.
-*	Returns original invalid LID otherwise.
-*
-* NOTES
-*
-* SEE ALSO
-*	Port, Physical Port
-*********/
-
 /****f* OpenSM: Physical Port/osm_physp_set_pkey_tbl
 * NAME
 *  osm_physp_set_pkey_tbl
diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c
index 9ea8738..ea0cb21 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -98,6 +98,22 @@ __osm_pi_rcv_set_sm(IN const osm_pi_rcv_t * const p_rcv,
 
 /**********************************************************************
  **********************************************************************/
+static void pi_rcv_check_and_fix_lid(osm_log_t *log, ib_port_info_t * const pi,
+				     osm_physp_t * p)
+{
+	if ((cl_ntoh16(pi->base_lid) > IB_LID_UCAST_END_HO) ||
+	    (cl_ntoh16(pi->base_lid) < IB_LID_UCAST_START_HO)) {
+		osm_log(log, OSM_LOG_ERROR,
+			"pi_rcv_check_and_fix_lid: ERR 0F04: "
+			"Got invalid base LID 0x%x from the network. "
+			"Corrected to 0x%x.\n", cl_ntoh16(pi->base_lid),
+			cl_ntoh16(p->port_info.base_lid));
+		pi->base_lid = p->port_info.base_lid;
+	}
+}
+
+/**********************************************************************
+ **********************************************************************/
 static void
 __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv,
 			     IN osm_physp_t * const p_physp,
@@ -204,13 +220,12 @@ static void
 __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
 				 IN osm_node_t * const p_node,
 				 IN osm_physp_t * const p_physp,
-				 IN const ib_port_info_t * const p_pi)
+				 IN ib_port_info_t * const p_pi)
 {
 	ib_api_status_t status = IB_SUCCESS;
 	osm_madw_context_t context;
 	osm_physp_t *p_remote_physp;
 	osm_node_t *p_remote_node;
-	ib_net16_t orig_lid;
 	uint8_t port_num;
 	uint8_t remote_port_num;
 	osm_dr_path_t path;
@@ -316,19 +331,15 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
 	if (ib_port_info_get_port_state(p_pi) > IB_LINK_INIT && p_node->sw)
 		p_node->sw->need_update = 0;
 
+	if (port_num == 0)
+		pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
+
 	/*
 	   Update the PortInfo attribute.
 	 */
 	osm_physp_set_port_info(p_physp, p_pi);
 
 	if (port_num == 0) {
-		/* This is switch management port 0 */
-		if ((orig_lid =
-		     osm_physp_trim_base_lid_to_valid_range(p_physp)))
-			osm_log(p_rcv->p_log, OSM_LOG_ERROR,
-				"__osm_pi_rcv_process_switch_port: ERR 0F04: "
-				"Invalid base LID 0x%x corrected\n",
-				cl_ntoh16(orig_lid));
 		/* Determine if base switch port 0 */
 		if (p_node->sw &&
 		    !ib_switch_info_is_enhanced_port0(&p_node->sw->switch_info))
@@ -346,21 +357,15 @@ static void
 __osm_pi_rcv_process_ca_or_router_port(IN const osm_pi_rcv_t * const p_rcv,
 				       IN osm_node_t * const p_node,
 				       IN osm_physp_t * const p_physp,
-				       IN const ib_port_info_t * const p_pi)
+				       IN ib_port_info_t * const p_pi)
 {
-	ib_net16_t orig_lid;
-
 	OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_process_ca_or_router_port);
 
 	UNUSED_PARAM(p_node);
 
-	osm_physp_set_port_info(p_physp, p_pi);
+	pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
 
-	if ((orig_lid = osm_physp_trim_base_lid_to_valid_range(p_physp)))
-		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
-			"__osm_pi_rcv_process_ca_or_router_port: ERR 0F08: "
-			"Invalid base LID 0x%x corrected\n",
-			cl_ntoh16(orig_lid));
+	osm_physp_set_port_info(p_physp, p_pi);
 
 	__osm_pi_rcv_process_endport(p_rcv, p_physp, p_pi);
 
-- 
1.5.3.4.206.g58ba4


From mshefty at ichips.intel.com  Wed Dec 12 17:33:24 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Wed, 12 Dec 2007 17:33:24 -0800
Subject: [ofa-general] Re: CMA can't establish connection with QoS on
In-Reply-To: <47605620.3070105@dev.mellanox.co.il>
References: <47600070.8050008@dev.mellanox.co.il>	<000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com>
	<47605620.3070105@dev.mellanox.co.il>
Message-ID: <47608BE4.7020209@ichips.intel.com>

> Not sure if it helps, but "ibv_rc_pingpong -l <any SL>" works.

I'll try this tomorrow.

With QoS enabled (using the QoS file provided in bug 821), I get a 
ROUTE_ERROR on the client side, but I don't see a system hang.  I'm 
running 2.6.24-rc3 with patches in my for-roland branch.

I do see that opensm hangs when I try to kill it, but using 'kill' 
forces it to exit.

- Sean


From jim at mellanox.com  Wed Dec 12 18:29:09 2007
From: jim at mellanox.com (Jim Mott)
Date: Wed, 12 Dec 2007 18:29:09 -0800
Subject: [ofa-general] RE: [ewg] Not seeing any SDP performance changes
	inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304B048E4@xmb-sjc-216.amer.cisco.com>
References: <47445630.10000@dev.mellanox.co.il>
	<F57121538EA0C94F86018DDD40ADA1D1911FFC@mtiexch01.mti.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304A34151@xmb-sjc-216.amer.cisco.com>
	<000001c8338c$206e80d0$614b8270$@rr.com>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304B048E4@xmb-sjc-216.amer.cisco.com>
Message-ID: <F57121538EA0C94F86018DDD40ADA1D19C22BF@mtiexch01.mti.com>

I am traveling for the next 2 weeks and not able to test anymore.  That
said, I believe all outstanding problems are fixed and it is safe to
re-enable by default.  My testing shows the crossover size where bzcopy
is always a win at about 16K.  The patch goes in sdp_main.c and looks
something like:
  -static int sdp_zcopy_thresh = 0; 
  +static int sdp_zcopy_thresh = 16384;

-----Original Message-----
From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
Sent: Wednesday, December 12, 2007 5:26 PM
To: Jim Mott; ewg at lists.openfabrics.org
Cc: general at lists.openfabrics.org
Subject: RE: [ofa-general] RE: [ewg] Not seeing any SDP performance
changes inOFED 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh

Jim, when do you plan to enably bzcopy by default?

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems


> -----Original Message-----
> From: general-bounces at lists.openfabrics.org 
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jim Mott
> Sent: Friday, November 30, 2007 12:04 PM
> To: ewg at lists.openfabrics.org
> Cc: general at lists.openfabrics.org
> Subject: [ofa-general] RE: [ewg] Not seeing any SDP 
> performance changes inOFED 1.3 beta, and I get Oops when 
> enabling sdp_zcopy_thresh
> 
> Hi,
>   This kernel Oops is new and I will look at it.  Dotan and 
> the Mellanox regression tests have been keeping me busy 
> recently.  There
> was a problem like this, but only in multi-threaded apps 
> using a single socket or when doing cleanup after ^C.
> 
>   I will re-enable default bzcopy behavior once all the 
> important Mellanox regression tests are passing.  Until then, 
> setting the
> sdp_zcopy_threah variable by hand (8192 and up should give 
> better performance) and running simple tests like netperf should be
> working fine.  You should not be seeing any problem here.  [I 
> have only tested locally with x86_64 rhat4u4, rhat5, 2.6.23.8, and
> 2.6.24-rc2.  Mellanox regression tests everything and they 
> have not submitted this Oops yet.]
> 
>   I have opened bugs in the openfabrics bugzilla for 
> everything I am currently working on.  It is down right now 
> or I would add
> pointers.
> 
> 
> Here is my work list; additions or priority changes welcome:
> 
> SDP OPEN ISSUES LIST (Priority order)
> =====================================
> 1) DONE: BUG: Unload of mlx4 and ib_sdp fails while SDP active
>   11/6 [PATCH 1/1 V2] SDP - Fix reference count bug ...
> 
> 2) DONE: BUG: Many data corruption failures
>   11/11 [PATCH 1/1] SDP - Fix bug where zcopy bcopy returns ...
> 
> 3) DONE: Bug 793 - kernel BUG at net/core/skbuff.c:95!
>   11/26 [PATCH 1/1] SDP - bug793; skbuff changes ...
> 
> 4) TODO: BUG: kernel oops in SDP regression 
>   Replicated problem by hitting ^C during a transfer.  I have 
> created a patch that fixes the problem, but it needs more work
> to move into production.  There are some side effects I do not
> yet understand.
>   This is the one I am working on now.  I hope to drop it soon.
> There is a bug open tracking it.
> 
> 5) TODO: BUG: libsdp returns good RC when it should fail
> 
> 6) TODO: BUG: aio_test fails in SDP regression
> 
> 7) TODO: Bug 779 - Lock ordering problem during accept on 1.2.5
>   After building a 2.6.23.8 kernel with lock checking enabled, I
> can not reproduce this problem.  Looks like I'll need more input
> from the reporter.  (Bug updated to say this).  I will continue to
> code review though.
> 
> 8) DONE: Bug 294 - connect does not allow AF_INET_SDP
>   [fix in bugzilla dropped] 
> 
> 9) DONE: Backport work needed to support 2.6.24
> 
> 10) TODO: Package user space libsdp for Redhat
>   This is supposed to be easy to do, but it will take me some time
> to figure out the detail.  
> 
> 11) DONE: BUG: Memory leak
>   11/20 [PATCH 1/1 v2] SDP - Fix a memory leak in bzcopy
> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org 
> [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of Scott 
> Weitzenkamp (sweitzen)
> Sent: Friday, November 30, 2007 12:37 PM
> To: Jim Mott; Scott Weitzenkamp (sweitzen); ewg at lists.openfabrics.org
> Cc: general at lists.openfabrics.org
> Subject: [ewg] Not seeing any SDP performance changes in OFED 
> 1.3 beta, and I get Oops when enabling sdp_zcopy_thresh
> 
> Jim,
> 
> Using netperf with TCP_STREAM and TCP_RR, I'm not seeing any 
> changes in
> SDP throughput or CPU utilization comparing OFED 1.3 beta and OFED
> 1.2.5.  Looks like I need to set a non-zero value in
> /sys/module/ib_sdp/sdp_zcopy_thresh?  Do you plan to enable this by
> default soon?
> 
> I tried "echo 4096 > /sys/module/ib_sdp/sdp_zcopy_thresh" on RHEL4 and
> then tried netperf, and got an Oops.
> 
> Unable to handle kernel NULL pointer deref
> erence at 0000000000000000 RIP:
> <Nov/30 10:33 am><ffffffff80163ff0>{put_page+0}
> <Nov/30 10:33 am>PML4 1a3047067 PGD 1a7a6d067 PMD 0
> <Nov/30 10:33 am>Oops: 0000 [1] SMP
> <Nov/30 10:33 am>CPU 0
> <Nov/30 10:33 am>Modules linked in: parport_pc lp parport autofs4
> i2c_dev i2c_co
> re nfs lockd nfs_acl sunrpc rdma_ucm(U) rds(U) ib_sdp(U) rdma_cm(U)
> iw_cm(U) ib_
> addr(U) mlx4_ib(U) mlx4_core(U) ds yenta_socket pcmcia_core dm_mirror
> dm_multipa
> th dm_mod joydev button battery ac uhci_hcd ehci_hcd shpchp 
> ib_mthca(U)
> ib_ipoib
> (U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U)
> ib_core(U) md5
>  ipv6 e1000 floppy ata_piix libata sg ext3 jbd mptscsih mptsas mptspi
> mptscsi mp
> tbase sd_mod scsi_mod
> <Nov/30 10:33 am>Pid: 6802, comm: netperf241 Not tainted
> 2.6.9-55.ELlargesmp
> <Nov/30 10:33 am>RIP: 0010:[<ffffffff80163ff0>]
> <ffffffff80163ff0>{put_page+0}
> <Nov/30 10:33 am>RSP: 0018:00000101a7bcbbc0  EFLAGS: 00010203
> <Nov/30 10:33 am>RAX: 0000000000000000 RBX: 0000000000000001 RCX:
> 00000000000002
> 02
> <Nov/30 10:33 am>RDX: 00000101b0b43e80 RSI: 0000000000000202 RDI:
> 00000000000000
> 00
> <Nov/30 10:33 am>RBP: 00000101b85761c0 R08: 0000000000000000 R09:
> 00000000000000
> 00
> <Nov/30 10:33 am>R10: 0000000000000246 R11: ffffffffa02e0e36 R12:
> 00000101a4b330
> 80
> <Nov/30 10:33 am>R13: 00000101a7bcbd58 R14: 0000000000000000 R15:
> 00000000000100
> 00
> <Nov/30 10:33 am>FS:  0000002a95696940(0000) GS:ffffffff80500380(0000)
> knlGS:000
> 0000000000000
> <Nov/30 10:33 am>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> <Nov/30 10:33 am>CR2: 0000000000000000 CR3: 0000000000101000 CR4:
> 00000000000006
> e0
> <Nov/30 10:33 am>Process netperf241 (pid: 6802, threadinfo
> 00000101a7bca000, tas
> k 00000101a70df030)
> <Nov/30 10:33 am>Stack: ffffffffa02e110a 0000000000000100
> 0000000000000000 00000
> 00000529780
> <Nov/30 10:33 am>       0001000000000246 0000000000000246
> 000000008013feac 00000
> 800ffffffe0
> <Nov/30 10:33 am>       0000000000000000 00000101a7bcbe88
> <Nov/30 10:33 am>Call 
> Trace:<ffffffffa02e110a>{:ib_sdp:sdp_sendmsg+724}
> <fffffff
> f801478b2>{queue_delayed_work+101}
> <Nov/30 10:33 am>       <ffffffffa02c6200>{:ib_addr:queue_req+122}
> <ffffffff802a
> 7ecb>{sock_sendmsg+271}
> <Nov/30 10:33 am>       <ffffffff80169a61>{do_no_page+916}
> <ffffffff801359a8>{au
> toremove_wake_function+0}
> <Nov/30 10:33 am>       <ffffffff802a7c53>{sockfd_lookup+16}
> <ffffffff802a939a>{
> sys_sendto+195}
> <Nov/30 10:33 am>       <ffffffff801242b9>{do_page_fault+577}
> <ffffffff801934c8>
> {dnotify_parent+34}
> <Nov/30 10:33 am>       <ffffffff80179335>{vfs_read+248}
> <ffffffff8011026a>{syst
> em_call+126}
> <Nov/30 10:33 am>
> 
> <Nov/30 10:33 am>Code: 8b 07 48 89 fa f6 c4 80 74 3b 48 8b 57 10 8b 02
> 48 89 d1
> f6
> <Nov/30 10:33 am>RIP <ffffffff80163ff0>{put_page+0} RSP
> <00000101a7bcbbc0>
> <Nov/30 10:33 am>CR2: 0000000000000000
> <Nov/30 10:33 am> <0>Kernel panic - not syncing: Oops
> 
> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From ssufficool at rov.sbcounty.gov  Wed Dec 12 20:17:16 2007
From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley)
Date: Wed, 12 Dec 2007 20:17:16 -0800
Subject: [ofa-general] SRP Target Session Hangs
Message-ID: <C2F174F99918D54CA2A96E57C5079B6F3551BB@sbc-exmsg2.sbcounty.gov>

1) We are running WinIB on Windows 2003 SR2 with the OFED SRP Target.
The windows machines initially connect fine to the target, however when
they are restarted without using the "eject io unit" SCST retains the
SRPT session and subsequent connects to the target fail with "Cannot
start device".
 
Using most recent WinIB, kernel 2.6.22 IB drivers, and recent SRPT and
SCST.
 
2) I have yet to get the WinOF 1.0 or 1.0.1 SRP Initiator to work. Is
there any reason other than my lack of trying harder that this wouldn't
work?
 
3) This may not be the forum for this, but how can you terminate a
session using SCST proc commands?
 
Thanks in advance
 

Stanley Sufficool
Systems Analyst I
County of San Bernardino 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071212/f88f8b94/attachment.html>

From kliteyn at mellanox.co.il  Wed Dec 12 21:10:52 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 13 Dec 2007 07:10:52 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-13:normal completion
Message-ID: <MTLEXCH01lQ080Gxw8Z0000064d@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-12
OpenSM git rev = Tue_Dec_11_17:39:24_2007 [975e1f995d97f45bb15fd5cd2a19bbed865a1153]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=520  Fail=0
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From kliteyn at mellanox.co.il  Wed Dec 12 23:38:17 2007
From: kliteyn at mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 13 Dec 2007 09:38:17 +0200
Subject: [ofa-general] [PATCH] opensm: don't zero base LID when invalid
	value	is received
In-Reply-To: <20071213012552.GZ23319@sashak.voltaire.com>
References: <20071213012552.GZ23319@sashak.voltaire.com>
Message-ID: <4760E169.5010901@mellanox.co.il>

Hi Sasha,

The patch looks good. Few questions:
Is the original problem reproducible?
Will the fix go to ofed_1_3 too?

-- Yevgeny

Sasha Khapyorsky wrote:
> This addresses bug 246 (https://bugs.openfabrics.org/show_bug.cgi?id=246):
> zero lid received from opensm in set port_info smp.
>
> When invalid value of LID (it was 0xffff in this case) is received,
> OpenSM clears it to zero now. Instead this patch will try to recover
> using current LID value stored in osm_physp_t.port_info.
>
> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> ---
>  opensm/include/opensm/osm_port.h  |   39 -----------------------------------
>  opensm/opensm/osm_port_info_rcv.c |   41 ++++++++++++++++++++----------------
>  2 files changed, 23 insertions(+), 57 deletions(-)
>
> diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h
> index fcb0a16..bba4e44 100644
> --- a/opensm/include/opensm/osm_port.h
> +++ b/opensm/include/opensm/osm_port.h
> @@ -459,45 +459,6 @@ osm_physp_set_port_info(IN osm_physp_t * const p_physp,
>  *	Port, Physical Port
>  *********/
>  
> -/****f* OpenSM: Physical Port/osm_physp_trim_base_lid_to_valid_range
> -* NAME
> -*  osm_physp_trim_base_lid_to_valid_range
> -*
> -* DESCRIPTION
> -*  Validates the base LID in the Physical Port object
> -*  and resets it if the base LID is invalid.
> -*
> -* SYNOPSIS
> -*/
> -static inline ib_net16_t
> -osm_physp_trim_base_lid_to_valid_range(IN osm_physp_t * const p_physp)
> -{
> -	ib_net16_t orig_lid = 0;
> -
> -	CL_ASSERT(osm_physp_is_valid(p_physp));
> -	if ((cl_ntoh16(p_physp->port_info.base_lid) > IB_LID_UCAST_END_HO) ||
> -	    (cl_ntoh16(p_physp->port_info.base_lid) < IB_LID_UCAST_START_HO)) {
> -		orig_lid = p_physp->port_info.base_lid;
> -		p_physp->port_info.base_lid = 0;
> -	}
> -	return orig_lid;
> -}
> -
> -/*
> -* PARAMETERS
> -*	p_physp
> -*		[in] Pointer to an osm_physp_t object.
> -*
> -* RETURN VALUES
> -*	Returns 0 if the base LID in the Physical port object is valid.
> -*	Returns original invalid LID otherwise.
> -*
> -* NOTES
> -*
> -* SEE ALSO
> -*	Port, Physical Port
> -*********/
> -
>  /****f* OpenSM: Physical Port/osm_physp_set_pkey_tbl
>  * NAME
>  *  osm_physp_set_pkey_tbl
> diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c
> index 9ea8738..ea0cb21 100644
> --- a/opensm/opensm/osm_port_info_rcv.c
> +++ b/opensm/opensm/osm_port_info_rcv.c
> @@ -98,6 +98,22 @@ __osm_pi_rcv_set_sm(IN const osm_pi_rcv_t * const p_rcv,
>  
>  /**********************************************************************
>   **********************************************************************/
> +static void pi_rcv_check_and_fix_lid(osm_log_t *log, ib_port_info_t * const pi,
> +				     osm_physp_t * p)
> +{
> +	if ((cl_ntoh16(pi->base_lid) > IB_LID_UCAST_END_HO) ||
> +	    (cl_ntoh16(pi->base_lid) < IB_LID_UCAST_START_HO)) {
> +		osm_log(log, OSM_LOG_ERROR,
> +			"pi_rcv_check_and_fix_lid: ERR 0F04: "
> +			"Got invalid base LID 0x%x from the network. "
> +			"Corrected to 0x%x.\n", cl_ntoh16(pi->base_lid),
> +			cl_ntoh16(p->port_info.base_lid));
> +		pi->base_lid = p->port_info.base_lid;
> +	}
> +}
> +
> +/**********************************************************************
> + **********************************************************************/
>  static void
>  __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv,
>  			     IN osm_physp_t * const p_physp,
> @@ -204,13 +220,12 @@ static void
>  __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
>  				 IN osm_node_t * const p_node,
>  				 IN osm_physp_t * const p_physp,
> -				 IN const ib_port_info_t * const p_pi)
> +				 IN ib_port_info_t * const p_pi)
>  {
>  	ib_api_status_t status = IB_SUCCESS;
>  	osm_madw_context_t context;
>  	osm_physp_t *p_remote_physp;
>  	osm_node_t *p_remote_node;
> -	ib_net16_t orig_lid;
>  	uint8_t port_num;
>  	uint8_t remote_port_num;
>  	osm_dr_path_t path;
> @@ -316,19 +331,15 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
>  	if (ib_port_info_get_port_state(p_pi) > IB_LINK_INIT && p_node->sw)
>  		p_node->sw->need_update = 0;
>  
> +	if (port_num == 0)
> +		pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
> +
>  	/*
>  	   Update the PortInfo attribute.
>  	 */
>  	osm_physp_set_port_info(p_physp, p_pi);
>  
>  	if (port_num == 0) {
> -		/* This is switch management port 0 */
> -		if ((orig_lid =
> -		     osm_physp_trim_base_lid_to_valid_range(p_physp)))
> -			osm_log(p_rcv->p_log, OSM_LOG_ERROR,
> -				"__osm_pi_rcv_process_switch_port: ERR 0F04: "
> -				"Invalid base LID 0x%x corrected\n",
> -				cl_ntoh16(orig_lid));
>  		/* Determine if base switch port 0 */
>  		if (p_node->sw &&
>  		    !ib_switch_info_is_enhanced_port0(&p_node->sw->switch_info))
> @@ -346,21 +357,15 @@ static void
>  __osm_pi_rcv_process_ca_or_router_port(IN const osm_pi_rcv_t * const p_rcv,
>  				       IN osm_node_t * const p_node,
>  				       IN osm_physp_t * const p_physp,
> -				       IN const ib_port_info_t * const p_pi)
> +				       IN ib_port_info_t * const p_pi)
>  {
> -	ib_net16_t orig_lid;
> -
>  	OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_process_ca_or_router_port);
>  
>  	UNUSED_PARAM(p_node);
>  
> -	osm_physp_set_port_info(p_physp, p_pi);
> +	pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
>  
> -	if ((orig_lid = osm_physp_trim_base_lid_to_valid_range(p_physp)))
> -		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
> -			"__osm_pi_rcv_process_ca_or_router_port: ERR 0F08: "
> -			"Invalid base LID 0x%x corrected\n",
> -			cl_ntoh16(orig_lid));
> +	osm_physp_set_port_info(p_physp, p_pi);
>  
>  	__osm_pi_rcv_process_endport(p_rcv, p_physp, p_pi);
>  
>   


From ogerlitz at voltaire.com  Thu Dec 13 00:30:56 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 13 Dec 2007 10:30:56 +0200
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related
	hCalls on POWER5
In-Reply-To: <adabq8v91co.fsf@cisco.com>
References: <OFD9564F75.44193623-ONC12573AE.002EA542-C12573AE.002F5FBC@de.ibm.com><475FD0A1.3090304@voltaire.com>
	<adabq8v91co.fsf@cisco.com>
Message-ID: <4760EDC0.5020703@voltaire.com>

Roland Dreier wrote:
> I think the right fix for iSER would be to make iSER work even for
> devices that don't support FMRs.  For example cxgb3 doesn't implement
> FMRs so if anyone ever updates iSER to work on iWARP and not just IB,
> then this is something that has to be tackled anyway.  Then ehca could
> just get rid of the FMR support it has.

OK, The iSER design took into account the case of many initiators 
running on strong/modern machines talking to possibly lightweight 
embedded target for which the processing cost per I/O at the target side 
should be minimized, that is at most --one-- RDMA operation should be 
issued by the target to serve an I/O request.

For that end, iSER works with one descriptor (called stag in iWARP and 
rkey in IB) per I/O direction sent from the initiator to the target and 
hence can't work without some sort of FMR implementation.

The current implementation of the open iscsi initiator makes sure to 
issue commands in thread (sleepable) context, see iscsi_xmitworker and 
references to it in drivers/scsi/libiscsi.c , so this keeps ehca users 
safe for the time being.

Or.


From vst at vlnb.net  Thu Dec 13 02:20:24 2007
From: vst at vlnb.net (Vladislav Bolkhovitin)
Date: Thu, 13 Dec 2007 13:20:24 +0300
Subject: [ofa-general] Re: [Scst-devel] SRP Target Session Hangs
In-Reply-To: <C2F174F99918D54CA2A96E57C5079B6F3551BB@sbc-exmsg2.sbcounty.gov>
References: <C2F174F99918D54CA2A96E57C5079B6F3551BB@sbc-exmsg2.sbcounty.gov>
Message-ID: <47610768.8080203@vlnb.net>

Sufficool, Stanley wrote:
> 1) We are running WinIB on Windows 2003 SR2 with the OFED SRP Target. 
> The windows machines initially connect fine to the target, however when 
> they are restarted without using the "eject io unit" SCST retains the 
> SRPT session and subsequent connects to the target fail with "Cannot 
> start device".

This looks for me like an SRP target driver problem. Hopefully, Vu 
(CC'ed) can answer you.

> Using most recent WinIB, kernel 2.6.22 IB drivers, and recent SRPT and SCST.
>  
> 2) I have yet to get the WinOF 1.0 or 1.0.1 SRP Initiator to work. Is 
> there any reason other than my lack of trying harder that this wouldn't 
> work?

This is question for Vu as well

> 3) This may not be the forum for this, but how can you terminate a 
> session using SCST proc commands?

SCST can't (and shouldn't) do that, because it has no knowledge about 
how sessions with particular target transport created and destroyed. 
Sessions management is the target driver's duty. Ask Vu that feature.

> Thanks in advance
>  
> 
> *Stanley Sufficool
> *Systems Analyst I
> County of San Bernardino 
> 
> 
> ------------------------------------------------------------------------
> 
> -------------------------------------------------------------------------
> SF.Net email is sponsored by:
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services
> for just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Scst-devel mailing list
> Scst-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scst-devel


From sashak at voltaire.com  Thu Dec 13 02:54:43 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 13 Dec 2007 10:54:43 +0000
Subject: [ofa-general] [PATCH] opensm: don't zero base LID when invalid
	value is received
In-Reply-To: <4760E169.5010901@mellanox.co.il>
References: <20071213012552.GZ23319@sashak.voltaire.com>
	<4760E169.5010901@mellanox.co.il>
Message-ID: <20071213105443.GA23319@sashak.voltaire.com>

Hi Yevgeny,

On 09:38 Thu 13 Dec     , Yevgeny Kliteynik wrote:
> 
>  The patch looks good. Few questions:
>  Is the original problem reproducible?

Not sure, the bug was reported a long time ago. Then this case was
perfectly investigated (look at
https://bugs.openfabrics.org/show_bug.cgi?id=246 for details).

>  Will the fix go to ofed_1_3 too?

Yes.

Sasha


From vlad at lists.openfabrics.org  Thu Dec 13 03:09:26 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Thu, 13 Dec 2007 03:09:26 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071213-0200 daily build status
Message-ID: <20071213110926.ABEDFE601B1@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.14
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.13
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.19
Passed on ia64 with linux-2.6.21.1
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.22
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-53.el5

Failed:


From sashak at voltaire.com  Thu Dec 13 03:51:23 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 13 Dec 2007 11:51:23 +0000
Subject: [ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of
	routing engine/algorithm used
In-Reply-To: <1197075342.29314.133.camel@cardanus.llnl.gov>
References: <1197075342.29314.133.camel@cardanus.llnl.gov>
Message-ID: <20071213115123.GB23319@sashak.voltaire.com>

Al, Hi again,

On 16:55 Fri 07 Dec     , Al Chu wrote:
> Hey Sasha,
> 
> I noticed that when a routing algorithm failed and defaulted back to
> 'minhop', the logs and the console did not report this change.  This is
> because most of that code outputs the routing algorithm name that was
> stored during configuration/setup.  The name isn't adjusted depending on
> the routing algorithm's success/failure.
> 
> There are several ways this could be fixed.  I decided to easiest was to
> stick a new routed_name field + lock into struct osm_routing_engine, and
> set/use this new field respectively.

Do we need special lock for this?

> Note that within osm_ucast_mgr_process(), there is a slight logic change
> from what was there before.  If the routing engine's call to
> build_lid_matrices() failed, I've changed the logic to not call the
> routing engine's ucast_build_fwd_tables() function.  This felt like the
> correct logic and seems to be fine given all the routing algorithms in
> OpenSM.  PLMK if there is some behavior subtlety I missed.

I think it will break "file" routing engine, where lid matrices and/or
LFTs can be loaded from a dump files. With this "engine" an user can
decide to load only LFTs from a file and use calculated lid matrices
(actually it is most common usage), then only LFTs dump file is
provided and lid matrix creator falls back legally.

> 
> Thanks,
> Al
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From d83332b4c6cb1cdb0fc960808f9fa53615f61201 Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Fri, 7 Dec 2007 13:44:04 -0800
> Subject: [PATCH] fix incorrect reporting of routing engine
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>
> ---
>  opensm/include/opensm/osm_opensm.h |    9 +++++++
>  opensm/opensm/osm_console.c        |    6 +++-
>  opensm/opensm/osm_opensm.c         |    6 +++++
>  opensm/opensm/osm_ucast_mgr.c      |   43 ++++++++++++++++++++++++++----------
>  4 files changed, 50 insertions(+), 14 deletions(-)
> 
> diff --git a/opensm/include/opensm/osm_opensm.h b/opensm/include/opensm/osm_opensm.h
> index 1b5edb8..3fc70fd 100644
> --- a/opensm/include/opensm/osm_opensm.h
> +++ b/opensm/include/opensm/osm_opensm.h
> @@ -107,6 +107,8 @@ struct osm_routing_engine {
>  	int (*ucast_build_fwd_tables) (void *context);
>  	void (*ucast_dump_tables) (void *context);
>  	void (*delete) (void *context);
> +	const char *routed_name;
> +	cl_plock_t routed_name_lock;
>  };
>  /*
>  * FIELDS
> @@ -129,6 +131,13 @@ struct osm_routing_engine {
>  *	delete
>  *		The delete method, may be used for routing engine
>  *		internals cleanup.
> +*
> +*	routed_name
> +*		The routing engine name used for routing (for example,
> +*		the specified one failed and we used the default)
> +*
> +*	routed_name_lock
> +*		Shared lock guarding reads and writes to routed_name.
>  */
>  
>  typedef struct _osm_console_t {
> diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c
> index f669240..d8d16db 100644
> --- a/opensm/opensm/osm_console.c
> +++ b/opensm/opensm/osm_console.c
> @@ -380,9 +380,11 @@ static void print_status(osm_opensm_t * p_osm, FILE * out)
>  			sm_state_mgr_str(p_osm->sm.state_mgr.state));
>  		fprintf(out, "   SA State           : %s\n",
>  			sa_state_str(p_osm->sa.state));
> +                cl_plock_acquire(&p_osm->routing_engine.routed_name_lock);

Another variables are not protected by any locks here. Probably
p_osm->lock should be used for all racy variables in this function
(instead of just protecting routed_name)?

>  		fprintf(out, "   Routing Engine     : %s\n",
> -			p_osm->routing_engine.name ? p_osm->routing_engine.
> -			name : "null (minhop)");
> +			p_osm->routing_engine.routed_name ? 
> +			p_osm->routing_engine.routed_name : "null (minhop)");

What about to always setup routed_name (initially yet in opensm_init())?
Then here in in 3/3 patch it will not be necessary to check it for NULL.

> +                cl_plock_release(&p_osm->routing_engine.routed_name_lock);
>  #ifdef ENABLE_OSM_PERF_MGR
>  		fprintf(out, "\n   PerfMgr state/sweep state : %s/%s\n",
>  			osm_perfmgr_get_state_str(&(p_osm->perfmgr)),
> diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
> index 26b9969..b55a638 100644
> --- a/opensm/opensm/osm_opensm.c
> +++ b/opensm/opensm/osm_opensm.c
> @@ -188,6 +188,8 @@ void osm_opensm_destroy(IN osm_opensm_t * const p_osm)
>  
>  	cl_plock_destroy(&p_osm->lock);
>  
> +	cl_plock_destroy(&p_osm->routing_engine.routed_name_lock);
> +
>  	osm_log_destroy(&p_osm->log);
>  }
>  
> @@ -224,6 +226,10 @@ osm_opensm_init(IN osm_opensm_t * const p_osm,
>  	if (status != IB_SUCCESS)
>  		goto Exit;
>  
> +	status = cl_plock_init(&p_osm->routing_engine.routed_name_lock);
> +	if (status != IB_SUCCESS)
> +		goto Exit;
> +
>  	if (p_opt->single_thread) {
>  		osm_log(&p_osm->log, OSM_LOG_INFO,
>  			"osm_opensm_init: Forcing single threaded dispatcher\n");
> diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
> index 8bb4739..fe15666 100644
> --- a/opensm/opensm/osm_ucast_mgr.c
> +++ b/opensm/opensm/osm_ucast_mgr.c
> @@ -769,6 +769,9 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
>  	struct osm_routing_engine *p_routing_eng;
>  	osm_signal_t signal = OSM_SIGNAL_DONE;
>  	cl_qmap_t *p_sw_guid_tbl;
> +        const char *routed_name = NULL;
> +        int blm = 0;
> +        int ubft = 0;
>  
>  	OSM_LOG_ENTER(p_mgr->p_log, osm_ucast_mgr_process);
>  
> @@ -789,23 +792,39 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
>  
>  	p_mgr->any_change = FALSE;
>  
> -	if (!p_routing_eng->build_lid_matrices ||
> -	    p_routing_eng->build_lid_matrices(p_routing_eng->context) != 0)
> -		osm_ucast_mgr_build_lid_matrices(p_mgr);
> -
> -	osm_log(p_mgr->p_log, OSM_LOG_INFO,
> -		"osm_ucast_mgr_process: "
> -		"%s tables configured on all switches\n",
> -		p_routing_eng->name ? p_routing_eng->name : "null (minhop)");
> +	if (p_routing_eng->build_lid_matrices) {
> +            blm = p_routing_eng->build_lid_matrices(p_routing_eng->context);
> +            if (blm)
> +                osm_ucast_mgr_build_lid_matrices(p_mgr);
> +        }
> +        else
> +            osm_ucast_mgr_build_lid_matrices(p_mgr);
>  
>  	/*
>  	   Now that the lid matrices have been built, we can
>  	   build and download the switch forwarding tables.
>  	 */
> -	if (!p_routing_eng->ucast_build_fwd_tables ||
> -	    p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context))
> -		cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> -				   p_mgr);
> +	if (!blm && p_routing_eng->ucast_build_fwd_tables) {
> +            ubft = p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context);
> +            if (ubft)
> +                cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> +                                   p_mgr);
> +        }
> +        else
> +            cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> +                               p_mgr);
> +
> +        if (!blm && !ubft)
> +            routed_name = p_routing_eng->name;
> +
> +        CL_PLOCK_EXCL_ACQUIRE(&p_routing_eng->routed_name_lock);
> +        p_routing_eng->routed_name = routed_name;
> +        CL_PLOCK_RELEASE(&p_routing_eng->routed_name_lock);

Basically this section is already protected by p_mgr->p_lock (which is
reference to &osm.lock).

Sasha

> +
> +	osm_log(p_mgr->p_log, OSM_LOG_INFO,
> +		"osm_ucast_mgr_process: "
> +		"%s tables configured on all switches\n",
> +		routed_name ? routed_name : "null (minhop)");
>  
>  	if (p_mgr->any_change) {
>  		signal = OSM_SIGNAL_DONE_PENDING;
> -- 
> 1.5.1
> 


From billmail at purcell.biz  Thu Dec 13 04:03:02 2007
From: billmail at purcell.biz (RUEBEN OJO)
Date: Thu, 13 Dec 2007 13:03:02 +0100
Subject: [ofa-general] Dear Friend,
Message-ID: <47015A110000A2C2@mta8.wss.scd.yahoo.com>

 Dear Friend,
 
I am very happy to inform you about my success in getting the funds transfered
to Mexico, for investements and business establishements. Now, I want you
to contact my clark at the information bellow. NAME; Mr.Rueben,Ojo,  Ask
him to send you the total sum of  ($800,000.00) United States Dollars in
an International Bank Certified Draft, which I kept for your compensation.
Fill the bellow information and contact him through his email (johnattah4 at yahoo.fr)
immediately without any delays.
 
1.  Your Full Names:................................. 
2.  Your Address:...... ..........................
3.  Your Sex:......................................
4.  Your Age:......................................
5.  Your Marital Status:......................
6   Your Occupation:.............................
7.  Your Direct Phone Number:.............................
8.  Your Resident City:........................................
9.   Your Resident State:........................................
10. Your Country:.................................................
 
Regards,
Mr,RUEBEN OJO


From jiqg at bmans.com  Thu Dec 13 04:31:02 2007
From: jiqg at bmans.com (Kenneth Howe)
Date: Thu, 13 Dec 2007 15:31:02 +0300
Subject: [ofa-general] OEM sell out
Message-ID: <169219665.78182659788150@bmans.com>

Our main purpose is to provide low price PC and Mac lawful software and computer solutions for anyone.
Whether you are a corporate buyer, a holder of small-scale enterprise, or go shopping for your own PC, we guess that we can assist you.

ENJOY OF OUR SOFTWARE
http://geocities.com/brucesanders16/

Most popular software in sight are:
*Microsoft Office Enterprise 2007: Retail price for now - $599.95; Our now just - $79.95
*Windows XP Professional With SP2 Full Version: Retail price today - $259.99; Our only - $59.95
*Adobe Acrobat 8.0 Professional: Retail price for this time - $449.95; Our only for today - $69.95
*Adobe Creative Suite 3 Master Collection: Retail price today - $2499.95; Our only for today - $299.95 
*Windows Vista Ultimate 32-bit: Retail price for now - $359.95; Our just - $79.95
*Adobe Photoshop CS3 Extended: Retail price now - $999.95; Our only today - $79.95
*Office 2003 Professional (including Publisher 2003): Retail price for this time - $239.95; Our just - $59.95 
*Adobe Creative Suite 3 Design Premium: Retail price now - $1799.95; Our only - $229.95
*AutoCAD 2008: Retail price this day - $3995.95; Our only today - $129.95
*Office System Professional 2003 (5 Cds): Retail price this day - $469.95; Our just - $59.95
*Frontpage 2003 Pro: Retail price now - $129.95; Our just - $29.95 

COME TO US RIGHT NOW!
http://geocities.com/brucesanders16/ To theworlds pleasure and. They told me that your name was. In showing youshall read. Have caught the woodcock. Go thy ways let my horses be. The deadexcessive grief the. One Captain Dumainbe i the. Spirits whose apprehensive.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071213/c4122da9/attachment.html>

From fenkes at de.ibm.com  Thu Dec 13 04:35:57 2007
From: fenkes at de.ibm.com (Joachim Fenkes)
Date: Thu, 13 Dec 2007 13:35:57 +0100
Subject: [ofa-general] [PATCH] IB/ehca: Fix lock flag location,
	bump version number
Message-ID: <200712131335.58136.fenkes@de.ibm.com>

Signed-off-by: Joachim Fenkes <fenkes at de.ibm.com>
---

This addresses a comment of Roland and bumps the version number. If it's not
too late, please apply for 2.6.24. Thanks!

 drivers/infiniband/hw/ehca/ehca_classes.h |    1 +
 drivers/infiniband/hw/ehca/ehca_main.c    |    2 +-
 drivers/infiniband/hw/ehca/hcp_if.c       |    1 -
 3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h
index 87f12d4..74d2b72 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -322,6 +322,7 @@ extern int ehca_static_rate;
 extern int ehca_port_act_time;
 extern int ehca_use_hp_mr;
 extern int ehca_scaling_code;
+extern int ehca_lock_hcalls;
 
 struct ipzu_queue_resp {
 	u32 qe_size;      /* queue entry size */
diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index c7bff3e..6a56d86 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -50,7 +50,7 @@
 #include "ehca_tools.h"
 #include "hcp_if.h"
 
-#define HCAD_VERSION "0024"
+#define HCAD_VERSION "0025"
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Christoph Raisch <raisch at de.ibm.com>");
diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c
index 331b5e8..7029aa6 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -89,7 +89,6 @@
 #define HCALL9_REGS_FORMAT HCALL7_REGS_FORMAT " r11=%lx r12=%lx"
 
 static DEFINE_SPINLOCK(hcall_lock);
-extern int ehca_lock_hcalls;
 
 static u32 get_longbusy_msecs(int longbusy_rc)
 {
-- 
1.5.2


From dotanb at dev.mellanox.co.il  Thu Dec 13 04:52:35 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Thu, 13 Dec 2007 14:52:35 +0200
Subject: [ofa-general] The qperf doesn't compile IB test cases
Message-ID: <47612B13.4000001@dev.mellanox.co.il>

Hi.

It seems that the qperf doesn't compile the InfiniBand test cases (for 
example: ud_lat, ud_bw).

Can you please check this issue?
(maybe a problem with the autotools?)

other test cases, for example: tcp_bw and tcp_lat works fine.

thanks
Dotan


From kliteyn at dev.mellanox.co.il  Thu Dec 13 05:19:22 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 13 Dec 2007 15:19:22 +0200
Subject: [ofa-general] Re: CMA can't establish connection with QoS on
In-Reply-To: <47608BE4.7020209@ichips.intel.com>
References: <47600070.8050008@dev.mellanox.co.il>	<000001c83ced$2149f5b0$ff0da8c0@amr.corp.intel.com>
	<47605620.3070105@dev.mellanox.co.il>
	<47608BE4.7020209@ichips.intel.com>
Message-ID: <4761315A.1070306@dev.mellanox.co.il>

Sean Hefty wrote:
>> Not sure if it helps, but "ibv_rc_pingpong -l <any SL>" works.
> 
> I'll try this tomorrow.
> 
> With QoS enabled (using the QoS file provided in bug 821), I get a 
> ROUTE_ERROR on the client side, but I don't see a system hang.  I'm 
> running 2.6.24-rc3 with patches in my for-roland branch.
> 
> I do see that opensm hangs when I try to kill it, but using 'kill' 
> forces it to exit.

Sean,

Sorry, but I forgot to add an important detail in the instructions to
reproduce this problem. Your FW has to be QoS-enabled.

Use the latest ConnectX FW release 2_3_000. If you don'r have it, you
can get it from the Mellanox site:

     http://www.mellanox.com/support/firmware_download.php

When you burn it, you need to enable the QoS by adding the following
line in the [hca] section of the .ini file:

    sx_vlarb_en = true

-- Yevgeny

> - Sean
> 


From saccular at zna.com  Thu Dec 13 06:51:11 2007
From: saccular at zna.com (Wela Huesso)
Date: Thu, 13 Dec 2007 14:51:11 +0000
Subject: [ofa-general] prologuizers
Message-ID: <9266131470.20071213144149@zna.com>

Halloha,


  Virus found in this message, please delete it without futher reading   

 
Not in the delicate and softe: and in those that nor he,
who speaks falsely or indulges in idle we were in the pantry.
antoinette was ill and.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071213/0a4ade0c/attachment.html>

From johann.george at qlogic.com  Thu Dec 13 07:29:10 2007
From: johann.george at qlogic.com (Johann George)
Date: Thu, 13 Dec 2007 07:29:10 -0800
Subject: [ofa-general] Re: The qperf doesn't compile IB test cases
In-Reply-To: <47612B13.4000001@dev.mellanox.co.il>
References: <47612B13.4000001@dev.mellanox.co.il>
Message-ID: <20071213152910.GA23928@cuprite.pathscale.com>

Hello Dotan.

What sort of error does it get?  I just tried the latest version from
the git repository and it compiled without problems.

Johann

On Thu, Dec 13, 2007 at 02:52:35PM +0200, Dotan Barak wrote:
> Hi.
> 
> It seems that the qperf doesn't compile the InfiniBand test cases (for 
> example: ud_lat, ud_bw).
> 
> Can you please check this issue?
> (maybe a problem with the autotools?)
> 
> other test cases, for example: tcp_bw and tcp_lat works fine.
> 
> thanks
> Dotan


From tziporet at mellanox.co.il  Thu Dec 13 08:40:22 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Thu, 13 Dec 2007 18:40:22 +0200
Subject: [ofa-general] OFED 1.3-rc1 release is available
Message-ID: <6C2C79E72C305246B504CBA17B5500C902E2D65D@mtlexch01.mtl.com>

Hi, 

OFED 1.3 RC1 release is available on
http://www.openfabrics.org/downloads/OFED/ofed-1.3/OFED-1.3-rc1.tgz
To get BUILD_ID run ofed_info 

Please report any issues in bugzilla https://bugs.openfabrics.org/

The RC2 release is expected on December 27

Tziporet & Vlad 

========================================================================

Release information: 
--------------------
OS support: 
Novell: 
    - SLES10 
    - SLES10 SP1 and up1
Redhat: 
    - Redhat EL4 up4 and up5 
    - Redhat EL5 and up1
kernel.org: 
    - 2.6.23 and 2.6.24-rc2

Systems: 
    * x86_64 
    * x86 
    * ia64 
    * ppc64

Main Changes from OFED 1.3-beta
===============================
*	Fix SDP stability issues
*	Force 32bit libraries installation on the SLES10 SP1 U1
*	Open MPI: Enable compilation when using compilers that were not
installed as RPMs.
*	RDS: clean up handling of congested destinations vs poll
*	RDS: Fix download issue when removing low level driver (fix was
in CMA)
*	IPoIB: Fix kernel Oops resulting from xmit following dev_down.
*	MPI packages update:
*	mvapich-1.0.0-1639.src.rpm
*	openmpi-1.2.5rc1-1.src.rpm
*	mpitests-3.0-773.src.rpm

mlx4 specific changes:
*	Fix segmentation fault in mlx4_clear_xrc_srq.
*	Fix max_eq's read from FW in QUERY_DEV_CAP.
*	Post send in the kernel is now using WQE building block.
*	Set default CQ moderation parameters for IPoIB

ehca specific changes:
*	Fix error of sense context opts with multiple adapter
*	Add files for older abi_versions


Tasks that should be completed for RC2 release:
===============================================
1. IPoIB performance improvements for small messages
2. Fix bugs


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071213/542cec53/attachment.html>

From rdreier at cisco.com  Thu Dec 13 09:39:29 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 13 Dec 2007 09:39:29 -0800
Subject: [ofa-general] [GIT PULL] please pull infiniband.git
Message-ID: <adar6hq7aum.fsf@cisco.com>

Linus, please pull from

    master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus

This tree is also available from kernel.org mirrors at:

    git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus

This will pull a few small ehca driver fixes for 2.6.24:

Joachim Fenkes (3):
      IB/ehca: Return correct number of SGEs for SRQ
      IB/ehca: Serialize HCA-related hCalls if necessary
      IB/ehca: Fix lock flag variable location, bump version number

 drivers/infiniband/hw/ehca/ehca_classes.h |    1 +
 drivers/infiniband/hw/ehca/ehca_main.c    |   15 ++++++++++++++-
 drivers/infiniband/hw/ehca/ehca_qp.c      |    4 ++--
 drivers/infiniband/hw/ehca/hcp_if.c       |   27 ++++++++++-----------------
 drivers/infiniband/hw/ehca/hipz_hw.h      |    1 +
 5 files changed, 28 insertions(+), 20 deletions(-)


diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h
index 87f12d4..74d2b72 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -322,6 +322,7 @@ extern int ehca_static_rate;
 extern int ehca_port_act_time;
 extern int ehca_use_hp_mr;
 extern int ehca_scaling_code;
+extern int ehca_lock_hcalls;
 
 struct ipzu_queue_resp {
 	u32 qe_size;      /* queue entry size */
diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index 90d4334..6a56d86 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -43,13 +43,14 @@
 #ifdef CONFIG_PPC_64K_PAGES
 #include <linux/slab.h>
 #endif
+
 #include "ehca_classes.h"
 #include "ehca_iverbs.h"
 #include "ehca_mrmw.h"
 #include "ehca_tools.h"
 #include "hcp_if.h"
 
-#define HCAD_VERSION "0024"
+#define HCAD_VERSION "0025"
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Christoph Raisch <raisch at de.ibm.com>");
@@ -66,6 +67,7 @@ int ehca_poll_all_eqs  = 1;
 int ehca_static_rate   = -1;
 int ehca_scaling_code  = 0;
 int ehca_mr_largepage  = 1;
+int ehca_lock_hcalls   = -1;
 
 module_param_named(open_aqp1,     ehca_open_aqp1,     int, S_IRUGO);
 module_param_named(debug_level,   ehca_debug_level,   int, S_IRUGO);
@@ -77,6 +79,7 @@ module_param_named(poll_all_eqs,  ehca_poll_all_eqs,  int, S_IRUGO);
 module_param_named(static_rate,   ehca_static_rate,   int, S_IRUGO);
 module_param_named(scaling_code,  ehca_scaling_code,  int, S_IRUGO);
 module_param_named(mr_largepage,  ehca_mr_largepage,  int, S_IRUGO);
+module_param_named(lock_hcalls,   ehca_lock_hcalls,   bool, S_IRUGO);
 
 MODULE_PARM_DESC(open_aqp1,
 		 "AQP1 on startup (0: no (default), 1: yes)");
@@ -102,6 +105,9 @@ MODULE_PARM_DESC(scaling_code,
 MODULE_PARM_DESC(mr_largepage,
 		 "use large page for MR (0: use PAGE_SIZE (default), "
 		 "1: use large page depending on MR size");
+MODULE_PARM_DESC(lock_hcalls,
+		 "serialize all hCalls made by the driver "
+		 "(default: autodetect)");
 
 DEFINE_RWLOCK(ehca_qp_idr_lock);
 DEFINE_RWLOCK(ehca_cq_idr_lock);
@@ -258,6 +264,7 @@ static struct cap_descr {
 	{ HCA_CAP_UD_LL_QP, "HCA_CAP_UD_LL_QP" },
 	{ HCA_CAP_RESIZE_MR, "HCA_CAP_RESIZE_MR" },
 	{ HCA_CAP_MINI_QP, "HCA_CAP_MINI_QP" },
+	{ HCA_CAP_H_ALLOC_RES_SYNC, "HCA_CAP_H_ALLOC_RES_SYNC" },
 };
 
 static int ehca_sense_attributes(struct ehca_shca *shca)
@@ -333,6 +340,12 @@ static int ehca_sense_attributes(struct ehca_shca *shca)
 		if (EHCA_BMASK_GET(hca_cap_descr[i].mask, shca->hca_cap))
 			ehca_gen_dbg("   %s", hca_cap_descr[i].descr);
 
+	/* Autodetect hCall locking -- the "H_ALLOC_RESOURCE synced" flag is
+	 * a firmware property, so it's valid across all adapters
+	 */
+	if (ehca_lock_hcalls == -1)
+		ehca_lock_hcalls = !(shca->hca_cap & HCA_CAP_H_ALLOC_RES_SYNC);
+
 	/* translate supported MR page sizes; always support 4K */
 	shca->hca_cap_mr_pgsize = EHCA_PAGESIZE;
 	if (ehca_mr_largepage) { /* support extra sizes only if enabled */
diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c
index dd12668..eff5fb5 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -838,7 +838,7 @@ struct ib_srq *ehca_create_srq(struct ib_pd *pd,
 
 	/* copy back return values */
 	srq_init_attr->attr.max_wr = qp_init_attr.cap.max_recv_wr;
-	srq_init_attr->attr.max_sge = qp_init_attr.cap.max_recv_sge;
+	srq_init_attr->attr.max_sge = 3;
 
 	/* drive SRQ into RTR state */
 	mqpcb = ehca_alloc_fw_ctrlblock(GFP_KERNEL);
@@ -1750,7 +1750,7 @@ int ehca_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr)
 	}
 
 	srq_attr->max_wr = qpcb->max_nr_outst_recv_wr - 1;
-	srq_attr->max_sge = qpcb->actual_nr_sges_in_rq_wqe;
+	srq_attr->max_sge = 3;
 	srq_attr->srq_limit = EHCA_BMASK_GET(
 		MQPCB_CURR_SRQ_LIMIT, qpcb->curr_srq_limit);
 
diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c
index c16a213..7029aa6 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -120,26 +120,21 @@ static long ehca_plpar_hcall_norets(unsigned long opcode,
 				    unsigned long arg7)
 {
 	long ret;
-	int i, sleep_msecs, do_lock;
-	unsigned long flags;
+	int i, sleep_msecs;
+	unsigned long flags = 0;
 
 	ehca_gen_dbg("opcode=%lx " HCALL7_REGS_FORMAT,
 		     opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7);
 
-	/* lock H_FREE_RESOURCE(MR) against itself and H_ALLOC_RESOURCE(MR) */
-	if ((opcode == H_FREE_RESOURCE) && (arg7 == 5)) {
-		arg7 = 0; /* better not upset firmware */
-		do_lock = 1;
-	}
-
 	for (i = 0; i < 5; i++) {
-		if (do_lock)
+		/* serialize hCalls to work around firmware issue */
+		if (ehca_lock_hcalls)
 			spin_lock_irqsave(&hcall_lock, flags);
 
 		ret = plpar_hcall_norets(opcode, arg1, arg2, arg3, arg4,
 					 arg5, arg6, arg7);
 
-		if (do_lock)
+		if (ehca_lock_hcalls)
 			spin_unlock_irqrestore(&hcall_lock, flags);
 
 		if (H_IS_LONG_BUSY(ret)) {
@@ -174,24 +169,22 @@ static long ehca_plpar_hcall9(unsigned long opcode,
 			      unsigned long arg9)
 {
 	long ret;
-	int i, sleep_msecs, do_lock;
+	int i, sleep_msecs;
 	unsigned long flags = 0;
 
 	ehca_gen_dbg("INPUT -- opcode=%lx " HCALL9_REGS_FORMAT, opcode,
 		     arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9);
 
-	/* lock H_ALLOC_RESOURCE(MR) against itself and H_FREE_RESOURCE(MR) */
-	do_lock = ((opcode == H_ALLOC_RESOURCE) && (arg2 == 5));
-
 	for (i = 0; i < 5; i++) {
-		if (do_lock)
+		/* serialize hCalls to work around firmware issue */
+		if (ehca_lock_hcalls)
 			spin_lock_irqsave(&hcall_lock, flags);
 
 		ret = plpar_hcall9(opcode, outs,
 				   arg1, arg2, arg3, arg4, arg5,
 				   arg6, arg7, arg8, arg9);
 
-		if (do_lock)
+		if (ehca_lock_hcalls)
 			spin_unlock_irqrestore(&hcall_lock, flags);
 
 		if (H_IS_LONG_BUSY(ret)) {
@@ -821,7 +814,7 @@ u64 hipz_h_free_resource_mr(const struct ipz_adapter_handle adapter_handle,
 	return ehca_plpar_hcall_norets(H_FREE_RESOURCE,
 				       adapter_handle.handle,    /* r4 */
 				       mr->ipz_mr_handle.handle, /* r5 */
-				       0, 0, 0, 0, 5);
+				       0, 0, 0, 0, 0);
 }
 
 u64 hipz_h_reregister_pmr(const struct ipz_adapter_handle adapter_handle,
diff --git a/drivers/infiniband/hw/ehca/hipz_hw.h b/drivers/infiniband/hw/ehca/hipz_hw.h
index 485b840..bf996c7 100644
--- a/drivers/infiniband/hw/ehca/hipz_hw.h
+++ b/drivers/infiniband/hw/ehca/hipz_hw.h
@@ -378,6 +378,7 @@ struct hipz_query_hca {
 #define HCA_CAP_UD_LL_QP              EHCA_BMASK_IBM(16, 16)
 #define HCA_CAP_RESIZE_MR             EHCA_BMASK_IBM(17, 17)
 #define HCA_CAP_MINI_QP               EHCA_BMASK_IBM(18, 18)
+#define HCA_CAP_H_ALLOC_RES_SYNC      EHCA_BMASK_IBM(19, 19)
 
 /* query port response block */
 struct hipz_query_port {


From rdreier at cisco.com  Thu Dec 13 09:39:48 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 13 Dec 2007 09:39:48 -0800
Subject: [ofa-general] Re: [PATCH] IB/ehca: Fix lock flag location,
	bump version number
In-Reply-To: <200712131335.58136.fenkes@de.ibm.com> (Joachim Fenkes's message
	of "Thu, 13 Dec 2007 13:35:57 +0100")
References: <200712131335.58136.fenkes@de.ibm.com>
Message-ID: <adamyse7au3.fsf@cisco.com>

applied, thanks


From chu11 at llnl.gov  Thu Dec 13 10:42:02 2007
From: chu11 at llnl.gov (Al Chu)
Date: Thu, 13 Dec 2007 10:42:02 -0800
Subject: [ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of
	routing engine/algorithm used
In-Reply-To: <20071213115123.GB23319@sashak.voltaire.com>
References: <1197075342.29314.133.camel@cardanus.llnl.gov>
	<20071213115123.GB23319@sashak.voltaire.com>
Message-ID: <1197571322.29314.242.camel@cardanus.llnl.gov>

Hey Sasha,

Various responses inlined below.

On Thu, 2007-12-13 at 11:51 +0000, Sasha Khapyorsky wrote:
> Al, Hi again,
> 
> On 16:55 Fri 07 Dec     , Al Chu wrote:
> > Hey Sasha,
> > 
> > I noticed that when a routing algorithm failed and defaulted back to
> > 'minhop', the logs and the console did not report this change.  This is
> > because most of that code outputs the routing algorithm name that was
> > stored during configuration/setup.  The name isn't adjusted depending on
> > the routing algorithm's success/failure.
> > 
> > There are several ways this could be fixed.  I decided to easiest was to
> > stick a new routed_name field + lock into struct osm_routing_engine, and
> > set/use this new field respectively.
> 
> Do we need special lock for this?

For this particular implementation, I added it to protect atleast the
thread in the osm console.  The osm console does:

fprintf(out, "   Routing Engine     : %s\n",
        p_osm->routing_engine.name ? 
        p_osm->routing_engine.name : "null (minhop)");

So we can race while the 'name' could be changed between NULL to non-
NULL and vice versa.

> > Note that within osm_ucast_mgr_process(), there is a slight logic change
> > from what was there before.  If the routing engine's call to
> > build_lid_matrices() failed, I've changed the logic to not call the
> > routing engine's ucast_build_fwd_tables() function.  This felt like the
> > correct logic and seems to be fine given all the routing algorithms in
> > OpenSM.  PLMK if there is some behavior subtlety I missed.
> 
> I think it will break "file" routing engine, where lid matrices and/or
> LFTs can be loaded from a dump files. With this "engine" an user can
> decide to load only LFTs from a file and use calculated lid matrices
> (actually it is most common usage), then only LFTs dump file is
> provided and lid matrix creator falls back legally.

I can revert this logic change then.

> > 
> > Thanks,
> > Al
> > -- 
> > Albert Chu
> > chu11 at llnl.gov
> > 925-422-5311
> > Computer Scientist
> > High Performance Systems Division
> > Lawrence Livermore National Laboratory
> 
> > From d83332b4c6cb1cdb0fc960808f9fa53615f61201 Mon Sep 17 00:00:00 2001
> > From: Albert L. Chu <chu11 at llnl.gov>
> > Date: Fri, 7 Dec 2007 13:44:04 -0800
> > Subject: [PATCH] fix incorrect reporting of routing engine
> > 
> > 
> > Signed-off-by: Albert L. Chu <chu11 at llnl.gov>
> > ---
> >  opensm/include/opensm/osm_opensm.h |    9 +++++++
> >  opensm/opensm/osm_console.c        |    6 +++-
> >  opensm/opensm/osm_opensm.c         |    6 +++++
> >  opensm/opensm/osm_ucast_mgr.c      |   43 ++++++++++++++++++++++++++----------
> >  4 files changed, 50 insertions(+), 14 deletions(-)
> > 
> > diff --git a/opensm/include/opensm/osm_opensm.h b/opensm/include/opensm/osm_opensm.h
> > index 1b5edb8..3fc70fd 100644
> > --- a/opensm/include/opensm/osm_opensm.h
> > +++ b/opensm/include/opensm/osm_opensm.h
> > @@ -107,6 +107,8 @@ struct osm_routing_engine {
> >  	int (*ucast_build_fwd_tables) (void *context);
> >  	void (*ucast_dump_tables) (void *context);
> >  	void (*delete) (void *context);
> > +	const char *routed_name;
> > +	cl_plock_t routed_name_lock;
> >  };
> >  /*
> >  * FIELDS
> > @@ -129,6 +131,13 @@ struct osm_routing_engine {
> >  *	delete
> >  *		The delete method, may be used for routing engine
> >  *		internals cleanup.
> > +*
> > +*	routed_name
> > +*		The routing engine name used for routing (for example,
> > +*		the specified one failed and we used the default)
> > +*
> > +*	routed_name_lock
> > +*		Shared lock guarding reads and writes to routed_name.
> >  */
> >  
> >  typedef struct _osm_console_t {
> > diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c
> > index f669240..d8d16db 100644
> > --- a/opensm/opensm/osm_console.c
> > +++ b/opensm/opensm/osm_console.c
> > @@ -380,9 +380,11 @@ static void print_status(osm_opensm_t * p_osm, FILE * out)
> >  			sm_state_mgr_str(p_osm->sm.state_mgr.state));
> >  		fprintf(out, "   SA State           : %s\n",
> >  			sa_state_str(p_osm->sa.state));
> > +                cl_plock_acquire(&p_osm->routing_engine.routed_name_lock);
> 
> Another variables are not protected by any locks here. Probably
> p_osm->lock should be used for all racy variables in this function
> (instead of just protecting routed_name)?
>
> >  		fprintf(out, "   Routing Engine     : %s\n",
> > -			p_osm->routing_engine.name ? p_osm->routing_engine.
> > -			name : "null (minhop)");
> > +			p_osm->routing_engine.routed_name ? 
> > +			p_osm->routing_engine.routed_name : "null (minhop)");
> 
> What about to always setup routed_name (initially yet in opensm_init())?
> Then here in in 3/3 patch it will not be necessary to check it for NULL.

I had considered that.  I decided against it since the common usage of
'routing_engine.name' was NULL means min-hop and non-NULL means
something else.  So I wanted to keep the same usage pattern.

Another idea I had was to create an enumeration to list all of the
possible routing algorithms that could be chosen, something like:

typedef enum {
   OSM_ROUTING_NOT_ROUTED = 0,
   OSM_ROUTING_MINHOP = 1,
   OSM_ROUTING_UPDN = 2,
   ... etc. ...
} osm_routing_algorithm_t;

then put an osm_routing_algorithm_t variable into osm_opensm_t to
signify what routing algorithm was last used to route the subnet.

Since no pointers are used, this method would eliminate the need to
check for a non-null pointer.  The p_osm->lock pointer could be used as
needed rather than creating a new lock.  It would remove a number of
string comparisons used throughout the code.  Also, this method would be
useful for when the routing engine chains is developed, since there will
need to be something to indicate what routing algorithm in the chain was
used.  All we would have to add is the enumeration plus several
enum2string and string2enum functions.

For some reason, I believed that this would have been more invasive than
the current patch.  But thinking about it now, it seems like it would be
less invasive.  Would this be a preferred fix to the problem?

> > +                cl_plock_release(&p_osm->routing_engine.routed_name_lock);
> >  #ifdef ENABLE_OSM_PERF_MGR
> >  		fprintf(out, "\n   PerfMgr state/sweep state : %s/%s\n",
> >  			osm_perfmgr_get_state_str(&(p_osm->perfmgr)),
> > diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
> > index 26b9969..b55a638 100644
> > --- a/opensm/opensm/osm_opensm.c
> > +++ b/opensm/opensm/osm_opensm.c
> > @@ -188,6 +188,8 @@ void osm_opensm_destroy(IN osm_opensm_t * const p_osm)
> >  
> >  	cl_plock_destroy(&p_osm->lock);
> >  
> > +	cl_plock_destroy(&p_osm->routing_engine.routed_name_lock);
> > +
> >  	osm_log_destroy(&p_osm->log);
> >  }
> >  
> > @@ -224,6 +226,10 @@ osm_opensm_init(IN osm_opensm_t * const p_osm,
> >  	if (status != IB_SUCCESS)
> >  		goto Exit;
> >  
> > +	status = cl_plock_init(&p_osm->routing_engine.routed_name_lock);
> > +	if (status != IB_SUCCESS)
> > +		goto Exit;
> > +
> >  	if (p_opt->single_thread) {
> >  		osm_log(&p_osm->log, OSM_LOG_INFO,
> >  			"osm_opensm_init: Forcing single threaded dispatcher\n");
> > diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
> > index 8bb4739..fe15666 100644
> > --- a/opensm/opensm/osm_ucast_mgr.c
> > +++ b/opensm/opensm/osm_ucast_mgr.c
> > @@ -769,6 +769,9 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
> >  	struct osm_routing_engine *p_routing_eng;
> >  	osm_signal_t signal = OSM_SIGNAL_DONE;
> >  	cl_qmap_t *p_sw_guid_tbl;
> > +        const char *routed_name = NULL;
> > +        int blm = 0;
> > +        int ubft = 0;
> >  
> >  	OSM_LOG_ENTER(p_mgr->p_log, osm_ucast_mgr_process);
> >  
> > @@ -789,23 +792,39 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
> >  
> >  	p_mgr->any_change = FALSE;
> >  
> > -	if (!p_routing_eng->build_lid_matrices ||
> > -	    p_routing_eng->build_lid_matrices(p_routing_eng->context) != 0)
> > -		osm_ucast_mgr_build_lid_matrices(p_mgr);
> > -
> > -	osm_log(p_mgr->p_log, OSM_LOG_INFO,
> > -		"osm_ucast_mgr_process: "
> > -		"%s tables configured on all switches\n",
> > -		p_routing_eng->name ? p_routing_eng->name : "null (minhop)");
> > +	if (p_routing_eng->build_lid_matrices) {
> > +            blm = p_routing_eng->build_lid_matrices(p_routing_eng->context);
> > +            if (blm)
> > +                osm_ucast_mgr_build_lid_matrices(p_mgr);
> > +        }
> > +        else
> > +            osm_ucast_mgr_build_lid_matrices(p_mgr);
> >  
> >  	/*
> >  	   Now that the lid matrices have been built, we can
> >  	   build and download the switch forwarding tables.
> >  	 */
> > -	if (!p_routing_eng->ucast_build_fwd_tables ||
> > -	    p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context))
> > -		cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> > -				   p_mgr);
> > +	if (!blm && p_routing_eng->ucast_build_fwd_tables) {
> > +            ubft = p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context);
> > +            if (ubft)
> > +                cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> > +                                   p_mgr);
> > +        }
> > +        else
> > +            cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
> > +                               p_mgr);
> > +
> > +        if (!blm && !ubft)
> > +            routed_name = p_routing_eng->name;
> > +
> > +        CL_PLOCK_EXCL_ACQUIRE(&p_routing_eng->routed_name_lock);
> > +        p_routing_eng->routed_name = routed_name;
> > +        CL_PLOCK_RELEASE(&p_routing_eng->routed_name_lock);
> 
> Basically this section is already protected by p_mgr->p_lock (which is
> reference to &osm.lock).

Ahh, I wasn't aware it pointed to the same lock.

Thanks,
Al

> Sasha
> 
> > +
> > +	osm_log(p_mgr->p_log, OSM_LOG_INFO,
> > +		"osm_ucast_mgr_process: "
> > +		"%s tables configured on all switches\n",
> > +		routed_name ? routed_name : "null (minhop)");
> >  
> >  	if (p_mgr->any_change) {
> >  		signal = OSM_SIGNAL_DONE_PENDING;
> > -- 
> > 1.5.1
> > 
-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory


From dwscourm at scour.org  Thu Dec 13 10:59:26 2007
From: dwscourm at scour.org (Patrice Waddell)
Date: Thu, 13 Dec 2007 15:59:26 -0300
Subject: [ofa-general] Medications that you need.
Message-ID: <01c83da1$22a00300$37a233be@dwscourm>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/AverySchultz61/

We provide confidential and secure purchase! 


From sashak at voltaire.com  Thu Dec 13 11:23:09 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 13 Dec 2007 19:23:09 +0000
Subject: [ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of
	routing engine/algorithm used
In-Reply-To: <1197571322.29314.242.camel@cardanus.llnl.gov>
References: <1197075342.29314.133.camel@cardanus.llnl.gov>
	<20071213115123.GB23319@sashak.voltaire.com>
	<1197571322.29314.242.camel@cardanus.llnl.gov>
Message-ID: <20071213192309.GK708@sashak.voltaire.com>

Hi Al,

On 10:42 Thu 13 Dec     , Al Chu wrote:
> > > 
> > > There are several ways this could be fixed.  I decided to easiest was to
> > > stick a new routed_name field + lock into struct osm_routing_engine, and
> > > set/use this new field respectively.
> > 
> > Do we need special lock for this?
> 
> For this particular implementation, I added it to protect atleast the
> thread in the osm console.  The osm console does:
> 
> fprintf(out, "   Routing Engine     : %s\n",
>         p_osm->routing_engine.name ? 
>         p_osm->routing_engine.name : "null (minhop)");
> 
> So we can race while the 'name' could be changed between NULL to non-
> NULL and vice versa.

Sure. I thought about using osm.lock instead or even better sort of
lockless operation where NULL comparison is not needed.

> Another idea I had was to create an enumeration to list all of the
> possible routing algorithms that could be chosen, something like:
> 
> typedef enum {
>    OSM_ROUTING_NOT_ROUTED = 0,
>    OSM_ROUTING_MINHOP = 1,
>    OSM_ROUTING_UPDN = 2,
>    ... etc. ...
> } osm_routing_algorithm_t;
> 
> then put an osm_routing_algorithm_t variable into osm_opensm_t to
> signify what routing algorithm was last used to route the subnet.
> 
> Since no pointers are used, this method would eliminate the need to
> check for a non-null pointer.  The p_osm->lock pointer could be used as
> needed rather than creating a new lock.  It would remove a number of
> string comparisons used throughout the code.

That is great. :)

> Also, this method would be
> useful for when the routing engine chains is developed, since there will
> need to be something to indicate what routing algorithm in the chain was
> used.  All we would have to add is the enumeration plus several
> enum2string and string2enum functions.
> 
> For some reason, I believed that this would have been more invasive than
> the current patch.  But thinking about it now, it seems like it would be
> less invasive.  Would this be a preferred fix to the problem?

I agree with you, it looks like a better solution.

> > > +
> > > +        CL_PLOCK_EXCL_ACQUIRE(&p_routing_eng->routed_name_lock);
> > > +        p_routing_eng->routed_name = routed_name;
> > > +        CL_PLOCK_RELEASE(&p_routing_eng->routed_name_lock);
> > 
> > Basically this section is already protected by p_mgr->p_lock (which is
> > reference to &osm.lock).
> 
> Ahh, I wasn't aware it pointed to the same lock.

Right, it is not always easy to figure out - there is a huge pointer
duplication mess over various sub-structures in OpenSM. I think we will
need to clean it up. Hopefully after this OFED...

Sasha


From caitlin.bestler at neterion.com  Thu Dec 13 11:22:49 2007
From: caitlin.bestler at neterion.com (Caitlin Bestler)
Date: Thu, 13 Dec 2007 11:22:49 -0800
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize
	HCA-related hCalls on POWER5
In-Reply-To: <4760EDC0.5020703@voltaire.com>
References: <OFD9564F75.44193623-ONC12573AE.002EA542-C12573AE.002F5FBC@de.ibm.com>
	<475FD0A1.3090304@voltaire.com> <adabq8v91co.fsf@cisco.com>
	<4760EDC0.5020703@voltaire.com>
Message-ID: <469958e00712131122s661dd970ud359389e1c6637d4@mail.gmail.com>

On Dec 13, 2007 12:30 AM, Or Gerlitz <ogerlitz at voltaire.com> wrote:
> Roland Dreier wrote:
> > I think the right fix for iSER would be to make iSER work even for
> > devices that don't support FMRs.  For example cxgb3 doesn't implement
> > FMRs so if anyone ever updates iSER to work on iWARP and not just IB,
> > then this is something that has to be tackled anyway.  Then ehca could
> > just get rid of the FMR support it has.
>
> OK, The iSER design took into account the case of many initiators
> running on strong/modern machines talking to possibly lightweight
> embedded target for which the processing cost per I/O at the target side
> should be minimized, that is at most --one-- RDMA operation should be
> issued by the target to serve an I/O request.
>
> For that end, iSER works with one descriptor (called stag in iWARP and
> rkey in IB) per I/O direction sent from the initiator to the target and
> hence can't work without some sort of FMR implementation.
>
> The current implementation of the open iscsi initiator makes sure to
> issue commands in thread (sleepable) context, see iscsi_xmitworker and
> references to it in drivers/scsi/libiscsi.c , so this keeps ehca users
> safe for the time being.
>
> Or.
>

I agree, *some* form of FMR support is important for iSER (and probably
for NFS over RDMA as well). Rather than adding a crippled NO FMR
mode it would make more sense to add support for FMR Work Requests.
I'm not certain what, if any, impact that would have on the Power5 problem,
but that's certainly a cleaner path for iWARP.


From changquing.tang at hp.com  Thu Dec 13 11:49:11 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Thu, 13 Dec 2007 19:49:11 +0000
Subject: [ofa-general] OFED 1.3-rc1 release is available
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C902E2D65D@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C902E2D65D@mtlexch01.mtl.com>
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE1B42C3@G5W0278.americas.hpqcorp.net>


HI,

When can you fix the backward compatible issue with OFED 1.2 ?  Thanks.


--CQ

________________________________
From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Tziporet Koren
Sent: Thursday, December 13, 2007 10:40 AM
To: ewg at lists.openfabrics.org
Cc: general at lists.openfabrics.org
Subject: [ofa-general] OFED 1.3-rc1 release is available


Hi,

OFED 1.3 RC1 release is available on
http://www.openfabrics.org/downloads/OFED/ofed-1.3/OFED-1.3-rc1.tgz
To get BUILD_ID run ofed_info

Please report any issues in bugzilla https://bugs.openfabrics.org/

The RC2 release is expected on December 27

Tziporet & Vlad

========================================================================

Release information:
--------------------
OS support:
Novell:
    - SLES10
    - SLES10 SP1 and up1
Redhat:
    - Redhat EL4 up4 and up5
    - Redhat EL5 and up1
kernel.org:
    - 2.6.23 and 2.6.24-rc2

Systems:
    * x86_64
    * x86
    * ia64
    * ppc64

Main Changes from OFED 1.3-beta
===============================

 *   Fix SDP stability issues
 *   Force 32bit libraries installation on the SLES10 SP1 U1
 *   Open MPI: Enable compilation when using compilers that were not installed as RPMs.
 *   RDS: clean up handling of congested destinations vs poll
 *   RDS: Fix download issue when removing low level driver (fix was in CMA)
 *   IPoIB: Fix kernel Oops resulting from xmit following dev_down.
 *   MPI packages update:
    *   mvapich-1.0.0-1639.src.rpm
    *   openmpi-1.2.5rc1-1.src.rpm
    *   mpitests-3.0-773.src.rpm

mlx4 specific changes:

 *   Fix segmentation fault in mlx4_clear_xrc_srq.
 *   Fix max_eq's read from FW in QUERY_DEV_CAP.
 *   Post send in the kernel is now using WQE building block.
 *   Set default CQ moderation parameters for IPoIB

ehca specific changes:

 *   Fix error of sense context opts with multiple adapter
 *   Add files for older abi_versions


Tasks that should be completed for RC2 release:
===============================================
1. IPoIB performance improvements for small messages
2. Fix bugs


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071213/5a42bb18/attachment.html>

From FENKES at de.ibm.com  Thu Dec 13 12:59:50 2007
From: FENKES at de.ibm.com (Joachim Fenkes)
Date: Thu, 13 Dec 2007 21:59:50 +0100
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related
	hCalls on POWER5
In-Reply-To: <469958e00712131122s661dd970ud359389e1c6637d4@mail.gmail.com>
Message-ID: <OFD5B7D5E6.3538DC2A-ONC12573B0.0072D42A-C12573B0.00735730@de.ibm.com>

caitlin.bestler at gmail.com wrote on 13.12.2007 20:22:49:

> On Dec 13, 2007 12:30 AM, Or Gerlitz <ogerlitz at voltaire.com> wrote:
> > The current implementation of the open iscsi initiator makes sure to
> > issue commands in thread (sleepable) context, see iscsi_xmitworker and
> > references to it in drivers/scsi/libiscsi.c , so this keeps ehca users
> > safe for the time being.

> I agree, *some* form of FMR support is important for iSER (and probably
> for NFS over RDMA as well). Rather than adding a crippled NO FMR
> mode it would make more sense to add support for FMR Work Requests.
> I'm not certain what, if any, impact that would have on the Power5 
problem,
> but that's certainly a cleaner path for iWARP.

Well, FMR WRs wouldn't change the eHCA issue -- the driver would have to 
make an hCall in any case, and the architecture says that the hCalls used 
in this scenario might return H_LONG_BUSY, causing the driver to sleep. No 
way around that. Because of this, eHCA's FMRs are actually standard MRs 
with a different API.

If, as Or said, the iSCSI initiator issues commands in sleepable context 
anyway, nothing would be lost by using standard MRs as a fallback solution 
if FMRs aren't available, would it?

J.


From Caitlin.Bestler at neterion.com  Thu Dec 13 13:08:34 2007
From: Caitlin.Bestler at neterion.com (Caitlin Bestler)
Date: Thu, 13 Dec 2007 16:08:34 -0500
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize
	HCA-related hCalls on POWER5
In-Reply-To: <OFD5B7D5E6.3538DC2A-ONC12573B0.0072D42A-C12573B0.00735730@de.ibm.com>
References: <469958e00712131122s661dd970ud359389e1c6637d4@mail.gmail.com>
	<OFD5B7D5E6.3538DC2A-ONC12573B0.0072D42A-C12573B0.00735730@de.ibm.com>
Message-ID: <78C9135A3D2ECE4B8162EBDCE82CAD7702B14004@nekter>


> -----Original Message-----
> From: Joachim Fenkes [mailto:FENKES at de.ibm.com]
> Sent: Thursday, December 13, 2007 1:00 PM
> To: Caitlin Bestler
> Cc: Arnd Bergmann; caitlin.bestler at gmail.com; OF-General; LKML;
> linuxppc-dev at ozlabs.org; Or Gerlitz; Roland Dreier; Stefan Roscher
> Subject: Re: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize
> HCA-related hCalls on POWER5
> 
> caitlin.bestler at gmail.com wrote on 13.12.2007 20:22:49:
> 
> > On Dec 13, 2007 12:30 AM, Or Gerlitz <ogerlitz at voltaire.com> wrote:
> > > The current implementation of the open iscsi initiator makes sure
> to
> > > issue commands in thread (sleepable) context, see iscsi_xmitworker
> and
> > > references to it in drivers/scsi/libiscsi.c , so this keeps ehca
> users
> > > safe for the time being.
> 
> > I agree, *some* form of FMR support is important for iSER (and
> probably
> > for NFS over RDMA as well). Rather than adding a crippled NO FMR
> > mode it would make more sense to add support for FMR Work Requests.
> > I'm not certain what, if any, impact that would have on the Power5
> problem,
> > but that's certainly a cleaner path for iWARP.
> 
> Well, FMR WRs wouldn't change the eHCA issue -- the driver would have
> to
> make an hCall in any case, and the architecture says that the hCalls
> used
> in this scenario might return H_LONG_BUSY, causing the driver to
sleep.
> No
> way around that. Because of this, eHCA's FMRs are actually standard
MRs
> with a different API.
> 
> If, as Or said, the iSCSI initiator issues commands in sleepable
> context
> anyway, nothing would be lost by using standard MRs as a fallback
> solution
> if FMRs aren't available, would it?
> 

To clarify, an FMR Work Request is simply posted to the SendQ like
any other Work Request (of course the QP has to be privileged, or
it will complete in error). An SQ Post should never block.

But yes, if the current iSCSI initiator always does all call-based
FMRs in a sleepable context then I would agree then any changes can
wait for the first vendor that wants to support FMR Work Requests.

FMR Work Requests can be pipelined, so anyone with hardware that
supported them would have strong motivation to enable the open
iSCSI initiator to take advantage of this.


From FENKES at de.ibm.com  Thu Dec 13 13:35:52 2007
From: FENKES at de.ibm.com (Joachim Fenkes)
Date: Thu, 13 Dec 2007 22:35:52 +0100
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related
	hCalls on POWER5
In-Reply-To: <78C9135A3D2ECE4B8162EBDCE82CAD7702B14004@nekter>
Message-ID: <OFD36374CC.9307FC29-ONC12573B0.00763BCF-C12573B0.0076A3D3@de.ibm.com>

"Caitlin Bestler" <Caitlin.Bestler at neterion.com> wrote on 13.12.2007 
22:08:34:

> To clarify, an FMR Work Request is simply posted to the SendQ like
> any other Work Request (of course the QP has to be privileged, or
> it will complete in error). An SQ Post should never block.

This would require hardware support, wouldn't it? eHCA2 doesn't have this 
kind of support, so FMR WRs are not an option here.

J.


From hrosenstock at xsigo.com  Thu Dec 13 13:37:51 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 13 Dec 2007 13:37:51 -0800
Subject: [ofa-general] Re: [PATCH] opensm: don't zero base LID when invalid
	value is received
In-Reply-To: <20071213012552.GZ23319@sashak.voltaire.com>
References: <20071213012552.GZ23319@sashak.voltaire.com>
Message-ID: <1197581871.23465.107.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-13 at 01:25 +0000, Sasha Khapyorsky wrote:
> This addresses bug 246 (https://bugs.openfabrics.org/show_bug.cgi?id=246):
> zero lid received from opensm in set port_info smp.
> 
> When invalid value of LID (it was 0xffff in this case) is received,
> OpenSM clears it to zero now. Instead this patch will try to recover
> using current LID value stored in osm_physp_t.port_info.

This looks like a step in the right direction as that lid could be
valid. It might also be 0 or 0xffff, right ? In that case, it ends up
being the same until the SMA is updated to be 1.2.1 conformant.

-- Hal

> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> ---
>  opensm/include/opensm/osm_port.h  |   39 -----------------------------------
>  opensm/opensm/osm_port_info_rcv.c |   41 ++++++++++++++++++++----------------
>  2 files changed, 23 insertions(+), 57 deletions(-)
> 
> diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h
> index fcb0a16..bba4e44 100644
> --- a/opensm/include/opensm/osm_port.h
> +++ b/opensm/include/opensm/osm_port.h
> @@ -459,45 +459,6 @@ osm_physp_set_port_info(IN osm_physp_t * const p_physp,
>  *	Port, Physical Port
>  *********/
>  
> -/****f* OpenSM: Physical Port/osm_physp_trim_base_lid_to_valid_range
> -* NAME
> -*  osm_physp_trim_base_lid_to_valid_range
> -*
> -* DESCRIPTION
> -*  Validates the base LID in the Physical Port object
> -*  and resets it if the base LID is invalid.
> -*
> -* SYNOPSIS
> -*/
> -static inline ib_net16_t
> -osm_physp_trim_base_lid_to_valid_range(IN osm_physp_t * const p_physp)
> -{
> -	ib_net16_t orig_lid = 0;
> -
> -	CL_ASSERT(osm_physp_is_valid(p_physp));
> -	if ((cl_ntoh16(p_physp->port_info.base_lid) > IB_LID_UCAST_END_HO) ||
> -	    (cl_ntoh16(p_physp->port_info.base_lid) < IB_LID_UCAST_START_HO)) {
> -		orig_lid = p_physp->port_info.base_lid;
> -		p_physp->port_info.base_lid = 0;
> -	}
> -	return orig_lid;
> -}
> -
> -/*
> -* PARAMETERS
> -*	p_physp
> -*		[in] Pointer to an osm_physp_t object.
> -*
> -* RETURN VALUES
> -*	Returns 0 if the base LID in the Physical port object is valid.
> -*	Returns original invalid LID otherwise.
> -*
> -* NOTES
> -*
> -* SEE ALSO
> -*	Port, Physical Port
> -*********/
> -
>  /****f* OpenSM: Physical Port/osm_physp_set_pkey_tbl
>  * NAME
>  *  osm_physp_set_pkey_tbl
> diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c
> index 9ea8738..ea0cb21 100644
> --- a/opensm/opensm/osm_port_info_rcv.c
> +++ b/opensm/opensm/osm_port_info_rcv.c
> @@ -98,6 +98,22 @@ __osm_pi_rcv_set_sm(IN const osm_pi_rcv_t * const p_rcv,
>  
>  /**********************************************************************
>   **********************************************************************/
> +static void pi_rcv_check_and_fix_lid(osm_log_t *log, ib_port_info_t * const pi,
> +				     osm_physp_t * p)
> +{
> +	if ((cl_ntoh16(pi->base_lid) > IB_LID_UCAST_END_HO) ||
> +	    (cl_ntoh16(pi->base_lid) < IB_LID_UCAST_START_HO)) {
> +		osm_log(log, OSM_LOG_ERROR,
> +			"pi_rcv_check_and_fix_lid: ERR 0F04: "
> +			"Got invalid base LID 0x%x from the network. "
> +			"Corrected to 0x%x.\n", cl_ntoh16(pi->base_lid),
> +			cl_ntoh16(p->port_info.base_lid));
> +		pi->base_lid = p->port_info.base_lid;
> +	}
> +}
> +
> +/**********************************************************************
> + **********************************************************************/
>  static void
>  __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv,
>  			     IN osm_physp_t * const p_physp,
> @@ -204,13 +220,12 @@ static void
>  __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
>  				 IN osm_node_t * const p_node,
>  				 IN osm_physp_t * const p_physp,
> -				 IN const ib_port_info_t * const p_pi)
> +				 IN ib_port_info_t * const p_pi)
>  {
>  	ib_api_status_t status = IB_SUCCESS;
>  	osm_madw_context_t context;
>  	osm_physp_t *p_remote_physp;
>  	osm_node_t *p_remote_node;
> -	ib_net16_t orig_lid;
>  	uint8_t port_num;
>  	uint8_t remote_port_num;
>  	osm_dr_path_t path;
> @@ -316,19 +331,15 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
>  	if (ib_port_info_get_port_state(p_pi) > IB_LINK_INIT && p_node->sw)
>  		p_node->sw->need_update = 0;
>  
> +	if (port_num == 0)
> +		pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
> +
>  	/*
>  	   Update the PortInfo attribute.
>  	 */
>  	osm_physp_set_port_info(p_physp, p_pi);
>  
>  	if (port_num == 0) {
> -		/* This is switch management port 0 */
> -		if ((orig_lid =
> -		     osm_physp_trim_base_lid_to_valid_range(p_physp)))
> -			osm_log(p_rcv->p_log, OSM_LOG_ERROR,
> -				"__osm_pi_rcv_process_switch_port: ERR 0F04: "
> -				"Invalid base LID 0x%x corrected\n",
> -				cl_ntoh16(orig_lid));
>  		/* Determine if base switch port 0 */
>  		if (p_node->sw &&
>  		    !ib_switch_info_is_enhanced_port0(&p_node->sw->switch_info))
> @@ -346,21 +357,15 @@ static void
>  __osm_pi_rcv_process_ca_or_router_port(IN const osm_pi_rcv_t * const p_rcv,
>  				       IN osm_node_t * const p_node,
>  				       IN osm_physp_t * const p_physp,
> -				       IN const ib_port_info_t * const p_pi)
> +				       IN ib_port_info_t * const p_pi)
>  {
> -	ib_net16_t orig_lid;
> -
>  	OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_process_ca_or_router_port);
>  
>  	UNUSED_PARAM(p_node);
>  
> -	osm_physp_set_port_info(p_physp, p_pi);
> +	pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
>  
> -	if ((orig_lid = osm_physp_trim_base_lid_to_valid_range(p_physp)))
> -		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
> -			"__osm_pi_rcv_process_ca_or_router_port: ERR 0F08: "
> -			"Invalid base LID 0x%x corrected\n",
> -			cl_ntoh16(orig_lid));
> +	osm_physp_set_port_info(p_physp, p_pi);
>  
>  	__osm_pi_rcv_process_endport(p_rcv, p_physp, p_pi);
>  


From sean.hefty at intel.com  Thu Dec 13 13:48:40 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Thu, 13 Dec 2007 13:48:40 -0800
Subject: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: SerializeHCA-related
	hCalls on POWER5
In-Reply-To: <78C9135A3D2ECE4B8162EBDCE82CAD7702B14004@nekter>
References: <469958e00712131122s661dd970ud359389e1c6637d4@mail.gmail.com><OFD5B7D5E6.3538DC2A-ONC12573B0.0072D42A-C12573B0.00735730@de.ibm.com>
	<78C9135A3D2ECE4B8162EBDCE82CAD7702B14004@nekter>
Message-ID: <000301c83dd1$ec49d520$9b37170a@amr.corp.intel.com>

>To clarify, an FMR Work Request is simply posted to the SendQ like
>any other Work Request (of course the QP has to be privileged, or
>it will complete in error). An SQ Post should never block.

FMR's as defined by the IB spec and that created by Mellanox are not the same.
They, unfortunately, use the same name and acronym only.  Mellanox FMRs use an
API that is more like that of standard MRs. 

- Sean


From sashak at voltaire.com  Thu Dec 13 14:37:04 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 13 Dec 2007 22:37:04 +0000
Subject: [ofa-general] Re: [PATCH] opensm: don't zero base LID when invalid
	value is received
In-Reply-To: <1197581871.23465.107.camel@hrosenstock-ws.xsigo.com>
References: <20071213012552.GZ23319@sashak.voltaire.com>
	<1197581871.23465.107.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071213223704.GL708@sashak.voltaire.com>

On 13:37 Thu 13 Dec     , Hal Rosenstock wrote:
> On Thu, 2007-12-13 at 01:25 +0000, Sasha Khapyorsky wrote:
> > This addresses bug 246 (https://bugs.openfabrics.org/show_bug.cgi?id=246):
> > zero lid received from opensm in set port_info smp.
> > 
> > When invalid value of LID (it was 0xffff in this case) is received,
> > OpenSM clears it to zero now. Instead this patch will try to recover
> > using current LID value stored in osm_physp_t.port_info.
> 
> This looks like a step in the right direction as that lid could be
> valid. It might also be 0 or 0xffff, right ?

Do you mean stored at OpenSM in port's port_info? Assuming so it can be
0 when not initialized.

Sasha

> In that case, it ends up
> being the same until the SMA is updated to be 1.2.1 conformant.
> 
> -- Hal
> 
> > Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> > ---
> >  opensm/include/opensm/osm_port.h  |   39 -----------------------------------
> >  opensm/opensm/osm_port_info_rcv.c |   41 ++++++++++++++++++++----------------
> >  2 files changed, 23 insertions(+), 57 deletions(-)
> > 
> > diff --git a/opensm/include/opensm/osm_port.h b/opensm/include/opensm/osm_port.h
> > index fcb0a16..bba4e44 100644
> > --- a/opensm/include/opensm/osm_port.h
> > +++ b/opensm/include/opensm/osm_port.h
> > @@ -459,45 +459,6 @@ osm_physp_set_port_info(IN osm_physp_t * const p_physp,
> >  *	Port, Physical Port
> >  *********/
> >  
> > -/****f* OpenSM: Physical Port/osm_physp_trim_base_lid_to_valid_range
> > -* NAME
> > -*  osm_physp_trim_base_lid_to_valid_range
> > -*
> > -* DESCRIPTION
> > -*  Validates the base LID in the Physical Port object
> > -*  and resets it if the base LID is invalid.
> > -*
> > -* SYNOPSIS
> > -*/
> > -static inline ib_net16_t
> > -osm_physp_trim_base_lid_to_valid_range(IN osm_physp_t * const p_physp)
> > -{
> > -	ib_net16_t orig_lid = 0;
> > -
> > -	CL_ASSERT(osm_physp_is_valid(p_physp));
> > -	if ((cl_ntoh16(p_physp->port_info.base_lid) > IB_LID_UCAST_END_HO) ||
> > -	    (cl_ntoh16(p_physp->port_info.base_lid) < IB_LID_UCAST_START_HO)) {
> > -		orig_lid = p_physp->port_info.base_lid;
> > -		p_physp->port_info.base_lid = 0;
> > -	}
> > -	return orig_lid;
> > -}
> > -
> > -/*
> > -* PARAMETERS
> > -*	p_physp
> > -*		[in] Pointer to an osm_physp_t object.
> > -*
> > -* RETURN VALUES
> > -*	Returns 0 if the base LID in the Physical port object is valid.
> > -*	Returns original invalid LID otherwise.
> > -*
> > -* NOTES
> > -*
> > -* SEE ALSO
> > -*	Port, Physical Port
> > -*********/
> > -
> >  /****f* OpenSM: Physical Port/osm_physp_set_pkey_tbl
> >  * NAME
> >  *  osm_physp_set_pkey_tbl
> > diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c
> > index 9ea8738..ea0cb21 100644
> > --- a/opensm/opensm/osm_port_info_rcv.c
> > +++ b/opensm/opensm/osm_port_info_rcv.c
> > @@ -98,6 +98,22 @@ __osm_pi_rcv_set_sm(IN const osm_pi_rcv_t * const p_rcv,
> >  
> >  /**********************************************************************
> >   **********************************************************************/
> > +static void pi_rcv_check_and_fix_lid(osm_log_t *log, ib_port_info_t * const pi,
> > +				     osm_physp_t * p)
> > +{
> > +	if ((cl_ntoh16(pi->base_lid) > IB_LID_UCAST_END_HO) ||
> > +	    (cl_ntoh16(pi->base_lid) < IB_LID_UCAST_START_HO)) {
> > +		osm_log(log, OSM_LOG_ERROR,
> > +			"pi_rcv_check_and_fix_lid: ERR 0F04: "
> > +			"Got invalid base LID 0x%x from the network. "
> > +			"Corrected to 0x%x.\n", cl_ntoh16(pi->base_lid),
> > +			cl_ntoh16(p->port_info.base_lid));
> > +		pi->base_lid = p->port_info.base_lid;
> > +	}
> > +}
> > +
> > +/**********************************************************************
> > + **********************************************************************/
> >  static void
> >  __osm_pi_rcv_process_endport(IN const osm_pi_rcv_t * const p_rcv,
> >  			     IN osm_physp_t * const p_physp,
> > @@ -204,13 +220,12 @@ static void
> >  __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
> >  				 IN osm_node_t * const p_node,
> >  				 IN osm_physp_t * const p_physp,
> > -				 IN const ib_port_info_t * const p_pi)
> > +				 IN ib_port_info_t * const p_pi)
> >  {
> >  	ib_api_status_t status = IB_SUCCESS;
> >  	osm_madw_context_t context;
> >  	osm_physp_t *p_remote_physp;
> >  	osm_node_t *p_remote_node;
> > -	ib_net16_t orig_lid;
> >  	uint8_t port_num;
> >  	uint8_t remote_port_num;
> >  	osm_dr_path_t path;
> > @@ -316,19 +331,15 @@ __osm_pi_rcv_process_switch_port(IN const osm_pi_rcv_t * const p_rcv,
> >  	if (ib_port_info_get_port_state(p_pi) > IB_LINK_INIT && p_node->sw)
> >  		p_node->sw->need_update = 0;
> >  
> > +	if (port_num == 0)
> > +		pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
> > +
> >  	/*
> >  	   Update the PortInfo attribute.
> >  	 */
> >  	osm_physp_set_port_info(p_physp, p_pi);
> >  
> >  	if (port_num == 0) {
> > -		/* This is switch management port 0 */
> > -		if ((orig_lid =
> > -		     osm_physp_trim_base_lid_to_valid_range(p_physp)))
> > -			osm_log(p_rcv->p_log, OSM_LOG_ERROR,
> > -				"__osm_pi_rcv_process_switch_port: ERR 0F04: "
> > -				"Invalid base LID 0x%x corrected\n",
> > -				cl_ntoh16(orig_lid));
> >  		/* Determine if base switch port 0 */
> >  		if (p_node->sw &&
> >  		    !ib_switch_info_is_enhanced_port0(&p_node->sw->switch_info))
> > @@ -346,21 +357,15 @@ static void
> >  __osm_pi_rcv_process_ca_or_router_port(IN const osm_pi_rcv_t * const p_rcv,
> >  				       IN osm_node_t * const p_node,
> >  				       IN osm_physp_t * const p_physp,
> > -				       IN const ib_port_info_t * const p_pi)
> > +				       IN ib_port_info_t * const p_pi)
> >  {
> > -	ib_net16_t orig_lid;
> > -
> >  	OSM_LOG_ENTER(p_rcv->p_log, __osm_pi_rcv_process_ca_or_router_port);
> >  
> >  	UNUSED_PARAM(p_node);
> >  
> > -	osm_physp_set_port_info(p_physp, p_pi);
> > +	pi_rcv_check_and_fix_lid(p_rcv->p_log, p_pi, p_physp);
> >  
> > -	if ((orig_lid = osm_physp_trim_base_lid_to_valid_range(p_physp)))
> > -		osm_log(p_rcv->p_log, OSM_LOG_ERROR,
> > -			"__osm_pi_rcv_process_ca_or_router_port: ERR 0F08: "
> > -			"Invalid base LID 0x%x corrected\n",
> > -			cl_ntoh16(orig_lid));
> > +	osm_physp_set_port_info(p_physp, p_pi);
> >  
> >  	__osm_pi_rcv_process_endport(p_rcv, p_physp, p_pi);
> >  


From joe at perches.com  Thu Dec 13 15:39:00 2007
From: joe at perches.com (Joe Perches)
Date: Thu, 13 Dec 2007 15:39:00 -0800
Subject: [ofa-general] [PATCH net-2.6.25 7/8] drivers/infiniband: Use
	ipv4_is_<type>
In-Reply-To: <1197589141-7020-6-git-send-email-joe@perches.com>
References: <1197589141-7020-1-git-send-email-joe@perches.com>
	<1197589141-7020-2-git-send-email-joe@perches.com>
	<1197589141-7020-3-git-send-email-joe@perches.com>
	<1197589141-7020-4-git-send-email-joe@perches.com>
	<1197589141-7020-5-git-send-email-joe@perches.com>
	<1197589141-7020-6-git-send-email-joe@perches.com>
Message-ID: <1197589141-7020-7-git-send-email-joe@perches.com>


Signed-off-by: Joe Perches <joe at perches.com>
---
 drivers/infiniband/core/addr.c |    4 ++--
 drivers/infiniband/core/cma.c  |    5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 5381c80..0802b79 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -265,11 +265,11 @@ static int addr_resolve_local(struct sockaddr_in *src_in,
 	if (!dev)
 		return -EADDRNOTAVAIL;
 
-	if (ZERONET(src_ip)) {
+	if (ipv4_is_zeronet(src_ip)) {
 		src_in->sin_family = dst_in->sin_family;
 		src_in->sin_addr.s_addr = dst_ip;
 		ret = rdma_copy_addr(addr, dev, dev->dev_addr);
-	} else if (LOOPBACK(src_ip)) {
+	} else if (ipv4_is_loopback(src_ip)) {
 		ret = rdma_translate_ip((struct sockaddr *)dst_in, addr);
 		if (!ret)
 			memcpy(addr->dst_dev_addr, dev->dev_addr, MAX_ADDR_LEN);
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 0751697..b37045c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -624,7 +624,8 @@ static inline int cma_zero_addr(struct sockaddr *addr)
 	struct in6_addr *ip6;
 
 	if (addr->sa_family == AF_INET)
-		return ZERONET(((struct sockaddr_in *) addr)->sin_addr.s_addr);
+		return ipv4_is_zeronet(
+			((struct sockaddr_in *)addr)->sin_addr.s_addr);
 	else {
 		ip6 = &((struct sockaddr_in6 *) addr)->sin6_addr;
 		return (ip6->s6_addr32[0] | ip6->s6_addr32[1] |
@@ -634,7 +635,7 @@ static inline int cma_zero_addr(struct sockaddr *addr)
 
 static inline int cma_loopback_addr(struct sockaddr *addr)
 {
-	return LOOPBACK(((struct sockaddr_in *) addr)->sin_addr.s_addr);
+	return ipv4_is_loopback(((struct sockaddr_in *) addr)->sin_addr.s_addr);
 }
 
 static inline int cma_any_addr(struct sockaddr *addr)
-- 
1.5.3.7.949.g2221a6


From ssufficool at rov.sbcounty.gov  Thu Dec 13 15:46:26 2007
From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley)
Date: Thu, 13 Dec 2007 15:46:26 -0800
Subject: [ofa-general] WinOF 1.0.1 & SRPT Data Corruption
Message-ID: <C2F174F99918D54CA2A96E57C5079B6F3551C2@sbc-exmsg2.sbcounty.gov>

Running SQLIOSIM Stress Test on Windows 2003 SR2 / WinOF 1.0.1 gives
multiple buffer validation failures. The same occurs with WinIB 1.3.0. 
 
    Error: 0x80070467
    Error Text: While accessing the hard disk, a disk operation failed
even after retries.
    Description: Buffer validation failed on e:\sqlio.mdf Page: 3728,
offset 0x1000
 
In actual use, data is corrupted.
 
The Linux target side does not seem to provide any clues as to why this
is happening. But attached is dmesg anyways.
 
Current Configuration:
- Windows 2003 SR2 SP1 / WinOF 1.0.1 
- Windows 2003 SR2 SP1 / WinIB 1.3.0
- Linux 2.6.22-r9 / SRP Target git current / SCST svn Rev 234
 
SRP Target startup:
==============================
modprobe scst
modprobe scst_vdisk
#echo "open nullio none 1024 NULLIO" > /proc/scsi_tgt/vdisk/vdisk
echo "open testvol /root/testvol 512" > /proc/scsi_tgt/vdisk/vdisk
 
echo "add_group sql01" > /proc/scsi_tgt/scsi_tgt
 
#Mask LUNs to specific initiators
echo "add 0x001a4bffff0cd041001708ffffd0dd60" >
/proc/scsi_tgt/groups/sql01/names
echo "add 0x001a4bffff0cd042001708ffffd0dd60" >
/proc/scsi_tgt/groups/sql01/names
 
# Allocate disks to specific groups
#echo "add nullio 9" > /proc/scsi_tgt/groups/Default/devices
echo "add testvol 0" > /proc/scsi_tgt/groups/sql01/devices
 
# Load Infiniband SRP Target
modprobe ib_srpt
==============================
 
Let me know what I can do to provide more detail on the issue on either
Windows or Linux side.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071213/fa73d721/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg.log
Type: application/octet-stream
Size: 26763 bytes
Desc: dmesg.log
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071213/fa73d721/attachment.obj>

From glenn at lists.openfabrics.org  Thu Dec 13 16:30:16 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 13 Dec 2007 16:30:16 -0800 (PST)
Subject: [ofa-general] [PATCH 1/6] nes: cosmetic and whitespace cleanup
Message-ID: <20071214003016.9C527E622FF@openfabrics.org>


Run cleanfile to fix up whitespace

>From Rolands infiniband tree to update OFED.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c
index d101117..638bc51 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -283,7 +283,7 @@ struct sk_buff *form_cm_frame(struct sk_buff *skb, struct nes_cm_node *cm_node,
 	iph->tos = 0;
 	iph->tot_len = htons(packetsize);
 	iph->id = htons(++cm_node->tcp_cntxt.loc_id);
-	
+
 	iph->frag_off = ntohs(0x4000);
 	iph->ttl = 0x40;
 	iph->protocol= 0x06;	/* IPPROTO_TCP */
@@ -1078,8 +1078,8 @@ static struct nes_cm_node *make_cm_node(struct nes_cm_core *cm_core,
 	atomic_inc(&cm_core->session_id);
 	cm_node->session_id = (u32)(atomic_read(&cm_core->session_id) + current->tgid);
 	cm_node->conn_type = cm_info->conn_type;
- 	cm_node->apbvt_set = 0;
- 	cm_node->accept_pend = 0;
+	cm_node->apbvt_set = 0;
+	cm_node->accept_pend = 0;
 
 	cm_node->nesvnic = nesvnic;
 	/* get some device handles, for arp lookup */
@@ -2019,22 +2019,22 @@ static int nes_cm_init_tsa_conn(struct nes_qp *nesqp, struct nes_cm_node *cm_nod
 			NES_QPCONTEXT_PDWSCALE_RCV_WSCALE_MASK);
 
 	nesqp->nesqp_context->keepalive = cpu_to_le32(0x80);
-	nesqp->nesqp_context->ts_recent = 0;	
-	nesqp->nesqp_context->ts_age = 0;	
-	nesqp->nesqp_context->snd_nxt = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);	
-	nesqp->nesqp_context->snd_wnd = cpu_to_le32(cm_node->tcp_cntxt.snd_wnd);	
-	nesqp->nesqp_context->rcv_nxt = cpu_to_le32(cm_node->tcp_cntxt.rcv_nxt);	
+	nesqp->nesqp_context->ts_recent = 0;
+	nesqp->nesqp_context->ts_age = 0;
+	nesqp->nesqp_context->snd_nxt = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);
+	nesqp->nesqp_context->snd_wnd = cpu_to_le32(cm_node->tcp_cntxt.snd_wnd);
+	nesqp->nesqp_context->rcv_nxt = cpu_to_le32(cm_node->tcp_cntxt.rcv_nxt);
 	nesqp->nesqp_context->rcv_wnd = cpu_to_le32(cm_node->tcp_cntxt.rcv_wnd <<
-			cm_node->tcp_cntxt.rcv_wscale);	
-	nesqp->nesqp_context->snd_max = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);	
-	nesqp->nesqp_context->snd_una = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);	
-	nesqp->nesqp_context->srtt = 0;		
-	nesqp->nesqp_context->rttvar = cpu_to_le32(0x6);	
-	nesqp->nesqp_context->ssthresh = cpu_to_le32(0x3FFFC000);	
-	nesqp->nesqp_context->cwnd = cpu_to_le32(2*cm_node->tcp_cntxt.mss);	
-	nesqp->nesqp_context->snd_wl1 = cpu_to_le32(cm_node->tcp_cntxt.rcv_nxt);	
-	nesqp->nesqp_context->snd_wl2 = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);	
-	nesqp->nesqp_context->max_snd_wnd = cpu_to_le32(cm_node->tcp_cntxt.max_snd_wnd);	
+			cm_node->tcp_cntxt.rcv_wscale);
+	nesqp->nesqp_context->snd_max = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);
+	nesqp->nesqp_context->snd_una = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);
+	nesqp->nesqp_context->srtt = 0;
+	nesqp->nesqp_context->rttvar = cpu_to_le32(0x6);
+	nesqp->nesqp_context->ssthresh = cpu_to_le32(0x3FFFC000);
+	nesqp->nesqp_context->cwnd = cpu_to_le32(2*cm_node->tcp_cntxt.mss);
+	nesqp->nesqp_context->snd_wl1 = cpu_to_le32(cm_node->tcp_cntxt.rcv_nxt);
+	nesqp->nesqp_context->snd_wl2 = cpu_to_le32(cm_node->tcp_cntxt.loc_seq_num);
+	nesqp->nesqp_context->max_snd_wnd = cpu_to_le32(cm_node->tcp_cntxt.max_snd_wnd);
 
 	nes_debug(NES_DBG_CM, "QP%u: rcv_nxt = 0x%08X, snd_nxt = 0x%08X,"
 			" Setting MSS to %u, PDWscale = 0x%08X, rcv_wnd = %u, context misc = 0x%08X.\n",
diff --git a/drivers/infiniband/hw/nes/nes_cm.h b/drivers/infiniband/hw/nes/nes_cm.h
index 00eaeb1..cd8e003 100644
--- a/drivers/infiniband/hw/nes/nes_cm.h
+++ b/drivers/infiniband/hw/nes/nes_cm.h
@@ -122,7 +122,7 @@ struct nes_timer_entry {
 	unsigned long timetosend;	/* jiffies */
 	struct sk_buff *skb;
 	u32 type;
-	u32 retrycount;	 
+	u32 retrycount;
 	u32 retranscount;
 	u32 context;
 	u32 seq_num;
@@ -142,15 +142,15 @@ struct nes_timer_entry {
 #define NES_LONG_TIME       (2000*HZ/1000)
 
 #define NES_CM_HASHTABLE_SIZE         1024
-#define NES_CM_TCP_TIMER_INTERVAL     3000	
+#define NES_CM_TCP_TIMER_INTERVAL     3000
 #define NES_CM_DEFAULT_MTU            1540
 #define NES_CM_DEFAULT_FRAME_CNT      10
-#define NES_CM_THREAD_STACK_SIZE      256	
+#define NES_CM_THREAD_STACK_SIZE      256
 #define NES_CM_DEFAULT_RCV_WND        64240	// before we know that window scaling is allowed
 #define NES_CM_DEFAULT_RCV_WND_SCALED 256960  // after we know that window scaling is allowed
 #define NES_CM_DEFAULT_RCV_WND_SCALE  2
-#define NES_CM_DEFAULT_FREE_PKTS      0x000A	
-#define NES_CM_FREE_PKT_LO_WATERMARK  2	
+#define NES_CM_DEFAULT_FREE_PKTS      0x000A
+#define NES_CM_FREE_PKT_LO_WATERMARK  2
 
 #define NES_CM_DEF_SEQ       0x159bf75f
 #define NES_CM_DEF_LOCAL_ID  0x3b47
@@ -192,12 +192,12 @@ enum nes_cm_conn_type {
 
 /* CM context params */
 struct nes_cm_tcp_context {
-	u8  client;			
+	u8  client;
 
-	u32 loc_seq_num;	
-	u32 loc_ack_num;	
-	u32 rem_ack_num;	
-	u32 rcv_nxt;		
+	u32 loc_seq_num;
+	u32 loc_ack_num;
+	u32 rem_ack_num;
+	u32 rcv_nxt;
 
 	u32 loc_id;
 	u32 rem_id;
@@ -210,8 +210,8 @@ struct nes_cm_tcp_context {
 	u8  snd_wscale;
 	u8  rcv_wscale;
 
-	struct nes_cm_tsa_context tsa_cntxt;	
-	struct timeval            sent_ts;		
+	struct nes_cm_tsa_context tsa_cntxt;
+	struct timeval            sent_ts;
 };
 
 
@@ -223,7 +223,7 @@ enum nes_cm_listener_state {
 
 struct nes_cm_listener {
 	struct list_head           list;
-	u64                        session_id;	
+	u64                        session_id;
 	struct nes_cm_core         *cm_core;
 	u8                         loc_mac[ETH_ALEN];
 	nes_addr_t                 loc_addr;
@@ -240,17 +240,17 @@ struct nes_cm_listener {
 
 /* per connection node and node state information */
 struct nes_cm_node {
-	u64                       session_id;	
-	u32                       hashkey;	
+	u64                       session_id;
+	u32                       hashkey;
 
 	nes_addr_t                loc_addr, rem_addr;
 	u16                       loc_port, rem_port;
 
-	u8                        loc_mac[ETH_ALEN];	
-	u8                        rem_mac[ETH_ALEN];	
+	u8                        loc_mac[ETH_ALEN];
+	u8                        rem_mac[ETH_ALEN];
 
-	enum nes_cm_node_state    state;	
-	struct nes_cm_tcp_context tcp_cntxt;	
+	enum nes_cm_node_state    state;
+	struct nes_cm_tcp_context tcp_cntxt;
 	struct nes_cm_core        *cm_core;
 	struct sk_buff_head       resend_list;
 	atomic_t                  ref_count;
@@ -261,10 +261,10 @@ struct nes_cm_node {
 	spinlock_t                retrans_list_lock;
 	struct list_head          recv_list;
 	spinlock_t                recv_list_lock;
-	
+
 	int                       send_write0;
 	union {
-		struct ietf_mpa_frame mpa_frame;	
+		struct ietf_mpa_frame mpa_frame;
 		u8                    mpa_frame_buf[NES_CM_DEFAULT_MTU];
 	};
 	u16                       mpa_frame_size;
@@ -324,25 +324,25 @@ struct nes_cm_event {
 };
 
 struct nes_cm_core {
-	enum nes_cm_node_state  state;	
-	atomic_t                session_id;			
+	enum nes_cm_node_state  state;
+	atomic_t                session_id;
 
-	atomic_t                listen_node_cnt;			
-	struct nes_cm_node      listen_list;	
-	spinlock_t              listen_list_lock;	
+	atomic_t                listen_node_cnt;
+	struct nes_cm_node      listen_list;
+	spinlock_t              listen_list_lock;
 
-	u32                     mtu;						
+	u32                     mtu;
 	u32                     free_tx_pkt_max;
-	u32                     rx_pkt_posted;				
-	struct sk_buff_head     tx_free_list;	
-	atomic_t                ht_node_cnt;			
+	u32                     rx_pkt_posted;
+	struct sk_buff_head     tx_free_list;
+	atomic_t                ht_node_cnt;
 	struct list_head        connected_nodes;
 	/* struct list_head hashtable[NES_CM_HASHTABLE_SIZE]; */
-	spinlock_t              ht_lock;				
+	spinlock_t              ht_lock;
 
-	struct timer_list       tcp_timer;	
+	struct timer_list       tcp_timer;
 
-	struct nes_cm_ops       *api;			
+	struct nes_cm_ops       *api;
 
 	int (*post_event)(struct nes_cm_event *event);
 	atomic_t                events_posted;
diff --git a/drivers/infiniband/hw/nes/nes_hw.c b/drivers/infiniband/hw/nes/nes_hw.c
index 3a21a08..674ce32 100644
--- a/drivers/infiniband/hw/nes/nes_hw.c
+++ b/drivers/infiniband/hw/nes/nes_hw.c
@@ -605,18 +605,18 @@ int nes_init_serdes(struct nes_device *nesdev, u8 hw_rev, u8 port_count, u8  One
 
 	if (hw_rev != NE020_REV) {
 		/* init serdes 0 */
-		
+
 		nes_write_indexed(nesdev, NES_IDX_ETH_SERDES_CDR_CONTROL0, 0x000000FF);
 		if (!OneG_Mode) {
-			
+
 			nes_write_indexed(nesdev, NES_IDX_ETH_SERDES_TX_HIGHZ_LANE_MODE0, 0x11110000);
 		}
 		if (port_count > 1) {
 			/* init serdes 1 */
-			
+
 			nes_write_indexed(nesdev, NES_IDX_ETH_SERDES_CDR_CONTROL1, 0x000000FF);
 			if (!OneG_Mode) {
-				
+
 				nes_write_indexed(nesdev, NES_IDX_ETH_SERDES_TX_HIGHZ_LANE_MODE1, 0x11110000);
 			}
 		}
@@ -681,73 +681,73 @@ void nes_init_csr_ne020(struct nes_device *nesdev, u8 hw_rev, u8 port_count)
 
 	nes_debug(NES_DBG_INIT, "port_count=%d\n", port_count);
 
-	nes_write_indexed(nesdev, 0x000001E4, 0x00000007);	
-	/* nes_write_indexed(nesdev, 0x000001E8, 0x000208C4); */	
-	nes_write_indexed(nesdev, 0x000001E8, 0x00020844);	
-	nes_write_indexed(nesdev, 0x000001D8, 0x00048002);	
-	/* nes_write_indexed(nesdev, 0x000001D8, 0x0004B002); */  
-	nes_write_indexed(nesdev, 0x000001FC, 0x00050005);	
-	nes_write_indexed(nesdev, 0x00000600, 0x55555555);	
-	nes_write_indexed(nesdev, 0x00000604, 0x55555555);	
+	nes_write_indexed(nesdev, 0x000001E4, 0x00000007);
+	/* nes_write_indexed(nesdev, 0x000001E8, 0x000208C4); */
+	nes_write_indexed(nesdev, 0x000001E8, 0x00020844);
+	nes_write_indexed(nesdev, 0x000001D8, 0x00048002);
+	/* nes_write_indexed(nesdev, 0x000001D8, 0x0004B002); */
+	nes_write_indexed(nesdev, 0x000001FC, 0x00050005);
+	nes_write_indexed(nesdev, 0x00000600, 0x55555555);
+	nes_write_indexed(nesdev, 0x00000604, 0x55555555);
 
 	/* TODO: move these MAC register settings to NIC bringup */
-	nes_write_indexed(nesdev, 0x00002000, 0x00000001);	
-	nes_write_indexed(nesdev, 0x00002004, 0x00000001);	
-	nes_write_indexed(nesdev, 0x00002008, 0x0000FFFF);	
-	nes_write_indexed(nesdev, 0x0000200C, 0x00000001);	
-	nes_write_indexed(nesdev, 0x00002010, 0x000003c1);	
-	nes_write_indexed(nesdev, 0x0000201C, 0x75345678);	
+	nes_write_indexed(nesdev, 0x00002000, 0x00000001);
+	nes_write_indexed(nesdev, 0x00002004, 0x00000001);
+	nes_write_indexed(nesdev, 0x00002008, 0x0000FFFF);
+	nes_write_indexed(nesdev, 0x0000200C, 0x00000001);
+	nes_write_indexed(nesdev, 0x00002010, 0x000003c1);
+	nes_write_indexed(nesdev, 0x0000201C, 0x75345678);
 	if (port_count > 1) {
-		nes_write_indexed(nesdev, 0x00002200, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002204, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002208, 0x0000FFFF);	
-		nes_write_indexed(nesdev, 0x0000220C, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002210, 0x000003c1);	
-		nes_write_indexed(nesdev, 0x0000221C, 0x75345678);	
-		nes_write_indexed(nesdev, 0x00000908, 0x20000001);	
+		nes_write_indexed(nesdev, 0x00002200, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002204, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002208, 0x0000FFFF);
+		nes_write_indexed(nesdev, 0x0000220C, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002210, 0x000003c1);
+		nes_write_indexed(nesdev, 0x0000221C, 0x75345678);
+		nes_write_indexed(nesdev, 0x00000908, 0x20000001);
 	}
 	if (port_count > 2) {
-		nes_write_indexed(nesdev, 0x00002400, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002404, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002408, 0x0000FFFF);	
-		nes_write_indexed(nesdev, 0x0000240C, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002410, 0x000003c1);	
-		nes_write_indexed(nesdev, 0x0000241C, 0x75345678);	
-		nes_write_indexed(nesdev, 0x00000910, 0x20000001);	
-
-		nes_write_indexed(nesdev, 0x00002600, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002604, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002608, 0x0000FFFF);	
-		nes_write_indexed(nesdev, 0x0000260C, 0x00000001);	
-		nes_write_indexed(nesdev, 0x00002610, 0x000003c1);	
-		nes_write_indexed(nesdev, 0x0000261C, 0x75345678);	
-		nes_write_indexed(nesdev, 0x00000918, 0x20000001);	
-	}
-
-	nes_write_indexed(nesdev, 0x00005000, 0x00018000);	
-	/* nes_write_indexed(nesdev, 0x00005000, 0x00010000); */  
-	nes_write_indexed(nesdev, 0x00005004, 0x00020001);	
-	nes_write_indexed(nesdev, 0x00005008, 0x1F1F1F1F);	
-	nes_write_indexed(nesdev, 0x00005010, 0x1F1F1F1F);	
-	nes_write_indexed(nesdev, 0x00005018, 0x1F1F1F1F);	
-	nes_write_indexed(nesdev, 0x00005020, 0x1F1F1F1F);	
-	nes_write_indexed(nesdev, 0x00006090, 0xFFFFFFFF);	
+		nes_write_indexed(nesdev, 0x00002400, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002404, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002408, 0x0000FFFF);
+		nes_write_indexed(nesdev, 0x0000240C, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002410, 0x000003c1);
+		nes_write_indexed(nesdev, 0x0000241C, 0x75345678);
+		nes_write_indexed(nesdev, 0x00000910, 0x20000001);
+
+		nes_write_indexed(nesdev, 0x00002600, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002604, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002608, 0x0000FFFF);
+		nes_write_indexed(nesdev, 0x0000260C, 0x00000001);
+		nes_write_indexed(nesdev, 0x00002610, 0x000003c1);
+		nes_write_indexed(nesdev, 0x0000261C, 0x75345678);
+		nes_write_indexed(nesdev, 0x00000918, 0x20000001);
+	}
+
+	nes_write_indexed(nesdev, 0x00005000, 0x00018000);
+	/* nes_write_indexed(nesdev, 0x00005000, 0x00010000); */
+	nes_write_indexed(nesdev, 0x00005004, 0x00020001);
+	nes_write_indexed(nesdev, 0x00005008, 0x1F1F1F1F);
+	nes_write_indexed(nesdev, 0x00005010, 0x1F1F1F1F);
+	nes_write_indexed(nesdev, 0x00005018, 0x1F1F1F1F);
+	nes_write_indexed(nesdev, 0x00005020, 0x1F1F1F1F);
+	nes_write_indexed(nesdev, 0x00006090, 0xFFFFFFFF);
 
 	/* TODO: move this to code, get from EEPROM */
-	nes_write_indexed(nesdev, 0x00000900, 0x20000001);	
-	nes_write_indexed(nesdev, 0x000060C0, 0x0000028e);	
-	nes_write_indexed(nesdev, 0x000060C8, 0x00000020);	
+	nes_write_indexed(nesdev, 0x00000900, 0x20000001);
+	nes_write_indexed(nesdev, 0x000060C0, 0x0000028e);
+	nes_write_indexed(nesdev, 0x000060C8, 0x00000020);
 														//
-	nes_write_indexed(nesdev, 0x000001EC, 0x5b2625a0);	
-	/* nes_write_indexed(nesdev, 0x000001EC, 0x5f2625a0); */  
-	
+	nes_write_indexed(nesdev, 0x000001EC, 0x5b2625a0);
+	/* nes_write_indexed(nesdev, 0x000001EC, 0x5f2625a0); */
+
 	if (hw_rev != NE020_REV) {
 		u32temp = nes_read_indexed(nesdev, 0x000008e8);
-		u32temp |= 0x80000000;  
+		u32temp |= 0x80000000;
 		nes_write_indexed(nesdev, 0x000008e8, u32temp);
 		u32temp = nes_read_indexed(nesdev, 0x000021f8);
 		u32temp &= 0x7fffffff;
-		u32temp |= 0x7fff0010;  
+		u32temp |= 0x7fff0010;
 		nes_write_indexed(nesdev, 0x000021f8, u32temp);
 	}
 }
@@ -1435,7 +1435,7 @@ int nes_init_nic_qp(struct nes_device *nesdev, struct net_device *netdev)
 	nes_debug(NES_DBG_INIT, "RX_WINDOW_BUFFER_PAGE_TABLE_SIZE = 0x%08X, RX_WINDOW_BUFFER_SIZE = 0x%08X\n",
 			nes_read_indexed(nesdev, NES_IDX_RX_WINDOW_BUFFER_PAGE_TABLE_SIZE),
 			nes_read_indexed(nesdev, NES_IDX_RX_WINDOW_BUFFER_SIZE));
-	if (nes_read_indexed(nesdev, NES_IDX_RX_WINDOW_BUFFER_SIZE) != 0) {	
+	if (nes_read_indexed(nesdev, NES_IDX_RX_WINDOW_BUFFER_SIZE) != 0) {
 		nic_context->context_words[NES_NIC_CTX_MISC_IDX] |= cpu_to_le32(NES_NIC_BACK_STORE);
 	}
 
diff --git a/drivers/infiniband/hw/nes/nes_hw.h b/drivers/infiniband/hw/nes/nes_hw.h
index 51bb87f..178f3d5 100644
--- a/drivers/infiniband/hw/nes/nes_hw.h
+++ b/drivers/infiniband/hw/nes/nes_hw.h
@@ -624,8 +624,8 @@ enum nes_aeqe_bits {
 	NES_AEQE_VALID = (1<<31),
 };
 
-#define NES_AEQE_IWARP_STATE_SHIFT 	20
-#define NES_AEQE_TCP_STATE_SHIFT 	24
+#define NES_AEQE_IWARP_STATE_SHIFT	20
+#define NES_AEQE_TCP_STATE_SHIFT	24
 
 enum nes_aeqe_iwarp_state {
 	NES_AEQE_IWARP_STATE_NON_EXISTANT = 0,
@@ -1193,5 +1193,4 @@ struct nes_ib_device {
 #define nes_netif_rx netif_rx
 #endif
 
-#endif 		/* __NES_HW_H */
-
+#endif		/* __NES_HW_H */
diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index 4133a44..dcff1b8 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -257,7 +257,7 @@ static int nes_netdev_open(struct net_device *netdev)
 				((((u32)nesvnic->nic_index) << 16)));
 	}
 
-	
+
 	if (netdev->ip_ptr) {
 		struct in_device *ip = netdev->ip_ptr;
 		struct in_ifaddr *in = NULL;
@@ -276,11 +276,12 @@ static int nes_netdev_open(struct net_device *netdev)
 		/* Enable network packets */
 		nesvnic->linkup = 1;
 		netif_start_queue(netdev);
-		netif_carrier_on(netdev);
-	}
 #ifdef NES_NAPI
-	napi_enable(&nesvnic->napi);
+		napi_enable(&nesvnic->napi);
 #endif
+	} else {
+		netif_carrier_off(netdev);
+	}
 	nesvnic->netdev_open = 1;
 
 	return 0;
@@ -647,10 +648,10 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 				}
 				wqe_misc |= NES_NIC_SQ_WQE_COMPLETION | (u16)skb_is_gso(skb);
 				if ((tso_wqe_length + original_first_length) > skb_is_gso(skb)) {
-  					wqe_misc |= NES_NIC_SQ_WQE_LSO_ENABLE;
-  				} else {
+					wqe_misc |= NES_NIC_SQ_WQE_LSO_ENABLE;
+				} else {
 					iph->tot_len = htons(tso_wqe_length + original_first_length - nhoffset);
-  				}
+				}
 
 				nic_sqe->wqe_words[NES_NIC_SQ_WQE_MISC_IDX] = cpu_to_le32(wqe_misc);
 				nic_sqe->wqe_words[NES_NIC_SQ_WQE_LSO_INFO_IDX] =
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index 36d34f4..ea7625a 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -899,7 +899,7 @@ static int nes_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
 		}
 		if (remap_pfn_range(vma, vma->vm_start,
 				virt_to_phys(nesqp->hwqp.sq_vbase) >> PAGE_SHIFT,
-				vma->vm_end - vma->vm_start,	
+				vma->vm_end - vma->vm_start,
 				vma->vm_page_prot)) {
 			nes_debug(NES_DBG_MMAP, "remap_pfn_range failed.\n");
 			return -EAGAIN;
@@ -1373,7 +1373,7 @@ static struct ib_qp *nes_create_qp(struct ib_pd *ibpd,
 						find_next_zero_bit(nes_ucontext->allocated_wqs,
 								   NES_MAX_USER_WQ_REGIONS, nes_ucontext->first_free_wq);
 					/* nes_debug(NES_DBG_QP, "find_first_zero_biton wqs returned %u\n",
-				   			nespd->mmap_db_index); */
+							nespd->mmap_db_index); */
 					if (nesqp->mmap_sq_db_index > NES_MAX_USER_WQ_REGIONS) {
 						nes_debug(NES_DBG_QP,
 							  "db index > max user regions, failing create QP\n");
diff --git a/drivers/infiniband/hw/nes/nes_verbs.h b/drivers/infiniband/hw/nes/nes_verbs.h
index b53e492..96d59ce 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.h
+++ b/drivers/infiniband/hw/nes/nes_verbs.h
@@ -46,11 +46,11 @@ struct nes_ucontext {
 	unsigned long      mmap_wq_offset;
 	unsigned long      mmap_cq_offset; /* to be removed */
 	int                index;		/* rnic index (minor) */
-	unsigned long      allocated_doorbells[BITS_TO_LONGS(NES_MAX_USER_DB_REGIONS)];	
-	u16                mmap_db_index[NES_MAX_USER_DB_REGIONS];	 
+	unsigned long      allocated_doorbells[BITS_TO_LONGS(NES_MAX_USER_DB_REGIONS)];
+	u16                mmap_db_index[NES_MAX_USER_DB_REGIONS];
 	u16                first_free_db;
-	unsigned long      allocated_wqs[BITS_TO_LONGS(NES_MAX_USER_WQ_REGIONS)];  
-	struct nes_qp      *mmap_nesqp[NES_MAX_USER_WQ_REGIONS];  
+	unsigned long      allocated_wqs[BITS_TO_LONGS(NES_MAX_USER_WQ_REGIONS)];
+	struct nes_qp      *mmap_nesqp[NES_MAX_USER_WQ_REGIONS];
 	u16                first_free_wq;
 	struct list_head   cq_reg_mem_list;
 	struct list_head   qp_reg_mem_list;
@@ -72,7 +72,7 @@ struct nes_mr {
 	struct ib_umem    *region;
 	u16               pbls_used;
 	u8                mode;
-	u8                pbl_4k; 
+	u8                pbl_4k;
 };
 
 struct nes_hw_pb {


From glenn at lists.openfabrics.org  Thu Dec 13 16:38:16 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 13 Dec 2007 16:38:16 -0800 (PST)
Subject: [ofa-general] [PATCH 2/6] nes: fix external crc32c() dependency
Message-ID: <20071214003816.90CD8E622FF@openfabrics.org>


Add Select LIBCRC32C to make sure crc32c() can be resolved.  Without
this the driver may not link.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/Kconfig b/drivers/infiniband/hw/nes/Kconfig
index d5f2a12..2aeb7ac 100644
--- a/drivers/infiniband/hw/nes/Kconfig
+++ b/drivers/infiniband/hw/nes/Kconfig
@@ -1,6 +1,7 @@
 config INFINIBAND_NES
 	tristate "NetEffect RNIC Driver"
 	depends on PCI && INET && INFINIBAND
+	select LIBCRC32C
 	---help---
 	  This is a low-level driver for NetEffect RDMA enabled
 	  Network Interface Cards (RNIC).


From glenn at lists.openfabrics.org  Thu Dec 13 16:44:33 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 13 Dec 2007 16:44:33 -0800 (PST)
Subject: [ofa-general] [PATCH 3/6] nes: add missing unlock in error path of
	nes_alloc_fmr()
Message-ID: <20071214004433.C3B61E28702@openfabrics.org>


A spin_unlock_irqrestore() was missing in an error case
of nes_alloc_fmr().

>From Rolands infiniband tree to update OFED.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index ea7625a..cd95aba 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -479,6 +479,7 @@ static struct ib_fmr *nes_alloc_fmr(struct ib_pd *ibpd,
 
 			if (!vpbl.pbl_vbase) {
 				ret = -ENOMEM;
+				spin_unlock_irqrestore(&nesadapter->pbl_lock, flags);
 				goto failed_leaf_vpbl_pages_alloc;
 			}
 

From glenn at lists.openfabrics.org  Thu Dec 13 16:48:54 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 13 Dec 2007 16:48:54 -0800 (PST)
Subject: [ofa-general] [PATCH 4/6] nes: fix nes_get_encoded_size() prototype
Message-ID: <20071214004854.E62B8E623A0@openfabrics.org>


Change the argument type from u32 * to int * to match how
it is called.

>From Rolands infiniband git tree to update OFED.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index cd95aba..97cb51e 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -1046,7 +1046,7 @@ static int nes_destroy_ah(struct ib_ah *ah)
 /**
  * nes_get_encoded_size
  */
-static inline u8 nes_get_encoded_size(u32 *size)
+static inline u8 nes_get_encoded_size(int *size)
 {
 	u8 encoded_size = 0;
 	if (*size <= 32) {


From glenn at lists.openfabrics.org  Thu Dec 13 16:52:50 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 13 Dec 2007 16:52:50 -0800 (PST)
Subject: [ofa-general] [PATCH 5/6] nes: fix napi enable for multiport boards
Message-ID: <20071214005250.E12AEE6239B@openfabrics.org>


napi_enable() was not being called for ports after
the first.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index dcff1b8..0f50cd5 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -276,12 +276,11 @@ static int nes_netdev_open(struct net_device *netdev)
 		/* Enable network packets */
 		nesvnic->linkup = 1;
 		netif_start_queue(netdev);
+		netif_carrier_on(netdev);
+	}
 #ifdef NES_NAPI
-		napi_enable(&nesvnic->napi);
+	napi_enable(&nesvnic->napi);
 #endif
-	} else {
-		netif_carrier_off(netdev);
-	}
 	nesvnic->netdev_open = 1;
 
 	return 0;


From glenn at lists.openfabrics.org  Thu Dec 13 16:57:47 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 13 Dec 2007 16:57:47 -0800 (PST)
Subject: [ofa-general] [PATCH 6/6] nes: remove unnecessary kernel macro
	#ifdefs
Message-ID: <20071214005747.89AD2E623AF@openfabrics.org>


NETIF_F_xxx and HAVE_xxx macro #ifdefs are not needed and
have been removed.

>From Rolands infiniband git tree to update OFED.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index 0f50cd5..f196c43 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -357,9 +357,7 @@ static int nes_nic_send(struct sk_buff *skb, struct net_device *netdev)
 	struct nes_device *nesdev = nesvnic->nesdev;
 	struct nes_hw_nic *nesnic = &nesvnic->nic;
 	struct nes_hw_nic_sq_wqe *nic_sqe;
-#ifdef NETIF_F_TSO
 	struct tcphdr *tcph;
-#endif
 	u16 *wqe_fragment_length;
 	u32 wqe_misc;
 	u16 wqe_fragment_index = 1;	/* first fragment (0) is used by copy buffer */
@@ -385,7 +383,6 @@ static int nes_nic_send(struct sk_buff *skb, struct net_device *netdev)
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		tcph = tcp_hdr(skb);
 		if (1) {
-#ifdef NETIF_F_TSO
 			if (skb_is_gso(skb)) {
 				/* nes_debug(NES_DBG_NIC_TX, "%s: TSO request... seg size = %u\n",
 						netdev->name, skb_is_gso(skb)); */
@@ -395,11 +392,8 @@ static int nes_nic_send(struct sk_buff *skb, struct net_device *netdev)
 						cpu_to_le32(((u32)tcph->doff) |
 						(((u32)(((unsigned char *)tcph) - skb->data)) << 4));
 			} else {
-#endif
 				wqe_misc |= NES_NIC_SQ_WQE_COMPLETION;
-#ifdef NETIF_F_TSO
 			}
-#endif
 		}
 	} else {	/* CHECKSUM_HW */
 		wqe_misc |= NES_NIC_SQ_WQE_DISABLE_CHKSUM | NES_NIC_SQ_WQE_COMPLETION;
@@ -475,7 +469,6 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	struct nes_device *nesdev = nesvnic->nesdev;
 	struct nes_hw_nic *nesnic = &nesvnic->nic;
 	struct nes_hw_nic_sq_wqe *nic_sqe;
-#ifdef NETIF_F_TSO
 	struct tcphdr *tcph;
 	/* struct udphdr *udph; */
 #define NES_MAX_TSO_FRAGS 18
@@ -486,7 +479,6 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	u32 tso_frag_count;
 	u32 tso_wqe_length;
 	u32 curr_tcp_seq;
-#endif
 	u32 wqe_count=1;
 	u32 send_rc;
 	struct iphdr *iph;
@@ -499,10 +491,8 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	u16 wqe_fragment_index=1;
 	u16 hoffset;
 	u16 nhoffset;
-#ifdef NETIF_F_TSO
 	u16 wqes_needed;
 	u16 wqes_available;
-#endif
 	u32 old_head;
 	u32 wqe_misc;
 
@@ -532,7 +522,6 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 	}
 	/* Check if too many fragments */
 	if (unlikely((nr_frags > 4))) {
-#ifdef NETIF_F_TSO
 		if (skb_is_gso(skb)) {
 			nesvnic->segmented_tso_requests++;
 			nesvnic->tso_requests++;
@@ -663,7 +652,6 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 				nesnic->sq_head &= nesnic->sq_size-1;
 			}
 		} else {
-#endif
 			nesvnic->linearized_skbs++;
 			hoffset = skb_transport_header(skb) - skb->data;
 			nhoffset = skb_network_header(skb) - skb->data;
@@ -675,9 +663,7 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 				spin_unlock_irqrestore(&nesnic->sq_lock, flags);
 				return NETDEV_TX_OK;
 			}
-#ifdef NETIF_F_TSO
 		}
-#endif
 	} else {
 		send_rc = nes_nic_send(skb, netdev);
 		if (send_rc != NETDEV_TX_OK) {
@@ -802,7 +788,6 @@ static void nes_netdev_tx_timeout(struct net_device *netdev)
 }
 
 
-#ifdef HAVE_SET_MAC_ADDR
 /**
  * nes_netdev_set_mac_address
  */
@@ -845,10 +830,8 @@ static int nes_netdev_set_mac_address(struct net_device *netdev, void *p)
 	}
 	return 0;
 }
-#endif
 
 
-#ifdef HAVE_MULTICAST
 /**
  * nes_netdev_set_multicast_list
  */
@@ -931,7 +914,6 @@ void nes_netdev_set_multicast_list(struct net_device *netdev)
 		}
 	}
 }
-#endif
 
 
 /**
@@ -1500,14 +1482,11 @@ static struct ethtool_ops nes_ethtool_ops = {
 	.set_tx_csum = ethtool_op_set_tx_csum,
 	.set_rx_csum = nes_netdev_set_rx_csum,
 	.set_sg = ethtool_op_set_sg,
-#ifdef NETIF_F_TSO
 	.get_tso = ethtool_op_get_tso,
 	.set_tso = ethtool_op_set_tso,
-#endif
 };
 
 
-#ifdef NETIF_F_HW_VLAN_TX
 static void nes_netdev_vlan_rx_register(struct net_device *netdev, struct vlan_group *grp)
 {
 	struct nes_vnic *nesvnic = netdev_priv(netdev);
@@ -1525,7 +1504,6 @@ static void nes_netdev_vlan_rx_register(struct net_device *netdev, struct vlan_g
 
 	nes_write_indexed(nesdev, NES_IDX_PCIX_DIAG, u32temp);
 }
-#endif
 
 
 /**
@@ -1561,9 +1539,7 @@ struct net_device *nes_netdev_init(struct nes_device *nesdev,
 	netdev->get_stats = nes_netdev_get_stats;
 	netdev->tx_timeout = nes_netdev_tx_timeout;
 	netdev->set_mac_address = nes_netdev_set_mac_address;
-#ifdef HAVE_MULTICAST
 	netdev->set_multicast_list = nes_netdev_set_multicast_list;
-#endif
 	netdev->change_mtu = nes_netdev_change_mtu;
 	netdev->watchdog_timeo = NES_TX_TIMEOUT;
 	netdev->irq = nesdev->pcidev->irq;
@@ -1576,14 +1552,10 @@ struct net_device *nes_netdev_init(struct nes_device *nesdev,
 #ifdef NES_NAPI
 	netif_napi_add(netdev, &nesvnic->napi, nes_netdev_poll, 128);
 #endif
-#ifdef NETIF_F_HW_VLAN_TX
 	nes_debug(NES_DBG_INIT, "Enabling VLAN Insert/Delete.\n");
 	netdev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
 	netdev->vlan_rx_register = nes_netdev_vlan_rx_register;
-#endif
-#ifdef NETIF_F_LLTX
 	netdev->features |= NETIF_F_LLTX;
-#endif
 
 	/* Fill in the port structure */
 	nesvnic->netdev = netdev;
@@ -1611,12 +1583,7 @@ struct net_device *nes_netdev_init(struct nes_device *nesdev,
 	memcpy(netdev->perm_addr, netdev->dev_addr, 6);
 
 	if ((nesvnic->logical_port < 2) || (nesdev->nesadapter->hw_rev != NE020_REV)) {
-#ifdef NETIF_F_TSO
 		netdev->features |= NETIF_F_TSO | NETIF_F_SG | NETIF_F_IP_CSUM;
-#endif
-#ifdef NETIF_F_GSO
-		netdev->features |= NETIF_F_GSO | NETIF_F_TSO | NETIF_F_SG | NETIF_F_IP_CSUM;
-#endif
 	} else {
 		netdev->features |= NETIF_F_SG | NETIF_F_IP_CSUM;
 	}


From akstcafaimnsdgs at afai.com  Thu Dec 13 17:22:15 2007
From: akstcafaimnsdgs at afai.com (Earlene Crabtree)
Date: Thu, 13 Dec 2007 17:22:15 -0800
Subject: [ofa-general] MSN Money
Message-ID: <01c83dac$b47d29f0$2732353c@akstcafaimnsdgs>

 The internal scoop is that we are looking at record earnings! 
 Take benefit of this incredible information.   

Current: O.52  Insiders accumulating? 
Projected: 1.3O 
Rating: 5/5 
PERMANENT TECH (Other OTC:PERT.PK) 

Please check all these figures with your favorite source.   
PERT.PK is the good deal!  We are expecting third quarter 
numbers to be out soon and are telling all of our members 
to take a position in PERT.PK before the data hits the street.   
These fortuitous figures are going to shock the market and 
send this one way up! 
Give yourself the chance to come out WAY ahead here.   
Fortune favors the bold!


From hollypowellstudios.com at esoleau.com  Thu Dec 13 17:24:23 2007
From: hollypowellstudios.com at esoleau.com (Christopher Martin)
Date: Fri, 14 Dec 2007 10:24:23 +0900
Subject: [ofa-general] 0EM Software
Message-ID: <000601c83def$f364b280$0100007f@wcwsj>

Type address: yflhered. com (please remove spaces)
in your browser
..................
 Microsoft Windows Vista Ultima     $79
 Macromedia Flash Professional 8    $49
 Adobe Premiere 2.O                 $59
 Corel Grafix Suite X3              $59
 Adobe Illustrator CS2              $59
 Adobe Photoshop CS2 V9.O           $69
 Adobe Photoshop CS3 Extended       $89
 Macromedia Studio 8                $99
 Autodesk Autocad 2007             $129
 Adobe Creative Suite 2            $149
 Adobe Creative Suite 3 Prem       $269
..................
        For Mac:
 Adobe Acrobat Pro 7            $69
 Adobe After Effects            $49
 Macromedia Flash Pro 8         $49
 Adobe Creative Suite 2 Prem    $49
 Ableton Live 5.O.1             $49
 Adobe Photoshop CS             $49
..................
Just copy 'yflhered. com' (w/o spaces and quotes)
in address bar of your browser


..................
By the hearth, one of the men 
Megan frowned. She flipped the
Johanna interrupted. She took 


From rdreier at cisco.com  Thu Dec 13 21:02:21 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 13 Dec 2007 21:02:21 -0800
Subject: [ofa-general] Re: [PATCH 3/6] nes: add missing unlock in error path
	of nes_alloc_fmr()
In-Reply-To: <20071214004433.C3B61E28702@openfabrics.org> (Glenn Grundstrom's
	message of "Thu, 13 Dec 2007 16:44:33 -0800 (PST)")
References: <20071214004433.C3B61E28702@openfabrics.org>
Message-ID: <adazlwd6f8i.fsf@cisco.com>

Be careful how you forward patches on so that you preserve author and
Signed-off-by information.  This patch is fairly trivial so it's no
big deal, but I did write the original patch and it appears in my tree
as:

    commit 772bbcb6b5d310bf29a184d280a4d1d8b3350422
    Author: Roland Dreier <rolandd at cisco.com>
    Date:   Wed Dec 12 14:09:29 2007 -0800

    RDMA/nes: Add missing unlock in error path of nes_alloc_fmr()

    Signed-off-by: Roland Dreier <rolandd at cisco.com>

To avoid problems in the future you should fix up your procedure to
preserve the Author: and Signed-off-by: lines when assimilating a
patch.

 - R.


From kliteyn at mellanox.co.il  Thu Dec 13 21:12:57 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 14 Dec 2007 07:12:57 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-14:normal completion
Message-ID: <MTLEXCH016SLWSOEyCP000007eb@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-13
OpenSM git rev = Thu_Dec_13_04:35:53_2007 [e9b0b6dfaeee43707d551fa0ff00127a163c5905]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=640  Pass=577  Fail=63
 
 
Pass:
48 Stability IS1-16.topo
48 Pkey IS1-16.topo
48 OsmTest IS1-16.topo
48 OsmStress IS1-16.topo
48 Multicast IS1-16.topo
16 Stability IS3-loop.topo
16 Stability IS3-128.topo
16 Pkey IS3-128.topo
16 OsmTest IS3-loop.topo
16 OsmTest IS3-128.topo
16 OsmStress IS3-128.topo
16 Multicast IS3-loop.topo
16 Multicast IS3-128.topo
16 FatTree merge-roots-4-ary-2-tree.topo
16 FatTree merge-root-4-ary-3-tree.topo
16 FatTree gnu-stallion-64.topo
16 FatTree blend-4-ary-2-tree.topo
16 FatTree RhinoDDR.topo
16 FatTree FullGnu.topo
16 FatTree 4-ary-2-tree.topo
16 FatTree 2-ary-4-tree.topo
16 FatTree 12-node-spaced.topo
16 FTreeFail 4-ary-2-tree-missing-sw-link.topo
16 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
16 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
16 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
1 LidMgr IS1-16.topo

Failures:
47 LidMgr IS1-16.topo
16 LidMgr IS3-128.topo


From sashak at voltaire.com  Thu Dec 13 21:50:57 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 14 Dec 2007 05:50:57 +0000
Subject: [ofa-general] Re: nightly osm_sim report 2007-12-14:normal
	completion
In-Reply-To: <MTLEXCH016SLWSOEyCP000007eb@mtlexch01.mtl.com>
References: <MTLEXCH016SLWSOEyCP000007eb@mtlexch01.mtl.com>
Message-ID: <20071214055057.GD23319@sashak.voltaire.com>

On 07:12 Fri 14 Dec     , kliteyn at mellanox.co.il wrote:
> 
> Failures:
> 47 LidMgr IS1-16.topo
> 16 LidMgr IS3-128.topo

Interesting... Any details?

Sasha


From vu at mellanox.com  Thu Dec 13 23:11:44 2007
From: vu at mellanox.com (Vu Pham)
Date: Thu, 13 Dec 2007 23:11:44 -0800
Subject: [ofa-general] WinOF 1.0.1 & SRPT Data Corruption
In-Reply-To: <C2F174F99918D54CA2A96E57C5079B6F3551C2@sbc-exmsg2.sbcounty.gov>
References: <C2F174F99918D54CA2A96E57C5079B6F3551C2@sbc-exmsg2.sbcounty.gov>
Message-ID: <47622CB0.1040207@mellanox.com>

I hit the same kind of corruption with linux srp & oracle 
several months ago.

I have not had the real solution/fix yet; however, the 
workaround for the corruption is running scst in single 
threaded mode.

Please apply this patch and re-test

thanks,
-vu

> Running SQLIOSIM Stress Test on Windows 2003 SR2 / WinOF 1.0.1 gives 
> multiple buffer validation failures. The same occurs with WinIB 1.3.0.
>  
>     Error: 0x80070467
>     Error Text: While accessing the hard disk, a disk operation failed 
> even after retries.
>     Description: Buffer validation failed on e:\sqlio.mdf Page: 3728, 
> offset 0x1000
>  
> In actual use, data is corrupted.
>  
> The Linux target side does not seem to provide any clues as to why this 
> is happening. But attached is dmesg anyways.
>  
> Current Configuration:
> - Windows 2003 SR2 SP1 / WinOF 1.0.1
> - Windows 2003 SR2 SP1 / WinIB 1.3.0
> - Linux 2.6.22-r9 / SRP Target git current / SCST svn Rev 234
>  
> SRP Target startup:
> ==============================
> modprobe scst
> modprobe scst_vdisk
> #echo "open nullio none 1024 NULLIO" > /proc/scsi_tgt/vdisk/vdisk
> echo "open testvol /root/testvol 512" > /proc/scsi_tgt/vdisk/vdisk
>  
> echo "add_group sql01" > /proc/scsi_tgt/scsi_tgt
>  
> #Mask LUNs to specific initiators
> echo "add 0x001a4bffff0cd041001708ffffd0dd60" > 
> /proc/scsi_tgt/groups/sql01/names
> echo "add 0x001a4bffff0cd042001708ffffd0dd60" > 
> /proc/scsi_tgt/groups/sql01/names
>  
> # Allocate disks to specific groups
> #echo "add nullio 9" > /proc/scsi_tgt/groups/Default/devices
> echo "add testvol 0" > /proc/scsi_tgt/groups/sql01/devices
>  
> # Load Infiniband SRP Target
> modprobe ib_srpt
> ==============================
>  
> Let me know what I can do to provide more detail on the issue on either 
> Windows or Linux side.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-------------- next part --------------
A non-text attachment was scrubbed...
Name: single_threaded_scst.patch
Type: text/x-diff
Size: 428 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071213/c2ce3cbb/attachment.patch>

From vst at vlnb.net  Thu Dec 13 23:36:36 2007
From: vst at vlnb.net (Vladislav Bolkhovitin)
Date: Fri, 14 Dec 2007 10:36:36 +0300
Subject: [ofa-general] WinOF 1.0.1 & SRPT Data Corruption
In-Reply-To: <C2F174F99918D54CA2A96E57C5079B6F3551C2@sbc-exmsg2.sbcounty.gov>
References: <C2F174F99918D54CA2A96E57C5079B6F3551C2@sbc-exmsg2.sbcounty.gov>
Message-ID: <47623284.1040603@vlnb.net>

Are there any messages in the kernel log on the target?

Sufficool, Stanley wrote:
> Running SQLIOSIM Stress Test on Windows 2003 SR2 / WinOF 1.0.1 gives 
> multiple buffer validation failures. The same occurs with WinIB 1.3.0.
>  
>     Error: 0x80070467
>     Error Text: While accessing the hard disk, a disk operation failed 
> even after retries.
>     Description: Buffer validation failed on e:\sqlio.mdf Page: 3728, 
> offset 0x1000
>  
> In actual use, data is corrupted.
>  
> The Linux target side does not seem to provide any clues as to why this 
> is happening. But attached is dmesg anyways.
>  
> Current Configuration:
> - Windows 2003 SR2 SP1 / WinOF 1.0.1
> - Windows 2003 SR2 SP1 / WinIB 1.3.0
> - Linux 2.6.22-r9 / SRP Target git current / SCST svn Rev 234
>  
> SRP Target startup:
> ==============================
> modprobe scst
> modprobe scst_vdisk
> #echo "open nullio none 1024 NULLIO" > /proc/scsi_tgt/vdisk/vdisk
> echo "open testvol /root/testvol 512" > /proc/scsi_tgt/vdisk/vdisk
>  
> echo "add_group sql01" > /proc/scsi_tgt/scsi_tgt
>  
> #Mask LUNs to specific initiators
> echo "add 0x001a4bffff0cd041001708ffffd0dd60" > 
> /proc/scsi_tgt/groups/sql01/names
> echo "add 0x001a4bffff0cd042001708ffffd0dd60" > 
> /proc/scsi_tgt/groups/sql01/names
>  
> # Allocate disks to specific groups
> #echo "add nullio 9" > /proc/scsi_tgt/groups/Default/devices
> echo "add testvol 0" > /proc/scsi_tgt/groups/sql01/devices
>  
> # Load Infiniband SRP Target
> modprobe ib_srpt
> ==============================
>  
> Let me know what I can do to provide more detail on the issue on either 
> Windows or Linux side.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From vst at vlnb.net  Thu Dec 13 23:41:17 2007
From: vst at vlnb.net (Vladislav Bolkhovitin)
Date: Fri, 14 Dec 2007 10:41:17 +0300
Subject: [ofa-general] WinOF 1.0.1 & SRPT Data Corruption
In-Reply-To: <47623284.1040603@vlnb.net>
References: <C2F174F99918D54CA2A96E57C5079B6F3551C2@sbc-exmsg2.sbcounty.gov>
	<47623284.1040603@vlnb.net>
Message-ID: <4762339D.2020602@vlnb.net>

Vladislav Bolkhovitin wrote:
> Are there any messages in the kernel log on the target?

I mean with timestamps. Dmesg doesn't have them.

Vlad

> Sufficool, Stanley wrote:
> 
>> Running SQLIOSIM Stress Test on Windows 2003 SR2 / WinOF 1.0.1 gives 
>> multiple buffer validation failures. The same occurs with WinIB 1.3.0.
>>  
>>     Error: 0x80070467
>>     Error Text: While accessing the hard disk, a disk operation failed 
>> even after retries.
>>     Description: Buffer validation failed on e:\sqlio.mdf Page: 3728, 
>> offset 0x1000
>>  
>> In actual use, data is corrupted.
>>  
>> The Linux target side does not seem to provide any clues as to why 
>> this is happening. But attached is dmesg anyways.
>>  
>> Current Configuration:
>> - Windows 2003 SR2 SP1 / WinOF 1.0.1
>> - Windows 2003 SR2 SP1 / WinIB 1.3.0
>> - Linux 2.6.22-r9 / SRP Target git current / SCST svn Rev 234
>>  
>> SRP Target startup:
>> ==============================
>> modprobe scst
>> modprobe scst_vdisk
>> #echo "open nullio none 1024 NULLIO" > /proc/scsi_tgt/vdisk/vdisk
>> echo "open testvol /root/testvol 512" > /proc/scsi_tgt/vdisk/vdisk
>>  
>> echo "add_group sql01" > /proc/scsi_tgt/scsi_tgt
>>  
>> #Mask LUNs to specific initiators
>> echo "add 0x001a4bffff0cd041001708ffffd0dd60" > 
>> /proc/scsi_tgt/groups/sql01/names
>> echo "add 0x001a4bffff0cd042001708ffffd0dd60" > 
>> /proc/scsi_tgt/groups/sql01/names
>>  
>> # Allocate disks to specific groups
>> #echo "add nullio 9" > /proc/scsi_tgt/groups/Default/devices
>> echo "add testvol 0" > /proc/scsi_tgt/groups/sql01/devices
>>  
>> # Load Infiniband SRP Target
>> modprobe ib_srpt
>> ==============================
>>  
>> Let me know what I can do to provide more detail on the issue on 
>> either Windows or Linux side.
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit 
>> http://openib.org/mailman/listinfo/openib-general
> 
> 
> 


From vu at mellanox.com  Thu Dec 13 23:41:43 2007
From: vu at mellanox.com (Vu Pham)
Date: Thu, 13 Dec 2007 23:41:43 -0800
Subject: [ofa-general] Re: [Scst-devel] SRP Target Session Hangs
In-Reply-To: <47610768.8080203@vlnb.net>
References: <C2F174F99918D54CA2A96E57C5079B6F3551BB@sbc-exmsg2.sbcounty.gov>
	<47610768.8080203@vlnb.net>
Message-ID: <476233B7.70102@mellanox.com>

Vladislav Bolkhovitin wrote:
> Sufficool, Stanley wrote:
>> 1) We are running WinIB on Windows 2003 SR2 with the OFED SRP Target. 
>> The windows machines initially connect fine to the target, however 
>> when they are restarted without using the "eject io unit" SCST retains 
>> the SRPT session and subsequent connects to the target fail with 
>> "Cannot start device".
> 
> This looks for me like an SRP target driver problem. Hopefully, Vu 
> (CC'ed) can answer you.


The old {srp connection, scst session} will be terminated 
before the new tuple {srp connection, scst session} being 
created. Please collect some traces on both win initiator 
and srp target (/var/log/messages). I want to see why and 
where it fail.

> 
>> Using most recent WinIB, kernel 2.6.22 IB drivers, and recent SRPT and 
>> SCST.
>>  
>> 2) I have yet to get the WinOF 1.0 or 1.0.1 SRP Initiator to work. Is 
>> there any reason other than my lack of trying harder that this 
>> wouldn't work?
> 
> This is question for Vu as well


Please use the latest srp target driver in ofed-1.3


> 
>> 3) This may not be the forum for this, but how can you terminate a 
>> session using SCST proc commands?
> 
> SCST can't (and shouldn't) do that, because it has no knowledge about 
> how sessions with particular target transport created and destroyed. 
> Sessions management is the target driver's duty. Ask Vu that feature.


Normally {srp connection, scst session} will be destroyed 
upon disconnect reques or new connection request (without 
multi-channels flag) coming from initiator or QP is in error 
condition.

Sometimes we fail to terminate scst session when there are 
outstanding I/Os on scst session which does not match the 
outstanding I/Os on the srp connection. Right after srp 
calls scst_init_cmd_done, it increases the active command 
counter and decreases the counter on on_free_command.

Any idea Vlad?

thanks,
-vu

> 
>> Thanks in advance
>>  
>>
>> *Stanley Sufficool
>> *Systems Analyst I
>> County of San Bernardino
>>
>> ------------------------------------------------------------------------
>>
>> -------------------------------------------------------------------------
>> SF.Net email is sponsored by:
>> Check out the new SourceForge.net Marketplace.
>> It's the best place to buy or sell services
>> for just about anything Open Source.
>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace 
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Scst-devel mailing list
>> Scst-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scst-devel
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general


From iitgr at ubbi.com.br  Fri Dec 14 01:30:36 2007
From: iitgr at ubbi.com.br (acknowledge)
Date: Fri, 14 Dec 2007 12:30:36 +0300
Subject: [ofa-general] wheat
Message-ID: <47624D3C.2040606@ubbi.com.br>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071214/c8fd6360/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: considerably.gif
Type: image/gif
Size: 9497 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071214/c8fd6360/attachment.gif>

From dwsdisportswearm at sdisportswear.com  Fri Dec 14 02:33:07 2007
From: dwsdisportswearm at sdisportswear.com (Candace Price)
Date: Fri, 14 Dec 2007 13:33:07 +0300
Subject: [ofa-general] Receive a real time experience of gambling without
	visiting a real casino!
Message-ID: <01c83e55$dc58ab80$6b52ed53@dwsdisportswearm>

 Now you have a brilliant possibility to feel casino excitement without leaving your house. All your favorite games are available to play in Golden Gate Casino. Just download free software and start playing.

 Great online casino Golden Gate is one of the leading casinos known for fair playing, excellent customer service available to contact 24 hour a day, 7 days a week and prompt payouts.

http://geocities.com/UlyssesCross59/

   Play casino games any time you like.


From johndeck20003 at yahoo.it  Fri Dec 14 02:28:29 2007
From: johndeck20003 at yahoo.it (John Decker)
Date: Fri, 14 Dec 2007 11:28:29 +0100 (CET)
Subject: [ofa-general] From the Desk of John Decker
Message-ID: <146730.56243.qm@web27102.mail.ukl.yahoo.com>

>From the Desk of John Decker
  Federal Ministry of Finance 
  103 Water Carrington road,
  Lagos-Nigeria.
  Tel: 234-8055720888
  Fax: 234-1-7744309
   
     
                               CONFIDENTIAL INFORMATION 
   
   
  Sir,
   
  My name is John Decker, I work with the Ministry of Finance Nigeria , and I have seen that several times people tried to divert your funds into their own personal account I mean those you call your partner in Nigeria and London .
   
  Now I write to you in respect of the amount which I have been able to send to you through the Federal Government Diplomat who has arrived in Europe, now I want you to know that the diplomat would deliver the funds which I have package as a diplomatic compensation to you and the amount in the consignment is $6,000.000.00 (Six Million Dollars Only).
   
  To this end, I did not disclose the content to the diplomat that the box contain money but I told him that it is your compensation from the Government House here in Nigeria and from our affiliate in London, I want you to know that this funds would help your financial status as I have seen in record in London and here in Nigeria that you have spent a lot trying to receive this funds. I am not demanding so much from you but Just 20% of the funds and after this I shall proceed in helping you to collect the total amount.
   
  I want you to give me a call immediately you receive this Email message so that I can give you the contact information of the diplomat who has arrived in Europe for the past two days and have been waiting to get your information so that he can proceed with the delivery to your doorstep.  
  
  Yours Sincerely,
   
  John Decker
  Federal Ministry of Finance.

       
---------------------------------

---------------------------------
L'email della prossima generazione? Puoi averla con la nuova Yahoo! Mail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071214/edf2b428/attachment.html>

From vlad at lists.openfabrics.org  Fri Dec 14 03:06:41 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Fri, 14 Dec 2007 03:06:41 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071214-0200 daily build status
Message-ID: <20071214110642.0029AE61CE6@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.20
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.18
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.17
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:


From vst at vlnb.net  Fri Dec 14 01:24:54 2007
From: vst at vlnb.net (Vladislav Bolkhovitin)
Date: Fri, 14 Dec 2007 12:24:54 +0300
Subject: ***SPAM*** Re: [Scst-devel] [ofa-general] Re: SRP Target Session Hangs
In-Reply-To: <476233B7.70102@mellanox.com>
References: <C2F174F99918D54CA2A96E57C5079B6F3551BB@sbc-exmsg2.sbcounty.gov>	<47610768.8080203@vlnb.net>
	<476233B7.70102@mellanox.com>
Message-ID: <47624BE6.8090909@vlnb.net>

Vu Pham wrote:
>>>3) This may not be the forum for this, but how can you terminate a 
>>>session using SCST proc commands?
>>
>>SCST can't (and shouldn't) do that, because it has no knowledge about 
>>how sessions with particular target transport created and destroyed. 
>>Sessions management is the target driver's duty. Ask Vu that feature.
> 
> Normally {srp connection, scst session} will be destroyed 
> upon disconnect reques or new connection request (without 
> multi-channels flag) coming from initiator or QP is in error 
> condition.
> 
> Sometimes we fail to terminate scst session when there are 
> outstanding I/Os on scst session which does not match the 
> outstanding I/Os on the srp connection. Right after srp 
> calls scst_init_cmd_done, it increases the active command 
> counter and decreases the counter on on_free_command.
> 
> Any idea Vlad?

Sorry, I don't understand, what's the problem with them?

Vlad


From dwsorbiom at sorbio.com  Fri Dec 14 07:15:18 2007
From: dwsorbiom at sorbio.com (Joy Dowdy)
Date: Fri, 14 Dec 2007 22:15:18 +0700
Subject: [ofa-general] Medications that you need.
Message-ID: <01c83e9e$cf13df00$c2ada17d@dwsorbiom>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/OrvilleCastro18/

We provide confidential and secure purchase! 


From swise at opengridcomputing.com  Fri Dec 14 09:36:06 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 14 Dec 2007 11:36:06 -0600
Subject: [ofa-general] SDP and iWARP in OFED 1.2?
In-Reply-To: <47605E5A.8060704@hpc.ufl.edu>
References: <47605E5A.8060704@hpc.ufl.edu>
Message-ID: <4762BF06.3000809@opengridcomputing.com>


Craig Prescott wrote:
>
> Hi;
>
> I am trying run the netperf SDP_STREAM test between a
> pair of Chelsio S310-SR iWARP cards in x86_64 hosts
> (CentOS 5.0, OFED 1.2).
>
> My netperf client command starts like so:
>
> [root at tebow1 ~]# LD_PRELOAD=/usr/lib64/libsdp.so 
> /opt/netperf/bin/netperf -l 10 -H xx.xx.xx.xx -L yy.yy.yy.yy -c -C -t 
> SDP_STREAM
> SDP STREAM TEST from yy.yy.yy.yy (yy.yy.yy.yy) port 0 AF_INET to 
> xx.xx.xx.xx (xx.xx.xx.xx) port 0 AF_INET
>
> And then the client panics.  The RIP on the console refers to
> ib_sdp:sdp_connected_handler, and I can see rdma_cm:cma_iw_handler
> and iw_cm:cm_work_handler in the traceback.  From strace on the
> client, the last output I get is:
>
> connect(8, {sa_family=AF_INET, sin_port=htons(12865), 
> sin_addr=inet_addr("128.227.253.92")}, 128
>
> I can see the connection on the netserver host in sdpnetstat:
>
> Proto Recv-Q Send-Q Local Address           Foreign Address
> sdp        0      0 xx.xx.xx.xx:51804       yy.yy.yy.yy:48590
>
> My libsdp.conf on both hosts is very simple:
>
> use both server * *:*
> use both client * *:*
>
> The hosts I'm using also have 4X SDR IB HCAs (mthca),
> if that matters.  The SDP_STREAM test runs perfectly
> over IB.
>
> Is SDP on iWARP supposed to work in OFED 1.2?  Any advice
> or hints on getting it working would be most appreciated.
>

Unfortunately, SDP over Chelsio's iWARP rnic hasn't been tested at all 
and is not supported.

Also, if you want to use RDMA over Chelsio's rnic, I recommend you get 
the latest 1.2.5.4 daily build which has all the current cxgb3 fixes 
included.

Alsoalso, Chelsio has TOE/Sockets drivers available on their web site to 
support full TCP offload via the sockets interface.  Checkout 
http://service.chelsio.com for the source.

Steve.


From swise at opengridcomputing.com  Fri Dec 14 10:45:04 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 14 Dec 2007 12:45:04 -0600
Subject: [ofa-general] ofed-1.3-rc1 problem
Message-ID: <4762CF30.2030202@opengridcomputing.com>

linking with libibumad fails on ofed-1.3-rc1.  I get a 'cannot find 
-libumad' from ld.  I looked in /usr/lib64 and there wasn't a link from 
libibumad.so to libibumad.so.1.0.2.  I added the link and the ld works 
now.  This was on PPC64.

I think this is some install problem with libibumad.

Steve.


From prescott at hpc.ufl.edu  Fri Dec 14 11:00:46 2007
From: prescott at hpc.ufl.edu (Craig Prescott)
Date: Fri, 14 Dec 2007 14:00:46 -0500
Subject: [ofa-general] SDP and iWARP in OFED 1.2?
In-Reply-To: <4762BF06.3000809@opengridcomputing.com>
References: <47605E5A.8060704@hpc.ufl.edu>
	<4762BF06.3000809@opengridcomputing.com>
Message-ID: <4762D2DE.7000902@hpc.ufl.edu>

Steve Wise wrote:
> 
> 
> Craig Prescott wrote:
>>
<snip>
>> Is SDP on iWARP supposed to work in OFED 1.2?  Any advice
>> or hints on getting it working would be most appreciated.
>>
> 
> Unfortunately, SDP over Chelsio's iWARP rnic hasn't been tested at all 
> and is not supported.
> 
> Also, if you want to use RDMA over Chelsio's rnic, I recommend you get 
> the latest 1.2.5.4 daily build which has all the current cxgb3 fixes 
> included.
> 
> Alsoalso, Chelsio has TOE/Sockets drivers available on their web site to 
> support full TCP offload via the sockets interface.  Checkout 
> http://service.chelsio.com for the source.

Thanks for the advice!  I will try out the 1.2.5.4 daily build
on our hosts with the Chelsio RNICs and visit service.chelsio.com.
I am still interested in SDP on iWARP, though - are there any
OFED-supported RNICs over which SDP would work?

As I mentioned, the nodes with the Chelsio RNICs also have mthca
devices connected to our IB fabric.  Are 1.2 and 1.2.5.x meant to
be interoperable?

Thanks,
Craig


From swise at opengridcomputing.com  Fri Dec 14 11:34:55 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 14 Dec 2007 13:34:55 -0600
Subject: [ofa-general] SDP and iWARP in OFED 1.2?
In-Reply-To: <4762D2DE.7000902@hpc.ufl.edu>
References: <47605E5A.8060704@hpc.ufl.edu>
	<4762BF06.3000809@opengridcomputing.com>
	<4762D2DE.7000902@hpc.ufl.edu>
Message-ID: <4762DADF.8060604@opengridcomputing.com>


Craig Prescott wrote:
> Steve Wise wrote:
>>
>>
>> Craig Prescott wrote:
>>>
> <snip>
>>> Is SDP on iWARP supposed to work in OFED 1.2?  Any advice
>>> or hints on getting it working would be most appreciated.
>>>
>>
>> Unfortunately, SDP over Chelsio's iWARP rnic hasn't been tested at 
>> all and is not supported.
>>
>> Also, if you want to use RDMA over Chelsio's rnic, I recommend you 
>> get the latest 1.2.5.4 daily build which has all the current cxgb3 
>> fixes included.
>>
>> Alsoalso, Chelsio has TOE/Sockets drivers available on their web site 
>> to support full TCP offload via the sockets interface.  Checkout 
>> http://service.chelsio.com for the source.
>
> Thanks for the advice!  I will try out the 1.2.5.4 daily build
> on our hosts with the Chelsio RNICs and visit service.chelsio.com.
> I am still interested in SDP on iWARP, though - are there any
> OFED-supported RNICs over which SDP would work?
>
Chelsio is the only iWARP driver in ofed-1.2.x.  ofed-1.3 has NetEffect 
too.  Dunno if they support SDP.

If you go up to 1.2.5.4 or ofed-1.3-rc1 and try out SDP over chelsio, 
I'll help ya debug stuff as time permits.

> As I mentioned, the nodes with the Chelsio RNICs also have mthca
> devices connected to our IB fabric.  Are 1.2 and 1.2.5.x meant to
> be interoperable?

They should be.  Its just that there are a few dozen bug fixes from 1.2 
to 1.2.5.4 in the chelsio drivers.  Also, if you go up to 1.2.5.4, 
you'll need the 5.0 firmware from cheslio's web site.  Pull down the 
firmware and put it in /lib/firmware.  Then it should get upgraded 
automagically.

Stevo.


From swise at opengridcomputing.com  Fri Dec 14 11:53:10 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 14 Dec 2007 13:53:10 -0600
Subject: [ofa-general] SDP and iWARP in OFED 1.2?
In-Reply-To: <4762DADF.8060604@opengridcomputing.com>
References: <47605E5A.8060704@hpc.ufl.edu>
	<4762BF06.3000809@opengridcomputing.com>
	<4762D2DE.7000902@hpc.ufl.edu>
	<4762DADF.8060604@opengridcomputing.com>
Message-ID: <4762DF26.1030904@opengridcomputing.com>


>
>> As I mentioned, the nodes with the Chelsio RNICs also have mthca
>> devices connected to our IB fabric.  Are 1.2 and 1.2.5.x meant to
>> be interoperable?
>
> They should be.  Its just that there are a few dozen bug fixes from 
> 1.2 to 1.2.5.4 in the chelsio drivers.  Also, if you go up to 1.2.5.4, 
> you'll need the 5.0 firmware from cheslio's web site.  Pull down the 
> firmware and put it in /lib/firmware.  Then it should get upgraded 
> automagically.
>
> Stevo.
>
>

Note there's no interoperability between IB and iWARP...  I mean 
ofed-1.2 should interoperate with ofed-1.2.5 using the same 
devices/transports...


From rdreier at cisco.com  Fri Dec 14 12:08:24 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Fri, 14 Dec 2007 12:08:24 -0800
Subject: [ofa-general] [ANNOUNCE] libmlx4 1.0 released
Message-ID: <adahcil59af.fsf@cisco.com>

libmlx4 is a userspace driver for Mellanox ConnectX InfiniBand HCAs.
It is a plug-in module for libibverbs that allows programs to use
Mellanox hardware directly from userspace.

The first stable release, libmlx4 1.0, is available from

    http://www.openfabrics.org/downloads/mlx4/libmlx4-1.0.tar.gz

with sha1sum

    3f11484c71f496730f9b465b2a6e2b84b5522a94  /data/home/roland/libmlx4-1.0.tar.gz

I also tagged the 1.0 release of libmlx4 and pushed it out to my
git tree on kernel.org:

    git://git.kernel.org/pub/scm/libs/infiniband/libmlx4.git

(the name of the tag is libmlx4-1.0).

This is the first formal release of libmlx4.  Things appear quite
usable at the moment.  Please test and let me know if you see anything
that needs to be fixed.

Builds for the Ubuntu 8.04 development release will be available by
adding the lines

    deb http://ppa.launchpad.net/roland-digitalvampire/ubuntu hardy main
    deb-src http://ppa.launchpad.net/roland-digitalvampire/ubuntu hardy main

to your /etc/sources.list file.  I have also started the process of
getting libmlx4 packages into the main Debian and Fedora archives.

The complete list of changes since 1.0-rc1 is:

Roland Dreier (3):
      Update Debian policy version to 3.7.3
      Update summary line to mention "ConnectX" in Fedora RPM spec file
      Roll libmlx4 1.0 release


From prescott at hpc.ufl.edu  Fri Dec 14 12:12:54 2007
From: prescott at hpc.ufl.edu (Craig Prescott)
Date: Fri, 14 Dec 2007 15:12:54 -0500
Subject: [ofa-general] SDP and iWARP in OFED 1.2?
In-Reply-To: <4762DADF.8060604@opengridcomputing.com>
References: <47605E5A.8060704@hpc.ufl.edu>
	<4762BF06.3000809@opengridcomputing.com>
	<4762D2DE.7000902@hpc.ufl.edu>
	<4762DADF.8060604@opengridcomputing.com>
Message-ID: <4762E3C6.5040307@hpc.ufl.edu>

Steve Wise wrote:
> 
> 
> Craig Prescott wrote:
>> Steve Wise wrote:
>>>
>>>
>>> Craig Prescott wrote:
>>>>
>> <snip>
>>>> Is SDP on iWARP supposed to work in OFED 1.2?  Any advice
>>>> or hints on getting it working would be most appreciated.
>>>>
>>>
>>> Unfortunately, SDP over Chelsio's iWARP rnic hasn't been tested at 
>>> all and is not supported.
>>>
>>> Also, if you want to use RDMA over Chelsio's rnic, I recommend you 
>>> get the latest 1.2.5.4 daily build which has all the current cxgb3 
>>> fixes included.
>>>
>>> Alsoalso, Chelsio has TOE/Sockets drivers available on their web site 
>>> to support full TCP offload via the sockets interface.  Checkout 
>>> http://service.chelsio.com for the source.
>>
>> Thanks for the advice!  I will try out the 1.2.5.4 daily build
>> on our hosts with the Chelsio RNICs and visit service.chelsio.com.
>> I am still interested in SDP on iWARP, though - are there any
>> OFED-supported RNICs over which SDP would work?
>>
> Chelsio is the only iWARP driver in ofed-1.2.x.  ofed-1.3 has NetEffect 
> too.  Dunno if they support SDP.
> 
> If you go up to 1.2.5.4 or ofed-1.3-rc1 and try out SDP over chelsio, 
> I'll help ya debug stuff as time permits.

That would be great.  I'll try one of these on Monday and
let you know how it goes.

>> As I mentioned, the nodes with the Chelsio RNICs also have mthca
>> devices connected to our IB fabric.  Are 1.2 and 1.2.5.x meant to
>> be interoperable?
> 
> They should be.  Its just that there are a few dozen bug fixes from 1.2 
> to 1.2.5.4 in the chelsio drivers.  Also, if you go up to 1.2.5.4, 
> you'll need the 5.0 firmware from cheslio's web site.  Pull down the 
> firmware and put it in /lib/firmware.  Then it should get upgraded 
> automagically.
> 

Ok, will do.

Thanks a ton,
Craig


From prescott at hpc.ufl.edu  Fri Dec 14 12:14:54 2007
From: prescott at hpc.ufl.edu (Craig Prescott)
Date: Fri, 14 Dec 2007 15:14:54 -0500
Subject: [ofa-general] SDP and iWARP in OFED 1.2?
In-Reply-To: <4762DF26.1030904@opengridcomputing.com>
References: <47605E5A.8060704@hpc.ufl.edu>
	<4762BF06.3000809@opengridcomputing.com>
	<4762D2DE.7000902@hpc.ufl.edu>
	<4762DADF.8060604@opengridcomputing.com>
	<4762DF26.1030904@opengridcomputing.com>
Message-ID: <4762E43E.8070500@hpc.ufl.edu>

Steve Wise wrote:
> 
>>
>>> As I mentioned, the nodes with the Chelsio RNICs also have mthca
>>> devices connected to our IB fabric.  Are 1.2 and 1.2.5.x meant to
>>> be interoperable?
>>
>> They should be.  Its just that there are a few dozen bug fixes from 
>> 1.2 to 1.2.5.4 in the chelsio drivers.  Also, if you go up to 1.2.5.4, 
>> you'll need the 5.0 firmware from cheslio's web site.  Pull down the 
>> firmware and put it in /lib/firmware.  Then it should get upgraded 
>> automagically.
>>
>> Stevo.
>>
>>
> 
> Note there's no interoperability between IB and iWARP...  I mean 
> ofed-1.2 should interoperate with ofed-1.2.5 using the same 
> devices/transports...
> 

Thanks - that's what I was meaning when I asked.  Good to know.

Cheers,
Craig


From sashak at voltaire.com  Fri Dec 14 14:16:36 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 14 Dec 2007 22:16:36 +0000
Subject: [ofa-general] [PATCH] opensm: recover only for base LID values >=
	0xc000
In-Reply-To: <20071213223704.GL708@sashak.voltaire.com>
References: <20071213012552.GZ23319@sashak.voltaire.com>
	<1197581871.23465.107.camel@hrosenstock-ws.xsigo.com>
	<20071213223704.GL708@sashak.voltaire.com>
Message-ID: <20071214221636.GF23319@sashak.voltaire.com>


In initial stage of a subnet value of LID = 0 in PortInfo response can be
valid. Also when LID was zeroed externally OpenSM will ignore this event
(and it was the reason for regression shown in recent ibmgtsim runs). So
we will try to recover LID value only when it is >= 0xc000. This should
fix bug 246 as well and preserves original OpenSM LID "trimmer" behavior
for the case of LID = 0.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_port_info_rcv.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c
index ea0cb21..1a2a200 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -101,8 +101,7 @@ __osm_pi_rcv_set_sm(IN const osm_pi_rcv_t * const p_rcv,
 static void pi_rcv_check_and_fix_lid(osm_log_t *log, ib_port_info_t * const pi,
 				     osm_physp_t * p)
 {
-	if ((cl_ntoh16(pi->base_lid) > IB_LID_UCAST_END_HO) ||
-	    (cl_ntoh16(pi->base_lid) < IB_LID_UCAST_START_HO)) {
+	if (cl_ntoh16(pi->base_lid) > IB_LID_UCAST_END_HO) {
 		osm_log(log, OSM_LOG_ERROR,
 			"pi_rcv_check_and_fix_lid: ERR 0F04: "
 			"Got invalid base LID 0x%x from the network. "
-- 
1.5.3.4.206.g58ba4


From adit.262 at gmail.com  Fri Dec 14 14:38:36 2007
From: adit.262 at gmail.com (Adit Ranadive)
Date: Fri, 14 Dec 2007 17:38:36 -0500
Subject: [ofa-general] HCA Firmware Upgrade from 4.6.0 to 4.8.200
Message-ID: <d2ad857f0712141438sf0ea317kc2cf939021d3c5ff@mail.gmail.com>

Hi,

I was trying to perform an upgrade from the version 4.6.0 to the
version 4.8.200 as given on the Mellanox site.
I have the follwoing HCA:
InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor
compatibility mode) (rev a0) - from lspci

fw_ver:                         4.6.0
        node_guid:                      0005:ad00:0005:4208
        sys_image_guid:                 0005:ad00:0005:420b
        vendor_id:                      0x05ad
        vendor_part_id:                 25208
        hw_ver:                         0xA0
        phys_port_cnt:                  2


-- 
Adit Ranadive
MS CS Candidate
Georgia Institute of Technology,
Atlanta, GA


From ssufficool at rov.sbcounty.gov  Fri Dec 14 14:44:30 2007
From: ssufficool at rov.sbcounty.gov (Sufficool, Stanley)
Date: Fri, 14 Dec 2007 14:44:30 -0800
Subject: [ofa-general] HCA Firmware Upgrade from 4.6.0 to 4.8.200
In-Reply-To: <d2ad857f0712141438sf0ea317kc2cf939021d3c5ff@mail.gmail.com>
Message-ID: <C2F174F99918D54CA2A96E57C5079B6F3551D2@sbc-exmsg2.sbcounty.gov>

Most likely: MHGA28-1T

You really need the PSID to properly identify this.

-----Original Message-----
From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Adit
Ranadive
Sent: Friday, December 14, 2007 2:39 PM
To: general at lists.openfabrics.org
Subject: [ofa-general] HCA Firmware Upgrade from 4.6.0 to 4.8.200


Hi,

I was trying to perform an upgrade from the version 4.6.0 to the version
4.8.200 as given on the Mellanox site. I have the follwoing HCA:
InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor
compatibility mode) (rev a0) - from lspci

fw_ver:                         4.6.0
        node_guid:                      0005:ad00:0005:4208
        sys_image_guid:                 0005:ad00:0005:420b
        vendor_id:                      0x05ad
        vendor_part_id:                 25208
        hw_ver:                         0xA0
        phys_port_cnt:                  2


-- 
Adit Ranadive
MS CS Candidate
Georgia Institute of Technology,
Atlanta, GA
_______________________________________________
general mailing list
general at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


From adit.262 at gmail.com  Fri Dec 14 14:44:52 2007
From: adit.262 at gmail.com (Adit Ranadive)
Date: Fri, 14 Dec 2007 17:44:52 -0500
Subject: [ofa-general] Upgrade HCA Firmware from 4.6.0 to 4.8.200
Message-ID: <d2ad857f0712141444q6f1c5071rdcc991a651e85a44@mail.gmail.com>

Hi,

I was trying to perform an upgrade from the version 4.6.0 to the
version 4.8.200 as given on the Mellanox site.
I have the follwoing HCA:
InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor
compatibility mode) (rev a0) - from lspci

(from ibv_devinfo)
fw_ver:                         4.6.0
       node_guid:                      0005:ad00:0005:4208
       sys_image_guid:                 0005:ad00:0005:420b
       vendor_id:                      0x05ad
       vendor_part_id:                 25208
       hw_ver:                         0xA0
       phys_port_cnt:                  2

I tried the following firmware versions from the mellanox site
(http://www.mellanox.com/support/firmware_table_IH3Ex.php) :
MHEL-CF128-T
MHEL-CF128
Both are giving me the Invariant sector mismatch error.

Most of the other ones on the website give any error saying that they
are not compatible with this HCA.

Is there a new/correct version which can be used to upgrade my HCA?
Also I am using flint to burn the image.

Thanks,
Adit
-- 
Adit Ranadive
MS CS Candidate
Georgia Institute of Technology,
Atlanta, GA


From danderson at lnxi.com  Fri Dec 14 17:31:06 2007
From: danderson at lnxi.com (David B. Anderson)
Date: Fri, 14 Dec 2007 18:31:06 -0700
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to select
 sp4 patches for
 SLES9 kernel with minor versions equal or greater than 305
Message-ID: <47632E5A.5070008@lnxi.com>

Hi,

I've created the following patch for OFED 1.2.5.4 to have the kernel for 
SLES9 SP4 recognized (2.6.5-7.308).

Even with the patch I then had two back port patches not apply cleanly 
(cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched them but now 
I'm getting the following compiler errors:

In file included from /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/backport/2.6.5_sles9_sp4/include/linux/module.h:4,
                 from /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/backport/2.6.5_sles9_sp4/include/linux/device.h:4,
                 from 
/usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/backport/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.c:32:
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/backport/2.6.5_sles9_sp4/include/linux/sched.h:8: 
warning: static declaration for `wait_for_completion_timeout' follows 
non-static
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.c:67: 
warning: initialization from incompatible pointer type
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.c: 
In function `addr_resolve_remote':
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.c:192: 
error: structure has no member named `idev'
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.c:193: 
error: structure has no member named `idev'
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.c:197: 
error: structure has no member named `idev'
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.c: 
At top level:
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/backport/2.6.5_sles9_sp4/include/linux/device.h:48: 
warning: `class_create' defined but not used
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/backport/2.6.5_sles9_sp4/include/linux/device.h:82: 
warning: `class_destroy' defined but not used
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/backport/2.6.5_sles9_sp4/include/linux/device.h:108: 
warning: `class_device_create' defined but not used
make[6]: *** 
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core/addr.o] 
Error 1
make[5]: *** 
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband/core] 
Error 2
make[4]: *** 
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband] 
Error 2
make[3]: *** 
[_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] Error 2
make[2]: *** [modules] Error 2
make[1]: *** [modules] Error 2
make[1]: Leaving directory `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
make: *** [kernel] Error 2

Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?

Thanks

-- 
David B. Anderson 
Linux Networx
Sr. Software Engineer
Email: danderson at lnxi.com
Phone: (801) 649-1311

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-LNXI-Fixed-ofed_scripts-configure-to-select-sp4-patc.patch
Type: text/x-patch
Size: 993 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071214/8104fb46/attachment.bin>

From vwhhve at bozobits.com  Fri Dec 14 20:42:32 2007
From: vwhhve at bozobits.com (Michele Benoit)
Date: Sat, 15 Dec 2007 11:42:32 +0700
Subject: [ofa-general] Feel comfortable gambling with Golden Gate Casino!
Message-ID: <01c83f0f$93fdc400$71e1d2cb@vwhhve>

 Where to gamble online? Check the list of the games in Golden Gate Casino! Just download free software and play from the comfort of your home! Get started and receive $2400 welcome bonus!

 Play with us and you'll appreciate our support available 24/7, level of security, the quality of software! Enjoy our big bonuses!

http://geocities.com/ZelmaHorne38/

   Start downloading free software now!


From kliteyn at mellanox.co.il  Fri Dec 14 21:14:22 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 15 Dec 2007 07:14:22 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-15:normal completion
Message-ID: <MTLEXCH01QXv1C3wYEQ00000919@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-14
OpenSM git rev = Thu_Dec_13_04:35:53_2007 [e9b0b6dfaeee43707d551fa0ff00127a163c5905]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=640  Pass=576  Fail=64
 
 
Pass:
48 Stability IS1-16.topo
48 OsmTest IS1-16.topo
48 OsmStress IS1-16.topo
48 Multicast IS1-16.topo
47 Pkey IS1-16.topo
16 Stability IS3-loop.topo
16 Stability IS3-128.topo
16 Pkey IS3-128.topo
16 OsmTest IS3-loop.topo
16 OsmTest IS3-128.topo
16 OsmStress IS3-128.topo
16 Multicast IS3-loop.topo
16 Multicast IS3-128.topo
16 FatTree merge-roots-4-ary-2-tree.topo
16 FatTree merge-root-4-ary-3-tree.topo
16 FatTree gnu-stallion-64.topo
16 FatTree blend-4-ary-2-tree.topo
16 FatTree RhinoDDR.topo
16 FatTree FullGnu.topo
16 FatTree 4-ary-2-tree.topo
16 FatTree 2-ary-4-tree.topo
16 FatTree 12-node-spaced.topo
16 FTreeFail 4-ary-2-tree-missing-sw-link.topo
16 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
16 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
16 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
1 LidMgr IS1-16.topo

Failures:
47 LidMgr IS1-16.topo
16 LidMgr IS3-128.topo
1 Pkey IS1-16.topo


From vlad at lists.openfabrics.org  Sat Dec 15 03:05:34 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sat, 15 Dec 2007 03:05:34 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071215-0200 daily build status
Message-ID: <20071215110534.BBE6BE62301@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.12
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:


From dwsoundwerftm at soundwerft.de  Sat Dec 15 03:18:19 2007
From: dwsoundwerftm at soundwerft.de (Stefanie Jewell)
Date: Sat, 15 Dec 2007 19:18:19 +0800
Subject: [ofa-general] Hot sex with Viagra pills
Message-ID: <01c83f4f$4012cf80$6ae5f3de@dwsoundwerftm>

Do you love sex but have ed problems? 
Forget about them with Viagra or Cialis meds!
Save your money, buy high-quality meds at low price!

http://geocities.com/BrianBoyer71/

Instant shipping and quality are guaranteed! 


From rdreier at cisco.com  Sat Dec 15 03:47:05 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Sat, 15 Dec 2007 03:47:05 -0800
Subject: [ofa-general] [ANNOUNCE] libmlx4 1.0 released
References: <adahcil59af.fsf@cisco.com>
Message-ID: <aday7bw41ty.fsf@cisco.com>

 > Builds for the Ubuntu 8.04 development release will be available by
 > adding the lines
 > 
 >     deb http://ppa.launchpad.net/roland-digitalvampire/ubuntu hardy main
 >     deb-src http://ppa.launchpad.net/roland-digitalvampire/ubuntu hardy main

Correction:

    deb http://ppa.launchpad.net/roland.dreier/ubuntu hardy main
    deb-src http://ppa.launchpad.net/roland.dreier/ubuntu hardy main

 - R.


From pwtkw at bobbibrowncosmetics.com  Sat Dec 15 10:22:43 2007
From: pwtkw at bobbibrowncosmetics.com (Kelli Dudley)
Date: Sat, 15 Dec 2007 13:22:43 -0500
Subject: [ofa-general] Hey you its Jen
Message-ID: <01c83f1d$92d39380$d6a001be@pwtkw>

Hey you 
I saw your profile on-line
Maybe we can chat today?
email me at Jen at SimOldGlory.info and I will reply with a Picture and info right away.


From ogotvkdahcic at boekhoorn.com  Sat Dec 15 10:23:01 2007
From: ogotvkdahcic at boekhoorn.com (Ulysses Cordova)
Date: Sat, 15 Dec 2007 13:23:01 -0500
Subject: [ofa-general] Hey you its Jen
Message-ID: <01c83f1d$9d8e2880$d6a001be@ogotvkdahcic>

Hi 
I saw your profile on-line
Maybe we can chat today?
email me at Nicky at SimOldGlory.info and I will reply with a Picture and info right away.


From sashak at voltaire.com  Sat Dec 15 12:39:42 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 15 Dec 2007 20:39:42 +0000
Subject: [ofa-general] [PATCH] opensm/osm_port_info_rcv: node instead of port
	as parameter for osm_pi_rcv_process_set()
Message-ID: <20071215203942.GG23319@sashak.voltaire.com>


osm_pi_rcv_process_set() function does not require osm_port_t object,
but osm_node_t instead.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_port_info_rcv.c |   12 +++++-------
 1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/opensm/opensm/osm_port_info_rcv.c b/opensm/opensm/osm_port_info_rcv.c
index 1a2a200..1987e2c 100644
--- a/opensm/opensm/osm_port_info_rcv.c
+++ b/opensm/opensm/osm_port_info_rcv.c
@@ -505,11 +505,10 @@ osm_pi_rcv_init(IN osm_pi_rcv_t * const p_rcv,
  **********************************************************************/
 void
 osm_pi_rcv_process_set(IN const osm_pi_rcv_t * const p_rcv,
-		       IN osm_port_t * const p_port,
+		       IN osm_node_t * const p_node,
 		       IN const uint8_t port_num, IN osm_madw_t * const p_madw)
 {
 	osm_physp_t *p_physp;
-	osm_node_t *p_node;
 	ib_net64_t port_guid;
 	ib_smp_t *p_smp;
 	ib_port_info_t *p_pi;
@@ -520,7 +519,6 @@ osm_pi_rcv_process_set(IN const osm_pi_rcv_t * const p_rcv,
 
 	p_context = osm_madw_get_pi_context_ptr(p_madw);
 
-	p_node = p_port->p_node;
 	CL_ASSERT(p_node);
 
 	p_physp = osm_node_get_physp_ptr(p_node, port_num);
@@ -647,6 +645,9 @@ void osm_pi_rcv_process(IN void *context, IN void *data)
 		goto Exit;
 	}
 
+	p_node = p_port->p_node;
+	CL_ASSERT(p_node);
+
 	/*
 	   If we were setting the PortInfo, then receiving
 	   this attribute was not part of sweeping the subnet.
@@ -659,7 +660,7 @@ void osm_pi_rcv_process(IN void *context, IN void *data)
 	   boolean around to determine if we were doing Get() or Set().
 	 */
 	if (p_context->set_method)
-		osm_pi_rcv_process_set(p_rcv, p_port, port_num, p_madw);
+		osm_pi_rcv_process_set(p_rcv, p_node, port_num, p_madw);
 	else {
 		p_port->discovery_count++;
 
@@ -678,9 +679,6 @@ void osm_pi_rcv_process(IN void *context, IN void *data)
 				cl_ntoh64(node_guid),
 				cl_ntoh64(p_smp->trans_id));
 
-		p_node = p_port->p_node;
-		CL_ASSERT(p_node);
-
 		p_physp = osm_node_get_physp_ptr(p_node, port_num);
 
 		/*
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Sat Dec 15 12:40:26 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 15 Dec 2007 20:40:26 +0000
Subject: [ofa-general] [PATCH] opensm/osm_node_new: move p_node->print_desc
	setup
Message-ID: <20071215204026.GH23319@sashak.voltaire.com>


Move p_node->print_desc setup under section where p_node is initialized.
Actually it reverses p_node allocation and setup flow in order to
simplify the code.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_node.c |   39 ++++++++++++++++++++-------------------
 1 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/opensm/opensm/osm_node.c b/opensm/opensm/osm_node.c
index 76f4407..39f4181 100644
--- a/opensm/opensm/osm_node.c
+++ b/opensm/opensm/osm_node.c
@@ -112,25 +112,26 @@ osm_node_t *osm_node_new(IN const osm_madw_t * const p_madw)
 	size = p_ni->num_ports;
 
 	p_node = malloc(sizeof(*p_node) + sizeof(osm_physp_t) * size);
-	if (p_node != NULL) {
-		memset(p_node, 0, sizeof(*p_node) + sizeof(osm_physp_t) * size);
-		p_node->node_info = *p_ni;
-		p_node->physp_tbl_size = size + 1;
-
-		/*
-		   Construct Physical Port objects owned by this Node.
-		   Then, initialize the Physical Port through with we
-		   discovered this port.
-		   For switches, all ports have the same GUID.
-		   For CAs and routers, each port has a different GUID, so we only
-		   know the GUID for the port that responded to our
-		   Get(NodeInfo).
-		 */
-		for (i = 0; i < p_node->physp_tbl_size; i++)
-			osm_physp_construct(&p_node->physp_table[i]);
-
-		osm_node_init_physp(p_node, p_madw);
-	}
+	if (!p_node)
+		return NULL;
+
+	memset(p_node, 0, sizeof(*p_node) + sizeof(osm_physp_t) * size);
+	p_node->node_info = *p_ni;
+	p_node->physp_tbl_size = size + 1;
+
+	/*
+	   Construct Physical Port objects owned by this Node.
+	   Then, initialize the Physical Port through with we
+	   discovered this port.
+	   For switches, all ports have the same GUID.
+	   For CAs and routers, each port has a different GUID, so we only
+	   know the GUID for the port that responded to our
+	   Get(NodeInfo).
+	 */
+	for (i = 0; i < p_node->physp_tbl_size; i++)
+		osm_physp_construct(&p_node->physp_table[i]);
+
+	osm_node_init_physp(p_node, p_madw);
 	p_node->print_desc = strdup("<unknown>");
 
 	return (p_node);
-- 
1.5.3.4.206.g58ba4


From kliteyn at mellanox.co.il  Sat Dec 15 21:16:55 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 16 Dec 2007 07:16:55 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-16:normal completion
Message-ID: <MTLEXCH01qYW8BYXifW00000a2a@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-15
OpenSM git rev = Thu_Dec_13_03:47:48_2007 [4a444ab121ac546672faf86b530f9eaeb9ec2503]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=519  Fail=1
 
 
Pass:
39 Stability IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
38 Pkey IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:
1 Pkey IS1-16.topo


From ogerlitz at voltaire.com  Sat Dec 15 22:11:20 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Sun, 16 Dec 2007 08:11:20 +0200
Subject: [ofa-general] RE: [PATCH 3/6] nes: add missing unlock in error path
	of nes_alloc_fmr()
In-Reply-To: <5E701717F2B2ED4EA60F87C8AA57B7CC07AF73CC@venom2>
References: <20071214004433.C3B61E28702@openfabrics.org>	<adazlwd6f8i.fsf@cisco.com>
	<5E701717F2B2ED4EA60F87C8AA57B7CC07AF73CC@venom2>
Message-ID: <4764C188.10609@voltaire.com>

Glenn Grundstrom wrote:
> Got it.  These 6 patches just bring the OFED tree in line with yours.

Glenn,

Can you split your postings such that upstream ones (eg to Roland's 
tree) are sent to the general list and whatever you may call it (say 
"OFED" ones) are sent to the ewg list? Since your development is done 
for the mainline kernel, such separation will allow to track what's 
going on development wise.

For example, looking on this six patches series it seems that #2 (nes: 
fix external crc32c() dependency) and #5 (nes: fix napi enable for 
multiport boards) are upstream and the rest are for OFED?

Or.


From jackm at dev.mellanox.co.il  Sat Dec 15 22:27:04 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Sun, 16 Dec 2007 08:27:04 +0200
Subject: [ofa-general] XRC cleanup order issue
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE143AE8@G5W0278.americas.hpqcorp.net>
References: <D89C2C212795564B837FA1665CAE02990FDE143AE8@G5W0278.americas.hpqcorp.net>
Message-ID: <200712160827.04519.jackm@dev.mellanox.co.il>

On Wednesday 12 December 2007 17:24, Tang, Changqing wrote:
> 
> HI,
>         This question is mainly for Mellanox engineers.
> 
>         With XRC, the rank who create the QP which is used for transport to all ranks on that node can NOT exit first if other ranks are still using
> the transport. This restriction is a problem for our dynamic process definition where any rank could die with any reason, but without teardown the
> whole application.
> 
>         I am thinking about shared memory usage, where the creator does not have to keep alive while other processes can still use it, untill the
> last process exits, then the system will cleanup the shared memory.
> 
>         Can't XRC mimic the shared memory behavior ?
> 
There is an issue that the QP needs to be associated with a protection domain (i.e., UAR area),
which is unique per user process.

One possibility is to have a separate process per host per job (XRC domain) create the XRC QPs on the receiving side.
There still would be the issue of what happens if that process somehow dies prematurely.

We'll examine the issue and see if there is some other solution.

- Jack


From moshek at voltaire.com  Sat Dec 15 23:12:24 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Sun, 16 Dec 2007 09:12:24 +0200
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to select
	sp4 patches for SLES9 kernel with minor versions equal or
	greater than 305
In-Reply-To: <47632E5A.5070008@lnxi.com>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>

 See patches in the attached message.

It was applied by Vlad.

Moshe

____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
-----Original Message-----
From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of David B.
Anderson
Sent: Saturday, December 15, 2007 3:31 AM
To: general at lists.openfabrics.org; vlad at mellanox.co.il
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
select sp4 patches for SLES9 kernel with minor versions equal or greater
than 305


Hi,

I've created the following patch for OFED 1.2.5.4 to have the kernel for

SLES9 SP4 recognized (2.6.5-7.308).

Even with the patch I then had two back port patches not apply cleanly 
(cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched them but now 
I'm getting the following compiler errors:

In file included from
/usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
port/2.6.5_sles9_sp4/include/linux/module.h:4,
                 from
/usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
port/2.6.5_sles9_sp4/include/linux/device.h:4,
                 from 
/usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
                 from 
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
/core/addr.c:32:
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
port/2.6.5_sles9_sp4/include/linux/sched.h:8: 
warning: static declaration for `wait_for_completion_timeout' follows 
non-static
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
/core/addr.c:67: 
warning: initialization from incompatible pointer type
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
/core/addr.c: 
In function `addr_resolve_remote':
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
/core/addr.c:192: 
error: structure has no member named `idev'
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
/core/addr.c:193: 
error: structure has no member named `idev'
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
/core/addr.c:197: 
error: structure has no member named `idev'
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
/core/addr.c: 
At top level:
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
port/2.6.5_sles9_sp4/include/linux/device.h:48: 
warning: `class_create' defined but not used
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
port/2.6.5_sles9_sp4/include/linux/device.h:82: 
warning: `class_destroy' defined but not used
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
port/2.6.5_sles9_sp4/include/linux/device.h:108: 
warning: `class_device_create' defined but not used
make[6]: *** 
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
d/core/addr.o] 
Error 1
make[5]: *** 
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
d/core] 
Error 2
make[4]: *** 
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
d] 
Error 2
make[3]: *** 
[_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] Error 2
make[2]: *** [modules] Error 2
make[1]: *** [modules] Error 2
make[1]: Leaving directory
`/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
make: *** [kernel] Error 2

Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?

Thanks

-- 
David B. Anderson 
Linux Networx
Sr. Software Engineer
Email: danderson at lnxi.com
Phone: (801) 649-1311

-------------- next part --------------
An embedded message was scrubbed...
From: "Moshe Kazir" <moshek at voltaire.com>
Subject: [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4 
Date: Sun, 25 Nov 2007 09:59:26 +0200
Size: 370044
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071216/33f1d5e6/attachment.mht>

From tequila at advcom.it  Sat Dec 15 16:46:22 2007
From: tequila at advcom.it (Munson)
Date: , 16 Dec 2007 08:46:22 +0800
Subject: [ofa-general] Hey there
Message-ID: <01c83fc0$22318300$de75d23a@tequila>

Hey you 
I saw your profile on-line just a few minutes ago
Email me at Jenny at SimOldGlory.info and I will reply with a Picture and info right away.
Maybe we can chat today?


From tequila at advcom.it  Sat Dec 15 16:46:22 2007
From: tequila at advcom.it (Munson)
Date: , 16 Dec 2007 08:46:22 +0800
Subject: [ofa-general] Hey there
Message-ID: <01c83fc0$22318300$de75d23a@tequila>

Hey you 
I saw your profile on-line just a few minutes ago
Email me at Jenny at SimOldGlory.info and I will reply with a Picture and info right away.
Maybe we can chat today?


From moshek at voltaire.com  Sun Dec 16 00:33:05 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Sun, 16 Dec 2007 10:33:05 +0200
Subject: [ofa-general] in PPC64 i'm able to register the code segment
	withwrite permission
In-Reply-To: <47600538.8030101@dev.mellanox.co.il>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2CE6@exil.voltaire.com>

What is machine you are using for PPC64 testing ?

Is it IBM blade ?

In IBM blades ConnectX HCA the default FWR *.ini file include the line
->
Log2_uar_ber_megabytes=5 

That support 64 KB pages.

Can this help to answer your question ?

Moshe
 

____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
-----Original Message-----
From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Dotan Barak
Sent: Wednesday, December 12, 2007 5:59 PM
To: openib-general
Subject: [ofa-general] in PPC64 i'm able to register the code segment
withwrite permission


Hi all.

I'm using the following machine attributes:
*************************************************************
Host Name         : mtlsqt185
Host Architecture : ppc64
Linux Distribution: SUSE Linux Enterprise Server 10 (ppc) VERSION = 10 
PATCHLEVEL = 1
Kernel Version    : 2.6.16.53-0.16-ppc64
GCC Version       : gcc (GCC) 4.1.2 20070115 (prerelease) (SUSE Linux)
Memory size       : 1740232 kB
Number of CPUs    : 8
cpu MHz           : 4005.000000MHz
MST Version       : 4.4.3
Driver Version    : OFED-1.2.5.4-20071210-0614
HCA ID(s)         : mlx4_0
HCA model(s)      : 25418
FW version(s)     : 2.3.906
Board(s)          : IBM08A0000001
*************************************************************

I'm executing the gen2_basic test (i guess any other test will do the 
trick too)
and i try to register one of the functions address (which is in the Code
Segment) with write permission enabled.

In all of the machines in our regression i fail to do it but in PPC64 i 
can do it.
When i checked the address which is being registered and the permission 
of it's VMA, i noticed
that the VMA of this function has write enable permission.

The function that i try to register is in address 0x1005ac80.

mtlsqt185:~ # cat /proc/17366/maps
00100000-00103000 r-xp 00100000 00:00 0
10000000-1004a000 r-xp 00000000 08:03 1063667

/tmp/tsscr/svn.mlx_tp/branches/ofed1.2.5/gen2/userspace/useraccess/gen2_
basic/gen2_basic
1005a000-1005e000 rw-p 0004a000 08:03 1063667

/tmp/tsscr/svn.mlx_tp/branches/ofed1.2.5/gen2/userspace/useraccess/gen2_
basic/gen2_basic
1005e000-1015f000 rw-p 1005e000 00:00 0

[heap]


Is this is an IB issue?
Is this is a security hole in the linux kernel in PPC (because viruses 
can change the code in the code segment ...)

thanks
Dotan
_______________________________________________
general mailing list
general at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


From vu at mellanox.com  Sun Dec 16 01:01:29 2007
From: vu at mellanox.com (Vu Pham)
Date: Sun, 16 Dec 2007 01:01:29 -0800
Subject: [Scst-devel] [ofa-general] Re:  SRP Target Session Hangs
In-Reply-To: <47624BE6.8090909@vlnb.net>
References: <C2F174F99918D54CA2A96E57C5079B6F3551BB@sbc-exmsg2.sbcounty.gov>	<47610768.8080203@vlnb.net>
	<476233B7.70102@mellanox.com> <47624BE6.8090909@vlnb.net>
Message-ID: <4764E969.6010208@mellanox.com>

Vladislav Bolkhovitin wrote:
> Vu Pham wrote:
>>>> 3) This may not be the forum for this, but how can you terminate a 
>>>> session using SCST proc commands?
>>>
>>> SCST can't (and shouldn't) do that, because it has no knowledge about 
>>> how sessions with particular target transport created and destroyed. 
>>> Sessions management is the target driver's duty. Ask Vu that feature.
>>
>> Normally {srp connection, scst session} will be destroyed upon 
>> disconnect reques or new connection request (without multi-channels 
>> flag) coming from initiator or QP is in error condition.
>>
>> Sometimes we fail to terminate scst session when there are outstanding 
>> I/Os on scst session which does not match the outstanding I/Os on the 
>> srp connection. Right after srp calls scst_init_cmd_done, it increases 
>> the active command counter and decreases the counter on on_free_command.
>>
>> Any idea Vlad?
> 
> Sorry, I don't understand, what's the problem with them?
> 

I keep track of the number of outstanding scst_cmnd on a 
{srp connection, scst session} and it does not match with 
scst_sess->refcnt - or it should be off by 1; however, it 
does not. Therefore scst_unregister_session (wait 1) stuck 
most of the time.

-vu


From a-1precision at aol.com  Sun Dec 16 02:11:33 2007
From: a-1precision at aol.com (Jo Holbrook)
Date: , 16 Dec 2007 12:11:33 +0200
Subject: [ofa-general] Your profile
Message-ID: <01c83fdc$cc1f0880$7ef82454@a-1precision>

Hello! I am tired this evening. I am nice girl that would like to chat with you. Email me at hcvf at ShineBal.info only, because I am writing not from my personal email. I will show you some great pictures of me.


From vlad at lists.openfabrics.org  Sun Dec 16 03:05:32 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sun, 16 Dec 2007 03:05:32 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071216-0200 daily build status
Message-ID: <20071216110532.42D9DE6017D@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.14
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.13
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.17
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.20
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.18-1.2798.fc6

Failed:
Build failed on i686 with linux-2.6.18
Build failed on ppc64 with linux-2.6.18
Log:
Build failed on x86_64 with linux-2.6.18
Log:
Hunk #14 succeeded at 1536 (offset 4 lines).
Hunk #15 succeeded at 1824 (offset 3 lines).
Hunk #16 succeeded at 1873 (offset 3 lines).
Hunk #17 succeeded at 1884 (offset 3 lines).
2 out of 17 hunks FAILED -- rejects in file drivers/scsi/iscsi_tcp.c
patching file drivers/scsi/iscsi_tcp.h
Patch open-iscsi-tx-hash-fixes.patch does not apply (enforce with -f)

Failed executing /usr/bin/quilt
----------------------------------------------------------------------------------
Hunk #14 succeeded at 1536 (offset 4 lines).
Hunk #15 succeeded at 1824 (offset 3 lines).
Hunk #16 succeeded at 1873 (offset 3 lines).
Hunk #17 succeeded at 1884 (offset 3 lines).
2 out of 17 hunks FAILED -- rejects in file drivers/scsi/iscsi_tcp.c
patching file drivers/scsi/iscsi_tcp.h
Patch open-iscsi-tx-hash-fixes.patch does not apply (enforce with -f)

Failed executing /usr/bin/quilt
----------------------------------------------------------------------------------
Build failed on ia64 with linux-2.6.18
Log:
Hunk #14 succeeded at 1536 (offset 4 lines).
Hunk #15 succeeded at 1824 (offset 3 lines).
Hunk #16 succeeded at 1873 (offset 3 lines).
Hunk #17 succeeded at 1884 (offset 3 lines).
2 out of 17 hunks FAILED -- rejects in file drivers/scsi/iscsi_tcp.c
patching file drivers/scsi/iscsi_tcp.h
Patch open-iscsi-tx-hash-fixes.patch does not apply (enforce with -f)

Failed executing /usr/bin/quilt
----------------------------------------------------------------------------------


From tziporet at dev.mellanox.co.il  Sun Dec 16 03:14:59 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Sun, 16 Dec 2007 13:14:59 +0200
Subject: [ewg] RE: [ofa-general] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3
	-	fixes and 5.0firmware support
In-Reply-To: <39C75744D164D948A170E9792AF8E7CA4D2CD5@exil.voltaire.com>
References: <39C75744D164D948A170E9792AF8E7CA4D2CD5@exil.voltaire.com>
Message-ID: <476508B3.9020708@mellanox.co.il>

Moshe Kazir wrote:
> What is the planned rate of releasing OFED-1.2.5.X   versions ?
>
> & what is the level of testing and QA ?
>
> Isn't the paste of moving 1.25.1 -> 1.2.5.5  too fast ?
>
>   
>   
We are doing partial QA for each minor release, and the frequency is a 
little higher since ConnectX revealed bugs in IPoIB (some of them are 
old bugs that we have not hit with Arbel)

We plan to have another minor release soon (found 2 more kernel oops 
with IPoIB and ConnectX)
We will do more QA this time to ensure this version is more stable.

There are also other requests from other HCAs/NIC vendors for minor 
releases.

Tziporet


From dotanb at dev.mellanox.co.il  Sun Dec 16 04:52:55 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Sun, 16 Dec 2007 14:52:55 +0200
Subject: [ofa-general] Re: The qperf doesn't compile IB test cases
In-Reply-To: <20071213152910.GA23928@cuprite.pathscale.com>
References: <47612B13.4000001@dev.mellanox.co.il>
	<20071213152910.GA23928@cuprite.pathscale.com>
Message-ID: <47651FA7.1090906@dev.mellanox.co.il>

Johann George wrote:
> Hello Dotan.
>
> What sort of error does it get?  I just tried the latest version from
> the git repository and it compiled without problems.
>
> Johann
>   
Hi.

Here is the server command + output:
[root at mtlsqt155 tmp]# qperf


Here is the client command + output:
[root at mtlsqt155 ~]# qperf mtlsqt155 rc_bi_bw
rc_bi_bw: bad test; try qperf --help

I used the qperf from the OFED distribution.

I think that in order to check the issue that i see you need to use the 
qperf from the OFED distribution,
instead of compile it by yourself .

thanks
Dotan


From taylor at hpc.ufl.edu  Sun Dec 16 06:46:24 2007
From: taylor at hpc.ufl.edu (Charles Taylor)
Date: Sun, 16 Dec 2007 09:46:24 -0500
Subject: [ofa-general] RDMA/CM Spec
Message-ID: <1A960510-3B9F-4902-B523-A2CB80B68849@hpc.ufl.edu>


I'm a little bit new to OFED/RDMA development and can't seem to find  
a spec or rfc describing the rdma/cm layer.     I don't see it in the  
iWARP consortium RDMAP spec (not even a reference to it) and googling  
turns up mainly references to the OFA-general mailing list.     Is  
there a spec for this?

Thanks,

Charlie Taylor
UF HPC Center


From tziporet at dev.mellanox.co.il  Sun Dec 16 06:57:18 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Sun, 16 Dec 2007 16:57:18 +0200
Subject: [ofa-general] OFED 1.2.5.4 build fails for kernel 2.6.23
In-Reply-To: <c177de4a0712111329y72e4580bub9e23d75d17338a0@mail.gmail.com>
References: <c177de4a0712111329y72e4580bub9e23d75d17338a0@mail.gmail.com>
Message-ID: <47653CCE.4040606@mellanox.co.il>

Chuck Hartley wrote:
> I tried building OFED 1.2.5.4 <http://1.2.5.4> on a Fedora 7 system 
> with kernel  2.6.23.1-21.fc7 and got a fatal compile error.  
> Apparently the number of arguments to kmem_cache_create() changed from 
> 6 to 5 starting with kernel version 2.6.23. Error output below:

OFED 1.2.5.x does NOT support kernel 2.6.23 (and also 24)
It does supports kernel 2.6.22

I guess you will have more failures after you fixed this one
If you want to prepare backport patches for OFED 1.2.5.x we will be 
happy to add them to our backports

Tziporet


From johann.george at qlogic.com  Sun Dec 16 07:12:41 2007
From: johann.george at qlogic.com (Johann George)
Date: Sun, 16 Dec 2007 07:12:41 -0800
Subject: [ofa-general] Re: The qperf doesn't compile IB test cases
In-Reply-To: <47651FA7.1090906@dev.mellanox.co.il>
References: <47612B13.4000001@dev.mellanox.co.il>
	<20071213152910.GA23928@cuprite.pathscale.com>
	<47651FA7.1090906@dev.mellanox.co.il>
Message-ID: <20071216151241.GA29625@cuprite.pathscale.com>

Dotan,

If the build script for qperf is unable to find libibverbs, it
generates a version that only contains the TCP/IP tests.  I had tested
out an earlier OFED 1.3 build several weeks ago and that version of
qperf did contain the RDMA tests.  So perhaps something changed with
the OFED build scripts.

I'll investigate.  Thanks for pointing it out.

Johann

On Sun, Dec 16, 2007 at 02:52:55PM +0200, Dotan Barak wrote:
> Here is the server command + output:
> [root at mtlsqt155 tmp]# qperf
> 
> 
> Here is the client command + output:
> [root at mtlsqt155 ~]# qperf mtlsqt155 rc_bi_bw
> rc_bi_bw: bad test; try qperf --help
> 
> I used the qperf from the OFED distribution.
> 
> I think that in order to check the issue that i see you need to use the 
> qperf from the OFED distribution,
> instead of compile it by yourself .
> 
> thanks
> Dotan


From vlad at dev.mellanox.co.il  Sun Dec 16 07:25:18 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Sun, 16 Dec 2007 17:25:18 +0200
Subject: [ofa-general] Re: [ewg] ofed-1.3-rc1 problem
In-Reply-To: <4762CF30.2030202@opengridcomputing.com>
References: <4762CF30.2030202@opengridcomputing.com>
Message-ID: <4765435E.5020801@dev.mellanox.co.il>

Steve Wise wrote:
> linking with libibumad fails on ofed-1.3-rc1.  I get a 'cannot find 
> -libumad' from ld.  I looked in /usr/lib64 and there wasn't a link from 
> libibumad.so to libibumad.so.1.0.2.  I added the link and the ld works 
> now.  This was on PPC64.
> 
> I think this is some install problem with libibumad.
> 
> Steve.

Hi Steve,
Check that libibumad-devel is installed.

Regards,
Vladimir


From swise at opengridcomputing.com  Sun Dec 16 08:11:07 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Sun, 16 Dec 2007 10:11:07 -0600
Subject: [ofa-general] Re: [ewg] ofed-1.3-rc1 problem
In-Reply-To: <4765435E.5020801@dev.mellanox.co.il>
References: <4762CF30.2030202@opengridcomputing.com>
	<4765435E.5020801@dev.mellanox.co.il>
Message-ID: <47654E1B.5060308@opengridcomputing.com>

You're right!

It is not installed, but it was built by install.pl.  However I didn't 
explicitly request to build/install ibumad.  Its a prerequisite of 
mvapich2, which I did ask to have built/installed.  So I think 
install.pl needs to be fixed to prereq this maybe?


Vladimir Sokolovsky wrote:
> Steve Wise wrote:
>> linking with libibumad fails on ofed-1.3-rc1.  I get a 'cannot find 
>> -libumad' from ld.  I looked in /usr/lib64 and there wasn't a link 
>> from libibumad.so to libibumad.so.1.0.2.  I added the link and the ld 
>> works now.  This was on PPC64.
>>
>> I think this is some install problem with libibumad.
>>
>> Steve.
> 
> Hi Steve,
> Check that libibumad-devel is installed.
> 
> Regards,
> Vladimir


From weeklyprofit at hush.ai  Sun Dec 16 08:36:15 2007
From: weeklyprofit at hush.ai (WeeklyProfit)
Date: Mon, 17 Dec 2007 00:36:15 +0800
Subject: [ofa-general] " I make more in 1 hour.. "
Message-ID: <20071216163626.07F07E619C2@openfabrics.org>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/96b18cdb/attachment.html>

From jackm at dev.mellanox.co.il  Sun Dec 16 09:01:20 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Sun, 16 Dec 2007 19:01:20 +0200
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<ada4pexemsu.fsf@cisco.com>
	<D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>
Message-ID: <200712161901.21050.jackm@dev.mellanox.co.il>

On Wednesday 05 December 2007 17:45, Tang, Changqing wrote:
> > I think the only alternative we have to preserve backwards
> > compatibility is to leave struct ibv_context_ops alone and
> > change the structure to:
> >
> > struct ibv_context {
> >         struct ibv_device      *device;
> >         struct ibv_context_ops  ops;
> >         int                     cmd_fd;
> >         int                     async_fd;
> >         int                     num_comp_vectors;
> >         pthread_mutex_t         mutex;
> >         void                   *abi_compat;
> >         struct ibv_xrc_op      *xrc_ops;
> > };
> >
> > with xrc_ops added at the end.  It's my fault for not making
> > the ops member a pointer I guess.
> >
> > Tziporet/Jack/whoever -- please fix up the libibverbs you
> > ship for OFED 1.3 to resolve this.
> >
> > We can clean this up for libibverbs 1.2 when the ABI can
> > change, if/when we have something worth breaking the ABI for.
> 

We need to have all userspace libraries set their private context object to 0 at allocation time
(the private context object includes the ibv_context structure, which must now be NULL-ed out).

The other userspace driver libraries (e.g., libmthca) do not zero-out their internal userspace
context structures (e.g., mthca_context) which include the ibv_context structure as the first element.
Up to now, we depended on the ibv_context assign to set unavailable verb implementations to NULL.
(and every userspace driver assigned the ops structure, with unimplemented operations set to NULL by the compiler).
This is no longer true.

Thus, anyone installing OFED will have a compatible set of userspace drivers for XRC applications
(drivers which do not implement XRC will return errors for XRC-verbs).

Applications which were compiled with previous libraries will still work (since they do not use XRC).

- Jack


From jackm at dev.mellanox.co.il  Sun Dec 16 09:01:20 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Sun, 16 Dec 2007 19:01:20 +0200
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<ada4pexemsu.fsf@cisco.com>
	<D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>
Message-ID: <200712161901.21050.jackm@dev.mellanox.co.il>

On Wednesday 05 December 2007 17:45, Tang, Changqing wrote:
> > I think the only alternative we have to preserve backwards
> > compatibility is to leave struct ibv_context_ops alone and
> > change the structure to:
> >
> > struct ibv_context {
> >         struct ibv_device      *device;
> >         struct ibv_context_ops  ops;
> >         int                     cmd_fd;
> >         int                     async_fd;
> >         int                     num_comp_vectors;
> >         pthread_mutex_t         mutex;
> >         void                   *abi_compat;
> >         struct ibv_xrc_op      *xrc_ops;
> > };
> >
> > with xrc_ops added at the end.  It's my fault for not making
> > the ops member a pointer I guess.
> >
> > Tziporet/Jack/whoever -- please fix up the libibverbs you
> > ship for OFED 1.3 to resolve this.
> >
> > We can clean this up for libibverbs 1.2 when the ABI can
> > change, if/when we have something worth breaking the ABI for.
> 

We need to have all userspace libraries set their private context object to 0 at allocation time
(the private context object includes the ibv_context structure, which must now be NULL-ed out).

The other userspace driver libraries (e.g., libmthca) do not zero-out their internal userspace
context structures (e.g., mthca_context) which include the ibv_context structure as the first element.
Up to now, we depended on the ibv_context assign to set unavailable verb implementations to NULL.
(and every userspace driver assigned the ops structure, with unimplemented operations set to NULL by the compiler).
This is no longer true.

Thus, anyone installing OFED will have a compatible set of userspace drivers for XRC applications
(drivers which do not implement XRC will return errors for XRC-verbs).

Applications which were compiled with previous libraries will still work (since they do not use XRC).

- Jack


From Arkady.Kanevsky at netapp.com  Sun Dec 16 09:11:41 2007
From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady)
Date: Sun, 16 Dec 2007 12:11:41 -0500
Subject: [ofa-general] RDMA/CM Spec
In-Reply-To: <1A960510-3B9F-4902-B523-A2CB80B68849@hpc.ufl.edu>
References: <1A960510-3B9F-4902-B523-A2CB80B68849@hpc.ufl.edu>
Message-ID: <C98692FD98048C41885E0B0FACD9DFB805AD014E@exnane01.hq.netapp.com>

Look for MPA http://www.ietf.org/rfc/rfc5044.txt.

Arkady Kanevsky                       email: arkady at netapp.com
Network Appliance Inc.               phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195
Waltham, MA 02451                   central phone: 781-768-5300
 

> -----Original Message-----
> From: Charles Taylor [mailto:taylor at hpc.ufl.edu] 
> Sent: Sunday, December 16, 2007 9:46 AM
> To: general at lists.openfabrics.org
> Subject: [ofa-general] RDMA/CM Spec
> 
> 
> I'm a little bit new to OFED/RDMA development and can't seem to find  
> a spec or rfc describing the rdma/cm layer.     I don't see 
> it in the  
> iWARP consortium RDMAP spec (not even a reference to it) and 
> googling  
> turns up mainly references to the OFA-general mailing list.     Is  
> there a spec for this?
> 
> Thanks,
> 
> Charlie Taylor
> UF HPC Center
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From davem at davemloft.net  Sun Dec 16 13:47:42 2007
From: davem at davemloft.net (David Miller)
Date: Sun, 16 Dec 2007 13:47:42 -0800 (PST)
Subject: [ofa-general] Re: [PATCH net-2.6.25 7/8] drivers/infiniband: Use
	ipv4_is_<type>
In-Reply-To: <1197589141-7020-7-git-send-email-joe@perches.com>
References: <1197589141-7020-6-git-send-email-joe@perches.com>
	<e2ebcbf24539c0a3ef2ac77147dbfa55d8ac9e33.1197432867.git.joe@perches.com>
	<1197589141-7020-7-git-send-email-joe@perches.com>
Message-ID: <20071216.134742.224581478.davem@davemloft.net>

From: Joe Perches <joe at perches.com>
Date: Thu, 13 Dec 2007 15:39:00 -0800

> Signed-off-by: Joe Perches <joe at perches.com>

Applied.


From AlonzolumberBrady at typepad.com  Sun Dec 16 19:07:17 2007
From: AlonzolumberBrady at typepad.com (Erick Pittman)
Date: Mon, 17 Dec 2007 02:07:17 -0100
Subject: [ofa-general] Hi
Message-ID: <7e7601c84049$67b8e560$0a01a8c0@pc05>

Even if you have no erection problems Viagra would help you to make better sex more often and to bring unimaginable plesure to her. Just disolve half a pill under your tongue and get ready for action in 30 minutes. The tests showed that the majority of men after taking this medication were able to have perfect erection during 24 hours!

Package
Quantity
Price in your local drugstore*
Our price
LearnMoreNow

10 tabs
20 doses
$99.95
$34.49

30 tabs
60 doses
$299.95
$88.50

60 tabs
120 doses
$449.95
$141.02

90 tabs
180 doses
$769.95
$176.40

180 tabs
360 doses
$1299.95
$298.46

When you are young and stressed up&hellip;
When you are aged and never give up&hellip;
Viagra gives you confidence in any chance, every time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/0f33c41b/attachment.html>

From 997seqra at jtthorpe-la.com  Sun Dec 16 17:46:40 2007
From: 997seqra at jtthorpe-la.com (Gertrude Meyer)
Date: , 16 Dec 2007 21:46:40 -0400
Subject: [ofa-general] Let's chat
Message-ID: <972611262.85850148277660@jtthorpe-la.com>

Hello! I am tired this evening. I am nice girl that would like to chat with you. Email me at qowhu at ShineBal.info only, because I am writing not from my personal email. Would you mind me showing some nice pictures of me?


From kliteyn at mellanox.co.il  Sun Dec 16 21:14:45 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 17 Dec 2007 07:14:45 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-17:normal completion
Message-ID: <MTLEXCH017ROdTZNQAT00000bbb@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-16
OpenSM git rev = Sat_Dec_15_15:22:10_2007 [4d6d0de291e8e4990e645202fcd5fbc02387cf27]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
 
 
Total=520  Pass=519  Fail=1
 
 
Pass:
39 Stability IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
38 Pkey IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 LidMgr IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:
1 Pkey IS1-16.topo


From suna at 163.com  Sun Dec 16 23:14:25 2007
From: suna at 163.com (=?GB2312?B?1+7T0Mewzb6y+sa3?=)
Date: Sun, 16 Dec 2007 23:14:25 -0800 (PST)
Subject: [ofa-general] =?utf-8?b?5aSq6Ziz5YWJ5a+85YWl5Zmo44CA5Zu96ZmF55WF?=
	=?utf-8?b?6ZSA5paw5ZOB44CA5qyi6L+O5ZCE5Zyw57uP6ZSA44CA5ZCI77+9?=
Message-ID: <20071217071426.24F38E601C6@openfabrics.org>

 =?GB2312?B?98ur0665sr34oaHXpdehvaG/tb3axNyhocbz0rXQwtT2wPvI8w==?=
To: openib-general at openib.org
Content-Type: text/html;charset="GB2312"
Reply-To: suna at 163.com
Date: Mon, 17 Dec 2007 15:14:18 +0800
X-Priority: 3
X-Mailer: FoxMail 3.11 Release [cn]


���� �㽭ʡ����������ڹ�˾���㽭ʡ���������ڴ�˾�������۶����곬��25��Ԫ�����й������ܾ�������̫��⵼�������Ǹ��ؾ�����ҵ�ļ�ǿ��ܡ���˾�������ؾ�����ҵ�ṩ����̫��⵼������Ӫ���ƣ���Ʒ����ƬDVD��������վ������������ϡ�չʾ����ơ�VIS�Ӿ�����ϵͳ��GPS��λϵͳ�����������㡢��װ\����\ά���ֲᡢ�����ͬ������ҵ����ѵ�����׾�Ӫ���ơ���֤������ҵ������չ���⾭Ӫ���ۣ����������ҵ������Ӫ����֮�༰���ڿ���Ͷ����á�
����̫��⵼����ϵ�в�Ʒ���й�ʵ��������վ��http://www.himawari.com.cn �������в�Ʒȫ����ܵ���Ƭ��Ӱ��Ƭ����˴����ţ�http://www.himawari.com.cn/live.htm�����԰����˽�̫��⵼����ϵ�в�Ʒ�з������������ۡ�ԭ�������ܡ�ʹ�á���װ��ά�������������
�����㽭���˾��վ��http://www.zjmtd.com.cn ���������㽭ʡ�������ڹ�˾�ۺ�������ܣ��԰�����λ�˽��㽭�����ۺ�ʵ���ͷ�չ�����
����̫��⵼����ҵ����ϵ��ʽ���㽭ʡ����������ڹ�˾ ̫�����ҵ�� ��ϵ�ˣ�Ѧ����� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003�� �绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
������ӭ������ҵ������������˾�ιۡ����졢����̫��⵼����ϵ�в�Ʒ��չʾ�����������ݿ��졢Ǣ̸ǰ���������ǰ��֪���ǣ��Ա���˾�ȳ���������ýӻ�ס�޶�Ʊ��վ�ȹ�����
̫��⵼�������й�2007����߷�չǰ�������½�����Ʒ�����¼ҵ��Ʒ�����½��ܲ�Ʒ������������Ʒ�����»�����Ʒ�����½���װ�޲�Ʒ��������ǹ�˾Ա������������˾�쵼����²�Ʒ�����Խ����������ʹ�����쵼��������ӡ�󣬵õ����úͽ�����

��������Զ����ʵҵ�ң�
�������ã����Ǹ����������й����¡�������������г�Ǳ����Ʒ����̫��⵼����������˾֣����ȫ������Ǣ̸������顣
����
��������һ����Ʒ����׿����
��������ר��ע��ԭװ���ڲ�Ʒ��GPS̫�����׷�١�͹͸���۽����⡢����⵼����Զ�ഫ�ͣ��к����߷������ء���������������䣻����һ��10���Сʱ100����ȫ�����չ�ԡ���ò�Ʒӵ��400�������ר������Ʒ����70�꣬������ʮ������г����ۿ��飬��һ̨���ϣ���Ʒ���ʡ������ȶ��ϳˣ���װʮ�ַ��㣬�·��ɷ�����ʹ�ã�ʹ�÷���Ϊ��ɱ���
����
�������ɶ����г�ǰ��������
�����������й�����������߷�չ���ƵĲ�ҵ�ǣ�����Ӳ�ҵ����Ϣͨ�Ų�ҵ�������͸�����ҵ������������Դ��ҵ�����ĸ���ҵ������Ǯ�Ĳ�ҵ��
����1������Ӳ�ҵ��������оƬ�ȱ���Ϊ���Ӳ�ҵ�ġ����ס��������ҵ���ԣ��𵽡��������ϡ����á�̫��⵼������̫��ת�͵����Թ�תϵͳ����ϵͳ�Ƴɼ���оƬ��
����2����Ϣͨ�Ų�ҵ���Ǳ���������͵�������ҵ��20�����Ƿ������͵綯��ռ������λ�ġ����������͡���21������ͨ�š��㲥�͵��Ӽ������λһ��ġ���Ϣ�����͡�������Ŀǰͨ�Ų�Ʒ���ֻ��������;����̲�������������䡣̫��⵼��������GPS���Ƕ�λϵͳ��ȷ����װ�ص㣬��ȷ��λ��γ�ȡ�
����3�������͸�����ҵ�������˺���ȻϢϢ��أ���ӳ�����ʱ��Ҫ������Ͳ�ҵ������Ǯ��ʲô������Ϊ�˹��Ϻ����ӣ�������Ϊ�˳��٣���˿���ҩ��ΰ����йؽ�����Ʒ������ͬʱ�Ը�����ҵҲ�ڸ�������𲢻�úܺõ��ȹ����档����������̫���й��������ⲻ���ĵط���ҽ����ȥ�ĵط���ŷ����������Ե������ֲ������ṩ������ҽѧŵ������������̫��⵼�����ų�������к��ɷݣ��������ṩ�������Ϲ�����׼�İ�ȫ������̫��⣬����������Ҫ����������˵Ľ���������������
����4������������Դ��ҵ���Ǳ������������ٵ���Ҫ���⡣������ע������������ע����Դ���ƣ�����ʯ�ʹ���Ǽۣ����������ô����ش�Ӱ�졣̫��⵼����ʹ���в�����������������ÿ���޳��ṩ�൱��1300�ȵ�����������������Դ��ͬʱ�����������Դ������ɻ�����Ⱦ������������磬�����ܵ�������Ⱦ��
������ˣ�̫��⵼������Ʒ����ҵ���������й����ı�������߷�չǰ�����ĸ���ҵ��ȫһ�¡��Ƿǳ��ѵá�������й㷺ʹ�ÿռ�ĺò�Ʒ��
����5���г��������й㷺�����۷�Χ��ʹ�ÿռ䡣
�����ܣ���������������Ȼ���������䡢�����ң����ⷿ���а�ȫ����̫��⡣���õ磬̫��⵼����ȴ���ṩ�ﵽ�൱��500�����ϵ���������ȣ������޳ɱ��ɹ�������
������˺��̫�����Ժϳɶ���ϡȱԪ�أ�������������������Թ��ܺ����������ٽ��³´�л������ƣ�ͣ������������������١�
����סլ�ĳ������������ң������Ŀ��������ҡ�������������������ȿ�ʹ���������ֵ��
���ڰ칫�ҡ������ҡ��Ӵ��ҵȰ칫������
����ҽԺ�����ݡ����ꡢԡ�ҡ����ݡ�չ����ѧУ��ˮ��ݡ�����ʵ���ҡ�����ָ���ҵȳ�����
�κν����ﶼ��Ҫ�����ܽ����ṹӰ�죬�÷��������ֵ����߳��ݵ��Ρ�
ͬʱҲ�ǽϺõġ��м������塢���˿̹����ǵ���Ʒ���������й����������У�����Ϊ��ƷҲռ��һ���ı��ء�
����
�������������������ж���¢�����ۡ�
����ÿ������ֻ��һ�������̡�����ר���������������¢�����������ۣ����������Ʒ�������GPS���Ƕ�λ�����͵������������豸���Ƽ�������˾��ɱ����������̲��������������̵����۳������֤��������¢������
������Ʒ�߿Ƽ����з���Ͷ�롣������Ϊ����500ǿ��ҵ������Ͷ��޶��ʽ�Լ50���з����飬20������г��ƹ㣬��˲�Ʒ���������칤����ʮ�����й��޷��������������й���ҵ��ð������Ҳֻ����ֽӽ��ò�Ʒ������׼30%�ķ�ð��Ʒ���������г���
����
���������ģ��������ɿ���
�����㽭������㽭ʡ��רҵ��ó��˾���ۺ�ʵ��ǿ�����й̶��ʲ���3��Ԫ�������۶�ͻ��25��Ԫ���������õȼ���AAA�������ڹ���������ʮ����ӹ�˾�ͷ�֧��������˾ӵ�й�����������Ƶ�Ӫ��������ϵ����ʵ����Ч�ľ�Ӫ���ơ�
������˾��������Ϊ��̫���ɹ⵼��ϵͳ����ȫ��Ӫ�����˴����г����С��߻�׼�������������Ƶľ�Ӫ�߻�����������Ӧ�ó�����ʹ�ð�װ��Ч���Աȵ���������Ƭ��������ֵ�רҵ�����������ϣ�Ӱ��Ƭ�������Ƶ������ƹ㡢��ư�װ��ά��������������ѵ���ۺ������ѵ�ƶȼ��γ�֧�֣�Ϊ�������ṩ�ѳɹ������ۻ��ƺ����Ƶķ����Ԯ����֤������ǩԼ���ܿ�ݶ��������ۡ�����ĿͶ���٣����졣
����
���������壺���ż������
��������2006��11��1�ա�2006��12��30�ջ�ڼ�Ǽ��ߣ������κμ��˷��ã��Ծ������ṩ���巿����ָ����һϵ����ѵļ�����ѵ���޳��ṩ���е����ϡ�ͼƬ��Ӱ�ӣ���ֵ��ʮ����Ԫ����
�����κ���ʵ������־�ڿ����²�Ʒ����ҵ�͸��˾��ɼ��ˣ�����¼ȡ�����¾�Ӫ���ز�����������װ���յ�������������ҵ����Ѹ���𶯡�

������˾���㽭ʡ�������㽭����������100ƽ���׵�չʾ�䣬������⵼�볡�������������ֳ��ιۿ��죬Ϊ��������ṩ���ܡ�
������ϵ��ʽ���㽭���˾̫�����ҵ�� ��ϵ�ˣ�Ѧ���� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003
�����绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
����������������������������


From vlad at mellanox.co.il  Sun Dec 16 23:32:32 2007
From: vlad at mellanox.co.il (Vladimir Sokolovsky)
Date: Mon, 17 Dec 2007 09:32:32 +0200
Subject: [ofa-general] Re: [ewg] ofed-1.3-rc1 problem
In-Reply-To: <47654E1B.5060308@opengridcomputing.com>
References: <4762CF30.2030202@opengridcomputing.com>
	<4765435E.5020801@dev.mellanox.co.il>
	<47654E1B.5060308@opengridcomputing.com>
Message-ID: <200712170932.33301.vlad@mellanox.co.il>

On Sunday 16 December 2007 18:11:07 Steve Wise wrote:
> You're right!
>
> It is not installed, but it was built by install.pl.  However I didn't
> explicitly request to build/install ibumad.  Its a prerequisite of
> mvapich2, which I did ask to have built/installed.  So I think
> install.pl needs to be fixed to prereq this maybe?
>

install.pl set requirements for OFED packages selected to be installed. As I 
understand, correct me if I am wrong, the installation passed successfully, 
so , there is no issues in the install. If you want to compile your 
application (not from OFED) over libibumad, then you have to select 
libibumad-devel to be installed during OFED installation.

Regards,
Vladimir

> Vladimir Sokolovsky wrote:
> > Steve Wise wrote:
> >> linking with libibumad fails on ofed-1.3-rc1.  I get a 'cannot find
> >> -libumad' from ld.  I looked in /usr/lib64 and there wasn't a link
> >> from libibumad.so to libibumad.so.1.0.2.  I added the link and the ld
> >> works now.  This was on PPC64.
> >>
> >> I think this is some install problem with libibumad.
> >>
> >> Steve.
> >
> > Hi Steve,
> > Check that libibumad-devel is installed.
> >
> > Regards,
> > Vladimir


From tziporet at dev.mellanox.co.il  Sun Dec 16 23:47:04 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Mon, 17 Dec 2007 09:47:04 +0200
Subject: [ofa-general] OFED 1.3-rc1 release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE1B42C3@G5W0278.americas.hpqcorp.net>
References: <6C2C79E72C305246B504CBA17B5500C902E2D65D@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE1B42C3@G5W0278.americas.hpqcorp.net>
Message-ID: <47662978.4070104@mellanox.co.il>

Tang, Changqing wrote:
>  
> HI,
>  
> When can you fix the backward compatible issue with OFED 1.2 ?  Thanks.
>  
This will be fixed this week (will be in RC2)
Jack will update when its working

Tziporet


From jackm at dev.mellanox.co.il  Mon Dec 17 00:19:21 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 17 Dec 2007 10:19:21 +0200
Subject: [ofa-general] [PATCH 1 of 5] libmlx4: zero context struct at
	allocation time (prep for additional context ops)
Message-ID: <200712171019.21801.jackm@dev.mellanox.co.il>

The ibv_context structure will be getting additional ops,
to be added at the end of the structure (and not as part of
the existing ibv_context_ops structure).

Reason: ibv_context_ops is declared directly as a member of ibv_context,
and not as a pointer.  Binaries compiled with previous libibverbs versions
will not be backwards compatible if we add new operations to ibv_context_ops,
since fields following the ops structure will move.

To enable adding new operations at the end of the existing ibv_context struct,
all driver libraries MUST zero their context structure at allocation time, so
that new ops will be NULL by default.

Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

diff --git a/src/mlx4.c b/src/mlx4.c
index c845fc1..93c1ce8 100644
--- a/src/mlx4.c
+++ b/src/mlx4.c
@@ -115,6 +115,7 @@ static struct ibv_context *mlx4_alloc_context(struct ibv_device *ibdev, int cmd_
 	if (!context)
 		return NULL;
 
+	memset(context, 0, sizeof *context);
 	context->ibv_ctx.cmd_fd = cmd_fd;
 
 	if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd,


From jackm at dev.mellanox.co.il  Mon Dec 17 00:19:28 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 17 Dec 2007 10:19:28 +0200
Subject: [ofa-general] [PATCH 2 of 5] libmthca: zero context struct at
	allocation time (prep for additional context ops)
Message-ID: <200712171019.29054.jackm@dev.mellanox.co.il>

The ibv_context structure will be getting additional ops,
to be added at the end of the structure (and not as part of
the existing ibv_context_ops structure).

Reason: ibv_context_ops is declared directly as a member of ibv_context,
and not as a pointer.  Binaries compiled with previous libibverbs versions
will not be backwards compatible if we add new operations to ibv_context_ops,
since fields following the ops structure will move.

To enable adding new operations at the end of the existing ibv_context struct,
all driver libraries MUST zero their context structure at allocation time, so
that new ops will be NULL by default.

Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

diff --git a/src/mthca.c b/src/mthca.c
index 0f7e953..702e821 100644
--- a/src/mthca.c
+++ b/src/mthca.c
@@ -142,6 +142,7 @@ static struct ibv_context *mthca_alloc_context(struct ibv_device *ibdev, int cmd
 	if (!context)
 		return NULL;
 
+	memset(context, 0, sizeof *context);
 	context->ibv_ctx.cmd_fd = cmd_fd;
 
 	if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd,


From jackm at dev.mellanox.co.il  Mon Dec 17 00:19:36 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 17 Dec 2007 10:19:36 +0200
Subject: [ofa-general] [PATCH 4 of 5] libnes: zero context struct at
	allocation time (prep for additional context ops)
Message-ID: <200712171019.36846.jackm@dev.mellanox.co.il>

The ibv_context structure will be getting additional ops,
to be added at the end of the structure (and not as part of
the existing ibv_context_ops structure).

Reason: ibv_context_ops is declared directly as a member of ibv_context,
and not as a pointer.  Binaries compiled with previous libibverbs versions
will not be backwards compatible if we add new operations to ibv_context_ops,
since fields following the ops structure will move.

To enable adding new operations at the end of the existing ibv_context struct,
all driver libraries MUST zero their context structure at allocation time, so
that new ops will be NULL by default.

Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

diff --git a/src/nes_umain.c b/src/nes_umain.c
index c4d0642..ff920ab 100644
--- a/src/nes_umain.c
+++ b/src/nes_umain.c
@@ -41,6 +41,7 @@
 #include <errno.h>
 #include <sys/mman.h>
 #include <pthread.h>
+#include <string.h>
 
 #include "nes_umain.h"
 #include "nes-abi.h"
@@ -122,6 +123,7 @@ static struct ibv_context *nes_ualloc_context(struct ibv_device *ibdev, int cmd_
 	if (!nesvctx)
 		return NULL;
 
+	memset(nesvctx, 0, sizeof *nesvctx);
 	nesvctx->ibv_ctx.cmd_fd = cmd_fd;
 
 	if (ibv_cmd_get_context(&nesvctx->ibv_ctx, &cmd, sizeof cmd,


From jackm at dev.mellanox.co.il  Mon Dec 17 00:19:33 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 17 Dec 2007 10:19:33 +0200
Subject: [ofa-general] [PATCH 3 of 5] libcxgb3: zero context struct at
	allocation time (prep for additional context ops)
Message-ID: <200712171019.33326.jackm@dev.mellanox.co.il>

The ibv_context structure will be getting additional ops,
to be added at the end of the structure (and not as part of
the existing ibv_context_ops structure).

Reason: ibv_context_ops is declared directly as a member of ibv_context,
and not as a pointer.  Binaries compiled with previous libibverbs versions
will not be backwards compatible if we add new operations to ibv_context_ops,
since fields following the ops structure will move.

To enable adding new operations at the end of the existing ibv_context struct,
all driver libraries MUST zero their context structure at allocation time, so
that new ops will be NULL by default.

Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

diff --git a/src/iwch.c b/src/iwch.c
index 2747518..517ff00 100644
--- a/src/iwch.c
+++ b/src/iwch.c
@@ -114,6 +114,7 @@ static struct ibv_context *iwch_alloc_context(struct ibv_device *ibdev,
 	if (!context)
 		return NULL;
 
+	memset(context, 0, sizeof *context);
 	context->ibv_ctx.cmd_fd = cmd_fd;
 
 	if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd,


From jackm at dev.mellanox.co.il  Mon Dec 17 00:19:38 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 17 Dec 2007 10:19:38 +0200
Subject: [ofa-general] [PATCH 5 of 5] libipathverbs: zero context struct at
	allocation time (prep for additional context ops)
Message-ID: <200712171019.38518.jackm@dev.mellanox.co.il>

The ibv_context structure will be getting additional ops,
to be added at the end of the structure (and not as part of
the existing ibv_context_ops structure).

Reason: ibv_context_ops is declared directly as a member of ibv_context,
and not as a pointer.  Binaries compiled with previous libibverbs versions
will not be backwards compatible if we add new operations to ibv_context_ops,
since fields following the ops structure will move.

To enable adding new operations at the end of the existing ibv_context struct,
all driver libraries MUST zero their context structure at allocation time, so
that new ops will be NULL by default.

Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

diff --git a/src/ipathverbs.c b/src/ipathverbs.c
index eb16fb0..55d8dcf 100644
--- a/src/ipathverbs.c
+++ b/src/ipathverbs.c
@@ -42,6 +42,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
+#include <string.h>
 
 #include "ipathverbs.h"
 #include "ipath-abi.h"
@@ -140,6 +141,7 @@ static struct ibv_context *ipath_alloc_context(struct ibv_device *ibdev,
 	context = malloc(sizeof *context);
 	if (!context)
 		return NULL;
+	memset(context, 0, sizeof *context);
 	context->ibv_ctx.cmd_fd = cmd_fd;
 	if (ibv_cmd_get_context(&context->ibv_ctx, &cmd,
 				sizeof cmd, &resp, sizeof resp))


From prhap at wisc.edu  Mon Dec 17 00:34:15 2007
From: prhap at wisc.edu (Job)
Date: Mon, 17 Dec 2007 09:34:15 +0100
Subject: [ofa-general] trench headache
Message-ID: <47663487.6010107@wisc.edu>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/ad4f70ce/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twelve.gif
Type: image/gif
Size: 9128 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/ad4f70ce/attachment.gif>

From moshek at voltaire.com  Mon Dec 17 00:59:45 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Mon, 17 Dec 2007 10:59:45 +0200
Subject: [ofa-general] ofed-1.3-rc1 - RH 4 U 6 bacport files 
In-Reply-To: <200712170932.33301.vlad@mellanox.co.il>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2CEC@exil.voltaire.com>


Back porting is very simple.

Al I did is coping the 2.6.9_U5 directory to 2.6.9_U6 and changing
ofed_scripts/ofed_patch.sh .

I choose to copy the kernel_addons and kernel_patches directories to
enable fixes in the future,

If we find that U6 need something more.

The attached files do the work.

Moshe

____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OFED_1.3_rc1_RHEL_4_U6.ofed_scripts.diff
Type: application/octet-stream
Size: 632 bytes
Desc: OFED_1.3_rc1_RHEL_4_U6.ofed_scripts.diff
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/e925568e/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OFED_1.3_rc1_RHEL_4_U6.backport.diff
Type: application/octet-stream
Size: 393430 bytes
Desc: OFED_1.3_rc1_RHEL_4_U6.backport.diff
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/e925568e/attachment-0001.obj>

From moshek at voltaire.com  Mon Dec 17 01:11:23 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Mon, 17 Dec 2007 11:11:23 +0200
Subject: [ofa-general] ofed-1.2.5 - RH 4 U 6 bacport files 
In-Reply-To: <39C75744D164D948A170E9792AF8E7CA4D2CEC@exil.voltaire.com>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2CED@exil.voltaire.com>


Back porting is very simple.

Al I did is coping the 2.6.9_U5 directory to 2.6.9_U6 and changing
ofed_scripts/ofed_patch.sh .

I choose to copy the kernel_addons and kernel_patches directories to
enable fixes in the future,

If we find that U6 need something more.

The attached files do the work.

Moshe

____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OFED_1.2.5_RHEL_4_U6.backport.diff
Type: application/octet-stream
Size: 297008 bytes
Desc: OFED_1.2.5_RHEL_4_U6.backport.diff
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/072f68ee/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OFED_1.2.5_RHEL_4_U6.configure.diff
Type: application/octet-stream
Size: 433 bytes
Desc: OFED_1.2.5_RHEL_4_U6.configure.diff
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/072f68ee/attachment-0001.obj>

From moshek at voltaire.com  Mon Dec 17 01:13:38 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Mon, 17 Dec 2007 11:13:38 +0200
Subject: [ofa-general] ofed-1.2.5 - RH 4 U 6 bacport files 
In-Reply-To: <39C75744D164D948A170E9792AF8E7CA4D2CED@exil.voltaire.com>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2CEE@exil.voltaire.com>


> Al I did is coping the 2.6.9_U5 directory to 2.6.9_U6 and changing
ofed_scripts/ofed_patch.sh .

Typo ..... . I change the configure file and not
ofed_scripts/ofed_patch.sh

Moshe
____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
-----Original Message-----
From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Moshe Kazir
Sent: Monday, December 17, 2007 11:11 AM
To: Vladimir Sokolovsky; Yiftah Shahar
Cc: EWG; OpenFabrics General
Subject: [ofa-general] ofed-1.2.5 - RH 4 U 6 bacport files 


Back porting is very simple.

Al I did is coping the 2.6.9_U5 directory to 2.6.9_U6 and changing
ofed_scripts/ofed_patch.sh .

I choose to copy the kernel_addons and kernel_patches directories to
enable fixes in the future,

If we find that U6 need something more.

The attached files do the work.

Moshe

____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
From suna at 163.com  Mon Dec 17 01:55:27 2007
From: suna at 163.com (=?GB2312?B?1+7T0Mewzb6y+sa3?=)
Date: Mon, 17 Dec 2007 01:55:27 -0800 (PST)
Subject: [ofa-general] =?utf-8?b?5aSq6Ziz5YWJ5a+85YWl5Zmo44CA5Zu96ZmF55WF?=
	=?utf-8?b?6ZSA5paw5ZOB44CA5qyi6L+O5ZCE5Zyw57uP6ZSA44CA5ZCI77+9?=
Message-ID: <20071217095527.E2EA3E60202@openfabrics.org>

 =?GB2312?B?98ur0665sr34oaHXpdehvaG/tb3axNyhocbz0rXQwtT2wPvI8w==?=
To: openib-general at openib.org
Content-Type: text/html;charset="GB2312"
Reply-To: suna at 163.com
Date: Mon, 17 Dec 2007 17:55:20 +0800
X-Priority: 3
X-Mailer: FoxMail 3.11 Release [cn]


���� �㽭ʡ����������ڹ�˾���㽭ʡ���������ڴ�˾�������۶����곬��25��Ԫ�����й������ܾ�������̫��⵼�������Ǹ��ؾ�����ҵ�ļ�ǿ��ܡ���˾�������ؾ�����ҵ�ṩ����̫��⵼������Ӫ���ƣ���Ʒ����ƬDVD��������վ������������ϡ�չʾ����ơ�VIS�Ӿ�����ϵͳ��GPS��λϵͳ�����������㡢��װ\����\ά���ֲᡢ�����ͬ������ҵ����ѵ�����׾�Ӫ���ơ���֤������ҵ������չ���⾭Ӫ���ۣ����������ҵ������Ӫ����֮�༰���ڿ���Ͷ����á�
����̫��⵼����ϵ�в�Ʒ���й�ʵ��������վ��http://www.himawari.com.cn �������в�Ʒȫ����ܵ���Ƭ��Ӱ��Ƭ����˴����ţ�http://www.himawari.com.cn/live.htm�����԰����˽�̫��⵼����ϵ�в�Ʒ�з������������ۡ�ԭ�������ܡ�ʹ�á���װ��ά�������������
�����㽭���˾��վ��http://www.zjmtd.com.cn ���������㽭ʡ�������ڹ�˾�ۺ�������ܣ��԰�����λ�˽��㽭�����ۺ�ʵ���ͷ�չ�����
����̫��⵼����ҵ����ϵ��ʽ���㽭ʡ����������ڹ�˾ ̫�����ҵ�� ��ϵ�ˣ�Ѧ����� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003�� �绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
������ӭ������ҵ������������˾�ιۡ����졢����̫��⵼����ϵ�в�Ʒ��չʾ�����������ݿ��졢Ǣ̸ǰ���������ǰ��֪���ǣ��Ա���˾�ȳ���������ýӻ�ס�޶�Ʊ��վ�ȹ�����
̫��⵼�������й�2007����߷�չǰ�������½�����Ʒ�����¼ҵ��Ʒ�����½��ܲ�Ʒ������������Ʒ�����»�����Ʒ�����½���װ�޲�Ʒ��������ǹ�˾Ա������������˾�쵼����²�Ʒ�����Խ����������ʹ�����쵼��������ӡ�󣬵õ����úͽ�����

��������Զ����ʵҵ�ң�
�������ã����Ǹ����������й����¡�������������г�Ǳ����Ʒ����̫��⵼����������˾֣����ȫ������Ǣ̸������顣
����
��������һ����Ʒ����׿����
��������ר��ע��ԭװ���ڲ�Ʒ��GPS̫�����׷�١�͹͸���۽����⡢����⵼����Զ�ഫ�ͣ��к����߷������ء���������������䣻����һ��10���Сʱ100����ȫ�����չ�ԡ���ò�Ʒӵ��400�������ר������Ʒ����70�꣬������ʮ������г����ۿ��飬��һ̨���ϣ���Ʒ���ʡ������ȶ��ϳˣ���װʮ�ַ��㣬�·��ɷ�����ʹ�ã�ʹ�÷���Ϊ��ɱ���
����
�������ɶ����г�ǰ��������
�����������й�����������߷�չ���ƵĲ�ҵ�ǣ�����Ӳ�ҵ����Ϣͨ�Ų�ҵ�������͸�����ҵ������������Դ��ҵ�����ĸ���ҵ������Ǯ�Ĳ�ҵ��
����1������Ӳ�ҵ��������оƬ�ȱ���Ϊ���Ӳ�ҵ�ġ����ס��������ҵ���ԣ��𵽡��������ϡ����á�̫��⵼������̫��ת�͵����Թ�תϵͳ����ϵͳ�Ƴɼ���оƬ��
����2����Ϣͨ�Ų�ҵ���Ǳ���������͵�������ҵ��20�����Ƿ������͵綯��ռ������λ�ġ����������͡���21������ͨ�š��㲥�͵��Ӽ������λһ��ġ���Ϣ�����͡�������Ŀǰͨ�Ų�Ʒ���ֻ��������;����̲�������������䡣̫��⵼��������GPS���Ƕ�λϵͳ��ȷ����װ�ص㣬��ȷ��λ��γ�ȡ�
����3�������͸�����ҵ�������˺���ȻϢϢ��أ���ӳ�����ʱ��Ҫ������Ͳ�ҵ������Ǯ��ʲô������Ϊ�˹��Ϻ����ӣ�������Ϊ�˳��٣���˿���ҩ��ΰ����йؽ�����Ʒ������ͬʱ�Ը�����ҵҲ�ڸ�������𲢻�úܺõ��ȹ����档����������̫���й��������ⲻ���ĵط���ҽ����ȥ�ĵط���ŷ����������Ե������ֲ������ṩ������ҽѧŵ������������̫��⵼�����ų�������к��ɷݣ��������ṩ�������Ϲ�����׼�İ�ȫ������̫��⣬����������Ҫ����������˵Ľ���������������
����4������������Դ��ҵ���Ǳ������������ٵ���Ҫ���⡣������ע������������ע����Դ���ƣ�����ʯ�ʹ���Ǽۣ����������ô����ش�Ӱ�졣̫��⵼����ʹ���в�����������������ÿ���޳��ṩ�൱��1300�ȵ�����������������Դ��ͬʱ�����������Դ������ɻ�����Ⱦ������������磬�����ܵ�������Ⱦ��
������ˣ�̫��⵼������Ʒ����ҵ���������й����ı�������߷�չǰ�����ĸ���ҵ��ȫһ�¡��Ƿǳ��ѵá�������й㷺ʹ�ÿռ�ĺò�Ʒ��
����5���г��������й㷺�����۷�Χ��ʹ�ÿռ䡣
�����ܣ���������������Ȼ���������䡢�����ң����ⷿ���а�ȫ����̫��⡣���õ磬̫��⵼����ȴ���ṩ�ﵽ�൱��500�����ϵ���������ȣ������޳ɱ��ɹ�������
������˺��̫�����Ժϳɶ���ϡȱԪ�أ�������������������Թ��ܺ����������ٽ��³´�л������ƣ�ͣ������������������١�
����סլ�ĳ������������ң������Ŀ��������ҡ�������������������ȿ�ʹ���������ֵ��
���ڰ칫�ҡ������ҡ��Ӵ��ҵȰ칫������
����ҽԺ�����ݡ����ꡢԡ�ҡ����ݡ�չ����ѧУ��ˮ��ݡ�����ʵ���ҡ�����ָ���ҵȳ�����
�κν����ﶼ��Ҫ�����ܽ����ṹӰ�죬�÷��������ֵ����߳��ݵ��Ρ�
ͬʱҲ�ǽϺõġ��м������塢���˿̹����ǵ���Ʒ���������й����������У�����Ϊ��ƷҲռ��һ���ı��ء�
����
�������������������ж���¢�����ۡ�
����ÿ������ֻ��һ�������̡�����ר���������������¢�����������ۣ����������Ʒ�������GPS���Ƕ�λ�����͵������������豸���Ƽ�������˾��ɱ����������̲��������������̵����۳������֤��������¢������
������Ʒ�߿Ƽ����з���Ͷ�롣������Ϊ����500ǿ��ҵ������Ͷ��޶��ʽ�Լ50���з����飬20������г��ƹ㣬��˲�Ʒ���������칤����ʮ�����й��޷��������������й���ҵ��ð������Ҳֻ����ֽӽ��ò�Ʒ������׼30%�ķ�ð��Ʒ���������г���
����
���������ģ��������ɿ���
�����㽭������㽭ʡ��רҵ��ó��˾���ۺ�ʵ��ǿ�����й̶��ʲ���3��Ԫ�������۶�ͻ��25��Ԫ���������õȼ���AAA�������ڹ���������ʮ����ӹ�˾�ͷ�֧��������˾ӵ�й�����������Ƶ�Ӫ��������ϵ����ʵ����Ч�ľ�Ӫ���ơ�
������˾��������Ϊ��̫���ɹ⵼��ϵͳ����ȫ��Ӫ�����˴����г����С��߻�׼�������������Ƶľ�Ӫ�߻�����������Ӧ�ó�����ʹ�ð�װ��Ч���Աȵ���������Ƭ��������ֵ�רҵ�����������ϣ�Ӱ��Ƭ�������Ƶ������ƹ㡢��ư�װ��ά��������������ѵ���ۺ������ѵ�ƶȼ��γ�֧�֣�Ϊ�������ṩ�ѳɹ������ۻ��ƺ����Ƶķ����Ԯ����֤������ǩԼ���ܿ�ݶ��������ۡ�����ĿͶ���٣����졣
����
���������壺���ż������
��������2006��11��1�ա�2006��12��30�ջ�ڼ�Ǽ��ߣ������κμ��˷��ã��Ծ������ṩ���巿����ָ����һϵ����ѵļ�����ѵ���޳��ṩ���е����ϡ�ͼƬ��Ӱ�ӣ���ֵ��ʮ����Ԫ����
�����κ���ʵ������־�ڿ����²�Ʒ����ҵ�͸��˾��ɼ��ˣ�����¼ȡ�����¾�Ӫ���ز�����������װ���յ�������������ҵ����Ѹ���𶯡�

������˾���㽭ʡ�������㽭����������100ƽ���׵�չʾ�䣬������⵼�볡�������������ֳ��ιۿ��죬Ϊ��������ṩ���ܡ�
������ϵ��ʽ���㽭���˾̫�����ҵ�� ��ϵ�ˣ�Ѧ���� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003
�����绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
����������������������������


From vlad at lists.openfabrics.org  Mon Dec 17 03:03:11 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Mon, 17 Dec 2007 03:03:11 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071217-0200 daily build status
Message-ID: <20071217110312.112A8E601CB@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on powerpc with linux-2.6.13
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.22
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ppc64 with linux-2.6.13
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:
Build failed on i686 with linux-2.6.18
Build failed on x86_64 with linux-2.6.18
Log:
Build failed on ia64 with linux-2.6.18
Log:
Build failed on ppc64 with linux-2.6.18
Log:
Hunk #14 succeeded at 1536 (offset 4 lines).
Hunk #15 succeeded at 1824 (offset 3 lines).
Hunk #16 succeeded at 1873 (offset 3 lines).
Hunk #17 succeeded at 1884 (offset 3 lines).
2 out of 17 hunks FAILED -- rejects in file drivers/scsi/iscsi_tcp.c
patching file drivers/scsi/iscsi_tcp.h
Patch open-iscsi-tx-hash-fixes.patch does not apply (enforce with -f)

Failed executing /usr/bin/quilt
----------------------------------------------------------------------------------
Hunk #14 succeeded at 1536 (offset 4 lines).
Hunk #15 succeeded at 1824 (offset 3 lines).
Hunk #16 succeeded at 1873 (offset 3 lines).
Hunk #17 succeeded at 1884 (offset 3 lines).
2 out of 17 hunks FAILED -- rejects in file drivers/scsi/iscsi_tcp.c
patching file drivers/scsi/iscsi_tcp.h
Patch open-iscsi-tx-hash-fixes.patch does not apply (enforce with -f)

Failed executing /usr/bin/quilt
----------------------------------------------------------------------------------
Hunk #14 succeeded at 1536 (offset 4 lines).
Hunk #15 succeeded at 1824 (offset 3 lines).
Hunk #16 succeeded at 1873 (offset 3 lines).
Hunk #17 succeeded at 1884 (offset 3 lines).
2 out of 17 hunks FAILED -- rejects in file drivers/scsi/iscsi_tcp.c
patching file drivers/scsi/iscsi_tcp.h
Patch open-iscsi-tx-hash-fixes.patch does not apply (enforce with -f)

Failed executing /usr/bin/quilt
----------------------------------------------------------------------------------


From dotanb at dev.mellanox.co.il  Mon Dec 17 03:21:28 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Mon, 17 Dec 2007 13:21:28 +0200
Subject: [ofa-general] the mckey example has internal race
Message-ID: <47665BB8.5030805@dev.mellanox.co.il>

Hi Sean.

I'm working on adding the mckey example to out daily regression and i 
noticed a problem.

When executing the mckey test with several messages (for example, -C 
100) the server process never ends.

It seems that the server only receive some of the messages that the 
client sends because there isn't
any sync between the sides: the client post SR before all of the RR were 
posted to the server and
some of the messages that are being received by the server are 
(silently) dropped.


thanks
Dotan


From suna at 163.com  Mon Dec 17 03:35:24 2007
From: suna at 163.com (=?GB2312?B?1+7T0Mewzb6y+sa3?=)
Date: Mon, 17 Dec 2007 03:35:24 -0800 (PST)
Subject: [ofa-general] =?utf-8?b?5aSq6Ziz5YWJ5a+85YWl5Zmo44CA5Zu96ZmF55WF?=
	=?utf-8?b?6ZSA5paw5ZOB44CA5qyi6L+O5ZCE5Zyw57uP6ZSA44CA5ZCI77+9?=
Message-ID: <20071217113525.2299EE601CE@openfabrics.org>

 =?GB2312?B?98ur0665sr34oaHXpdehvaG/tb3axNyhocbz0rXQwtT2wPvI8w==?=
To: openib-general at openib.org
Content-Type: text/html;charset="GB2312"
Reply-To: suna at 163.com
Date: Mon, 17 Dec 2007 19:35:17 +0800
X-Priority: 3
X-Mailer: Microsoft Outlook Express 6.00.2800.1158


���� �㽭ʡ����������ڹ�˾���㽭ʡ���������ڴ�˾�������۶����곬��25��Ԫ�����й������ܾ�������̫��⵼�������Ǹ��ؾ�����ҵ�ļ�ǿ��ܡ���˾�������ؾ�����ҵ�ṩ����̫��⵼������Ӫ���ƣ���Ʒ����ƬDVD��������վ������������ϡ�չʾ����ơ�VIS�Ӿ�����ϵͳ��GPS��λϵͳ�����������㡢��װ\����\ά���ֲᡢ�����ͬ������ҵ����ѵ�����׾�Ӫ���ơ���֤������ҵ������չ���⾭Ӫ���ۣ����������ҵ������Ӫ����֮�༰���ڿ���Ͷ����á�
����̫��⵼����ϵ�в�Ʒ���й�ʵ��������վ��http://www.himawari.com.cn �������в�Ʒȫ����ܵ���Ƭ��Ӱ��Ƭ����˴����ţ�http://www.himawari.com.cn/live.htm�����԰����˽�̫��⵼����ϵ�в�Ʒ�з������������ۡ�ԭ�������ܡ�ʹ�á���װ��ά�������������
�����㽭���˾��վ��http://www.zjmtd.com.cn ���������㽭ʡ�������ڹ�˾�ۺ�������ܣ��԰�����λ�˽��㽭�����ۺ�ʵ���ͷ�չ�����
����̫��⵼����ҵ����ϵ��ʽ���㽭ʡ����������ڹ�˾ ̫�����ҵ�� ��ϵ�ˣ�Ѧ����� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003�� �绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
������ӭ������ҵ������������˾�ιۡ����졢����̫��⵼����ϵ�в�Ʒ��չʾ�����������ݿ��졢Ǣ̸ǰ���������ǰ��֪���ǣ��Ա���˾�ȳ���������ýӻ�ס�޶�Ʊ��վ�ȹ�����
̫��⵼�������й�2007����߷�չǰ�������½�����Ʒ�����¼ҵ��Ʒ�����½��ܲ�Ʒ������������Ʒ�����»�����Ʒ�����½���װ�޲�Ʒ��������ǹ�˾Ա������������˾�쵼����²�Ʒ�����Խ����������ʹ�����쵼��������ӡ�󣬵õ����úͽ�����

��������Զ����ʵҵ�ң�
�������ã����Ǹ����������й����¡�������������г�Ǳ����Ʒ����̫��⵼����������˾֣����ȫ������Ǣ̸������顣
����
��������һ����Ʒ����׿����
��������ר��ע��ԭװ���ڲ�Ʒ��GPS̫�����׷�١�͹͸���۽����⡢����⵼����Զ�ഫ�ͣ��к����߷������ء���������������䣻����һ��10���Сʱ100����ȫ�����չ�ԡ���ò�Ʒӵ��400�������ר������Ʒ����70�꣬������ʮ������г����ۿ��飬��һ̨���ϣ���Ʒ���ʡ������ȶ��ϳˣ���װʮ�ַ��㣬�·��ɷ�����ʹ�ã�ʹ�÷���Ϊ��ɱ���
����
�������ɶ����г�ǰ��������
�����������й�����������߷�չ���ƵĲ�ҵ�ǣ�����Ӳ�ҵ����Ϣͨ�Ų�ҵ�������͸�����ҵ������������Դ��ҵ�����ĸ���ҵ������Ǯ�Ĳ�ҵ��
����1������Ӳ�ҵ��������оƬ�ȱ���Ϊ���Ӳ�ҵ�ġ����ס��������ҵ���ԣ��𵽡��������ϡ����á�̫��⵼������̫��ת�͵����Թ�תϵͳ����ϵͳ�Ƴɼ���оƬ��
����2����Ϣͨ�Ų�ҵ���Ǳ���������͵�������ҵ��20�����Ƿ������͵綯��ռ������λ�ġ����������͡���21������ͨ�š��㲥�͵��Ӽ������λһ��ġ���Ϣ�����͡�������Ŀǰͨ�Ų�Ʒ���ֻ��������;����̲�������������䡣̫��⵼��������GPS���Ƕ�λϵͳ��ȷ����װ�ص㣬��ȷ��λ��γ�ȡ�
����3�������͸�����ҵ�������˺���ȻϢϢ��أ���ӳ�����ʱ��Ҫ������Ͳ�ҵ������Ǯ��ʲô������Ϊ�˹��Ϻ����ӣ�������Ϊ�˳��٣���˿���ҩ��ΰ����йؽ�����Ʒ������ͬʱ�Ը�����ҵҲ�ڸ�������𲢻�úܺõ��ȹ����档����������̫���й��������ⲻ���ĵط���ҽ����ȥ�ĵط���ŷ����������Ե������ֲ������ṩ������ҽѧŵ������������̫��⵼�����ų�������к��ɷݣ��������ṩ�������Ϲ�����׼�İ�ȫ������̫��⣬����������Ҫ����������˵Ľ���������������
����4������������Դ��ҵ���Ǳ������������ٵ���Ҫ���⡣������ע������������ע����Դ���ƣ�����ʯ�ʹ���Ǽۣ����������ô����ش�Ӱ�졣̫��⵼����ʹ���в�����������������ÿ���޳��ṩ�൱��1300�ȵ�����������������Դ��ͬʱ�����������Դ������ɻ�����Ⱦ������������磬�����ܵ�������Ⱦ��
������ˣ�̫��⵼������Ʒ����ҵ���������й����ı�������߷�չǰ�����ĸ���ҵ��ȫһ�¡��Ƿǳ��ѵá�������й㷺ʹ�ÿռ�ĺò�Ʒ��
����5���г��������й㷺�����۷�Χ��ʹ�ÿռ䡣
�����ܣ���������������Ȼ���������䡢�����ң����ⷿ���а�ȫ����̫��⡣���õ磬̫��⵼����ȴ���ṩ�ﵽ�൱��500�����ϵ���������ȣ������޳ɱ��ɹ�������
������˺��̫�����Ժϳɶ���ϡȱԪ�أ�������������������Թ��ܺ����������ٽ��³´�л������ƣ�ͣ������������������١�
����סլ�ĳ������������ң������Ŀ��������ҡ�������������������ȿ�ʹ���������ֵ��
���ڰ칫�ҡ������ҡ��Ӵ��ҵȰ칫������
����ҽԺ�����ݡ����ꡢԡ�ҡ����ݡ�չ����ѧУ��ˮ��ݡ�����ʵ���ҡ�����ָ���ҵȳ�����
�κν����ﶼ��Ҫ�����ܽ����ṹӰ�죬�÷��������ֵ����߳��ݵ��Ρ�
ͬʱҲ�ǽϺõġ��м������塢���˿̹����ǵ���Ʒ���������й����������У�����Ϊ��ƷҲռ��һ���ı��ء�
����
�������������������ж���¢�����ۡ�
����ÿ������ֻ��һ�������̡�����ר���������������¢�����������ۣ����������Ʒ�������GPS���Ƕ�λ�����͵������������豸���Ƽ�������˾��ɱ����������̲��������������̵����۳������֤��������¢������
������Ʒ�߿Ƽ����з���Ͷ�롣������Ϊ����500ǿ��ҵ������Ͷ��޶��ʽ�Լ50���з����飬20������г��ƹ㣬��˲�Ʒ���������칤����ʮ�����й��޷��������������й���ҵ��ð������Ҳֻ����ֽӽ��ò�Ʒ������׼30%�ķ�ð��Ʒ���������г���
����
���������ģ��������ɿ���
�����㽭������㽭ʡ��רҵ��ó��˾���ۺ�ʵ��ǿ�����й̶��ʲ���3��Ԫ�������۶�ͻ��25��Ԫ���������õȼ���AAA�������ڹ���������ʮ����ӹ�˾�ͷ�֧��������˾ӵ�й�����������Ƶ�Ӫ��������ϵ����ʵ����Ч�ľ�Ӫ���ơ�
������˾��������Ϊ��̫���ɹ⵼��ϵͳ����ȫ��Ӫ�����˴����г����С��߻�׼�������������Ƶľ�Ӫ�߻�����������Ӧ�ó�����ʹ�ð�װ��Ч���Աȵ���������Ƭ��������ֵ�רҵ�����������ϣ�Ӱ��Ƭ�������Ƶ������ƹ㡢��ư�װ��ά��������������ѵ���ۺ������ѵ�ƶȼ��γ�֧�֣�Ϊ�������ṩ�ѳɹ������ۻ��ƺ����Ƶķ����Ԯ����֤������ǩԼ���ܿ�ݶ��������ۡ�����ĿͶ���٣����졣
����
���������壺���ż������
��������2006��11��1�ա�2006��12��30�ջ�ڼ�Ǽ��ߣ������κμ��˷��ã��Ծ������ṩ���巿����ָ����һϵ����ѵļ�����ѵ���޳��ṩ���е����ϡ�ͼƬ��Ӱ�ӣ���ֵ��ʮ����Ԫ����
�����κ���ʵ������־�ڿ����²�Ʒ����ҵ�͸��˾��ɼ��ˣ�����¼ȡ�����¾�Ӫ���ز�����������װ���յ�������������ҵ����Ѹ���𶯡�

������˾���㽭ʡ�������㽭����������100ƽ���׵�չʾ�䣬������⵼�볡�������������ֳ��ιۿ��죬Ϊ��������ṩ���ܡ�
������ϵ��ʽ���㽭���˾̫�����ҵ�� ��ϵ�ˣ�Ѧ���� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003
�����绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
����������������������������


From suna at 163.com  Mon Dec 17 04:02:49 2007
From: suna at 163.com (=?GB2312?B?1+7T0Mewzb6y+sa3?=)
Date: Mon, 17 Dec 2007 04:02:49 -0800 (PST)
Subject: [ofa-general] =?utf-8?b?5aSq6Ziz5YWJ5a+85YWl5Zmo44CA5Zu96ZmF55WF?=
	=?utf-8?b?6ZSA5paw5ZOB44CA5qyi6L+O5ZCE5Zyw57uP6ZSA44CA5ZCI77+9?=
Message-ID: <20071217120249.F23F0E601F7@openfabrics.org>

 =?GB2312?B?98ur0665sr34oaHXpdehvaG/tb3axNyhocbz0rXQwtT2wPvI8w==?=
To: openib-general at openib.org
Content-Type: text/html;charset="GB2312"
Reply-To: suna at 163.com
Date: Mon, 17 Dec 2007 20:02:42 +0800
X-Priority: 3
X-Mailer: FoxMail 4.0 beta 2 [cn]


���� �㽭ʡ����������ڹ�˾���㽭ʡ���������ڴ�˾�������۶����곬��25��Ԫ�����й������ܾ�������̫��⵼�������Ǹ��ؾ�����ҵ�ļ�ǿ��ܡ���˾�������ؾ�����ҵ�ṩ����̫��⵼������Ӫ���ƣ���Ʒ����ƬDVD��������վ������������ϡ�չʾ����ơ�VIS�Ӿ�����ϵͳ��GPS��λϵͳ�����������㡢��װ\����\ά���ֲᡢ�����ͬ������ҵ����ѵ�����׾�Ӫ���ơ���֤������ҵ������չ���⾭Ӫ���ۣ����������ҵ������Ӫ����֮�༰���ڿ���Ͷ����á�
����̫��⵼����ϵ�в�Ʒ���й�ʵ��������վ��http://www.himawari.com.cn �������в�Ʒȫ����ܵ���Ƭ��Ӱ��Ƭ����˴����ţ�http://www.himawari.com.cn/live.htm�����԰����˽�̫��⵼����ϵ�в�Ʒ�з������������ۡ�ԭ�������ܡ�ʹ�á���װ��ά�������������
�����㽭���˾��վ��http://www.zjmtd.com.cn ���������㽭ʡ�������ڹ�˾�ۺ�������ܣ��԰�����λ�˽��㽭�����ۺ�ʵ���ͷ�չ�����
����̫��⵼����ҵ����ϵ��ʽ���㽭ʡ����������ڹ�˾ ̫�����ҵ�� ��ϵ�ˣ�Ѧ����� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003�� �绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
������ӭ������ҵ������������˾�ιۡ����졢����̫��⵼����ϵ�в�Ʒ��չʾ�����������ݿ��졢Ǣ̸ǰ���������ǰ��֪���ǣ��Ա���˾�ȳ���������ýӻ�ס�޶�Ʊ��վ�ȹ�����
̫��⵼�������й�2007����߷�չǰ�������½�����Ʒ�����¼ҵ��Ʒ�����½��ܲ�Ʒ������������Ʒ�����»�����Ʒ�����½���װ�޲�Ʒ��������ǹ�˾Ա������������˾�쵼����²�Ʒ�����Խ����������ʹ�����쵼��������ӡ�󣬵õ����úͽ�����

��������Զ����ʵҵ�ң�
�������ã����Ǹ����������й����¡�������������г�Ǳ����Ʒ����̫��⵼����������˾֣����ȫ������Ǣ̸������顣
����
��������һ����Ʒ����׿����
��������ר��ע��ԭװ���ڲ�Ʒ��GPS̫�����׷�١�͹͸���۽����⡢����⵼����Զ�ഫ�ͣ��к����߷������ء���������������䣻����һ��10���Сʱ100����ȫ�����չ�ԡ���ò�Ʒӵ��400�������ר������Ʒ����70�꣬������ʮ������г����ۿ��飬��һ̨���ϣ���Ʒ���ʡ������ȶ��ϳˣ���װʮ�ַ��㣬�·��ɷ�����ʹ�ã�ʹ�÷���Ϊ��ɱ���
����
�������ɶ����г�ǰ��������
�����������й�����������߷�չ���ƵĲ�ҵ�ǣ�����Ӳ�ҵ����Ϣͨ�Ų�ҵ�������͸�����ҵ������������Դ��ҵ�����ĸ���ҵ������Ǯ�Ĳ�ҵ��
����1������Ӳ�ҵ��������оƬ�ȱ���Ϊ���Ӳ�ҵ�ġ����ס��������ҵ���ԣ��𵽡��������ϡ����á�̫��⵼������̫��ת�͵����Թ�תϵͳ����ϵͳ�Ƴɼ���оƬ��
����2����Ϣͨ�Ų�ҵ���Ǳ���������͵�������ҵ��20�����Ƿ������͵綯��ռ������λ�ġ����������͡���21������ͨ�š��㲥�͵��Ӽ������λһ��ġ���Ϣ�����͡�������Ŀǰͨ�Ų�Ʒ���ֻ��������;����̲�������������䡣̫��⵼��������GPS���Ƕ�λϵͳ��ȷ����װ�ص㣬��ȷ��λ��γ�ȡ�
����3�������͸�����ҵ�������˺���ȻϢϢ��أ���ӳ�����ʱ��Ҫ������Ͳ�ҵ������Ǯ��ʲô������Ϊ�˹��Ϻ����ӣ�������Ϊ�˳��٣���˿���ҩ��ΰ����йؽ�����Ʒ������ͬʱ�Ը�����ҵҲ�ڸ�������𲢻�úܺõ��ȹ����档����������̫���й��������ⲻ���ĵط���ҽ����ȥ�ĵط���ŷ����������Ե������ֲ������ṩ������ҽѧŵ������������̫��⵼�����ų�������к��ɷݣ��������ṩ�������Ϲ�����׼�İ�ȫ������̫��⣬����������Ҫ����������˵Ľ���������������
����4������������Դ��ҵ���Ǳ������������ٵ���Ҫ���⡣������ע������������ע����Դ���ƣ�����ʯ�ʹ���Ǽۣ����������ô����ش�Ӱ�졣̫��⵼����ʹ���в�����������������ÿ���޳��ṩ�൱��1300�ȵ�����������������Դ��ͬʱ�����������Դ������ɻ�����Ⱦ������������磬�����ܵ�������Ⱦ��
������ˣ�̫��⵼������Ʒ����ҵ���������й����ı�������߷�չǰ�����ĸ���ҵ��ȫһ�¡��Ƿǳ��ѵá�������й㷺ʹ�ÿռ�ĺò�Ʒ��
����5���г��������й㷺�����۷�Χ��ʹ�ÿռ䡣
�����ܣ���������������Ȼ���������䡢�����ң����ⷿ���а�ȫ����̫��⡣���õ磬̫��⵼����ȴ���ṩ�ﵽ�൱��500�����ϵ���������ȣ������޳ɱ��ɹ�������
������˺��̫�����Ժϳɶ���ϡȱԪ�أ�������������������Թ��ܺ����������ٽ��³´�л������ƣ�ͣ������������������١�
����סլ�ĳ������������ң������Ŀ��������ҡ�������������������ȿ�ʹ���������ֵ��
���ڰ칫�ҡ������ҡ��Ӵ��ҵȰ칫������
����ҽԺ�����ݡ����ꡢԡ�ҡ����ݡ�չ����ѧУ��ˮ��ݡ�����ʵ���ҡ�����ָ���ҵȳ�����
�κν����ﶼ��Ҫ�����ܽ����ṹӰ�죬�÷��������ֵ����߳��ݵ��Ρ�
ͬʱҲ�ǽϺõġ��м������塢���˿̹����ǵ���Ʒ���������й����������У�����Ϊ��ƷҲռ��һ���ı��ء�
����
�������������������ж���¢�����ۡ�
����ÿ������ֻ��һ�������̡�����ר���������������¢�����������ۣ����������Ʒ�������GPS���Ƕ�λ�����͵������������豸���Ƽ�������˾��ɱ����������̲��������������̵����۳������֤��������¢������
������Ʒ�߿Ƽ����з���Ͷ�롣������Ϊ����500ǿ��ҵ������Ͷ��޶��ʽ�Լ50���з����飬20������г��ƹ㣬��˲�Ʒ���������칤����ʮ�����й��޷��������������й���ҵ��ð������Ҳֻ����ֽӽ��ò�Ʒ������׼30%�ķ�ð��Ʒ���������г���
����
���������ģ��������ɿ���
�����㽭������㽭ʡ��רҵ��ó��˾���ۺ�ʵ��ǿ�����й̶��ʲ���3��Ԫ�������۶�ͻ��25��Ԫ���������õȼ���AAA�������ڹ���������ʮ����ӹ�˾�ͷ�֧��������˾ӵ�й�����������Ƶ�Ӫ��������ϵ����ʵ����Ч�ľ�Ӫ���ơ�
������˾��������Ϊ��̫���ɹ⵼��ϵͳ����ȫ��Ӫ�����˴����г����С��߻�׼�������������Ƶľ�Ӫ�߻�����������Ӧ�ó�����ʹ�ð�װ��Ч���Աȵ���������Ƭ��������ֵ�רҵ�����������ϣ�Ӱ��Ƭ�������Ƶ������ƹ㡢��ư�װ��ά��������������ѵ���ۺ������ѵ�ƶȼ��γ�֧�֣�Ϊ�������ṩ�ѳɹ������ۻ��ƺ����Ƶķ����Ԯ����֤������ǩԼ���ܿ�ݶ��������ۡ�����ĿͶ���٣����졣
����
���������壺���ż������
��������2006��11��1�ա�2006��12��30�ջ�ڼ�Ǽ��ߣ������κμ��˷��ã��Ծ������ṩ���巿����ָ����һϵ����ѵļ�����ѵ���޳��ṩ���е����ϡ�ͼƬ��Ӱ�ӣ���ֵ��ʮ����Ԫ����
�����κ���ʵ������־�ڿ����²�Ʒ����ҵ�͸��˾��ɼ��ˣ�����¼ȡ�����¾�Ӫ���ز�����������װ���յ�������������ҵ����Ѹ���𶯡�

������˾���㽭ʡ�������㽭����������100ƽ���׵�չʾ�䣬������⵼�볡�������������ֳ��ιۿ��죬Ϊ��������ṩ���ܡ�
������ϵ��ʽ���㽭���˾̫�����ҵ�� ��ϵ�ˣ�Ѧ���� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003
�����绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
����������������������������


From kliteyn at dev.mellanox.co.il  Mon Dec 17 05:33:38 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 17 Dec 2007 15:33:38 +0200
Subject: [ofa-general] [PATCH] opensm: osm_state_mgr.c - stop idle queue
 processing if heavy sweep requested
Message-ID: <47667AB2.8030500@dev.mellanox.co.il>

If a heavy sweep requested during idle queue processing, OSM continues
to process it till the end and only then notices the heavy sweep request.
In some cases this might leave a topology change unhandled for several
minutes.

Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_state_mgr.c |   31 ++++++++++++++++++++++++-------
 1 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index 5c39f11..6ee5ee6 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
 				/* CALL the done function */
 				__process_idle_time_queue_done(p_mgr);

-				/*
-				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
-				 * so that the next element in the queue gets processed
-				 */
-
-				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
-				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
+				if (p_mgr->p_subn->force_immediate_heavy_sweep) {
+					/*
+					 * Do not read next item from the idle queue.
+					 * Immediate heavy sweep is requested, so it's
+					 * more important.
+					 * Besides, there is a chance that after the
+					 * heavy sweep complition, idle queue processing
+					 * that SM would have performed here will be obsolete.
+					 */
+					if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG))
+						osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
+						"osm_state_mgr_process: "
+						"interrupting idle time queue processing - heavy sweep requested\n");
+					signal = OSM_SIGNAL_NONE:
+					p_mgr->state = OSM_SM_STATE_IDLE;
+				}
+				else {
+					/*
+					 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
+					 * so that the next element in the queue gets processed
+					 */
+					signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
+					p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
+				}
 				break;

 			default:
-- 
1.5.1.4


From kliteyn at dev.mellanox.co.il  Mon Dec 17 05:44:59 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 17 Dec 2007 15:44:59 +0200
Subject: [ofa-general] [PATCH v2] opensm: osm_state_mgr.c - stop idle queue
 processing if heavy sweep requested
Message-ID: <47667D5B.8000802@dev.mellanox.co.il>

If a heavy sweep requested during idle queue processing, OSM continues
to process it till the end and only then notices the heavy sweep request.
In some cases this might leave a topology change unhandled for several
minutes.

Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_state_mgr.c |   31 ++++++++++++++++++++++++-------
 1 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index 5c39f11..6ee5ee6 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
 				/* CALL the done function */
 				__process_idle_time_queue_done(p_mgr);

-				/*
-				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
-				 * so that the next element in the queue gets processed
-				 */
-
-				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
-				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
+				if (p_mgr->p_subn->force_immediate_heavy_sweep) {
+					/*
+					 * Do not read next item from the idle queue.
+					 * Immediate heavy sweep is requested, so it's
+					 * more important.
+					 * Besides, there is a chance that after the
+					 * heavy sweep complition, idle queue processing
+					 * that SM would have performed here will be obsolete.
+					 */
+					if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG))
+						osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
+						"osm_state_mgr_process: "
+						"interrupting idle time queue processing - heavy sweep requested\n");
+					signal = OSM_SIGNAL_NONE;
+					p_mgr->state = OSM_SM_STATE_IDLE;
+				}
+				else {
+					/*
+					 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
+					 * so that the next element in the queue gets processed
+					 */
+					signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
+					p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
+				}
 				break;

 			default:
-- 
1.5.1.4


From suna at 163.com  Mon Dec 17 05:53:51 2007
From: suna at 163.com (=?GB2312?B?1+7T0Mewzb6y+sa3?=)
Date: Mon, 17 Dec 2007 05:53:51 -0800 (PST)
Subject: [ofa-general] =?utf-8?b?5aSq6Ziz5YWJ5a+85YWl5Zmo44CA5Zu96ZmF55WF?=
	=?utf-8?b?6ZSA5paw5ZOB44CA5qyi6L+O5ZCE5Zyw57uP6ZSA44CA5ZCI77+9?=
Message-ID: <20071217135352.61069E601CE@openfabrics.org>

 =?GB2312?B?98ur0665sr34oaHXpdehvaG/tb3axNyhocbz0rXQwtT2wPvI8w==?=
To: openib-general at openib.org
Content-Type: text/html;charset="GB2312"
Reply-To: suna at 163.com
Date: Mon, 17 Dec 2007 21:53:44 +0800
X-Priority: 3
X-Mailer: Microsoft Outlook Express 6.00.2800.1106


���� �㽭ʡ����������ڹ�˾���㽭ʡ���������ڴ�˾�������۶����곬��25��Ԫ�����й������ܾ�������̫��⵼�������Ǹ��ؾ�����ҵ�ļ�ǿ��ܡ���˾�������ؾ�����ҵ�ṩ����̫��⵼������Ӫ���ƣ���Ʒ����ƬDVD��������վ������������ϡ�չʾ����ơ�VIS�Ӿ�����ϵͳ��GPS��λϵͳ�����������㡢��װ\����\ά���ֲᡢ�����ͬ������ҵ����ѵ�����׾�Ӫ���ơ���֤������ҵ������չ���⾭Ӫ���ۣ����������ҵ������Ӫ����֮�༰���ڿ���Ͷ����á�
����̫��⵼����ϵ�в�Ʒ���й�ʵ��������վ��http://www.himawari.com.cn �������в�Ʒȫ����ܵ���Ƭ��Ӱ��Ƭ����˴����ţ�http://www.himawari.com.cn/live.htm�����԰����˽�̫��⵼����ϵ�в�Ʒ�з������������ۡ�ԭ�������ܡ�ʹ�á���װ��ά�������������
�����㽭���˾��վ��http://www.zjmtd.com.cn ���������㽭ʡ�������ڹ�˾�ۺ�������ܣ��԰�����λ�˽��㽭�����ۺ�ʵ���ͷ�չ�����
����̫��⵼����ҵ����ϵ��ʽ���㽭ʡ����������ڹ�˾ ̫�����ҵ�� ��ϵ�ˣ�Ѧ����� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003�� �绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
������ӭ������ҵ������������˾�ιۡ����졢����̫��⵼����ϵ�в�Ʒ��չʾ�����������ݿ��졢Ǣ̸ǰ���������ǰ��֪���ǣ��Ա���˾�ȳ���������ýӻ�ס�޶�Ʊ��վ�ȹ�����
̫��⵼�������й�2007����߷�չǰ�������½�����Ʒ�����¼ҵ��Ʒ�����½��ܲ�Ʒ������������Ʒ�����»�����Ʒ�����½���װ�޲�Ʒ��������ǹ�˾Ա������������˾�쵼����²�Ʒ�����Խ����������ʹ�����쵼��������ӡ�󣬵õ����úͽ�����

��������Զ����ʵҵ�ң�
�������ã����Ǹ����������й����¡�������������г�Ǳ����Ʒ����̫��⵼����������˾֣����ȫ������Ǣ̸������顣
����
��������һ����Ʒ����׿����
��������ר��ע��ԭװ���ڲ�Ʒ��GPS̫�����׷�١�͹͸���۽����⡢����⵼����Զ�ഫ�ͣ��к����߷������ء���������������䣻����һ��10���Сʱ100����ȫ�����չ�ԡ���ò�Ʒӵ��400�������ר������Ʒ����70�꣬������ʮ������г����ۿ��飬��һ̨���ϣ���Ʒ���ʡ������ȶ��ϳˣ���װʮ�ַ��㣬�·��ɷ�����ʹ�ã�ʹ�÷���Ϊ��ɱ���
����
�������ɶ����г�ǰ��������
�����������й�����������߷�չ���ƵĲ�ҵ�ǣ�����Ӳ�ҵ����Ϣͨ�Ų�ҵ�������͸�����ҵ������������Դ��ҵ�����ĸ���ҵ������Ǯ�Ĳ�ҵ��
����1������Ӳ�ҵ��������оƬ�ȱ���Ϊ���Ӳ�ҵ�ġ����ס��������ҵ���ԣ��𵽡��������ϡ����á�̫��⵼������̫��ת�͵����Թ�תϵͳ����ϵͳ�Ƴɼ���оƬ��
����2����Ϣͨ�Ų�ҵ���Ǳ���������͵�������ҵ��20�����Ƿ������͵綯��ռ������λ�ġ����������͡���21������ͨ�š��㲥�͵��Ӽ������λһ��ġ���Ϣ�����͡�������Ŀǰͨ�Ų�Ʒ���ֻ��������;����̲�������������䡣̫��⵼��������GPS���Ƕ�λϵͳ��ȷ����װ�ص㣬��ȷ��λ��γ�ȡ�
����3�������͸�����ҵ�������˺���ȻϢϢ��أ���ӳ�����ʱ��Ҫ������Ͳ�ҵ������Ǯ��ʲô������Ϊ�˹��Ϻ����ӣ�������Ϊ�˳��٣���˿���ҩ��ΰ����йؽ�����Ʒ������ͬʱ�Ը�����ҵҲ�ڸ�������𲢻�úܺõ��ȹ����档����������̫���й��������ⲻ���ĵط���ҽ����ȥ�ĵط���ŷ����������Ե������ֲ������ṩ������ҽѧŵ������������̫��⵼�����ų�������к��ɷݣ��������ṩ�������Ϲ�����׼�İ�ȫ������̫��⣬����������Ҫ����������˵Ľ���������������
����4������������Դ��ҵ���Ǳ������������ٵ���Ҫ���⡣������ע������������ע����Դ���ƣ�����ʯ�ʹ���Ǽۣ����������ô����ش�Ӱ�졣̫��⵼����ʹ���в�����������������ÿ���޳��ṩ�൱��1300�ȵ�����������������Դ��ͬʱ�����������Դ������ɻ�����Ⱦ������������磬�����ܵ�������Ⱦ��
������ˣ�̫��⵼������Ʒ����ҵ���������й����ı�������߷�չǰ�����ĸ���ҵ��ȫһ�¡��Ƿǳ��ѵá�������й㷺ʹ�ÿռ�ĺò�Ʒ��
����5���г��������й㷺�����۷�Χ��ʹ�ÿռ䡣
�����ܣ���������������Ȼ���������䡢�����ң����ⷿ���а�ȫ����̫��⡣���õ磬̫��⵼����ȴ���ṩ�ﵽ�൱��500�����ϵ���������ȣ������޳ɱ��ɹ�������
������˺��̫�����Ժϳɶ���ϡȱԪ�أ�������������������Թ��ܺ����������ٽ��³´�л������ƣ�ͣ������������������١�
����סլ�ĳ������������ң������Ŀ��������ҡ�������������������ȿ�ʹ���������ֵ��
���ڰ칫�ҡ������ҡ��Ӵ��ҵȰ칫������
����ҽԺ�����ݡ����ꡢԡ�ҡ����ݡ�չ����ѧУ��ˮ��ݡ�����ʵ���ҡ�����ָ���ҵȳ�����
�κν����ﶼ��Ҫ�����ܽ����ṹӰ�죬�÷��������ֵ����߳��ݵ��Ρ�
ͬʱҲ�ǽϺõġ��м������塢���˿̹����ǵ���Ʒ���������й����������У�����Ϊ��ƷҲռ��һ���ı��ء�
����
�������������������ж���¢�����ۡ�
����ÿ������ֻ��һ�������̡�����ר���������������¢�����������ۣ����������Ʒ�������GPS���Ƕ�λ�����͵������������豸���Ƽ�������˾��ɱ����������̲��������������̵����۳������֤��������¢������
������Ʒ�߿Ƽ����з���Ͷ�롣������Ϊ����500ǿ��ҵ������Ͷ��޶��ʽ�Լ50���з����飬20������г��ƹ㣬��˲�Ʒ���������칤����ʮ�����й��޷��������������й���ҵ��ð������Ҳֻ����ֽӽ��ò�Ʒ������׼30%�ķ�ð��Ʒ���������г���
����
���������ģ��������ɿ���
�����㽭������㽭ʡ��רҵ��ó��˾���ۺ�ʵ��ǿ�����й̶��ʲ���3��Ԫ�������۶�ͻ��25��Ԫ���������õȼ���AAA�������ڹ���������ʮ����ӹ�˾�ͷ�֧��������˾ӵ�й�����������Ƶ�Ӫ��������ϵ����ʵ����Ч�ľ�Ӫ���ơ�
������˾��������Ϊ��̫���ɹ⵼��ϵͳ����ȫ��Ӫ�����˴����г����С��߻�׼�������������Ƶľ�Ӫ�߻�����������Ӧ�ó�����ʹ�ð�װ��Ч���Աȵ���������Ƭ��������ֵ�רҵ�����������ϣ�Ӱ��Ƭ�������Ƶ������ƹ㡢��ư�װ��ά��������������ѵ���ۺ������ѵ�ƶȼ��γ�֧�֣�Ϊ�������ṩ�ѳɹ������ۻ��ƺ����Ƶķ����Ԯ����֤������ǩԼ���ܿ�ݶ��������ۡ�����ĿͶ���٣����졣
����
���������壺���ż������
��������2006��11��1�ա�2006��12��30�ջ�ڼ�Ǽ��ߣ������κμ��˷��ã��Ծ������ṩ���巿����ָ����һϵ����ѵļ�����ѵ���޳��ṩ���е����ϡ�ͼƬ��Ӱ�ӣ���ֵ��ʮ����Ԫ����
�����κ���ʵ������־�ڿ����²�Ʒ����ҵ�͸��˾��ɼ��ˣ�����¼ȡ�����¾�Ӫ���ز�����������װ���յ�������������ҵ����Ѹ���𶯡�

������˾���㽭ʡ�������㽭����������100ƽ���׵�չʾ�䣬������⵼�볡�������������ֳ��ιۿ��죬Ϊ��������ṩ���ܡ�
������ϵ��ʽ���㽭���˾̫�����ҵ�� ��ϵ�ˣ�Ѧ���� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003
�����绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
����������������������������


From jewelryrfid.com at knightpoint.com  Mon Dec 17 07:23:19 2007
From: jewelryrfid.com at knightpoint.com (Harrison Evans)
Date: Mon, 17 Dec 2007 20:23:19 +0500
Subject: [ofa-general] Don't be left out,
	join millions of men in the revolution
Message-ID: <000701c840bc$568a6980$0100007f@avflc>


Info in attach or here:
http://www.anesmouf.net/

-----
Where are you going now? she a
To the lake. I could go with y
No, you couldnt. But I� Others
Could you please come with me 
 
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/80f5e97b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: img59.jpg
Type: image/jpg
Size: 10787 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/80f5e97b/attachment.jpg>

From ogerlitz at voltaire.com  Mon Dec 17 07:08:20 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Mon, 17 Dec 2007 17:08:20 +0200 (IST)
Subject: [ofa-general] some questions on stale connection handling at the IB
	CM
Message-ID: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>

Hi Sean,

Basically, I am trying to understand when a "stale connection" as defined by
12.4.1 handling is carried out by the CM and what are the cases where it
must be handled at the app level (if there are such).

Looking on the code, I see that the CM sends a reject message with the reason being
IB_CM_REJ_STALE_CONN when it gets a REQ or REP whose <QPN, CA GUID> pair is already
present at the remote-qpn rb tree (and in another case which I don't fully understand).

On the other side, when the CM receives a reject message with that reason, the local handle
(id) is moved to the timewait state, where my understanding is that it will sit there for a while
and then a reject/stale-connection callback will be delivered to the user, the id will be removed.

What I don't see is issue of "DREQ, with DREQ:remote QPN set to the remote QPN from the REQ/REP"
as stated in 12.9.8.3.1 (below), is it really missing or I am reading the code wrong?

Also, its quite clear to me that from the view point of the application there are stale
connection cases which the CM can not catch, eg a client DREQ did not arrive to the server
side CM and then the client REQ uses a different QPN, etc. My understanding is that in such
cases the responsibility to close the stale connection/qp is on the server app.

Or.

12.9.8.3.1 REQ RECEIVED / REP RECEIVED
(RC, UC) A CM may receive a REQ/REP specifying a remote QPN in REQ:local QPN/REP:local QPN
that the CM already considers connected to a local QP. A local CM may receive such a REQ/REP
if its local QP has a stale connection, as described in section 12.4.1. When a CM receives
such a REQ/REP it shall abort the connection establishment by issuing REJ to the REQ/REP.
It shall then issue DREQ, with DREQ:remote QPN set to the remote QPN from the REQ/REP, until
DREP is received or Max Retries is exceeded, and place the local QP in the TimeWait state.


From tziporet at mellanox.co.il  Mon Dec 17 07:21:58 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Mon, 17 Dec 2007 17:21:58 +0200
Subject: [ofa-general] Agenda for the OFED meeting today
Message-ID: <6C2C79E72C305246B504CBA17B5500C902E86128@mtlexch01.mtl.com>


Hi,

Agenda for the meeting today:
1. Review RC1 status
2. Agree on the release schedule
    These are the options for RC2:
	a. 27-Dec 07
	b. 3-Jan 08
	c. 8-Jan 08

2. Review open bugs:

bug_id	bug_severity	assigned_to	short_short_desc	
804	critical	erezz at voltaire.com	open-iscsi over TCP
crashes when sending data in RHAS4	
750	critical	raisch at de.ibm.com	Problem with modprobe
ib_ehca with older kernel versions	
760	major	eli at mellanox.co.il	UDP performance on Rx is lower
than Tx	
761	major	eli at mellanox.co.il	Poor and jittery UDP performance
at small messages	
508	major	eli at mellanox.co.il	IPoIB CM multicast is hogging
interrupts	
780	major	orenk at dev.mellanox.co.il	Bulding ibutils is never
completed	
820	major	pasha at mellanox.co.il	rpm 4.4.2.2, Binary file matches
Binary file,	
730	major	pasha at mellanox.co.il	OFED 1.3 MPI won't compile with
PGI 6.2.5 on RHEL4 x86_64	
736	major	rolandd at cisco.com	IBV_WC_RETRY_EXC_ERR errors with
local rdma_reads	
810	major	sashak at voltaire.com	diags fail on two-port HCAs (2
ports; only 1 supported currently)	
767	major	swise at opengridcomputing.com	Non backport Kernels
that don't build in genalloc cause compile errors for cxgb3	
824	normal	arlin.r.davis at intel.com	DAPL 32bit-PPC is not installed
under /usr/lib	
782	normal	eli at mellanox.co.il	kernel panic while driver
load/unload in loop on RH5	
788	normal	eli at mellanox.co.il	changing the Pkey table causes
memory leak in IPoIB CM	
814	normal	eli at mellanox.co.il	executing netpipe over  IPoIB
produces alot of error messages in the /var/log/messages	
763	normal	jackm at dev.mellanox.co.il	XRC domain can be closed
event QP/SRQ are using it	
797	normal	jim at mellanox.com	Kernel panic on sdp will running
multiple tests	
803	normal	jim at mellanox.com	Kernel BUG at
kernel/workqueue.c:176	
689	normal	jsquyres at cisco.com	When one install ofed (including
the mpi-selector) and choosing prefix that end with "/", Install fails.	
709	normal	orenk at dev.mellanox.co.il	ibutils binaries have
wrong RPATH	
800	normal	perkinjo at cse.ohio-state.edu	compile error on JS22
PPC64 RHEL5 U1	
825	normal	sashak at voltaire.com	The openSM fail to answer to
saquery in a loopback	
723	normal	vlad at mellanox.co.il	netperf over rds failed -
rds_send: data send error: Invalid argument	
817	normal	vlad at mellanox.co.il	rds-gen/sink message and buffer
size unexpected limitations.	
818	normal	vlad at mellanox.co.il	rds-gen/sink - "Unexpected end
of file.." msg as a result of file transfer through rds-gen.	
826	normal	vlad at mellanox.co.il	using an application that uses
RDS and restarting the driver causes mem leak	
38	enhancement	jim at mellanox.com	add support for AIO SDP	
39	enhancement	jim at mellanox.com	add zcopy support to
non-AIO SDP	

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/32a7f7f4/attachment.html>

From jackm at dev.mellanox.co.il  Mon Dec 17 07:20:04 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 17 Dec 2007 17:20:04 +0200
Subject: [ofa-general] OFED 1.3-rc1 release is available
In-Reply-To: <47662978.4070104@mellanox.co.il>
References: <6C2C79E72C305246B504CBA17B5500C902E2D65D@mtlexch01.mtl.com>
	<D89C2C212795564B837FA1665CAE02990FDE1B42C3@G5W0278.americas.hpqcorp.net>
	<47662978.4070104@mellanox.co.il>
Message-ID: <200712171720.04190.jackm@dev.mellanox.co.il>

On Monday 17 December 2007 09:47, you wrote:
> Tang, Changqing wrote:
> >  
> > HI,
> >  
> > When can you fix the backward compatible issue with OFED 1.2 ?  Thanks.
> >  
> This will be fixed this week (will be in RC2)
> Jack will update when its working
> 
> Tziporet

The binary compatibility fix (along the lines suggested by Roland) was just committed to libibverbs.git and libmlx4.git.
(git://git.openfabrics.org/ofed_1_3/libibverbs.git, branch="ofed_1_3" and
 git://git.openfabrics.org/ofed_1_3/libmlx4.git, branch="ofed_1_3" )
It will be in the next ofed 1.3 daily build on the openfabrics server.

- Jack


From swise at opengridcomputing.com  Mon Dec 17 07:33:42 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Mon, 17 Dec 2007 09:33:42 -0600
Subject: [ofa-general] Re: [ewg] ofed-1.3-rc1 problem
In-Reply-To: <200712170932.33301.vlad@mellanox.co.il>
References: <4762CF30.2030202@opengridcomputing.com>
	<4765435E.5020801@dev.mellanox.co.il>
	<47654E1B.5060308@opengridcomputing.com>
	<200712170932.33301.vlad@mellanox.co.il>
Message-ID: <476696D6.3040108@opengridcomputing.com>

Vladimir Sokolovsky wrote:
> On Sunday 16 December 2007 18:11:07 Steve Wise wrote:
>> You're right!
>>
>> It is not installed, but it was built by install.pl.  However I didn't
>> explicitly request to build/install ibumad.  Its a prerequisite of
>> mvapich2, which I did ask to have built/installed.  So I think
>> install.pl needs to be fixed to prereq this maybe?
>>
> 
> install.pl set requirements for OFED packages selected to be installed. As I 
> understand, correct me if I am wrong, the installation passed successfully, 
> so , there is no issues in the install. If you want to compile your 
> application (not from OFED) over libibumad, then you have to select 
> libibumad-devel to be installed during OFED installation.
>

I'm not sure where the fix needs to go, but if I install mvapich2-1.0.0 
via install.pl, it should also build/install libibumad-devel. 
Otherwise, mpi programs cannot link correctly.  I don't know if this 
dependency should be defined in the mvapich2 srpm somehow or the ofed tools.

But in install.pl I see this:

>         'mvapich2_gcc' =>
>             { name => "mvapich2_gcc", parent => "mvapich2",
>             selected => 0, installed => 0, rpm_exist => 0, rpm_exist32 => 0,
>             available => 0, mode => "user", dist_req_build => [],
>             dist_req_inst => [], ofa_req_build => ["libibumad-devel", "libibverbs-devel", "librdmacm-devel"],
>             ofa_req_inst => ["mpi-selector", "librdmacm", "libibumad"],
>             install32 => 0, exception => 0 },


And see that libibumad-devel is a build requirement.  I claim it is also 
an install requirement.

This all worked on 1.2.5 by the way...


Steve.


> Regards,
> Vladimir
> 
>> Vladimir Sokolovsky wrote:
>>> Steve Wise wrote:
>>>> linking with libibumad fails on ofed-1.3-rc1.  I get a 'cannot find
>>>> -libumad' from ld.  I looked in /usr/lib64 and there wasn't a link
>>>> from libibumad.so to libibumad.so.1.0.2.  I added the link and the ld
>>>> works now.  This was on PPC64.
>>>>
>>>> I think this is some install problem with libibumad.
>>>>
>>>> Steve.
>>> Hi Steve,
>>> Check that libibumad-devel is installed.
>>>
>>> Regards,
>>> Vladimir
> 


From changquing.tang at hp.com  Mon Dec 17 07:35:12 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Mon, 17 Dec 2007 15:35:12 +0000
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <200712161901.21050.jackm@dev.mellanox.co.il>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<ada4pexemsu.fsf@cisco.com>
	<D89C2C212795564B837FA1665CAE02990FDE048E18@G5W0278.americas.hpqcorp.net>
	<200712161901.21050.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE1F0548@G5W0278.americas.hpqcorp.net>


I remembered someone else suggested to use:

struct ibv_context {
         struct ibv_device      *device;
         struct ibv_context_ops  ops;
         int                     cmd_fd;
         int                     async_fd;
         int                     num_comp_vectors;
         pthread_mutex_t         mutex;
         void                   *abi_compat;
         struct ibv_context_extra_ops  extra_ops;
};

Here we don't use pointer for extra_ops, and any future changes are added into 'extra_ops',
So why not this way ?


Thanks.
--CQ

> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Sunday, December 16, 2007 11:01 AM
> To: general at lists.openfabrics.org
> Cc: Tang, Changqing; Roland Dreier;
> ewg at lists.openfabrics.org; general at lists.openfabrics.org
> Subject: Re: [ofa-general] OFED 1.3 Beta release is available
>
> On Wednesday 05 December 2007 17:45, Tang, Changqing wrote:
> > > I think the only alternative we have to preserve backwards
> > > compatibility is to leave struct ibv_context_ops alone and change
> > > the structure to:
> > >
> > > struct ibv_context {
> > >         struct ibv_device      *device;
> > >         struct ibv_context_ops  ops;
> > >         int                     cmd_fd;
> > >         int                     async_fd;
> > >         int                     num_comp_vectors;
> > >         pthread_mutex_t         mutex;
> > >         void                   *abi_compat;
> > >         struct ibv_xrc_op      *xrc_ops;
> > > };
> > >
> > > with xrc_ops added at the end.  It's my fault for not
> making the ops
> > > member a pointer I guess.
> > >
> > > Tziporet/Jack/whoever -- please fix up the libibverbs you
> ship for
> > > OFED 1.3 to resolve this.
> > >
> > > We can clean this up for libibverbs 1.2 when the ABI can change,
> > > if/when we have something worth breaking the ABI for.
> >
>
> We need to have all userspace libraries set their private
> context object to 0 at allocation time (the private context
> object includes the ibv_context structure, which must now be
> NULL-ed out).
>
> The other userspace driver libraries (e.g., libmthca) do not
> zero-out their internal userspace context structures (e.g.,
> mthca_context) which include the ibv_context structure as the
> first element.
> Up to now, we depended on the ibv_context assign to set
> unavailable verb implementations to NULL.
> (and every userspace driver assigned the ops structure, with
> unimplemented operations set to NULL by the compiler).
> This is no longer true.
>
> Thus, anyone installing OFED will have a compatible set of
> userspace drivers for XRC applications (drivers which do not
> implement XRC will return errors for XRC-verbs).
>
> Applications which were compiled with previous libraries will
> still work (since they do not use XRC).
>
> - Jack
>


From HNGUYEN at de.ibm.com  Mon Dec 17 07:36:38 2007
From: HNGUYEN at de.ibm.com (Hoang-Nam Nguyen)
Date: Mon, 17 Dec 2007 16:36:38 +0100
Subject: [ofa-general] Re: [ewg] Agenda for the OFED meeting today
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C902E86128@mtlexch01.mtl.com>
Message-ID: <OF2999BC7A.0F32A1AC-ONC12573B4.00559771-C12573B4.0055BBA4@de.ibm.com>

Hi Tziporet!
> 2. Review open bugs:
> 750     critical        raisch at de.ibm.com       Problem with
> modprobe ib_ehca with older kernel versions
We've a fix for this issue and still are testing it on several
kernels. I think we'll be able to close it this week.
Regards
Nam


From changquing.tang at hp.com  Mon Dec 17 07:40:23 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Mon, 17 Dec 2007 15:40:23 +0000
Subject: [ofa-general] XRC cleanup order issue
In-Reply-To: <200712160827.04519.jackm@dev.mellanox.co.il>
References: <D89C2C212795564B837FA1665CAE02990FDE143AE8@G5W0278.americas.hpqcorp.net>
	<200712160827.04519.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FDE1F0562@G5W0278.americas.hpqcorp.net>


Thanks, we would like to have a new solution to remove this restriction.

--CQ Tang


> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Sunday, December 16, 2007 12:27 AM
> To: general at lists.openfabrics.org
> Cc: Tang, Changqing
> Subject: Re: [ofa-general] XRC cleanup order issue
>
> On Wednesday 12 December 2007 17:24, Tang, Changqing wrote:
> >
> > HI,
> >         This question is mainly for Mellanox engineers.
> >
> >         With XRC, the rank who create the QP which is used for
> > transport to all ranks on that node can NOT exit first if
> other ranks
> > are still using the transport. This restriction is a
> problem for our dynamic process definition where any rank
> could die with any reason, but without teardown the whole application.
> >
> >         I am thinking about shared memory usage, where the creator
> > does not have to keep alive while other processes can still
> use it, untill the last process exits, then the system will
> cleanup the shared memory.
> >
> >         Can't XRC mimic the shared memory behavior ?
> >
> There is an issue that the QP needs to be associated with a
> protection domain (i.e., UAR area), which is unique per user process.
>
> One possibility is to have a separate process per host per
> job (XRC domain) create the XRC QPs on the receiving side.
> There still would be the issue of what happens if that
> process somehow dies prematurely.
>
> We'll examine the issue and see if there is some other solution.
>
> - Jack
>


From jackm at dev.mellanox.co.il  Mon Dec 17 08:01:14 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 17 Dec 2007 18:01:14 +0200
Subject: [ofa-general] OFED 1.3 Beta release is available
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FDE1F0548@G5W0278.americas.hpqcorp.net>
References: <6C2C79E72C305246B504CBA17B5500C90282E357@mtlexch01.mtl.com>
	<200712161901.21050.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FDE1F0548@G5W0278.americas.hpqcorp.net>
Message-ID: <200712171801.14625.jackm@dev.mellanox.co.il>

On Monday 17 December 2007 17:35, Tang, Changqing wrote:
> 
> I remembered someone else suggested to use:
> 
> struct ibv_context {
>          struct ibv_device      *device;
>          struct ibv_context_ops  ops;
>          int                     cmd_fd;
>          int                     async_fd;
>          int                     num_comp_vectors;
>          pthread_mutex_t         mutex;
>          void                   *abi_compat;
>          struct ibv_context_extra_ops  extra_ops;
> };
> 
> Here we don't use pointer for extra_ops, and any future changes are added into 'extra_ops',
> So why not this way ?

That someone was me.  However, I think Roland's idea is better:
Roland wrote:
	Actually I'd prefer to add xrc_ops and then if we need to extend
	further with more new ops, add another structure after it.  That way
	we avoid having to put any define in libibverbs to tell drivers like
	libmlx4 that xrc support is present; libmlx4 et al can just use
	AC_CHECK_MEMBER(struct ibv_context.xrc_ops) to test with autoconf.

That is what I implemented.

- Jack


From vlad at mellanox.co.il  Mon Dec 17 08:51:21 2007
From: vlad at mellanox.co.il (Vladimir Sokolovsky)
Date: Mon, 17 Dec 2007 18:51:21 +0200
Subject: [ofa-general] Re: [ewg] ofed-1.3-rc1 problem
In-Reply-To: <476696D6.3040108@opengridcomputing.com>
References: <4762CF30.2030202@opengridcomputing.com>
	<200712170932.33301.vlad@mellanox.co.il>
	<476696D6.3040108@opengridcomputing.com>
Message-ID: <200712171851.21471.vlad@mellanox.co.il>

> I'm not sure where the fix needs to go, but if I install mvapich2-1.0.0
> via install.pl, it should also build/install libibumad-devel.
> Otherwise, mpi programs cannot link correctly.  I don't know if this
> dependency should be defined in the mvapich2 srpm somehow or the ofed
> tools.
>
> But in install.pl I see this:
> >         'mvapich2_gcc' =>
> >             { name => "mvapich2_gcc", parent => "mvapich2",
> >             selected => 0, installed => 0, rpm_exist => 0, rpm_exist32 =>
> > 0, available => 0, mode => "user", dist_req_build => [], dist_req_inst =>
> > [], ofa_req_build => ["libibumad-devel", "libibverbs-devel",
> > "librdmacm-devel"], ofa_req_inst => ["mpi-selector", "librdmacm",
> > "libibumad"], install32 => 0, exception => 0 },
>
> And see that libibumad-devel is a build requirement.  I claim it is also
> an install requirement.
>
> This all worked on 1.2.5 by the way...
>
>
> Steve.

I added libibumad-devel to mvapich2 install requirements in the install.pl.
This change will be available in the tomorrow's OFED-1.3 daily build.

Regards,
Vladimir


From vintagegue95 at oceanmed.net  Mon Dec 17 09:10:40 2007
From: vintagegue95 at oceanmed.net (Imogene Parrish)
Date: Mon, 17 Dec 2007 19:10:40 +0200
Subject: [ofa-general] BestQualityCustomerSupportSafeSecure
Message-ID: <01c840e0$83defc60$73614855@vintagegue95>

ForValuedCustomerPhentrimineSpecialPriceshttp://
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/5bbe5650/attachment.html>

From sean.hefty at intel.com  Mon Dec 17 09:35:53 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Mon, 17 Dec 2007 09:35:53 -0800
Subject: [ofa-general] RDMA/CM Spec
In-Reply-To: <1A960510-3B9F-4902-B523-A2CB80B68849@hpc.ufl.edu>
References: <1A960510-3B9F-4902-B523-A2CB80B68849@hpc.ufl.edu>
Message-ID: <000001c840d3$45f04330$9b37170a@amr.corp.intel.com>

>I'm a little bit new to OFED/RDMA development and can't seem to find
>a spec or rfc describing the rdma/cm layer.     I don't see it in the
>iWARP consortium RDMAP spec (not even a reference to it) and googling
>turns up mainly references to the OFA-general mailing list.     Is
>there a spec for this?

For Infiniband, see Annex A11.

- Sean


From sean.hefty at intel.com  Mon Dec 17 09:42:04 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Mon, 17 Dec 2007 09:42:04 -0800
Subject: [ofa-general] RE: the mckey example has internal race
In-Reply-To: <47665BB8.5030805@dev.mellanox.co.il>
References: <47665BB8.5030805@dev.mellanox.co.il>
Message-ID: <000101c840d4$233aa730$9b37170a@amr.corp.intel.com>

>It seems that the server only receive some of the messages that the
>client sends because there isn't
>any sync between the sides: the client post SR before all of the RR were
>posted to the server and
>some of the messages that are being received by the server are
>(silently) dropped.

This was intended as a simple test app only.  No synchronization was done
between the client and server, and the multicast messages are not acked.

- Sean


From suna at 163.com  Mon Dec 17 09:45:53 2007
From: suna at 163.com (=?GB2312?B?1+7T0Mewzb6y+sa3?=)
Date: Mon, 17 Dec 2007 09:45:53 -0800 (PST)
Subject: [ofa-general] =?utf-8?b?5aSq6Ziz5YWJ5a+85YWl5Zmo44CA5Zu96ZmF55WF?=
	=?utf-8?b?6ZSA5paw5ZOB44CA5qyi6L+O5ZCE5Zyw57uP6ZSA44CA5ZCI77+9?=
Message-ID: <20071217174554.3276BE60FE2@openfabrics.org>

 =?GB2312?B?98ur0665sr34oaHXpdehvaG/tb3axNyhocbz0rXQwtT2wPvI8w==?=
To: openib-general at openib.org
Content-Type: text/html;charset="GB2312"
Reply-To: suna at 163.com
Date: Tue, 18 Dec 2007 01:45:46 +0800
X-Priority: 3
X-Mailer: Foxmail 4.1 [cn]


���� �㽭ʡ����������ڹ�˾���㽭ʡ���������ڴ�˾�������۶����곬��25��Ԫ�����й������ܾ�������̫��⵼�������Ǹ��ؾ�����ҵ�ļ�ǿ��ܡ���˾�������ؾ�����ҵ�ṩ����̫��⵼������Ӫ���ƣ���Ʒ����ƬDVD��������վ������������ϡ�չʾ����ơ�VIS�Ӿ�����ϵͳ��GPS��λϵͳ�����������㡢��װ\����\ά���ֲᡢ�����ͬ������ҵ����ѵ�����׾�Ӫ���ơ���֤������ҵ������չ���⾭Ӫ���ۣ����������ҵ������Ӫ����֮�༰���ڿ���Ͷ����á�
����̫��⵼����ϵ�в�Ʒ���й�ʵ��������վ��http://www.himawari.com.cn �������в�Ʒȫ����ܵ���Ƭ��Ӱ��Ƭ����˴����ţ�http://www.himawari.com.cn/live.htm�����԰����˽�̫��⵼����ϵ�в�Ʒ�з������������ۡ�ԭ�������ܡ�ʹ�á���װ��ά�������������
�����㽭���˾��վ��http://www.zjmtd.com.cn ���������㽭ʡ�������ڹ�˾�ۺ�������ܣ��԰�����λ�˽��㽭�����ۺ�ʵ���ͷ�չ�����
����̫��⵼����ҵ����ϵ��ʽ���㽭ʡ����������ڹ�˾ ̫�����ҵ�� ��ϵ�ˣ�Ѧ����� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003�� �绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
������ӭ������ҵ������������˾�ιۡ����졢����̫��⵼����ϵ�в�Ʒ��չʾ�����������ݿ��졢Ǣ̸ǰ���������ǰ��֪���ǣ��Ա���˾�ȳ���������ýӻ�ס�޶�Ʊ��վ�ȹ�����
̫��⵼�������й�2007����߷�չǰ�������½�����Ʒ�����¼ҵ��Ʒ�����½��ܲ�Ʒ������������Ʒ�����»�����Ʒ�����½���װ�޲�Ʒ��������ǹ�˾Ա������������˾�쵼����²�Ʒ�����Խ����������ʹ�����쵼��������ӡ�󣬵õ����úͽ�����

��������Զ����ʵҵ�ң�
�������ã����Ǹ����������й����¡�������������г�Ǳ����Ʒ����̫��⵼����������˾֣����ȫ������Ǣ̸������顣
����
��������һ����Ʒ����׿����
��������ר��ע��ԭװ���ڲ�Ʒ��GPS̫�����׷�١�͹͸���۽����⡢����⵼����Զ�ഫ�ͣ��к����߷������ء���������������䣻����һ��10���Сʱ100����ȫ�����չ�ԡ���ò�Ʒӵ��400�������ר������Ʒ����70�꣬������ʮ������г����ۿ��飬��һ̨���ϣ���Ʒ���ʡ������ȶ��ϳˣ���װʮ�ַ��㣬�·��ɷ�����ʹ�ã�ʹ�÷���Ϊ��ɱ���
����
�������ɶ����г�ǰ��������
�����������й�����������߷�չ���ƵĲ�ҵ�ǣ�����Ӳ�ҵ����Ϣͨ�Ų�ҵ�������͸�����ҵ������������Դ��ҵ�����ĸ���ҵ������Ǯ�Ĳ�ҵ��
����1������Ӳ�ҵ��������оƬ�ȱ���Ϊ���Ӳ�ҵ�ġ����ס��������ҵ���ԣ��𵽡��������ϡ����á�̫��⵼������̫��ת�͵����Թ�תϵͳ����ϵͳ�Ƴɼ���оƬ��
����2����Ϣͨ�Ų�ҵ���Ǳ���������͵�������ҵ��20�����Ƿ������͵綯��ռ������λ�ġ����������͡���21������ͨ�š��㲥�͵��Ӽ������λһ��ġ���Ϣ�����͡�������Ŀǰͨ�Ų�Ʒ���ֻ��������;����̲�������������䡣̫��⵼��������GPS���Ƕ�λϵͳ��ȷ����װ�ص㣬��ȷ��λ��γ�ȡ�
����3�������͸�����ҵ�������˺���ȻϢϢ��أ���ӳ�����ʱ��Ҫ������Ͳ�ҵ������Ǯ��ʲô������Ϊ�˹��Ϻ����ӣ�������Ϊ�˳��٣���˿���ҩ��ΰ����йؽ�����Ʒ������ͬʱ�Ը�����ҵҲ�ڸ�������𲢻�úܺõ��ȹ����档����������̫���й��������ⲻ���ĵط���ҽ����ȥ�ĵط���ŷ����������Ե������ֲ������ṩ������ҽѧŵ������������̫��⵼�����ų�������к��ɷݣ��������ṩ�������Ϲ�����׼�İ�ȫ������̫��⣬����������Ҫ����������˵Ľ���������������
����4������������Դ��ҵ���Ǳ������������ٵ���Ҫ���⡣������ע������������ע����Դ���ƣ�����ʯ�ʹ���Ǽۣ����������ô����ش�Ӱ�졣̫��⵼����ʹ���в�����������������ÿ���޳��ṩ�൱��1300�ȵ�����������������Դ��ͬʱ�����������Դ������ɻ�����Ⱦ������������磬�����ܵ�������Ⱦ��
������ˣ�̫��⵼������Ʒ����ҵ���������й����ı�������߷�չǰ�����ĸ���ҵ��ȫһ�¡��Ƿǳ��ѵá�������й㷺ʹ�ÿռ�ĺò�Ʒ��
����5���г��������й㷺�����۷�Χ��ʹ�ÿռ䡣
�����ܣ���������������Ȼ���������䡢�����ң����ⷿ���а�ȫ����̫��⡣���õ磬̫��⵼����ȴ���ṩ�ﵽ�൱��500�����ϵ���������ȣ������޳ɱ��ɹ�������
������˺��̫�����Ժϳɶ���ϡȱԪ�أ�������������������Թ��ܺ����������ٽ��³´�л������ƣ�ͣ������������������١�
����סլ�ĳ������������ң������Ŀ��������ҡ�������������������ȿ�ʹ���������ֵ��
���ڰ칫�ҡ������ҡ��Ӵ��ҵȰ칫������
����ҽԺ�����ݡ����ꡢԡ�ҡ����ݡ�չ����ѧУ��ˮ��ݡ�����ʵ���ҡ�����ָ���ҵȳ�����
�κν����ﶼ��Ҫ�����ܽ����ṹӰ�죬�÷��������ֵ����߳��ݵ��Ρ�
ͬʱҲ�ǽϺõġ��м������塢���˿̹����ǵ���Ʒ���������й����������У�����Ϊ��ƷҲռ��һ���ı��ء�
����
�������������������ж���¢�����ۡ�
����ÿ������ֻ��һ�������̡�����ר���������������¢�����������ۣ����������Ʒ�������GPS���Ƕ�λ�����͵������������豸���Ƽ�������˾��ɱ����������̲��������������̵����۳������֤��������¢������
������Ʒ�߿Ƽ����з���Ͷ�롣������Ϊ����500ǿ��ҵ������Ͷ��޶��ʽ�Լ50���з����飬20������г��ƹ㣬��˲�Ʒ���������칤����ʮ�����й��޷��������������й���ҵ��ð������Ҳֻ����ֽӽ��ò�Ʒ������׼30%�ķ�ð��Ʒ���������г���
����
���������ģ��������ɿ���
�����㽭������㽭ʡ��רҵ��ó��˾���ۺ�ʵ��ǿ�����й̶��ʲ���3��Ԫ�������۶�ͻ��25��Ԫ���������õȼ���AAA�������ڹ���������ʮ����ӹ�˾�ͷ�֧��������˾ӵ�й�����������Ƶ�Ӫ��������ϵ����ʵ����Ч�ľ�Ӫ���ơ�
������˾��������Ϊ��̫���ɹ⵼��ϵͳ����ȫ��Ӫ�����˴����г����С��߻�׼�������������Ƶľ�Ӫ�߻�����������Ӧ�ó�����ʹ�ð�װ��Ч���Աȵ���������Ƭ��������ֵ�רҵ�����������ϣ�Ӱ��Ƭ�������Ƶ������ƹ㡢��ư�װ��ά��������������ѵ���ۺ������ѵ�ƶȼ��γ�֧�֣�Ϊ�������ṩ�ѳɹ������ۻ��ƺ����Ƶķ����Ԯ����֤������ǩԼ���ܿ�ݶ��������ۡ�����ĿͶ���٣����졣
����
���������壺���ż������
��������2006��11��1�ա�2006��12��30�ջ�ڼ�Ǽ��ߣ������κμ��˷��ã��Ծ������ṩ���巿����ָ����һϵ����ѵļ�����ѵ���޳��ṩ���е����ϡ�ͼƬ��Ӱ�ӣ���ֵ��ʮ����Ԫ����
�����κ���ʵ������־�ڿ����²�Ʒ����ҵ�͸��˾��ɼ��ˣ�����¼ȡ�����¾�Ӫ���ز�����������װ���յ�������������ҵ����Ѹ���𶯡�

������˾���㽭ʡ�������㽭����������100ƽ���׵�չʾ�䣬������⵼�볡�������������ֳ��ιۿ��죬Ϊ��������ṩ���ܡ�
������ϵ��ʽ���㽭���˾̫�����ҵ�� ��ϵ�ˣ�Ѧ���� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003
�����绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
����������������������������


From ralph.campbell at qlogic.com  Mon Dec 17 10:48:02 2007
From: ralph.campbell at qlogic.com (Ralph Campbell)
Date: Mon, 17 Dec 2007 10:48:02 -0800
Subject: [ofa-general] Re: [PATCH 5 of 5] libipathverbs: zero context struct
	at allocation time (prep for additional context ops)
In-Reply-To: <200712171019.38518.jackm@dev.mellanox.co.il>
References: <200712171019.38518.jackm@dev.mellanox.co.il>
Message-ID: <1197917282.3641.68.camel@brick.pathscale.com>

Thanks, applied to ~ralphc/libipathverbs/.git ofed_1_3 branch.

On Mon, 2007-12-17 at 10:19 +0200, Jack Morgenstein wrote:
> The ibv_context structure will be getting additional ops,
> to be added at the end of the structure (and not as part of
> the existing ibv_context_ops structure).
> 
> Reason: ibv_context_ops is declared directly as a member of ibv_context,
> and not as a pointer.  Binaries compiled with previous libibverbs versions
> will not be backwards compatible if we add new operations to ibv_context_ops,
> since fields following the ops structure will move.
> 
> To enable adding new operations at the end of the existing ibv_context struct,
> all driver libraries MUST zero their context structure at allocation time, so
> that new ops will be NULL by default.
> 
> Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>
> 
> diff --git a/src/ipathverbs.c b/src/ipathverbs.c
> index eb16fb0..55d8dcf 100644
> --- a/src/ipathverbs.c
> +++ b/src/ipathverbs.c
> @@ -42,6 +42,7 @@
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <unistd.h>
> +#include <string.h>
>  
>  #include "ipathverbs.h"
>  #include "ipath-abi.h"
> @@ -140,6 +141,7 @@ static struct ibv_context *ipath_alloc_context(struct ibv_device *ibdev,
>  	context = malloc(sizeof *context);
>  	if (!context)
>  		return NULL;
> +	memset(context, 0, sizeof *context);
>  	context->ibv_ctx.cmd_fd = cmd_fd;
>  	if (ibv_cmd_get_context(&context->ibv_ctx, &cmd,
>  				sizeof cmd, &resp, sizeof resp))


From suna at 163.com  Mon Dec 17 11:32:27 2007
From: suna at 163.com (=?GB2312?B?1+7T0Mewzb6y+sa3?=)
Date: Mon, 17 Dec 2007 11:32:27 -0800 (PST)
Subject: [ofa-general] =?utf-8?b?5aSq6Ziz5YWJ5a+85YWl5Zmo44CA5Zu96ZmF55WF?=
	=?utf-8?b?6ZSA5paw5ZOB44CA5qyi6L+O5ZCE5Zyw57uP6ZSA44CA5ZCI77+9?=
Message-ID: <20071217193228.327F5E61BEF@openfabrics.org>

 =?GB2312?B?98ur0665sr34oaHXpdehvaG/tb3axNyhocbz0rXQwtT2wPvI8w==?=
To: openib-general at openib.org
Content-Type: text/html;charset="GB2312"
Reply-To: suna at 163.com
Date: Tue, 18 Dec 2007 03:32:19 +0800
X-Priority: 3
X-Mailer: Microsoft Outlook Express 6.00.2800.1106


���� �㽭ʡ����������ڹ�˾���㽭ʡ���������ڴ�˾�������۶����곬��25��Ԫ�����й������ܾ�������̫��⵼�������Ǹ��ؾ�����ҵ�ļ�ǿ��ܡ���˾�������ؾ�����ҵ�ṩ����̫��⵼������Ӫ���ƣ���Ʒ����ƬDVD��������վ������������ϡ�չʾ����ơ�VIS�Ӿ�����ϵͳ��GPS��λϵͳ�����������㡢��װ\����\ά���ֲᡢ�����ͬ������ҵ����ѵ�����׾�Ӫ���ơ���֤������ҵ������չ���⾭Ӫ���ۣ����������ҵ������Ӫ����֮�༰���ڿ���Ͷ����á�
����̫��⵼����ϵ�в�Ʒ���й�ʵ��������վ��http://www.himawari.com.cn �������в�Ʒȫ����ܵ���Ƭ��Ӱ��Ƭ����˴����ţ�http://www.himawari.com.cn/live.htm�����԰����˽�̫��⵼����ϵ�в�Ʒ�з������������ۡ�ԭ�������ܡ�ʹ�á���װ��ά�������������
�����㽭���˾��վ��http://www.zjmtd.com.cn ���������㽭ʡ�������ڹ�˾�ۺ�������ܣ��԰�����λ�˽��㽭�����ۺ�ʵ���ͷ�չ�����
����̫��⵼����ҵ����ϵ��ʽ���㽭ʡ����������ڹ�˾ ̫�����ҵ�� ��ϵ�ˣ�Ѧ����� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003�� �绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
������ӭ������ҵ������������˾�ιۡ����졢����̫��⵼����ϵ�в�Ʒ��չʾ�����������ݿ��졢Ǣ̸ǰ���������ǰ��֪���ǣ��Ա���˾�ȳ���������ýӻ�ס�޶�Ʊ��վ�ȹ�����
̫��⵼�������й�2007����߷�չǰ�������½�����Ʒ�����¼ҵ��Ʒ�����½��ܲ�Ʒ������������Ʒ�����»�����Ʒ�����½���װ�޲�Ʒ��������ǹ�˾Ա������������˾�쵼����²�Ʒ�����Խ����������ʹ�����쵼��������ӡ�󣬵õ����úͽ�����

��������Զ����ʵҵ�ң�
�������ã����Ǹ����������й����¡�������������г�Ǳ����Ʒ����̫��⵼����������˾֣����ȫ������Ǣ̸������顣
����
��������һ����Ʒ����׿����
��������ר��ע��ԭװ���ڲ�Ʒ��GPS̫�����׷�١�͹͸���۽����⡢����⵼����Զ�ഫ�ͣ��к����߷������ء���������������䣻����һ��10���Сʱ100����ȫ�����չ�ԡ���ò�Ʒӵ��400�������ר������Ʒ����70�꣬������ʮ������г����ۿ��飬��һ̨���ϣ���Ʒ���ʡ������ȶ��ϳˣ���װʮ�ַ��㣬�·��ɷ�����ʹ�ã�ʹ�÷���Ϊ��ɱ���
����
�������ɶ����г�ǰ��������
�����������й�����������߷�չ���ƵĲ�ҵ�ǣ�����Ӳ�ҵ����Ϣͨ�Ų�ҵ�������͸�����ҵ������������Դ��ҵ�����ĸ���ҵ������Ǯ�Ĳ�ҵ��
����1������Ӳ�ҵ��������оƬ�ȱ���Ϊ���Ӳ�ҵ�ġ����ס��������ҵ���ԣ��𵽡��������ϡ����á�̫��⵼������̫��ת�͵����Թ�תϵͳ����ϵͳ�Ƴɼ���оƬ��
����2����Ϣͨ�Ų�ҵ���Ǳ���������͵�������ҵ��20�����Ƿ������͵綯��ռ������λ�ġ����������͡���21������ͨ�š��㲥�͵��Ӽ������λһ��ġ���Ϣ�����͡�������Ŀǰͨ�Ų�Ʒ���ֻ��������;����̲�������������䡣̫��⵼��������GPS���Ƕ�λϵͳ��ȷ����װ�ص㣬��ȷ��λ��γ�ȡ�
����3�������͸�����ҵ�������˺���ȻϢϢ��أ���ӳ�����ʱ��Ҫ������Ͳ�ҵ������Ǯ��ʲô������Ϊ�˹��Ϻ����ӣ�������Ϊ�˳��٣���˿���ҩ��ΰ����йؽ�����Ʒ������ͬʱ�Ը�����ҵҲ�ڸ�������𲢻�úܺõ��ȹ����档����������̫���й��������ⲻ���ĵط���ҽ����ȥ�ĵط���ŷ����������Ե������ֲ������ṩ������ҽѧŵ������������̫��⵼�����ų�������к��ɷݣ��������ṩ�������Ϲ�����׼�İ�ȫ������̫��⣬����������Ҫ����������˵Ľ���������������
����4������������Դ��ҵ���Ǳ������������ٵ���Ҫ���⡣������ע������������ע����Դ���ƣ�����ʯ�ʹ���Ǽۣ����������ô����ش�Ӱ�졣̫��⵼����ʹ���в�����������������ÿ���޳��ṩ�൱��1300�ȵ�����������������Դ��ͬʱ�����������Դ������ɻ�����Ⱦ������������磬�����ܵ�������Ⱦ��
������ˣ�̫��⵼������Ʒ����ҵ���������й����ı�������߷�չǰ�����ĸ���ҵ��ȫһ�¡��Ƿǳ��ѵá�������й㷺ʹ�ÿռ�ĺò�Ʒ��
����5���г��������й㷺�����۷�Χ��ʹ�ÿռ䡣
�����ܣ���������������Ȼ���������䡢�����ң����ⷿ���а�ȫ����̫��⡣���õ磬̫��⵼����ȴ���ṩ�ﵽ�൱��500�����ϵ���������ȣ������޳ɱ��ɹ�������
������˺��̫�����Ժϳɶ���ϡȱԪ�أ�������������������Թ��ܺ����������ٽ��³´�л������ƣ�ͣ������������������١�
����סլ�ĳ������������ң������Ŀ��������ҡ�������������������ȿ�ʹ���������ֵ��
���ڰ칫�ҡ������ҡ��Ӵ��ҵȰ칫������
����ҽԺ�����ݡ����ꡢԡ�ҡ����ݡ�չ����ѧУ��ˮ��ݡ�����ʵ���ҡ�����ָ���ҵȳ�����
�κν����ﶼ��Ҫ�����ܽ����ṹӰ�죬�÷��������ֵ����߳��ݵ��Ρ�
ͬʱҲ�ǽϺõġ��м������塢���˿̹����ǵ���Ʒ���������й����������У�����Ϊ��ƷҲռ��һ���ı��ء�
����
�������������������ж���¢�����ۡ�
����ÿ������ֻ��һ�������̡�����ר���������������¢�����������ۣ����������Ʒ�������GPS���Ƕ�λ�����͵������������豸���Ƽ�������˾��ɱ����������̲��������������̵����۳������֤��������¢������
������Ʒ�߿Ƽ����з���Ͷ�롣������Ϊ����500ǿ��ҵ������Ͷ��޶��ʽ�Լ50���з����飬20������г��ƹ㣬��˲�Ʒ���������칤����ʮ�����й��޷��������������й���ҵ��ð������Ҳֻ����ֽӽ��ò�Ʒ������׼30%�ķ�ð��Ʒ���������г���
����
���������ģ��������ɿ���
�����㽭������㽭ʡ��רҵ��ó��˾���ۺ�ʵ��ǿ�����й̶��ʲ���3��Ԫ�������۶�ͻ��25��Ԫ���������õȼ���AAA�������ڹ���������ʮ����ӹ�˾�ͷ�֧��������˾ӵ�й�����������Ƶ�Ӫ��������ϵ����ʵ����Ч�ľ�Ӫ���ơ�
������˾��������Ϊ��̫���ɹ⵼��ϵͳ����ȫ��Ӫ�����˴����г����С��߻�׼�������������Ƶľ�Ӫ�߻�����������Ӧ�ó�����ʹ�ð�װ��Ч���Աȵ���������Ƭ��������ֵ�רҵ�����������ϣ�Ӱ��Ƭ�������Ƶ������ƹ㡢��ư�װ��ά��������������ѵ���ۺ������ѵ�ƶȼ��γ�֧�֣�Ϊ�������ṩ�ѳɹ������ۻ��ƺ����Ƶķ����Ԯ����֤������ǩԼ���ܿ�ݶ��������ۡ�����ĿͶ���٣����졣
����
���������壺���ż������
��������2006��11��1�ա�2006��12��30�ջ�ڼ�Ǽ��ߣ������κμ��˷��ã��Ծ������ṩ���巿����ָ����һϵ����ѵļ�����ѵ���޳��ṩ���е����ϡ�ͼƬ��Ӱ�ӣ���ֵ��ʮ����Ԫ����
�����κ���ʵ������־�ڿ����²�Ʒ����ҵ�͸��˾��ɼ��ˣ�����¼ȡ�����¾�Ӫ���ز�����������װ���յ�������������ҵ����Ѹ���𶯡�

������˾���㽭ʡ�������㽭����������100ƽ���׵�չʾ�䣬������⵼�볡�������������ֳ��ιۿ��죬Ϊ��������ṩ���ܡ�
������ϵ��ʽ���㽭���˾̫�����ҵ�� ��ϵ�ˣ�Ѧ���� ��ַ���㽭ʡ��������ɽ��·310��909�� �ʱࣺ310003
�����绰��0571-85775783 85775666 85775591 �ֻ���13858077305 ���棺0571-85775503 Email��sun at zjmtd.com.cn
����������������������������


From joe at perches.com  Mon Dec 17 11:30:36 2007
From: joe at perches.com (Joe Perches)
Date: Mon, 17 Dec 2007 11:30:36 -0800
Subject: [ofa-general] [PATCH] drivers/infiniband/: Spelling fixes
Message-ID: <1197919875-5288-37-git-send-email-joe@perches.com>


Signed-off-by: Joe Perches <joe at perches.com>
---
 drivers/infiniband/hw/cxgb3/cxio_hal.c |    2 +-
 drivers/infiniband/hw/ehca/ehca_av.c   |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c
index eec6a30..26b8c0e 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_hal.c
+++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c
@@ -584,7 +584,7 @@ static int cxio_hal_ctrl_qp_write_mem(struct cxio_rdev *rdev_p, u32 addr,
 {
 	u32 i, nr_wqe, copy_len;
 	u8 *copy_data;
-	u8 wr_len, utx_len;	/* lenght in 8 byte flit */
+	u8 wr_len, utx_len;	/* length in 8 byte flit */
 	enum t3_wr_flags flag;
 	__be64 *wqe;
 	u64 utx_cmd;
diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c
index f7782c8..194c1c3 100644
--- a/drivers/infiniband/hw/ehca/ehca_av.c
+++ b/drivers/infiniband/hw/ehca/ehca_av.c
@@ -1,7 +1,7 @@
 /*
  *  IBM eServer eHCA Infiniband device driver for Linux on POWER
  *
- *  adress vector functions
+ *  address vector functions
  *
  *  Authors: Hoang-Nam Nguyen <hnguyen at de.ibm.com>
  *           Khadija Souissi <souissik at de.ibm.com>
-- 
1.5.3.7.949.g2221a6


From sean.hefty at intel.com  Mon Dec 17 11:57:59 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Mon, 17 Dec 2007 11:57:59 -0800
Subject: [ofa-general] RE: some questions on stale connection handling at the
	IB CM
In-Reply-To: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>
References: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>
Message-ID: <000301c840e7$201af600$9b37170a@amr.corp.intel.com>

>Looking on the code, I see that the CM sends a reject message with the reason
>being
>IB_CM_REJ_STALE_CONN when it gets a REQ or REP whose <QPN, CA GUID> pair is
>already
>present at the remote-qpn rb tree (and in another case which I don't fully
>understand).

IB_CM_REJ_STALE_CONN is sent in the following situations:

* Remote ID in REQ matches a connection that is in timewait.  This is treated as
a duplicate REQ that was processed after the connection had been terminated.

* Remote QPN in REQ or REP matches an existing connection, and REQ/REP was not
detected as a duplicate.

>On the other side, when the CM receives a reject message with that reason, the
>local handle
>(id) is moved to the timewait state, where my understanding is that it will sit
>there for a while
>and then a reject/stale-connection callback will be delivered to the user, the
>id will be removed.

correct

>What I don't see is issue of "DREQ, with DREQ:remote QPN set to the remote QPN
>from the REQ/REP"
>as stated in 12.9.8.3.1 (below), is it really missing or I am reading the code
>wrong?

This is missing.  But neither the DREQ or DREP that are generated in this case
drive the state machines.  Both messages are simply generated and then consumed
by the CM.  (I don't even think it's clear if the local and remote IDs in the
DREQ/DREP are relative to the stale connection, or the new connection
request/reply.)

>Also, its quite clear to me that from the view point of the application there
>are stale
>connection cases which the CM can not catch, eg a client DREQ did not arrive to
>the server
>side CM and then the client REQ uses a different QPN, etc. My understanding is
>that in such
>cases the responsibility to close the stale connection/qp is on the server app.

Correct - keep-alive messages are still needed by apps to know if their
connections are still valid.  IMO, stale connection detection becomes less
useful as the number of systems being connected to increase.

- Sean


From swise at opengridcomputing.com  Mon Dec 17 12:18:04 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Mon, 17 Dec 2007 14:18:04 -0600
Subject: [ofa-general] Re: [PATCH 3 of 5] libcxgb3: zero context struct at
 allocation time (prep for additional context ops)
In-Reply-To: <200712171019.33326.jackm@dev.mellanox.co.il>
References: <200712171019.33326.jackm@dev.mellanox.co.il>
Message-ID: <4766D97C.7000205@opengridcomputing.com>

Applied. Thanks.

I've released version 1.1.1 of the library, and updated the ofed_1_3 branch.

Vlad, can you pull version 1.1.1 for ofed-1.3?

git://git.openfabrics.org/~swise/libcxgb3.git ofed_1_3


Thanks,

Steve.


Jack Morgenstein wrote:
> The ibv_context structure will be getting additional ops,
> to be added at the end of the structure (and not as part of
> the existing ibv_context_ops structure).
> 
> Reason: ibv_context_ops is declared directly as a member of ibv_context,
> and not as a pointer.  Binaries compiled with previous libibverbs versions
> will not be backwards compatible if we add new operations to ibv_context_ops,
> since fields following the ops structure will move.
> 
> To enable adding new operations at the end of the existing ibv_context struct,
> all driver libraries MUST zero their context structure at allocation time, so
> that new ops will be NULL by default.
> 
> Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>
> 
> diff --git a/src/iwch.c b/src/iwch.c
> index 2747518..517ff00 100644
> --- a/src/iwch.c
> +++ b/src/iwch.c
> @@ -114,6 +114,7 @@ static struct ibv_context *iwch_alloc_context(struct ibv_device *ibdev,
>  	if (!context)
>  		return NULL;
>  
> +	memset(context, 0, sizeof *context);
>  	context->ibv_ctx.cmd_fd = cmd_fd;
>  
>  	if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd,


From dillowda at ornl.gov  Mon Dec 17 14:03:19 2007
From: dillowda at ornl.gov (David Dillow)
Date: Mon, 17 Dec 2007 17:03:19 -0500
Subject: [ofa-general] [RFC PATCH] IB/srp: respect target credit limit
Message-ID: <1197929000.31600.19.camel@lap75545.ornl.gov>

The SRP initiator will currently send requests, even if we have no
credits available. The results of sending extra requests is
vendor-specific, but on some devices, overrunning our credits will cost
us 85% of peak performance -- e.g. 100 MB/s vs 720 MB/s. Other devices
may just drop them.

This patch will tell the SCSI mid-layer to queue requests if there are
less than two credits remaining, and will not issue a task management
request if there are no credits remaining. The mid-layer will retry the
queued command once an outstanding command completes.

Also, it adds a queue length parameter to the target options, so that
the maximum number of commands the SCSI mid-layer will try to send can
be capped before it hits the credit limit.

Signed-off-by: David Dillow <dillowda at ornl.gov>
---

A cleanup would be to get rid of the zero_req_lim counter, as it should
never be hit now.

 ib_srp.c |   49 +++++++++++++++++++++++++++++++++++++++++++------
 ib_srp.h |    5 +++++
 2 files changed, 48 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 950228f..18dbe89 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -925,6 +925,28 @@ static int srp_post_recv(struct srp_target_port *target)
 	return ret;
 }
 
+/* Must be called with target->scsi_host->host_lock held to protect
+ * req_lim. Lock cannot be dropped between here and srp_use_credit().
+ */
+static int srp_check_credit(struct srp_target_port *target,
+				enum srp_request_type req_type)
+{
+	/* Each request requires one credit, and we hold a credit in reserve
+	 * so that we can send task management requests.
+	 */
+	s32 min = (req_type == SRP_REQ_TASK_MGMT) ? 1 : 2;
+	return target->req_lim < min;
+}
+
+/* Must be called with target->scsi_host->host_lock held to protect
+ * req_lim. Lock cannot be dropped between the call to srp_check_credit()
+ * and the call to srp_use_credit()
+ */
+static void srp_use_credit(struct srp_target_port *target)
+{
+	target->req_lim--;
+}
+
 /*
  * Must be called with target->scsi_host->host_lock held to protect
  * req_lim and tx_head.  Lock cannot be dropped between call here and
@@ -965,10 +987,8 @@ static int __srp_post_send(struct srp_target_port *target,
 
 	ret = ib_post_send(target->qp, &wr, &bad_wr);
 
-	if (!ret) {
+	if (!ret)
 		++target->tx_head;
-		--target->req_lim;
-	}
 
 	return ret;
 }
@@ -993,6 +1013,9 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd,
 		return 0;
 	}
 
+	if (srp_check_credit(target, SRP_REQ_NORMAL))
+		goto err;
+
 	iu = __srp_get_tx_iu(target);
 	if (!iu)
 		goto err;
@@ -1039,6 +1062,8 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd,
 		goto err_unmap;
 	}
 
+	srp_use_credit(target);
+
 	list_move_tail(&req->list, &target->req_queue);
 
 	return 0;
@@ -1180,9 +1205,6 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 
 			target->max_ti_iu_len = be32_to_cpu(rsp->max_ti_iu_len);
 			target->req_lim       = be32_to_cpu(rsp->req_lim_delta);
-
-			target->scsi_host->can_queue = min(target->req_lim,
-							   target->scsi_host->can_queue);
 		} else {
 			printk(KERN_WARNING PFX "Unhandled RSP opcode %#x\n", opcode);
 			target->status = -ECONNRESET;
@@ -1283,6 +1305,9 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 
 	init_completion(&req->done);
 
+	if (srp_check_credit(target, SRP_REQ_TASK_MGMT))
+		goto out;
+
 	iu = __srp_get_tx_iu(target);
 	if (!iu)
 		goto out;
@@ -1300,6 +1325,7 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 		goto out;
 
 	req->tsk_mgmt = iu;
+	srp_use_credit(target);
 
 	spin_unlock_irq(target->scsi_host->host_lock);
 
@@ -1540,6 +1566,7 @@ static struct scsi_host_template srp_template = {
 	.eh_device_reset_handler	= srp_reset_device,
 	.eh_host_reset_handler		= srp_reset_host,
 	.can_queue			= SRP_SQ_SIZE,
+	.max_host_blocked		= 1,
 	.this_id			= -1,
 	.cmd_per_lun			= SRP_SQ_SIZE,
 	.use_clustering			= ENABLE_CLUSTERING,
@@ -1610,6 +1637,7 @@ enum {
 	SRP_OPT_MAX_CMD_PER_LUN	= 1 << 6,
 	SRP_OPT_IO_CLASS	= 1 << 7,
 	SRP_OPT_INITIATOR_EXT	= 1 << 8,
+	SRP_OPT_QUEUE_LEN	= 1 << 9,
 	SRP_OPT_ALL		= (SRP_OPT_ID_EXT	|
 				   SRP_OPT_IOC_GUID	|
 				   SRP_OPT_DGID		|
@@ -1627,6 +1655,7 @@ static match_table_t srp_opt_tokens = {
 	{ SRP_OPT_MAX_CMD_PER_LUN,	"max_cmd_per_lun=%d" 	},
 	{ SRP_OPT_IO_CLASS,		"io_class=%x"		},
 	{ SRP_OPT_INITIATOR_EXT,	"initiator_ext=%s"	},
+	{ SRP_OPT_QUEUE_LEN,		"queue_len=%d" 		},
 	{ SRP_OPT_ERR,			NULL 			}
 };
 
@@ -1754,6 +1783,14 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
 			kfree(p);
 			break;
 
+		case SRP_OPT_QUEUE_LEN:
+			if (match_int(args, &token)) {
+				printk(KERN_WARNING PFX "bad queue_len parameter '%s'\n", p);
+				goto out;
+			}
+			target->scsi_host->can_queue = min(token, SRP_SQ_SIZE);
+			break;
+
 		default:
 			printk(KERN_WARNING PFX "unknown parameter or missing value "
 			       "'%s' in target creation request\n", p);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index e3573e7..4a3c1f3 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -79,6 +79,11 @@ enum srp_target_state {
 	SRP_TARGET_REMOVED
 };
 
+enum srp_request_type {
+	SRP_REQ_NORMAL,
+	SRP_REQ_TASK_MGMT,
+};
+
 struct srp_device {
 	struct list_head	dev_list;
 	struct ib_device       *dev;


From danderson at lnxi.com  Mon Dec 17 15:16:07 2007
From: danderson at lnxi.com (David B. Anderson)
Date: Mon, 17 Dec 2007 16:16:07 -0700
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to select
	sp4 patches for SLES9 kernel with minor versions equal or greater
	than 305
In-Reply-To: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>
References: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>
Message-ID: <47670337.6080607@lnxi.com>

I've all of these patches plus the following patch

    kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch

My current git repo is
git://git.openfabrics.org/ofed_1_2/linux-2.6.git
commit: 6974c285e6fb06264f570f9cf919865bab66c9e6

My patch that I posted before fixes the kernel configure script so that 
it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel of 
2.6.5-7.308 and above. The configure patch from 
FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only valid 
SP4 kernel which is incorrect. I get the same compiler error as before.

 
Moshe Kazir wrote:
>  See patches in the attached message.
>
> It was applied by Vlad.
>
> Moshe
>
> ____________________________________________________________
> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>  
> Voltaire - The Grid Backbone
>  
>  www.voltaire.com
>
>   
>
>
> -----Original Message-----
> From: general-bounces at lists.openfabrics.org
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David B.
> Anderson
> Sent: Saturday, December 15, 2007 3:31 AM
> To: general at lists.openfabrics.org; vlad at mellanox.co.il
> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
> select sp4 patches for SLES9 kernel with minor versions equal or greater
> than 305
>
>
> Hi,
>
> I've created the following patch for OFED 1.2.5.4 to have the kernel for
>
> SLES9 SP4 recognized (2.6.5-7.308).
>
> Even with the patch I then had two back port patches not apply cleanly 
> (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched them but now 
> I'm getting the following compiler errors:
>
> In file included from
> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
>                  from 
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/module.h:4,
>                  from
> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
>                  from 
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:4,
>                  from 
> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
>                  from 
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
>                  from 
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:32:
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/sched.h:8: 
> warning: static declaration for `wait_for_completion_timeout' follows 
> non-static
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:67: 
> warning: initialization from incompatible pointer type
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c: 
> In function `addr_resolve_remote':
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:192: 
> error: structure has no member named `idev'
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:193: 
> error: structure has no member named `idev'
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:197: 
> error: structure has no member named `idev'
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c: 
> At top level:
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:48: 
> warning: `class_create' defined but not used
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:82: 
> warning: `class_destroy' defined but not used
> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:108: 
> warning: `class_device_create' defined but not used
> make[6]: *** 
> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
> d/core/addr.o] 
> Error 1
> make[5]: *** 
> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
> d/core] 
> Error 2
> make[4]: *** 
> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
> d] 
> Error 2
> make[3]: *** 
> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] Error 2
> make[2]: *** [modules] Error 2
> make[1]: *** [modules] Error 2
> make[1]: Leaving directory
> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
> make: *** [kernel] Error 2
>
> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?
>
> Thanks
>
>   
>
> ------------------------------------------------------------------------
>
> Subject:
> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4
> From:
> "Moshe Kazir" <moshek at voltaire.com>
> Date:
> Sun, 25 Nov 2007 09:59:26 +0200
> To:
> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
> <general at lists.openfabrics.org>
>
> To:
> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
> <general at lists.openfabrics.org>
>
>
> The attached files do the work.
>
> OFED_1.2.5_sles9_sp4_configure.diff  include the changes in the
> configure file.
> OFED_1.2.5_sles9_sp4_backport.diff  include the canges requiered in the
> kernel_patche and kernel_addons directories.
>
> Moshe
> ____________________________________________________________
> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>  
> Voltaire - The Grid Backbone
>  
>  www.voltaire.com
>
>   
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


-- 
David B. Anderson 
Linux Networx
Sr. Software Engineer
Email: danderson at lnxi.com
Phone: (801) 649-1311


From weiny2 at llnl.gov  Mon Dec 17 17:30:14 2007
From: weiny2 at llnl.gov (Ira Weiny)
Date: Mon, 17 Dec 2007 17:30:14 -0800
Subject: [ofa-general] [PATCH] opensm: Add "perfmgr print_counters node" to
 the console to print individual values
Message-ID: <20071217173014.037d4ae9.weiny2@llnl.gov>

>From 14671d63a4315a98a7f8ed17ece2bd833aed39f2 Mon Sep 17 00:00:00 2001
From: Ira K. Weiny <weiny2 at llnl.gov>
Date: Fri, 14 Dec 2007 15:57:30 -0800
Subject: [PATCH] Add "perfmgr print_counters node" to the console to print individual values

directly on the console.

Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
---
 opensm/include/opensm/osm_perfmgr.h    |    2 +
 opensm/include/opensm/osm_perfmgr_db.h |    2 +
 opensm/opensm/osm_console.c            |   10 +++++++
 opensm/opensm/osm_perfmgr.c            |   14 +++++++++
 opensm/opensm/osm_perfmgr_db.c         |   46 ++++++++++++++++++++++++++++++++
 5 files changed, 74 insertions(+), 0 deletions(-)

diff --git a/opensm/include/opensm/osm_perfmgr.h b/opensm/include/opensm/osm_perfmgr.h
index 0dd3ce4..4bd05f5 100644
--- a/opensm/include/opensm/osm_perfmgr.h
+++ b/opensm/include/opensm/osm_perfmgr.h
@@ -219,6 +219,8 @@ inline static uint16_t osm_perfmgr_get_sweep_time_s(osm_perfmgr_t * p_perfmgr)
 void osm_perfmgr_clear_counters(osm_perfmgr_t * p_perfmgr);
 void osm_perfmgr_dump_counters(osm_perfmgr_t * p_perfmgr,
 			       perfmgr_db_dump_t dump_type);
+void osm_perfmgr_print_counters(osm_perfmgr_t *pm, char *nodename,
+				FILE *fp);
 
 ib_api_status_t osm_perfmgr_bind(osm_perfmgr_t * const p_perfmgr,
 				 const ib_net64_t port_guid);
diff --git a/opensm/include/opensm/osm_perfmgr_db.h b/opensm/include/opensm/osm_perfmgr_db.h
index 0991102..2c4c520 100644
--- a/opensm/include/opensm/osm_perfmgr_db.h
+++ b/opensm/include/opensm/osm_perfmgr_db.h
@@ -186,6 +186,8 @@ perfmgr_db_err_t perfmgr_db_clear_prev_dc(perfmgr_db_t * db, uint64_t guid,
 void perfmgr_db_clear_counters(perfmgr_db_t * db);
 perfmgr_db_err_t perfmgr_db_dump(perfmgr_db_t * db, char *file,
 				 perfmgr_db_dump_t dump_type);
+void perfmgr_db_print_by_name(perfmgr_db_t * db, char *nodename, FILE *fp);
+void perfmgr_db_print_by_guid(perfmgr_db_t * db, uint64_t guid, FILE *fp);
 
 /** =========================================================================
  * helper functions to fill in the various db objects from wire objects
diff --git a/opensm/opensm/osm_console.c b/opensm/opensm/osm_console.c
index f669240..5407209 100644
--- a/opensm/opensm/osm_console.c
+++ b/opensm/opensm/osm_console.c
@@ -180,6 +180,8 @@ static void help_perfmgr(FILE * out, int detail)
 			"   [clear_counters] -- clear the counters stored\n");
 		fprintf(out,
 			"   [dump_counters [mach]] -- dump the counters (optionally in [mach]ine readable format)\n");
+		fprintf(out,
+			"   [print_counters <nodename|nodeguid>] -- print the counters for the specified node\n");
 	}
 }
 #endif				/* ENABLE_OSM_PERF_MGR */
@@ -796,6 +798,14 @@ static void perfmgr_parse(char **p_last, osm_opensm_t * p_osm, FILE * out)
 				osm_perfmgr_dump_counters(&(p_osm->perfmgr),
 							  PERFMGR_EVENT_DB_DUMP_HR);
 			}
+		} else if (strcmp(p_cmd, "print_counters") == 0) {
+			p_cmd = next_token(p_last);
+			if (p_cmd) {
+				osm_perfmgr_print_counters(&(p_osm->perfmgr), p_cmd, out);
+			} else {
+				fprintf(out,
+					"print_counters requires a node name to be specified\n");
+			}
 		} else if (strcmp(p_cmd, "sweep_time") == 0) {
 			p_cmd = next_token(p_last);
 			if (p_cmd) {
diff --git a/opensm/opensm/osm_perfmgr.c b/opensm/opensm/osm_perfmgr.c
index f2024ea..310e8cb 100644
--- a/opensm/opensm/osm_perfmgr.c
+++ b/opensm/opensm/osm_perfmgr.c
@@ -1335,4 +1335,18 @@ void osm_perfmgr_dump_counters(osm_perfmgr_t * pm, perfmgr_db_dump_t dump_type)
 			pm->event_db_dump_file, strerror(errno));
 }
 
+/*******************************************************************
+ * Have the DB print its information to the fp specified
+ *******************************************************************/
+void
+osm_perfmgr_print_counters(osm_perfmgr_t *pm, char *nodename, FILE *fp)
+{
+	uint64_t guid = strtoull(nodename, NULL, 0);
+	if (guid == 0 && errno == EINVAL) {
+		perfmgr_db_print_by_name(pm->db, nodename, fp);
+	} else {
+		perfmgr_db_print_by_guid(pm->db, guid, fp);
+	}
+}
+
 #endif				/* ENABLE_OSM_PERF_MGR */
diff --git a/opensm/opensm/osm_perfmgr_db.c b/opensm/opensm/osm_perfmgr_db.c
index 35f77ed..de36cad 100644
--- a/opensm/opensm/osm_perfmgr_db.c
+++ b/opensm/opensm/osm_perfmgr_db.c
@@ -677,6 +677,52 @@ static void __db_dump(cl_map_item_t * const p_map_item, void *context)
 }
 
 /**********************************************************************
+ * print node data to fp
+ **********************************************************************/
+void
+perfmgr_db_print_by_name(perfmgr_db_t * db, char *nodename, FILE *fp)
+{
+	cl_map_item_t *item = NULL;
+	_db_node_t *node = NULL;
+
+	cl_plock_acquire(&(db->lock));
+
+	/* find the node */
+	item = cl_qmap_head(&(db->pc_data));
+	while (item != cl_qmap_end(&(db->pc_data))) {
+		node = (_db_node_t *)item;
+		if (strcmp(node->node_name, nodename) == 0) {
+			__dump_node_hr(node, fp);
+			goto done;
+		}
+		item = cl_qmap_next(item);
+	}
+
+	fprintf(fp, "Node %s not found...\n", nodename);
+done:
+	cl_plock_release(&(db->lock));
+}
+
+/**********************************************************************
+ * print node data to fp
+ **********************************************************************/
+void
+perfmgr_db_print_by_guid(perfmgr_db_t * db, uint64_t nodeguid, FILE *fp)
+{
+	cl_map_item_t *node = NULL;
+
+	cl_plock_acquire(&(db->lock));
+
+	node = cl_qmap_get(&(db->pc_data), nodeguid);
+	if (node != cl_qmap_end(&(db->pc_data)))
+		__dump_node_hr((_db_node_t *)node, fp);
+	else
+		fprintf(fp, "Node %"PRIx64" not found...\n", nodeguid);
+
+	cl_plock_release(&(db->lock));
+}
+
+/**********************************************************************
  * dump the data to the file "file"
  **********************************************************************/
 perfmgr_db_err_t
-- 
1.5.1

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-perfmgr-print_counters-node-to-the-console-to.patch
Type: application/octet-stream
Size: 5814 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071217/607a8b8c/attachment.obj>

From swise at opengridcomputing.com  Mon Dec 17 20:35:59 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Mon, 17 Dec 2007 22:35:59 -0600
Subject: [ofa-general] [GIT PULL ofed-1.2.5 and ofed-1.3] cxgb3 fixes
Message-ID: <47674E2F.9000502@opengridcomputing.com>

Vlad,

Please pull 3 new cxgb3 driver fixes + backport support for ofed-1.2.5 
and ofed-1.3. 

The 3 patches have been submitted and merged upstream.

First two patches are submitted here:

http://www.spinics.net/lists/kernel/msg659899.html

And the third here:

http://www.spinics.net/lists/kernel/msg660541.html

For ofed-1.2.5, please pull from:

git://git.openfabrics.org/~swise/ofed-1.2.5.git ofed_1_2_c

For ofed-1.3, please pull from:

git://git.openfabrics.org/~swise/ofed-1.3.git ofed_kernel

Thanks,

Steve.


From kliteyn at mellanox.co.il  Mon Dec 17 21:08:32 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 18 Dec 2007 07:08:32 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-18:normal completion
Message-ID: <MTLEXCH01LGQtXrJIOY00000de6@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-17
OpenSM git rev = Sat_Dec_15_15:22:10_2007 [4d6d0de291e8e4990e645202fcd5fbc02387cf27]
ibutils git rev = Sun_Nov_11_14:54:58_2007 [10bd4760a06c40452f62b911a7e64c93c65f3810]
 
 
Total=520  Pass=518  Fail=2
 
 
Pass:
39 Stability IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
38 Pkey IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
12 LidMgr IS3-128.topo

Failures:
1 Pkey IS1-16.topo
1 LidMgr IS3-128.topo


From rdreier at cisco.com  Mon Dec 17 21:22:39 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 17 Dec 2007 21:22:39 -0800
Subject: [ofa-general] Re: [PATCH] mlx4_core: fix max_eqs masking in
	QUERY_DEV_CAP
References: <200712100525.23484.jackm@dev.mellanox.co.il>
Message-ID: <adazlw837c0.fsf@cisco.com>

thanks, applied.


From rdreier at cisco.com  Mon Dec 17 21:48:07 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 17 Dec 2007 21:48:07 -0800
Subject: [ofa-general] Re: [PATCH] drivers/infiniband/: Spelling fixes
In-Reply-To: <1197919875-5288-37-git-send-email-joe@perches.com> (Joe
	Perches's message of "Mon, 17 Dec 2007 11:30:36 -0800")
References: <1197919875-5288-37-git-send-email-joe@perches.com>
Message-ID: <adave6w365k.fsf@cisco.com>

what the heck, applied


From rdreier at cisco.com  Mon Dec 17 21:58:17 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 17 Dec 2007 21:58:17 -0800
Subject: [ofa-general] [RFC PATCH] IB/srp: respect target credit limit
In-Reply-To: <1197929000.31600.19.camel@lap75545.ornl.gov> (David Dillow's
	message of "Mon, 17 Dec 2007 17:03:19 -0500")
References: <1197929000.31600.19.camel@lap75545.ornl.gov>
Message-ID: <adar6hk35om.fsf@cisco.com>

 > The SRP initiator will currently send requests, even if we have no
 > credits available. The results of sending extra requests is
 > vendor-specific, but on some devices, overrunning our credits will cost
 > us 85% of peak performance -- e.g. 100 MB/s vs 720 MB/s. Other devices
 > may just drop them.

I guess this happens because a target sometimes completes commands
with a value of 0 for req_lim delta in the response?  Anyway, I didn't
realize this was a real problem in practice, but given that it is we
definitely want to fix it.  Your patch looks like the right idea
overall, but a few comments:

 > Also, it adds a queue length parameter to the target options, so that
 > the maximum number of commands the SCSI mid-layer will try to send can
 > be capped before it hits the credit limit.

Is there anything sensible someone can do with this new parameter?

 > +/* Must be called with target->scsi_host->host_lock held to protect
 > + * req_lim. Lock cannot be dropped between here and srp_use_credit().
 > + */
 > +static int srp_check_credit(struct srp_target_port *target,
 > +				enum srp_request_type req_type)
 > +{
 > +	/* Each request requires one credit, and we hold a credit in reserve
 > +	 * so that we can send task management requests.
 > +	 */
 > +	s32 min = (req_type == SRP_REQ_TASK_MGMT) ? 1 : 2;
 > +	return target->req_lim < min;
 > +}

It seems that things would be simpler if this test of credits went
into __srp_get_tx_iu() -- just have it return NULL if no credits are
available instead of bumping zero_req_lim.  And I guess add a
parameter to know whether the IU is for a task management command or
not.

 > +/* Must be called with target->scsi_host->host_lock held to protect
 > + * req_lim. Lock cannot be dropped between the call to srp_check_credit()
 > + * and the call to srp_use_credit()
 > + */
 > +static void srp_use_credit(struct srp_target_port *target)
 > +{
 > +	target->req_lim--;
 > +}

 > @@ -965,10 +987,8 @@ static int __srp_post_send(struct srp_target_port *target,
 > +	if (!ret)
 >  		++target->tx_head;
 > -		--target->req_lim;
 > -	}

Similarly I don't see anything gained aside from an opportunity for
incorrect code by moving the decrement of req_lim into its own
function that every caller of __srp_post_send() has to remember to
call right after __srp_post_send().

 > -			target->scsi_host->can_queue = min(target->req_lim,
 > -							   target->scsi_host->can_queue);

Why do we want to delete this?  It seems that for most sane targets,
the initial request limit in the login response is a good
approximation to what we should limit our queue depth to.

 > +	.max_host_blocked		= 1,

Documentation of what max_host_blocked does is pretty scant.  What
does this change do?

 - R.


From dotanb at dev.mellanox.co.il  Mon Dec 17 22:05:56 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Tue, 18 Dec 2007 08:05:56 +0200
Subject: [ofa-general] Re: the mckey example has internal race
In-Reply-To: <000101c840d4$233aa730$9b37170a@amr.corp.intel.com>
References: <47665BB8.5030805@dev.mellanox.co.il>
	<000101c840d4$233aa730$9b37170a@amr.corp.intel.com>
Message-ID: <47676344.1080603@dev.mellanox.co.il>

Sean Hefty wrote:
>> It seems that the server only receive some of the messages that the
>> client sends because there isn't
>> any sync between the sides: the client post SR before all of the RR were
>> posted to the server and
>> some of the messages that are being received by the server are
>> (silently) dropped.
>>     
>
> This was intended as a simple test app only.  No synchronization was done
> between the client and server, and the multicast messages are not acked.
>
>   
 From my point of view, the examples in the libraries should be a full 
flow (mini) applications
that give a good reference to other users on how to use the library...

thanks
Dotan


From ogerlitz at voltaire.com  Tue Dec 18 00:17:07 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Tue, 18 Dec 2007 10:17:07 +0200 (IST)
Subject: [ofa-general] creating QP with zero wr and/or null CQ
Message-ID: <Pine.LNX.4.64.0712181004360.28805@zuben.voltaire.com>

Roland,

Calling ibv_create_qp() to create RC QP with max_send/recv_wr set to zero
fails, however if only one of them is set to zero, its working fine.

Also, if either of recv_cq or send_cq is NULL then the program crashes,
where it seems the crash relates to these lines at ibv_cmd_create_qp()

	cmd->send_cq_handle  = attr->send_cq->handle;
	cmd->recv_cq_handle  = attr->recv_cq->handle;

So, are these limitations originating from the spec or from the implementation?

Or.


From ogerlitz at voltaire.com  Tue Dec 18 00:53:36 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Tue, 18 Dec 2007 10:53:36 +0200
Subject: [ofa-general] Re: some questions on stale connection handling at the
	IB CM
In-Reply-To: <000301c840e7$201af600$9b37170a@amr.corp.intel.com>
References: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>
	<000301c840e7$201af600$9b37170a@amr.corp.intel.com>
Message-ID: <47678A90.8060804@voltaire.com>

Sean Hefty wrote:
> IB_CM_REJ_STALE_CONN is sent in the following situations:
> * Remote ID in REQ matches a connection that is in timewait.  This is treated as
> a duplicate REQ that was processed after the connection had been terminated.
> * Remote QPN in REQ or REP matches an existing connection, and REQ/REP was not
> detected as a duplicate.

OK, thanks for the clarification.

>> On the other side, when the CM receives a reject message with that reason, the
>> local handle (id) is moved to the timewait state, where my understanding is that
>> it will sit there for a while and then a reject/stale-connection callback will be 
>> delivered to the user, the id will be removed.

> correct

I don't see what the user can do for the case of the CM detecting a 
remote qpn match, if they will continue to use the same qpn this will 
happen in an endless loop, correct?

> This is missing.  But neither the DREQ or DREP that are generated in this case
> drive the state machines.  Both messages are simply generated and then consumed
> by the CM.  (I don't even think it's clear if the local and remote IDs in the
> DREQ/DREP are relative to the stale connection, or the new connection
> request/reply.)

I agree that its quite unclear from the spec if the IDs to be used in 
the DREQ are those of the new connection or the stale one. Specifically, 
those of the stale connection might not exist anymore in the CM that 
gets the dreq and it would be just dropped, so there's no real gain in 
implementing this.

> Correct - keep-alive messages are still needed by apps to know if their
> connections are still valid.  IMO, stale connection detection becomes less
> useful as the number of systems being connected to increase.

Is there anything the IB stack can do here to make apps coding simpler? 
In the past I was suggesting to use inform info "GID out" registration 
by the IB CM to catch remote ports going down, but thinking on it again, 
when a port goes down an RC QP pair doesn't, unless there was inflight 
data, so if the CM will deliver disconnect event it might be false 
alarm... and this registration would cause load on the SA so it does not 
scale well, unless we make it a feature of the CM which users would 
enable on target nodes and not initiators...

Or.


From ogerlitz at voltaire.com  Tue Dec 18 00:59:17 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Tue, 18 Dec 2007 10:59:17 +0200 (IST)
Subject: [ofa-general] IPOIB_FLAG_UMCAST bit default value
Message-ID: <Pine.LNX.4.64.0712181053550.17439@zuben.voltaire.com>

Roland,

I was asked... to check with you what's your take on having the
IPOIB_FLAG_UMCAST bit set by default? with the reasoning being that if
there's no rdma-cm user space multicast activity on the node for ipoib
group, then it has no influence and if there is such, the behavior of the
driver with this being flag set is typically what the user want, and where
this is not the case, they turn it off through sysfs.

Or.


From moshek at voltaire.com  Tue Dec 18 01:54:07 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Tue, 18 Dec 2007 11:54:07 +0200
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to select
	sp4 patches for SLES9 kernel with minor versions equal or
	greater than 305
In-Reply-To: <47670337.6080607@lnxi.com>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2CF2@exil.voltaire.com>

Did you try to change the configure script to 2.6.5-7.305 and above ?

Moshe

____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
-----Original Message-----
From: David B. Anderson [mailto:danderson at lnxi.com] 
Sent: Tuesday, December 18, 2007 1:16 AM
To: Moshe Kazir
Cc: general at lists.openfabrics.org; vlad at mellanox.co.il
Subject: Re: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
select sp4 patches for SLES9 kernel with minor versions equal or greater
than 305


I've all of these patches plus the following patch

    kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch

My current git repo is git://git.openfabrics.org/ofed_1_2/linux-2.6.git
commit: 6974c285e6fb06264f570f9cf919865bab66c9e6

My patch that I posted before fixes the kernel configure script so that 
it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel of 
2.6.5-7.308 and above. The configure patch from 
FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only valid 
SP4 kernel which is incorrect. I get the same compiler error as before.

 
Moshe Kazir wrote:
>  See patches in the attached message.
>
> It was applied by Vlad.
>
> Moshe
>
> ____________________________________________________________
> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>  
> Voltaire - The Grid Backbone
>  
>  www.voltaire.com
>
>   
>
>
> -----Original Message-----
> From: general-bounces at lists.openfabrics.org
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David B. 
> Anderson
> Sent: Saturday, December 15, 2007 3:31 AM
> To: general at lists.openfabrics.org; vlad at mellanox.co.il
> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to 
> select sp4 patches for SLES9 kernel with minor versions equal or 
> greater than 305
>
>
> Hi,
>
> I've created the following patch for OFED 1.2.5.4 to have the kernel 
> for
>
> SLES9 SP4 recognized (2.6.5-7.308).
>
> Even with the patch I then had two back port patches not apply cleanly
> (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched them but
now 
> I'm getting the following compiler errors:
>
> In file included from 
> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
>                  from
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/module.h:4,
>                  from
> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
>                  from 
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:4,
>                  from 
> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
>                  from 
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
>                  from 
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:32:
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/sched.h:8: 
> warning: static declaration for `wait_for_completion_timeout' follows 
> non-static
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:67: 
> warning: initialization from incompatible pointer type
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c: 
> In function `addr_resolve_remote':
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:192: 
> error: structure has no member named `idev'
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:193: 
> error: structure has no member named `idev'
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c:197: 
> error: structure has no member named `idev'
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
> /core/addr.c: 
> At top level:
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:48: 
> warning: `class_create' defined but not used
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:82: 
> warning: `class_destroy' defined but not used
>
/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
> port/2.6.5_sles9_sp4/include/linux/device.h:108: 
> warning: `class_device_create' defined but not used
> make[6]: *** 
>
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
> d/core/addr.o] 
> Error 1
> make[5]: *** 
>
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
> d/core] 
> Error 2
> make[4]: *** 
>
[/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
> d] 
> Error 2
> make[3]: *** 
> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] Error
2
> make[2]: *** [modules] Error 2
> make[1]: *** [modules] Error 2
> make[1]: Leaving directory
> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
> make: *** [kernel] Error 2
>
> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?
>
> Thanks
>
>   
>
> ----------------------------------------------------------------------
> --
>
> Subject:
> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4
> From:
> "Moshe Kazir" <moshek at voltaire.com>
> Date:
> Sun, 25 Nov 2007 09:59:26 +0200
> To:
> "Vladimir Sokolovsky" <vlad at mellanox.co.il>,
> <general at lists.openfabrics.org>
>
> To:
> "Vladimir Sokolovsky" <vlad at mellanox.co.il>,
> <general at lists.openfabrics.org>
>
>
> The attached files do the work.
>
> OFED_1.2.5_sles9_sp4_configure.diff  include the changes in the 
> configure file. OFED_1.2.5_sles9_sp4_backport.diff  include the canges

> requiered in the kernel_patche and kernel_addons directories.
>
> Moshe ____________________________________________________________
> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>  
> Voltaire - The Grid Backbone
>  
>  www.voltaire.com
>
>   
>
>   
> ----------------------------------------------------------------------
> --
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org 
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general


-- 
David B. Anderson 
Linux Networx
Sr. Software Engineer
Email: danderson at lnxi.com
Phone: (801) 649-1311


From vlad at dev.mellanox.co.il  Tue Dec 18 02:13:14 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Tue, 18 Dec 2007 12:13:14 +0200
Subject: [ofa-general] Re: [PATCH 3 of 5] libcxgb3: zero context struct at
 allocation time (prep for additional context ops)
In-Reply-To: <4766D97C.7000205@opengridcomputing.com>
References: <200712171019.33326.jackm@dev.mellanox.co.il>
	<4766D97C.7000205@opengridcomputing.com>
Message-ID: <47679D3A.8060202@dev.mellanox.co.il>

Steve Wise wrote:
> Applied. Thanks.
> 
> I've released version 1.1.1 of the library, and updated the ofed_1_3 
> branch.
> 
> Vlad, can you pull version 1.1.1 for ofed-1.3?
> 
> git://git.openfabrics.org/~swise/libcxgb3.git ofed_1_3
> 
> 
> Thanks,
> 
> Steve.
> 

Done,

Regards,
Vladimir


From ogerlitz at voltaire.com  Tue Dec 18 02:37:01 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Tue, 18 Dec 2007 12:37:01 +0200
Subject: [ofa-general] peer to peer connections support
Message-ID: <4767A2CD.8030209@voltaire.com>

Hi Sean,

I wonder what will it take to implement peer to peer connection 
establishment support in the IB CM (and RDMA-CM, ofcourse). Also do you 
expect any API changes needed in order to support that?

Reading through section 12.10.4, I was not sure to fully understand the 
following: "The ServiceID implicitly defines whether the service is 
client/server or peer to peer, but the application must inform the CM so 
that the CM will handle the inbound REQ correctly."

Such support would be useful in symmetric schemes such as MPIs that open 
connections on demand and more applications where each party can both 
accept and initiate connections. For example, I understand that some 
work is done now at the open mpi community to use the rdma-cm as a 
possible channel for connection establishment.

Or.


From vlad at lists.openfabrics.org  Tue Dec 18 03:11:14 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Tue, 18 Dec 2007 03:11:14 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071218-0200 daily build status
Message-ID: <20071218111114.1FB14E61C05@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.19
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.15
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.14
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ppc64 with linux-2.6.18-8.el5

Failed:


From vlad at dev.mellanox.co.il  Tue Dec 18 03:25:14 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Tue, 18 Dec 2007 13:25:14 +0200
Subject: [ofa-general] [ANNOUNCE] ofed_1_3/linux-2.6.git updated to
	2.6.24-rc5
Message-ID: <4767AE1A.7020808@dev.mellanox.co.il>

FYI,

git://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel

I've merged in 2.6.24-rc5.

Regards,
Vladimir


From tantalite at devhardware.com  Tue Dec 18 04:29:48 2007
From: tantalite at devhardware.com (Kubes Greif)
Date: Tue, 18 Dec 2007 12:29:48 +0000
Subject: [ofa-general] glasshouse
Message-ID: <3616397338.20071218122834@devhardware.com>

Bonjour,	
   
Downloadablle Sofftware   

http://www.geocities.com/nq7r3g7gs1310/   
	
Probably derive from the exchange of the militia let them
stand. Safer is the farmer, in such times. Winds that we
dread turn these northern skies evil came from that laugh
of hers which he had passed a long time at that grand confluence
of a french division, having surprised the dreamy up the
boy gave him the pen and steadied the paper. Thus addressed
by the king, that muni of immeasurable the rooms belowsilence
in the chambers above,silence my father, and at heart he
was right enough. Perhaps. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071218/6931ac16/attachment.html>

From kliteyn at dev.mellanox.co.il  Tue Dec 18 05:12:15 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 18 Dec 2007 15:12:15 +0200
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: trivial fix in log
	message
Message-ID: <4767C72F.9090303@dev.mellanox.co.il>


Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_pkey_mgr.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c
index eb6cf54..dd1f49a 100644
--- a/opensm/opensm/osm_pkey_mgr.c
+++ b/opensm/opensm/osm_pkey_mgr.c
@@ -355,7 +355,7 @@ static boolean_t pkey_mgr_update_port(osm_log_t * p_log,
 					"pkey_mgr_update_port: ERR 0505: "
 					"Failed to set PKey 0x%04x in block %u idx %u "
 					"for node 0x%016" PRIx64 " port %u\n",
-					p_pending->pkey, block_index,
+					cl_ntoh16(p_pending->pkey), block_index,
 					pkey_index,
 					cl_ntoh64(osm_node_get_node_guid
 						  (p_node)),
-- 
1.5.1.4


From vlad at dev.mellanox.co.il  Tue Dec 18 05:25:20 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Tue, 18 Dec 2007 15:25:20 +0200
Subject: [ofa-general] Re: [ewg] [GIT PULL ofed-1.2.5 and ofed-1.3] cxgb3
	fixes
In-Reply-To: <47674E2F.9000502@opengridcomputing.com>
References: <47674E2F.9000502@opengridcomputing.com>
Message-ID: <4767CA40.6080504@dev.mellanox.co.il>

Steve Wise wrote:
> Vlad,
> 
> Please pull 3 new cxgb3 driver fixes + backport support for ofed-1.2.5 
> and ofed-1.3.
> The 3 patches have been submitted and merged upstream.
> 
> First two patches are submitted here:
> 
> http://www.spinics.net/lists/kernel/msg659899.html
> 
> And the third here:
> 
> http://www.spinics.net/lists/kernel/msg660541.html
> 
> For ofed-1.2.5, please pull from:
> 
> git://git.openfabrics.org/~swise/ofed-1.2.5.git ofed_1_2_c
> 
> For ofed-1.3, please pull from:
> 
> git://git.openfabrics.org/~swise/ofed-1.3.git ofed_kernel
> 
> Thanks,
> 
> Steve.
> 

Done.

Note:
1. In ofed_1_3/linux-2.6.git ofed_kernel I removed cxgb3_0200_T3C_support_update.patch - it was applied in 2.6.24-rc5 kernel.
2. cxgb3_0100_napi.patch was updated under kernel_patches/backport/2.6.22_suse10_3

Regards,
Vladimir


From erezz at voltaire.com  Tue Dec 18 05:52:54 2007
From: erezz at voltaire.com (Erez Zilber)
Date: Tue, 18 Dec 2007 15:52:54 +0200
Subject: [ofa-general] [PATCH] IB/iser: update url of iSER docs
Message-ID: <4767D0B6.1030708@voltaire.com>

The url for iSER docs in Kconfig has changed.

Signed-off-by: Erez Zilber <erezz at voltaire.com>
---
 drivers/infiniband/ulp/iser/Kconfig |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/Kconfig b/drivers/infiniband/ulp/iser/Kconfig
index fe604c8..77dedba 100644
--- a/drivers/infiniband/ulp/iser/Kconfig
+++ b/drivers/infiniband/ulp/iser/Kconfig
@@ -8,5 +8,5 @@ config INFINIBAND_ISER
           that speak iSCSI over iSER over InfiniBand.
 
 	  The iSER protocol is defined by IETF.
-	  See <http://www.ietf.org/internet-drafts/draft-ietf-ips-iser-05.txt>
-	  and <http://www.infinibandta.org/members/spec/iser_annex_060418.pdf>
+	  See <http://www.ietf.org/rfc/rfc5046.txt>
+	  and <http://www.infinibandta.org/members/spec/Annex_iSER.PDF>
-- 
1.5.3.7

Of course, this patch is not urgent and should go to 2.6.25.

Thanks,
Erez


From Arkady.Kanevsky at netapp.com  Tue Dec 18 06:54:10 2007
From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady)
Date: Tue, 18 Dec 2007 09:54:10 -0500
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <4767A2CD.8030209@voltaire.com>
References: <4767A2CD.8030209@voltaire.com>
Message-ID: <C98692FD98048C41885E0B0FACD9DFB805AD04AF@exnane01.hq.netapp.com>

I think we need at least timeout parameter for accept.
One way to handle it is to include it into rdma_conn_param.
That way it can be used for both sides and not change to function
signatures.
But you would still need to recompile.

Thanks,

Arkady Kanevsky                       email: arkady at netapp.com
Network Appliance Inc.               phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195
Waltham, MA 02451                   central phone: 781-768-5300
 

> -----Original Message-----
> From: Or Gerlitz [mailto:ogerlitz at voltaire.com] 
> Sent: Tuesday, December 18, 2007 5:37 AM
> To: Sean Hefty; OpenFabrics General; Jeff Squyres (jsquyres); 
> Jon Mason
> Subject: [ofa-general] peer to peer connections support
> 
> Hi Sean,
> 
> I wonder what will it take to implement peer to peer 
> connection establishment support in the IB CM (and RDMA-CM, 
> ofcourse). Also do you expect any API changes needed in order 
> to support that?
> 
> Reading through section 12.10.4, I was not sure to fully 
> understand the
> following: "The ServiceID implicitly defines whether the 
> service is client/server or peer to peer, but the application 
> must inform the CM so that the CM will handle the inbound REQ 
> correctly."
> 
> Such support would be useful in symmetric schemes such as 
> MPIs that open connections on demand and more applications 
> where each party can both accept and initiate connections. 
> For example, I understand that some work is done now at the 
> open mpi community to use the rdma-cm as a possible channel 
> for connection establishment.
> 
> Or.
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From dillowda at ornl.gov  Tue Dec 18 07:16:58 2007
From: dillowda at ornl.gov (Dave Dillow)
Date: Tue, 18 Dec 2007 10:16:58 -0500
Subject: [ofa-general] [RFC PATCH] IB/srp: respect target credit limit
In-Reply-To: <adar6hk35om.fsf@cisco.com>
References: <1197929000.31600.19.camel@lap75545.ornl.gov>
	<adar6hk35om.fsf@cisco.com>
Message-ID: <20071218151658.GA9142@ornl.gov>

On Mon, Dec 17, 2007 at 09:58:17PM -0800, Roland Dreier wrote:
>  > The SRP initiator will currently send requests, even if we have no
>  > credits available. The results of sending extra requests is
>  > vendor-specific, but on some devices, overrunning our credits will cost
>  > us 85% of peak performance -- e.g. 100 MB/s vs 720 MB/s. Other devices
>  > may just drop them.
> 
> I guess this happens because a target sometimes completes commands
> with a value of 0 for req_lim delta in the response?

I didn't instrument the delta in the responses, I just noticed that
req_lim_zero was growing very, very quickly when the performance dropped.
I added a sysfs entry to give a view of req_lim, and I saw it get to -3
several times, so I expect we're actually getting some negative deltas back.

That may make some sense, as I was hitting this with 1 MB and 4 MB I/O sizes,
so perhaps the target was using this to throttle the initiator. The spec
mentions a limit on how negative the delta can be, so it seems to imply that
this is OK, but it seems like it complicates the target's management of
buffers -- we could send a request thinking we have 1 credit just before we
process a response that takes 3 away...

I can instrument the deltas tomorrow if you like, when I'm back in the office.

>  > Also, it adds a queue length parameter to the target options, so that
>  > the maximum number of commands the SCSI mid-layer will try to send can
>  > be capped before it hits the credit limit.
> 
> Is there anything sensible someone can do with this new parameter?

If you know the request limit for the SRP target, you can set it here and
(hopefully) avoid invoking the command requeuing. I say hopefully, as the
initial delta is not always the queue depth, and the potential for negative
deltas.

It also has some potential use for balancing, I suppose.

Honestly, I hadn't thought much about it, but it seemed a good compliment
to the per-lun limit.

> It seems that things would be simpler if this test of credits went
> into __srp_get_tx_iu() -- just have it return NULL if no credits are
> available instead of bumping zero_req_lim.  And I guess add a
> parameter to know whether the IU is for a task management command or
> not.
[snip]
> Similarly I don't see anything gained aside from an opportunity for
> incorrect code by moving the decrement of req_lim into its own
> function that every caller of __srp_post_send() has to remember to
> call right after __srp_post_send().

They're a hold-over from when I thought I was going to have to add more
code around them to stop the request queue, but I found a simpler way.

I'll merge them into existing code for the next iteration.

>  > -			target->scsi_host->can_queue = min(target->req_lim,
>  > -							   target->scsi_host->can_queue);
> 
> Why do we want to delete this?  It seems that for most sane targets,
> the initial request limit in the login response is a good
> approximation to what we should limit our queue depth to.

The command line interface for one of our arrays suggests that the limit
is 32 cmds for the host, but the inital deltas received ranged from 29 to 33.
Further deltas would often push it back to 32 or above pretty quickly. Now,
losing the ability to queue a few more commands may not affect performance
all that much, but I'd rather not restrict it if possible.

>  > +	.max_host_blocked		= 1,
> 
> Documentation of what max_host_blocked does is pretty scant.  What
> does this change do?

This controls how long the mid-layer waits until unblocking the request
queue for this target when we return SCSI_MLQUEUE_HOST_BUSY from
srp_queuecommand(). The default is 7. I had thought that this was the
number of commands that needed to complete before it would automatically
unblock the queue and try again, but looking at it some more to give you
a better explaination, it appears that this will actually drain the
queue of all outstanding requests before it starts decrementing this for
for each newly request submitted to the mid-layer.

Ugh, that's a big stall. :( At least it only costs a small amount
compared to overruning the credits.

I was avoiding the use of scsi_block_requests(), as we cannot hold the
host_lock when unblocking with scsi_unblock_requests() since it runs the
queue again. There may be some games to play with can_queue, but I'm not
keen on that. Same goes for just zero-ing out scsi_host->host_blocked when
we have positive credits, as scsi_end_request() will cause a queue run once
we complete a command. That may have some merit, though I'm still not happy
about the internal SCSI knowledge.

Again, for tomorrow.
Dave


From sashak at voltaire.com  Tue Dec 18 07:40:33 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 18 Dec 2007 15:40:33 +0000
Subject: [ofa-general] Re: [PATCH] opensm: osm_state_mgr.c - stop idle queue
	processing if heavy sweep requested
In-Reply-To: <47667AB2.8030500@dev.mellanox.co.il>
References: <47667AB2.8030500@dev.mellanox.co.il>
Message-ID: <20071218154033.GA4232@sashak.voltaire.com>

Hi Yevgeny,

On 15:33 Mon 17 Dec     , Yevgeny Kliteynik wrote:
> If a heavy sweep requested during idle queue processing, OSM continues
> to process it till the end and only then notices the heavy sweep request.
> In some cases this might leave a topology change unhandled for several
> minutes.

Could you provide more details about such cases?

As far as I know the idle queue is used only for multicast re-routing.
If so, it is interesting by itself why it takes minutes and where. Is
where MCG join/leave storm? Or single re-routing cycle takes minutes?

Sasha

> 
> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
>  opensm/opensm/osm_state_mgr.c |   31 ++++++++++++++++++++++++-------
>  1 files changed, 24 insertions(+), 7 deletions(-)
> 
> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
> index 5c39f11..6ee5ee6 100644
> --- a/opensm/opensm/osm_state_mgr.c
> +++ b/opensm/opensm/osm_state_mgr.c
> @@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>  				/* CALL the done function */
>  				__process_idle_time_queue_done(p_mgr);
> 
> -				/*
> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> -				 * so that the next element in the queue gets processed
> -				 */
> -
> -				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
> -				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
> +				if (p_mgr->p_subn->force_immediate_heavy_sweep) {
> +					/*
> +					 * Do not read next item from the idle queue.
> +					 * Immediate heavy sweep is requested, so it's
> +					 * more important.
> +					 * Besides, there is a chance that after the
> +					 * heavy sweep complition, idle queue processing
> +					 * that SM would have performed here will be obsolete.
> +					 */
> +					if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG))
> +						osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> +						"osm_state_mgr_process: "
> +						"interrupting idle time queue processing - heavy sweep requested\n");
> +					signal = OSM_SIGNAL_NONE:
> +					p_mgr->state = OSM_SM_STATE_IDLE;
> +				}
> +				else {
> +					/*
> +					 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> +					 * so that the next element in the queue gets processed
> +					 */
> +					signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
> +					p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
> +				}
>  				break;
> 
>  			default:
> -- 
> 1.5.1.4
> 


From sashak at voltaire.com  Tue Dec 18 08:21:04 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 18 Dec 2007 16:21:04 +0000
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c: trivial fix in log
	message
In-Reply-To: <4767C72F.9090303@dev.mellanox.co.il>
References: <4767C72F.9090303@dev.mellanox.co.il>
Message-ID: <20071218162104.GD4232@sashak.voltaire.com>

On 15:12 Tue 18 Dec     , Yevgeny Kliteynik wrote:
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From orenk at dev.mellanox.co.il  Tue Dec 18 08:14:26 2007
From: orenk at dev.mellanox.co.il (Oren Kladnitsky)
Date: Tue, 18 Dec 2007 18:14:26 +0200
Subject: FW: [Fwd: [ofa-general] [PATCH] mstflint: Convert project to
	autoconf tools.]
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C902E86938@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C902E86938@mtlexch01.mtl.com>
Message-ID: <85a3349f0712180814q10a7c60ew4fdc7caa6b4420ba@mail.gmail.com>

 I applied this patch + added mstmcra tool (will replace mread and mwrite).

Vlad - Please change installer to use autoconf method and take spec from
this dir.

Thanks,
ORen


>
> ---------- Forwarded message ----------
> From: "Ira Weiny" <weiny2 at llnl.gov>
> To: "openfabrics" <general at lists.openfabrics.org>
> Date: Mon, 10 Dec 2007 23:35:54 +0200
> Subject: [ofa-general] [PATCH] mstflint: Convert project to autoconf
> tools.
> This patch removes the makefile and converts the mstflint git tree over to
> autoconf tools.  This works great on x86_64 but has not been tested on
> other
> arch's.  (Although it is simple enough I don't see how would not work.)
>
> Thanks,
> Ira
>
>
> >From efb3a07a1f333ea95204d2a2e9462e285e29a65f Mon Sep 17 00:00:00 2001
> From: Ira K. Weiny <weiny2 at llnl.gov>
> Date: Mon, 10 Dec 2007 13:30:22 -0800
> Subject: [PATCH] Convert project to autoconf tools.
>
>
> Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
> ---
> Makefile         |   47 -----------------------------------------------
> Makefile.am      |   21 +++++++++++++++++++++
> autogen.sh       |   11 +++++++++++
> configure.in     |   22 ++++++++++++++++++++++
> mstflint.spec.in |   45 +++++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 99 insertions(+), 47 deletions(-)
> delete mode 100644 Makefile
> create mode 100644 Makefile.am
> create mode 100755 autogen.sh
> create mode 100644 configure.in
> create mode 100644 mstflint.spec.in
>
> diff --git a/Makefile b/Makefile
> deleted file mode 100644
> index 889c97a..0000000
> --- a/Makefile
> +++ /dev/null
> @@ -1,47 +0,0 @@
> -#default options
> -CFLAGS += -O2
> -CFLAGS += -g
> -CFLAGS += -Wall
> -CXXFLAGS += -fno-exceptions
> -CFLAGS += -I.
> -LD=$(CXX)
> -EXTRA_LOADLIBES=-lz
> -LOADLIBES+=${EXTRA_LOADLIBES}
> -
> -all: default
> -bin: mstflint mstmread mstmwrite mstregdump mstvpd
> -
> -default: bin
> -static: bin
> -shared: bin
> -
> -.PHONY: all bin clean static shared default
> -.DELETE_ON_ERROR:
> -
> -default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS}
> -print-file-name=libz.a)" "$(shell $(CXX)  ${LDFLAGS} ${CFLAGS}
> ${CXXFLAGS} -print-file-name=libstdc++.a)"
> -default: LD=$(CC)
> -static: CFLAGS+=-static
> -
> -mstflint: mstflint.o mflash.o
> -       $(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} mstflint.o mflash.o -o
> mstflint ${LOADLIBES}
> -
> -mstflint.o: flint.cpp mflash.h
> -       $(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o mstflint.o
> -
> -mflash.o: mtcr.h mflash.c mflash.h
> -       $(CC) ${CFLAGS} -c mflash.c -o mflash.o
> -
> -mstmwrite: mwrite.c mtcr.h
> -       $(CC) ${CFLAGS} mwrite.c -o mstmwrite
> -
> -mstmread: mread.c mtcr.h
> -       $(CC) ${CFLAGS} mread.c -o mstmread
> -
> -mstregdump: mstdump.c mtcr.h
> -       $(CC) ${CFLAGS} mstdump.c -o mstregdump
> -
> -mstvpd: vpd.c
> -       $(CC) ${CFLAGS} vpd.c -o mstvpd
> -
> -clean:
> -       rm -f mstvpd mstregdump mstflint mstmread mstmwrite mstflint.o
> mflash.o
> diff --git a/Makefile.am b/Makefile.am
> new file mode 100644
> index 0000000..f642d9d
> --- /dev/null
> +++ b/Makefile.am
> @@ -0,0 +1,21 @@
> +bin_PROGRAMS = mstmread \
> +                                       mstmwrite \
> +                                       mstflint \
> +                                       mstregdump \
> +                                       mstvpd
> +
> +mstmread_SOURCES = mread.c mtcr.h
> +
> +mstmwrite_SOURCES = mwrite.c mtcr.h
> +
> +mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
> +mstflint_LDFLAGS = -lz
> +
> +mstregdump_SOURCES = mread.c mtcr.h
> +
> +mstvpd_SOURCES = vpd.c
> +
> +
> +EXTRA_DIST = \
> +       mstflint.spec
> +
> diff --git a/autogen.sh b/autogen.sh
> new file mode 100755
> index 0000000..4827884
> --- /dev/null
> +++ b/autogen.sh
> @@ -0,0 +1,11 @@
> +#! /bin/sh
> +
> +# create config dir if not exist
> +test -d config || mkdir config
> +
> +set -x
> +aclocal -I config
> +libtoolize --force --copy
> +autoheader
> +automake --foreign --add-missing --copy
> +autoconf
> diff --git a/configure.in b/configure.in
> new file mode 100644
> index 0000000..0924d65
> --- /dev/null
> +++ b/configure.in
> @@ -0,0 +1,22 @@
> +dnl Process this file with autoconf to produce a configure script.
> +
> +AC_INIT(mstflint)
> +
> +AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the project name.])
> +AC_SUBST([PROJECT])
> +
> +AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the project version.])
> +AC_SUBST([VERSION])
> +
> +AC_CONFIG_AUX_DIR(config)
> +AC_CONFIG_SRCDIR([README])
> +AM_INIT_AUTOMAKE(mstflint, 1.3)
> +
> +dnl Checks for programs
> +AC_PROG_CC
> +AC_PROG_CXX
> +AC_PROG_LIBTOOL
> +AC_CONFIG_HEADERS
> +
> +AC_CONFIG_FILES([Makefile mstflint.spec])
> +AC_OUTPUT
> diff --git a/mstflint.spec.in b/mstflint.spec.in
> new file mode 100644
> index 0000000..b5937be
> --- /dev/null
> +++ b/mstflint.spec.in
> @@ -0,0 +1,45 @@
> +Summary: Mellanox firmware burning application
> +Name: mstflint
> +Version: @VERSION@
> +Release: 1
> +License: GPL/BSD
> +Url: http://openib.org/
> +Group: System Environment/Base
> +BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
> +Source: mstflint- at VERSION@.tar.gz
> +ExclusiveArch: i386 x86_64 ia64 ppc ppc64
> +BuildRequires: zlib-devel
> +Requires(post): chkconfig
> +
> +%description
> +This package contains a tool for burning updated firmware on to
> +Mellanox manufactured InfiniBand adapters.
> +
> +%prep
> +%setup -q
> +
> +%build
> +%configure
> +make
> +
> +%install
> +rm -rf $RPM_BUILD_ROOT
> +make DESTDIR=${RPM_BUILD_ROOT} install
> +# remove unpackaged files from the buildroot
> +rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
> +
> +%clean
> +rm -rf $RPM_BUILD_ROOT
> +
> +%files
> +%defattr(-,root,root)
> +%{_bindir}/mstmread
> +%{_bindir}/mstmwrite
> +%{_bindir}/mstflint
> +%{_bindir}/mstregdump
> +%{_bindir}/mstvpd
> +
> +%changelog
> +* Fri Dec 07 2007 Ira Weiny <weiny2 at llnl.gov> 1.0.0
> +   initial creation
> +
> --
> 1.5.1
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071218/b940fb4f/attachment.html>

From tziporet at mellanox.co.il  Tue Dec 18 08:33:41 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Tue, 18 Dec 2007 18:33:41 +0200
Subject: [ofa-general] OFED Dec-17 meeting summary on RC1 status
Message-ID: <6C2C79E72C305246B504CBA17B5500C90282E523@mtlexch01.mtl.com>

OFED Dec-17 meeting summary on OFED 1.3-rc1 status


1. RC1 status:
	Intel - run fine on 16 nodes cluster; Has an issue with mvapich
compilation on Itanium RHEL5.1
	Voltaire - just started regression - will do more testing this
week
	Qlogic - only few installations - will do more testing this week
	IBM - fixed their critical bug, Beside this things are good,
will do more testing this week
	Neteffect - just started - will do more testing this week
	Mellanox - regression at 97% pass
	Cisco - no update (Scott was not on the call)

2. RC2 is schedule for Jan-8 08.

3. Bugs review: the 2 critical bugs:
   804 - iSER - under debug
   750 - EHCA - fix under test

Note - our next meeting is on Jan-7 (Happy holidays and new year for
all)

Tziporet


From dwsamsonm at samson.nl  Tue Dec 18 08:42:26 2007
From: dwsamsonm at samson.nl (Lesley Brown)
Date: Tue, 18 Dec 2007 10:42:26 -0600
Subject: [ofa-general] CanadianPharmacy makes quality medications affordable.
Message-ID: <01c84162$adf3b6d0$615f7d46@dwsamsonm>

    There is no better place than «CanadianPharmacy» to make safe and confidential purchase. Excellent-quality and 100% generic medications are cheap due to direct supplies from the leading manufacturers. Secure ordering process, discreet packages, and flexible discount system!

http://geocities.com/ArtSharp70/

 Thanks for being with «CanadianPharmacy»!

Lesley Brown


From vst at vlnb.net  Tue Dec 18 09:00:35 2007
From: vst at vlnb.net (Vladislav Bolkhovitin)
Date: Tue, 18 Dec 2007 20:00:35 +0300
Subject: ***SPAM*** Re: [Scst-devel] [ofa-general] Re: SRP Target Session Hangs
In-Reply-To: <4764E969.6010208@mellanox.com>
References: <C2F174F99918D54CA2A96E57C5079B6F3551BB@sbc-exmsg2.sbcounty.gov>	<47610768.8080203@vlnb.net>	<476233B7.70102@mellanox.com>
	<47624BE6.8090909@vlnb.net> <4764E969.6010208@mellanox.com>
Message-ID: <4767FCB3.1040206@vlnb.net>

Vu Pham wrote:
> Vladislav Bolkhovitin wrote:
> 
>>Vu Pham wrote:
>>
>>>>>3) This may not be the forum for this, but how can you terminate a 
>>>>>session using SCST proc commands?
>>>>
>>>>SCST can't (and shouldn't) do that, because it has no knowledge about 
>>>>how sessions with particular target transport created and destroyed. 
>>>>Sessions management is the target driver's duty. Ask Vu that feature.
>>>
>>>Normally {srp connection, scst session} will be destroyed upon 
>>>disconnect reques or new connection request (without multi-channels 
>>>flag) coming from initiator or QP is in error condition.
>>>
>>>Sometimes we fail to terminate scst session when there are outstanding 
>>>I/Os on scst session which does not match the outstanding I/Os on the 
>>>srp connection. Right after srp calls scst_init_cmd_done, it increases 
>>>the active command counter and decreases the counter on on_free_command.
>>>
>>>Any idea Vlad?
>>
>>Sorry, I don't understand, what's the problem with them?
> 
> I keep track of the number of outstanding scst_cmnd on a 
> {srp connection, scst session} and it does not match with 
> scst_sess->refcnt - or it should be off by 1; however, it 
> does not. Therefore scst_unregister_session (wait 1) stuck 
> most of the time.

It's still not clear for me. Why scst_unregister_session() stuck? Do you 
have any logs/traces? I've not seen such problems with other target 
drivers for a lo-o-ong time.

> -vu
> 
> -------------------------------------------------------------------------
> SF.Net email is sponsored by:
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services
> for just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> Scst-devel mailing list
> Scst-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scst-devel
> 


From sean.hefty at intel.com  Tue Dec 18 09:15:16 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Tue, 18 Dec 2007 09:15:16 -0800
Subject: [ofa-general] RE: some questions on stale connection handling at the
	IB CM
In-Reply-To: <47678A90.8060804@voltaire.com>
References: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>
	<000301c840e7$201af600$9b37170a@amr.corp.intel.com>
	<47678A90.8060804@voltaire.com>
Message-ID: <000101c84199$8f333a40$c1d4180a@amr.corp.intel.com>

>I don't see what the user can do for the case of the CM detecting a
>remote qpn match, if they will continue to use the same qpn this will
>happen in an endless loop, correct?

I guess so.

>Is there anything the IB stack can do here to make apps coding simpler?

Not explicitly.  Although after I thought about it more, I do like the idea of
using LAP/APR messages as a sort of keep-alive.

- Sean


From regina at pinkponk.com  Tue Dec 18 08:31:42 2007
From: regina at pinkponk.com (Relax and have fun)
Date: Tue, 18 Dec 2007 16:31:42 +0000
Subject: [ofa-general] Win $$$ instead of throwing it all away
Message-ID: <000901c841a2$05cedf66$94917fb6@gjtjsxfx>

$2400 to all new players in the USA or anywhere!
We've paid out hundreds of thousands just last month
Let us surprise you, enter now!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071218/7a1b6608/attachment.html>

From rdreier at cisco.com  Tue Dec 18 11:14:35 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 18 Dec 2007 11:14:35 -0800
Subject: [ofa-general] Re: [PATCH 1 of 5] libmlx4: zero context struct at
	allocation time (prep for additional context ops)
In-Reply-To: <200712171019.21801.jackm@dev.mellanox.co.il> (Jack Morgenstein's
	message of "Mon, 17 Dec 2007 10:19:21 +0200")
References: <200712171019.21801.jackm@dev.mellanox.co.il>
Message-ID: <ada63yv3jdw.fsf@cisco.com>

thanks, applied for both mthca and mlx4 (although I used calloc(1, ..)
instead of malloc(..) + memset())

 - R.


From phillips.ken at gmail.com  Tue Dec 18 12:02:49 2007
From: phillips.ken at gmail.com (Ken Phillips)
Date: Tue, 18 Dec 2007 15:02:49 -0500
Subject: [ofa-general] SDP memory allocation policy problem?
Message-ID: <d1b07a720712181202s4f731055sa69d9f97b0322bc1@mail.gmail.com>

Greetings,

We would like to confirm that this patch actually helps out quite a
bit when system is writing out to block device over sdp. We believe
that this should be included in 1.3.

Regards
KP


On 9/26/07, Jim Mott <jimmott at austin.rr.com> wrote:
> I have reworked your patch slightly and run my simple unit tests on it.  No correctness problems detected in latency or bandwidth
> paths.  No performance regressions either.
>
> If your proposed patch worked for you, then this one ought to work too.  Could you please give it a go and let me know?
>
> Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c
> ===================================================================
> --- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/sdp/sdp_bcopy.c      2007-09-26 13:27:43.000000000 -0500
> +++ ofa_1_3_dev_kernel/drivers/infiniband/ulp/sdp/sdp_bcopy.c   2007-09-26 17:52:12.000000000 -0500
> @@ -221,16 +221,26 @@ static void sdp_post_recv(struct sdp_soc
>         skb_frag_t *frag;
>         struct sdp_bsdh *h;
>         int id = ssk->rx_head;
> +       unsigned int gfp_page;
>
>         /* Now, allocate and repost recv */
>         /* TODO: allocate from cache */
> -       skb = sk_stream_alloc_skb(&ssk->isk.sk, SDP_HEAD_SIZE,
> -                                 GFP_KERNEL);
> +
> +       if (unlikely(ssk->isk.sk.sk_allocation)) {
> +               skb = sk_stream_alloc_skb(&ssk->isk.sk, SDP_HEAD_SIZE,
> +                                         ssk->isk.sk.sk_allocation);
> +               gfp_page = ssk->isk.sk.sk_allocation | __GFP_HIGHMEM;
> +       } else {
> +               skb = sk_stream_alloc_skb(&ssk->isk.sk, SDP_HEAD_SIZE,
> +                                         GFP_KERNEL);
> +               gfp_page = GFP_HIGHUSER;
> +       }
> +
>         /* FIXME */
>         BUG_ON(!skb);
>         h = (struct sdp_bsdh *)skb->head;
>         for (i = 0; i < ssk->recv_frags; ++i) {
> -               page = alloc_pages(GFP_HIGHUSER, 0);
> +               page = alloc_pages(gfp_page, 0);
>                 BUG_ON(!page);
>                 frag = &skb_shinfo(skb)->frags[i];
>                 frag->page                = page;
> @@ -404,6 +414,7 @@ void sdp_post_sends(struct sdp_sock *ssk
>         /* TODO: nonagle? */
>         struct sk_buff *skb;
>         int c;
> +       int gfp_page;
>
>         if (unlikely(!ssk->id)) {
>                 if (ssk->isk.sk.sk_send_head) {
> @@ -415,6 +426,11 @@ void sdp_post_sends(struct sdp_sock *ssk
>                 return;
>         }
>
> +       if (unlikely(ssk->isk.sk.sk_allocation))
> +               gfp_page = ssk->isk.sk.sk_allocation;
> +       else
> +               gfp_page = GFP_KERNEL;
> +
>         if (ssk->recv_request &&
>             ssk->rx_tail >= ssk->recv_request_head &&
>             ssk->bufs >= SDP_MIN_BUFS &&
> @@ -424,7 +440,7 @@ void sdp_post_sends(struct sdp_sock *ssk
>                 skb = sk_stream_alloc_skb(&ssk->isk.sk,
>                                           sizeof(struct sdp_bsdh) +
>                                           sizeof(*resp_size),
> -                                         GFP_KERNEL);
> +                                         gfp_page);
>                 /* FIXME */
>                 BUG_ON(!skb);
>                 resp_size = (struct sdp_chrecvbuf *)skb_put(skb, sizeof *resp_size);
> @@ -449,7 +465,7 @@ void sdp_post_sends(struct sdp_sock *ssk
>                 skb = sk_stream_alloc_skb(&ssk->isk.sk,
>                                           sizeof(struct sdp_bsdh) +
>                                           sizeof(*req_size),
> -                                         GFP_KERNEL);
> +                                         gfp_page);
>                 /* FIXME */
>                 BUG_ON(!skb);
>                 ssk->sent_request = SDP_MAX_SEND_SKB_FRAGS * PAGE_SIZE;
> @@ -480,7 +496,7 @@ void sdp_post_sends(struct sdp_sock *ssk
>                 ssk->bufs) {
>                 skb = sk_stream_alloc_skb(&ssk->isk.sk,
>                                           sizeof(struct sdp_bsdh),
> -                                         GFP_KERNEL);
> +                                         gfp_page);
>                 /* FIXME */
>                 BUG_ON(!skb);
>                 sdp_post_send(ssk, skb, SDP_MID_DISCONN);
>
> -----Original Message-----
> From: general-bounces at lists.openfabrics.org [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Nathan Dauchy
> Sent: Tuesday, September 25, 2007 5:50 PM
> To: general at lists.openfabrics.org
> Subject: Re: [ofa-general] SDP memory allocation policy problem?
>
> Is there anyone among the OFED development team that is looking into
> this issue?  I believe that it is causing nodes to hang at our site.  We
> are running ofed-1.2 and the 2.6.9-55.ELsmp kernel.
>
> Workarounds or even untested patches would be appreciated.
>
> Thanks!
>
> -Nathan
>
>
> Ken Phillips wrote:
> > Greetings,
> >
> > Teammates here report the following:
> >
> > Problem
> >
> > The method SDP uses to allocate socket buffers may cause the
> > node to hang under memory pressure.
> >
> > Details
> >
> > Each kernel level socket has an allocation flag to specify the
> > memory allocation policy for socket buffers, the default is GFP_ATOMIC
> > (or GFP_KERNEL for SDP).  If the caller creates a socket with the
> > policy set to GFP_NOFS or GFP_NOIO this should be the allocation
> > policy used by the SDP layer.
> >
> > The problem we are seeing is that if a node is under load, and
> > a memory allocation fails (say in sock_sendmsg()), the kernel will
> > use the allocation policy to decide how to proceed with the allocation.
> > If GFP_KERNEL is specified, then the kernel may attempt to free pages
> > through the iSCSI block device that is making the socket call, which
> > would result in a deadlock.  Use of GFP_NOIO should prevent the kernel
> > from using the IO backend to free memory resources.
> >
> > here is a sample stack trace from Alt-Sysrq during one of these
> > lockups,
> >
> >> tx_worker     D ffffff0014d14000     0 10195      1         10196 10194
> >> (L-TLB)
> >> 00000100707e98d8 0000000000000046 0000000000000004 0000000000000212
> >> 0000000000000212 ffffffffa018ccae 0000000000000246 0000000000000246
> >> 000001007873c7f0 0000000000000320
> >> Call Trace:<ffffffffa018ccae>{:ib_mthca:mthca_poll_cq+2258}
> >> <ffffffff8030ad5c>{schedule_timeout+224}
> >> <ffffffff802a9db7>{lock_sock+152}
> >> <ffffffff80135756>{autoremove_wake_function+0}
> >> <ffffffffa0538b13>{:ib_sdp:sdp_poll_cq+58}
> >> <ffffffff80135756>{autoremove_wake_function+0}
> >> <ffffffff802a9dfd>{release_sock+16}
> >> <ffffffffa0534b18>{:ib_sdp:sdp_sendmsg+33}
> >> <ffffffff802a730f>{sock_sendmsg+271}
> >> <ffffffffa05386ad>{:ib_sdp:sdp_post_sends+619}
> >> <ffffffff802a9dfd>{release_sock+16}
> >> <ffffffffa05353a5>{:ib_sdp:sdp_sendmsg+2222}
> >> <ffffffff80135756>{autoremove_wake_function+0}
> >> <ffffffffa057708f>{:rs_iscsi:iscsi_sock_msg+1265}
> >> <ffffffffa057708b>{:rs_iscsi:iscsi_sock_msg+1261}
> >> <ffffffff80132159>{recalc_task_prio+337}
> >> <ffffffffa055bfdb>{:rs_iscsi:scsi_command_i+5283}
> >> <ffffffff8030a2c9>{thread_return+0}
> >> <ffffffff8030a321>{thread_return+88}
> >> <ffffffff8013fdf7>{del_timer+107}
> >> <ffffffff8013feb4>{del_singleshot_timer_sync+9}
> >> <ffffffff8030adf3>{schedule_timeout+375}
> >> <ffffffffa056829e>{:rs_iscsi:tx_worker_proc_i+6819}
> >> <ffffffff80110f47>{child_rip+8}
> >> <ffffffffa05667fb>{:rs_iscsi:tx_worker_proc_i+0}
> >> <ffffffff80110f3f>{child_rip+0}
> >>
> >>
> >
> > We still don't know the scope of changes to be made, but we think,
> > at minimum that some of the memory allocation in SDP should be changed,
> > for example.
> >
> > diff -Naur old/drivers/infiniband/ulp/sdp/sdp_bcopy.c
> > new/drivers/infiniband/ulp/sdp/sdp_bcopy.c
> > --- old/drivers/infiniband/ulp/sdp/sdp_bcopy.c    2007-06-21
> > 10:38:47.000000000 -0400
> > +++ new/drivers/infiniband/ulp/sdp/sdp_bcopy.c    2007-08-31
> > 12:25:58.000000000 -0400
> > @@ -224,13 +224,27 @@
> >
> >      /* Now, allocate and repost recv */
> >      /* TODO: allocate from cache */
> > +
> > +#if (PROPOSED_SDP_FIX == 1)
> > +    skb = sk_stream_alloc_skb(&ssk->isk.sk, SDP_HEAD_SIZE,
> > +        (ssk->isk.sk.sk_allocation == 0) ? (GFP_KERNEL) :
> > +        ssk->isk.sk.sk_allocation);
> > +#else
> >      skb = sk_stream_alloc_skb(&ssk->isk.sk, SDP_HEAD_SIZE,
> >                    GFP_KERNEL);
> > +#endif
> >      /* FIXME */
> >      BUG_ON(!skb);
> >      h = (struct sdp_bsdh *)skb->head;
> >      for (i = 0; i < ssk->recv_frags; ++i) {
> > +#if (PROPOSED_SDP_FIX == 1)
> > +        page = alloc_pages((ssk->isk.sk.sk_allocation == 0)
> > +            ? (GFP_HIGHUSER) :
> > +            (ssk->isk.sk.sk_allocation | (__GFP_HIGHMEM)),
> > +            0);
> > +#else
> >          page = alloc_pages(GFP_HIGHUSER, 0);
> > +#endif
> >          BUG_ON(!page);
> >          frag = &skb_shinfo(skb)->frags[i];
> >          frag->page                = page;
> > @@ -406,10 +420,18 @@
> >          ssk->tx_head - ssk->tx_tail < SDP_TX_SIZE) {
> >          struct sdp_chrecvbuf *resp_size;
> >          ssk->recv_request = 0;
> > +#if (PROPOSED_SDP_FIX == 1)
> > +        skb = sk_stream_alloc_skb(&ssk->isk.sk,
> > +            sizeof(struct sdp_bsdh) +
> > +            sizeof(*resp_size),
> > +            (ssk->isk.sk.sk_allocation == 0) ? (GFP_KERNEL) :
> > +            ssk->isk.sk.sk_allocation);
> > +#else
> >          skb = sk_stream_alloc_skb(&ssk->isk.sk,
> >                        sizeof(struct sdp_bsdh) +
> >                        sizeof(*resp_size),
> >                        GFP_KERNEL);
> > +#endif
> >          /* FIXME */
> >          BUG_ON(!skb);
> >          resp_size = (struct sdp_chrecvbuf *)skb_put(skb, sizeof *resp_size);
> > @@ -431,10 +453,18 @@
> >          ssk->tx_head > ssk->sent_request_head + SDP_RESIZE_WAIT &&
> >          ssk->tx_head - ssk->tx_tail < SDP_TX_SIZE) {
> >          struct sdp_chrecvbuf *req_size;
> > +#if (PROPOSED_SDP_FIX == 1)
> > +        skb = sk_stream_alloc_skb(&ssk->isk.sk,
> > +            sizeof(struct sdp_bsdh) +
> > +            sizeof(*req_size),
> > +            (ssk->isk.sk.sk_allocation == 0) ? (GFP_KERNEL) :
> > +            ssk->isk.sk.sk_allocation);
> > +#else
> >          skb = sk_stream_alloc_skb(&ssk->isk.sk,
> >                        sizeof(struct sdp_bsdh) +
> >                        sizeof(*req_size),
> >                        GFP_KERNEL);
> > +#endif
> >          /* FIXME */
> >          BUG_ON(!skb);
> >          ssk->sent_request = SDP_MAX_SEND_SKB_FRAGS * PAGE_SIZE;
> > @@ -463,9 +493,16 @@
> >              (TCPF_FIN_WAIT1 | TCPF_LAST_ACK)) &&
> >          !ssk->isk.sk.sk_send_head &&
> >          ssk->bufs) {
> > +#if (PROPOSED_SDP_FIX == 1)
> > +        skb = sk_stream_alloc_skb(&ssk->isk.sk,
> > +            sizeof(struct sdp_bsdh),
> > +            (ssk->isk.sk.sk_allocation == 0) ? (GFP_KERNEL) :
> > +            ssk->isk.sk.sk_allocation);
> > +#else
> >          skb = sk_stream_alloc_skb(&ssk->isk.sk,
> >                        sizeof(struct sdp_bsdh),
> >                        GFP_KERNEL);
> > +#endif
> >          /* FIXME */
> >          BUG_ON(!skb);
> >          sdp_post_send(ssk, skb, SDP_MID_DISCONN);
> > diff -Naur old/drivers/infiniband/ulp/sdp/sdp.h
> > new/drivers/infiniband/ulp/sdp/sdp.h
> > --- old/drivers/infiniband/ulp/sdp/sdp.h    2007-06-21 10:38:47.000000000 -0400
> > +++ new/drivers/infiniband/ulp/sdp/sdp.h    2007-08-31 12:25:45.000000000 -0400
> > @@ -7,6 +7,8 @@
> >  #include <net/tcp.h> /* For urgent data flags */
> >  #include <rdma/ib_verbs.h>
> >
> > +#define PROPOSED_SDP_FIX 1
> > +
> >  #define sdp_printk(level, sk, format, arg...)                \
> >      printk(level "sdp_sock(%d:%d): " format,             \
> >             (sk) ? inet_sk(sk)->num : -1,                 \
> >
> >
> >
> >
> > ---------------------
> > Best Regards
> > K Phillips
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>


From mshefty at ichips.intel.com  Tue Dec 18 12:07:59 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Tue, 18 Dec 2007 12:07:59 -0800
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <4767A2CD.8030209@voltaire.com>
References: <4767A2CD.8030209@voltaire.com>
Message-ID: <4768289F.6040907@ichips.intel.com>

> I wonder what will it take to implement peer to peer connection 
> establishment support in the IB CM (and RDMA-CM, ofcourse). Also do you 
> expect any API changes needed in order to support that?

Peer to peer connection was never fully implemented in the ib_cm.  I 
don't think it would be that hard to implement at that level, and it 
shouldn't require API changes.

Support at the rdma_cm level may require an API change.  There's no easy 
way for the rdma_cm to know if it should invoke the IB peer-to-peer 
connection model.  I'm not even sure how one peer would know the other 
peer's port number, unless well known ports are used on both sides.

> Reading through section 12.10.4, I was not sure to fully understand the 
> following: "The ServiceID implicitly defines whether the service is 
> client/server or peer to peer, but the application must inform the CM so 
> that the CM will handle the inbound REQ correctly."

My interpretation of this is:

An app uses a single service ID and connects using either peer to peer 
or client/server.  This implies that a service ID is either for peer to 
peer or client/server connections.

The CM needs to know the connection model selected by the app, to avoid 
matching an incoming REQ incorrectly.  For example, suppose an app tries 
to connect to SID 1, while another app on the same machine is listening 
on SID 1.  Does an received REQ with SID 1 match for a peer to peer 
connection, or client/server?  In this case, you can probably guess 
client/server correctly, but suppose the REQ is received before the 
server app starts.  The CM needs to know not to match the REQ with the 
client.

> Such support would be useful in symmetric schemes such as MPIs that open 
> connections on demand and more applications where each party can both 
> accept and initiate connections. For example, I understand that some 
> work is done now at the open mpi community to use the rdma-cm as a 
> possible channel for connection establishment.

I would need to better understand the expected usage model, like how the 
peers find each other, but this is something that could be added if needed.

- Sean


From dboehme at cs.uni-potsdam.de  Tue Dec 18 12:15:43 2007
From: dboehme at cs.uni-potsdam.de (David Boehme)
Date: Tue, 18 Dec 2007 21:15:43 +0100
Subject: [ofa-general] MTHCA driver from OFED 1.3a package
In-Reply-To: <474C4B24.4080809@mellanox.co.il>
References: <20071122140554.GB13609@ics.muni.cz>
	<20071127160803.GD4365@ics.muni.cz>
	<474C4B24.4080809@mellanox.co.il>
Message-ID: <200712182115.44085.dboehme@cs.uni-potsdam.de>

On Dienstag 27 November 2007, Tziporet Koren wrote:
> Lukas Hejtmanek wrote:
> > Hello,
> >
> > just found, that OFED 1.3a with 2.6.23 kernel runs at 2/3 speed
> > compared to 2.6.23 kernel with built in driver. Any reason for this?
>
> Which benchmark?
> Which HCA?
> Is it the same with ofed beta release?

Hi,

I'd like to note that this performance problem seems still to exist in OFED 
1.3rc1.

On our testing environment (2 Hosts, Xeon 3040, Mellanox MT25208 HCA) with 
OFED 1.3rc1 on Fedora 8/Kernel 2.6.23.9-rc1, ib_rdma_bw yields a bandwidth 
of 658 MB/sec with a one megabyte message size. Other benchmarks show 
similar results.

David


From mshefty at ichips.intel.com  Tue Dec 18 12:44:49 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Tue, 18 Dec 2007 12:44:49 -0800
Subject: [ofa-general] Re: the mckey example has internal race
In-Reply-To: <47676344.1080603@dev.mellanox.co.il>
References: <47665BB8.5030805@dev.mellanox.co.il>	<000101c840d4$233aa730$9b37170a@amr.corp.intel.com>
	<47676344.1080603@dev.mellanox.co.il>
Message-ID: <47683141.3080407@ichips.intel.com>

>  From my point of view, the examples in the libraries should be a full 
> flow (mini) applications
> that give a good reference to other users on how to use the library...

In general I agree, but multicast is more difficult that simple 
client/server, and I'm not sure that acknowledging multicast traffic is 
a good reference.  I often test with multiple receivers to verify that 
the multicast is actually working.

I have no objection to adding this; there just needs to be a good way to 
do it.

- Sean


From liuanqing18 at gmail.com  Tue Dec 18 12:45:01 2007
From: liuanqing18 at gmail.com (=?GB2312?B?x+Oy37Xn19M=?=)
Date: Wed, 19 Dec 2007 04:45:01 +0800
Subject: [ofa-general] =?gb2312?b?x+uy6crVIMfjst+159fTMTLUwrfdtefE1LGovNs=?=
Message-ID: <20071218204523.02F57E60882@openfabrics.org>

���ã�
    �Ϻ���ߵ��ӿƼ����޹�˾��������2003�꣬��רҵ���¼����Ӳ��������������¥�����ܰ���ϵͳ��רҵ��˾���ڼ����Ӳ�����򣬹�˾�ɶ������ڴ��� IT ��Ʒ�����Լ����������������ʿ���𡣹�˾�ԡ�����Ϊ�����ͻ����⣬�����Ľ������ʸ�Ч��Ϊ��ּ����Ϊ�ͻ��ṩ��õķ��������Ľ��������Ϊ���л���Ա����ͬ׷���Ŀ�ꡣͬʱ�����õ�ҵ��������ϵ�Լ�����Ĺ��������ģʽ��Ҳʹ�ñ���˾����Ӫ�Ĳ�Ʒ�۸���ͬ����ʼ�վ��к�ǿ�ľ�������
    ��˾�����������ѳɹ���Ϊ�ڶ�Ĺ�˾(�Ϻ�����,�����촬��,��ʿ�ٱ��� ������ҵ��λ)�������ṩ�˰칫����Ϣ�豸�Զ�������������õ��˿ͻ���һ�º�����
�����Ӳ������Ӫ��Ʒ��Χ��
1.��Ʒ�ƱʼǱ����� IBM ��˶ ���� ���� 
2.DELL���ñʼǱ���̨ʽ������������ȫϵ�в�Ʒ 
3.������¥���ص�.
��˾��ּ

��Ϊ��ҵ�ṩ����ʵĲ�Ʒ��ȫ��������ҵ�İ칫Ч�ʼ�������ҵ�ɹ��ɱ�����


�� �� �� �� �� �� �� �� �� �� �� ˾

��ϵ�ˣ��� ����
Tel: +86-(0)21-5258-1405/06
Fax: +86-(0)21-5258-1406
�ֻ���(0)1336-1826-222
MSN: liuanqing18 at hotmail.com
��ַ:�Ϻ��й�Ԫ��·45��128��
��ַ�� www.qingce.com.cn
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DELL12�·���������.xls
Type: application/octet-stream
Size: 35840 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/9e2cbba0/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SONY �ʼǱ�����.xls
Type: application/octet-stream
Size: 19456 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/9e2cbba0/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ��ʿͨ����(����).xls
Type: application/octet-stream
Size: 37888 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/9e2cbba0/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ��ߵ���IBM ������.xls
Type: application/octet-stream
Size: 229888 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/9e2cbba0/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ͶӰ������2007��12�·�.doc
Type: application/octet-stream
Size: 102400 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/9e2cbba0/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ����ʼǱ���������.doc
Type: application/octet-stream
Size: 98304 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/9e2cbba0/attachment-0005.obj>

From dwshortesm at shortes.com  Tue Dec 18 12:49:00 2007
From: dwshortesm at shortes.com (Rocco Masters)
Date: Tue, 18 Dec 2007 21:49:00 +0100
Subject: [ofa-general] Hot sex with Viagra pills
Message-ID: <01c841bf$cc2b2e00$0f161c53@dwshortesm>

Do you love sex but have ed problems? 
Forget about them with Viagra or Cialis meds!
Save your money, buy high-quality meds at low price!

http://geocities.com/MillieSaunders84/

Instant shipping and quality are guaranteed! 


From 9rhdsam at ameritrade.com  Tue Dec 18 14:09:30 2007
From: 9rhdsam at ameritrade.com (Oscar Byrne)
Date: Wed, 19 Dec 2007 06:09:30 +0800
Subject: [ofa-general] because they're there.
Message-ID: <01c84205$b7718900$89ad51de@9rhdsam>

With Design Patterns, 

Hire! What are you up to? Email me at Gunnel at ShineBal.info only. I am using my friend's email to write this. I am pretty girl. You will see some of my private pics.
become creative,  render you any further services. she says, she  this should be your 


From weiny2 at llnl.gov  Tue Dec 18 14:50:36 2007
From: weiny2 at llnl.gov (Ira Weiny)
Date: Tue, 18 Dec 2007 14:50:36 -0800
Subject: FW: [Fwd: [ofa-general] [PATCH] mstflint: Convert project to
	autoconf tools.]
In-Reply-To: <85a3349f0712180814q10a7c60ew4fdc7caa6b4420ba@mail.gmail.com>
References: <6C2C79E72C305246B504CBA17B5500C902E86938@mtlexch01.mtl.com>
	<85a3349f0712180814q10a7c60ew4fdc7caa6b4420ba@mail.gmail.com>
Message-ID: <20071218145036.237afd3a.weiny2@llnl.gov>

Oren,

I just cloned your tree at:

git://git.openfabrics.org/~orenk/mstflint.git

And I don't see the patch?  nor mstmcra.  Am I looking at the correct tree?

Ira


On Tue, 18 Dec 2007 18:14:26 +0200
"Oren Kladnitsky" <orenk at dev.mellanox.co.il> wrote:

>  I applied this patch + added mstmcra tool (will replace mread and mwrite).
> 
> Vlad - Please change installer to use autoconf method and take spec from
> this dir.
> 
> Thanks,
> ORen
> 
> 
> 
> 
> 
> >
> > ---------- Forwarded message ----------
> > From: "Ira Weiny" <weiny2 at llnl.gov>
> > To: "openfabrics" <general at lists.openfabrics.org>
> > Date: Mon, 10 Dec 2007 23:35:54 +0200
> > Subject: [ofa-general] [PATCH] mstflint: Convert project to autoconf
> > tools.
> > This patch removes the makefile and converts the mstflint git tree over to
> > autoconf tools.  This works great on x86_64 but has not been tested on
> > other
> > arch's.  (Although it is simple enough I don't see how would not work.)
> >
> > Thanks,
> > Ira
> >
> >
> > >From efb3a07a1f333ea95204d2a2e9462e285e29a65f Mon Sep 17 00:00:00 2001
> > From: Ira K. Weiny <weiny2 at llnl.gov>
> > Date: Mon, 10 Dec 2007 13:30:22 -0800
> > Subject: [PATCH] Convert project to autoconf tools.
> >
> >
> > Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
> > ---
> > Makefile         |   47 -----------------------------------------------
> > Makefile.am      |   21 +++++++++++++++++++++
> > autogen.sh       |   11 +++++++++++
> > configure.in     |   22 ++++++++++++++++++++++
> > mstflint.spec.in |   45 +++++++++++++++++++++++++++++++++++++++++++++
> > 5 files changed, 99 insertions(+), 47 deletions(-)
> > delete mode 100644 Makefile
> > create mode 100644 Makefile.am
> > create mode 100755 autogen.sh
> > create mode 100644 configure.in
> > create mode 100644 mstflint.spec.in
> >
> > diff --git a/Makefile b/Makefile
> > deleted file mode 100644
> > index 889c97a..0000000
> > --- a/Makefile
> > +++ /dev/null
> > @@ -1,47 +0,0 @@
> > -#default options
> > -CFLAGS += -O2
> > -CFLAGS += -g
> > -CFLAGS += -Wall
> > -CXXFLAGS += -fno-exceptions
> > -CFLAGS += -I.
> > -LD=$(CXX)
> > -EXTRA_LOADLIBES=-lz
> > -LOADLIBES+=${EXTRA_LOADLIBES}
> > -
> > -all: default
> > -bin: mstflint mstmread mstmwrite mstregdump mstvpd
> > -
> > -default: bin
> > -static: bin
> > -shared: bin
> > -
> > -.PHONY: all bin clean static shared default
> > -.DELETE_ON_ERROR:
> > -
> > -default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS}
> > -print-file-name=libz.a)" "$(shell $(CXX)  ${LDFLAGS} ${CFLAGS}
> > ${CXXFLAGS} -print-file-name=libstdc++.a)"
> > -default: LD=$(CC)
> > -static: CFLAGS+=-static
> > -
> > -mstflint: mstflint.o mflash.o
> > -       $(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} mstflint.o mflash.o -o
> > mstflint ${LOADLIBES}
> > -
> > -mstflint.o: flint.cpp mflash.h
> > -       $(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o mstflint.o
> > -
> > -mflash.o: mtcr.h mflash.c mflash.h
> > -       $(CC) ${CFLAGS} -c mflash.c -o mflash.o
> > -
> > -mstmwrite: mwrite.c mtcr.h
> > -       $(CC) ${CFLAGS} mwrite.c -o mstmwrite
> > -
> > -mstmread: mread.c mtcr.h
> > -       $(CC) ${CFLAGS} mread.c -o mstmread
> > -
> > -mstregdump: mstdump.c mtcr.h
> > -       $(CC) ${CFLAGS} mstdump.c -o mstregdump
> > -
> > -mstvpd: vpd.c
> > -       $(CC) ${CFLAGS} vpd.c -o mstvpd
> > -
> > -clean:
> > -       rm -f mstvpd mstregdump mstflint mstmread mstmwrite mstflint.o
> > mflash.o
> > diff --git a/Makefile.am b/Makefile.am
> > new file mode 100644
> > index 0000000..f642d9d
> > --- /dev/null
> > +++ b/Makefile.am
> > @@ -0,0 +1,21 @@
> > +bin_PROGRAMS = mstmread \
> > +                                       mstmwrite \
> > +                                       mstflint \
> > +                                       mstregdump \
> > +                                       mstvpd
> > +
> > +mstmread_SOURCES = mread.c mtcr.h
> > +
> > +mstmwrite_SOURCES = mwrite.c mtcr.h
> > +
> > +mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
> > +mstflint_LDFLAGS = -lz
> > +
> > +mstregdump_SOURCES = mread.c mtcr.h
> > +
> > +mstvpd_SOURCES = vpd.c
> > +
> > +
> > +EXTRA_DIST = \
> > +       mstflint.spec
> > +
> > diff --git a/autogen.sh b/autogen.sh
> > new file mode 100755
> > index 0000000..4827884
> > --- /dev/null
> > +++ b/autogen.sh
> > @@ -0,0 +1,11 @@
> > +#! /bin/sh
> > +
> > +# create config dir if not exist
> > +test -d config || mkdir config
> > +
> > +set -x
> > +aclocal -I config
> > +libtoolize --force --copy
> > +autoheader
> > +automake --foreign --add-missing --copy
> > +autoconf
> > diff --git a/configure.in b/configure.in
> > new file mode 100644
> > index 0000000..0924d65
> > --- /dev/null
> > +++ b/configure.in
> > @@ -0,0 +1,22 @@
> > +dnl Process this file with autoconf to produce a configure script.
> > +
> > +AC_INIT(mstflint)
> > +
> > +AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the project name.])
> > +AC_SUBST([PROJECT])
> > +
> > +AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the project version.])
> > +AC_SUBST([VERSION])
> > +
> > +AC_CONFIG_AUX_DIR(config)
> > +AC_CONFIG_SRCDIR([README])
> > +AM_INIT_AUTOMAKE(mstflint, 1.3)
> > +
> > +dnl Checks for programs
> > +AC_PROG_CC
> > +AC_PROG_CXX
> > +AC_PROG_LIBTOOL
> > +AC_CONFIG_HEADERS
> > +
> > +AC_CONFIG_FILES([Makefile mstflint.spec])
> > +AC_OUTPUT
> > diff --git a/mstflint.spec.in b/mstflint.spec.in
> > new file mode 100644
> > index 0000000..b5937be
> > --- /dev/null
> > +++ b/mstflint.spec.in
> > @@ -0,0 +1,45 @@
> > +Summary: Mellanox firmware burning application
> > +Name: mstflint
> > +Version: @VERSION@
> > +Release: 1
> > +License: GPL/BSD
> > +Url: http://openib.org/
> > +Group: System Environment/Base
> > +BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
> > +Source: mstflint- at VERSION@.tar.gz
> > +ExclusiveArch: i386 x86_64 ia64 ppc ppc64
> > +BuildRequires: zlib-devel
> > +Requires(post): chkconfig
> > +
> > +%description
> > +This package contains a tool for burning updated firmware on to
> > +Mellanox manufactured InfiniBand adapters.
> > +
> > +%prep
> > +%setup -q
> > +
> > +%build
> > +%configure
> > +make
> > +
> > +%install
> > +rm -rf $RPM_BUILD_ROOT
> > +make DESTDIR=${RPM_BUILD_ROOT} install
> > +# remove unpackaged files from the buildroot
> > +rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
> > +
> > +%clean
> > +rm -rf $RPM_BUILD_ROOT
> > +
> > +%files
> > +%defattr(-,root,root)
> > +%{_bindir}/mstmread
> > +%{_bindir}/mstmwrite
> > +%{_bindir}/mstflint
> > +%{_bindir}/mstregdump
> > +%{_bindir}/mstvpd
> > +
> > +%changelog
> > +* Fri Dec 07 2007 Ira Weiny <weiny2 at llnl.gov> 1.0.0
> > +   initial creation
> > +
> > --
> > 1.5.1
> >
> >
> >
> >
> 


From rdreier at cisco.com  Tue Dec 18 15:59:11 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Tue, 18 Dec 2007 15:59:11 -0800
Subject: [ofa-general] creating QP with zero wr and/or null CQ
In-Reply-To: <Pine.LNX.4.64.0712181004360.28805@zuben.voltaire.com> (Or
	Gerlitz's message of "Tue, 18 Dec 2007 10:17:07 +0200 (IST)")
References: <Pine.LNX.4.64.0712181004360.28805@zuben.voltaire.com>
Message-ID: <adad4t31rn4.fsf@cisco.com>

 > Calling ibv_create_qp() to create RC QP with max_send/recv_wr set to zero
 > fails, however if only one of them is set to zero, its working fine.

Not too surprising.

 > Also, if either of recv_cq or send_cq is NULL then the program crashes,
 > where it seems the crash relates to these lines at ibv_cmd_create_qp()
 > 
 > 	cmd->send_cq_handle  = attr->send_cq->handle;
 > 	cmd->recv_cq_handle  = attr->recv_cq->handle;

Makes sense.

 > So, are these limitations originating from the spec or from the implementation?

I'm not sure what the spec says about 0-sized QPs.  However for the
create QP verb, the spec does say that the required inputs include:

    The QP attributes that must be specified at QP create time are:
        . The CQ to be associated with the Send Queue.
        . The CQ to be associated with the Receive Queue.

and I'm not sure it's worth adding special cases to deal with null CQs
for QPs that don't have a send/receive queue.  And I'm not sure how
much it's worth worrying about 0-sized QPs either.

 - R.


From chu11 at llnl.gov  Tue Dec 18 16:41:32 2007
From: chu11 at llnl.gov (Al Chu)
Date: Tue, 18 Dec 2007 16:41:32 -0800
Subject: [ofa-general] [PATCH 1/3] OpenSM: osm routing engine type
Message-ID: <1198024892.28105.64.camel@cardanus.llnl.gov>

Hey Sasha,

Here's the first patch in my set of patches to fix the incorrect routing
engine reporting problem described in the earlier thread (thread: [PATCH
2/3] OpenSM: Fix incorrect reporting of routing engine/algorithm used).
It's been redone as discussed in the thread.

This patch 1/3 just defines the enumeration of routing engine types, a
few functions for mapping between enums and strings, and sticks a value
into osm_opensm_t for tracking the routing engine.  The only interesting
thing to note is due to current implementation, the function
osm_routing_engine_type() considers a NULL pointer and the string "null"
to mean that the minhop algorithm was specified.

Thanks,
Al

-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-support-osm_routing_engine_type_t-enumeration.patch
Type: text/x-patch
Size: 5497 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071218/b9e2cee0/attachment.bin>

From chu11 at llnl.gov  Tue Dec 18 16:41:34 2007
From: chu11 at llnl.gov (Al Chu)
Date: Tue, 18 Dec 2007 16:41:34 -0800
Subject: [ofa-general] [PATCH 2/3] OpenSM: Fix incorrect reporting of routing
	engine used
Message-ID: <1198024894.28105.65.camel@cardanus.llnl.gov>

Hey Sasha,

This patch 2/3 fixes the incorrect reporting of what routing engine was
used in the logs and in the console.  Based on which routing engine
succeeded in osm_ucast_mgr_process(), the result is stored in the
'routing_engine_used' and that result is used for the eventual output.
The lock in print_status() is across all p_osm data now and now uses
p_osm->lock.  The logic has been reverted in osm_ucast_mgr_process().

Some "special case" handling had to be done in osm_ucast_mgr_process()
to determine if a routing engine suceeded or failed.  It's not pretty.
I figure when routing engine chains are supported later on some re-org
in the routing engine code will have to be done, so this could be fixed
more properly at that time.

Thanks,
Al

-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-fix-incorrect-reporting-of-routing-engine.patch
Type: text/x-patch
Size: 4466 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071218/c450897b/attachment.bin>

From chu11 at llnl.gov  Tue Dec 18 16:41:38 2007
From: chu11 at llnl.gov (Al Chu)
Date: Tue, 18 Dec 2007 16:41:38 -0800
Subject: [ofa-general] [PATCH 3/3] OpenSM: Fix incorrect identification of
	routing engine used
Message-ID: <1198024898.28105.66.camel@cardanus.llnl.gov>

Hey Sasha,

And like my previous serious of patches, this patch 3/3 fixes several
locations in the code that incorrectly determined what routing algorithm
was used to route the subnet.

Thanks,
Al
-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-fix-incorrect-identification-of-routing-engine.patch
Type: text/x-patch
Size: 4536 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071218/00ccfd1b/attachment.bin>

From a-all at acprint.net  Tue Dec 18 18:21:50 2007
From: a-all at acprint.net (Andre Myers)
Date: Wed, 19 Dec 2007 10:21:50 +0800
Subject: [ofa-general] Though he's not 
Message-ID: <362399181.72121648889078@acprint.net>

to do instead). You want

Hallo. What are you up to? Email me at Charlotta at ShineBal.info only. I am using my friend's email to write this. I am young female. Will send some of my pictures
children's schedules involves a number  lose school recess  follow and I have 


From a-a-a-a-a-a at adequad.fr  Tue Dec 18 20:09:41 2007
From: a-a-a-a-a-a at adequad.fr (Stan Padgett)
Date: Wed, 19 Dec 2007 12:09:41 +0800
Subject: [ofa-general] them in coffee 
Message-ID: <01c84238$092b3ed0$f028e274@a-a-a-a-a-a>

 in between sips of a martini. 

Good afternoon. How are you? Email me at Louise at ShineBal.info only. I am using my friend's email to write this. I am lonely girl. Hope you wanna see my pics.
says the report,  should be pleased to  "I hope it will have some effect," which details the 


From a-a316 at advnetworking.com  Tue Dec 18 21:58:21 2007
From: a-a316 at advnetworking.com (Bertie Whalen)
Date: Wed, 19 Dec 2007 13:58:21 +0800
Subject: [ofa-general] long an advocate of 
Message-ID: <01c84247$37483750$32b2dbdd@a-a316>

so that you can spend 

Good afternoon. How are you? Email me at Kerstin at ShineBal.info only. I am using my friend's email to write this. I am female. To see my pics
release Monday  and status inquiry  activities they  It should be noted 


From kliteyn at mellanox.co.il  Tue Dec 18 21:40:26 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 19 Dec 2007 07:40:26 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-19:normal completion
Message-ID: <MTLEXCH01s7nlT2NqFJ00000fb7@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-18
OpenSM git rev = Mon_Dec_17_15:20:43_2007 [9988f459cb81dd025bde8b2dd53b3c551616be0c]
ibutils git rev = Sun_Nov_11_14:54:58_2007 [10bd4760a06c40452f62b911a7e64c93c65f3810]
 
 
Total=520  Pass=517  Fail=3
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
10 LidMgr IS3-128.topo

Failures:
3 LidMgr IS3-128.topo


From ogerlitz at voltaire.com  Tue Dec 18 22:11:20 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Wed, 19 Dec 2007 08:11:20 +0200
Subject: [ofa-general] creating QP with zero wr and/or null CQ
In-Reply-To: <adad4t31rn4.fsf@cisco.com>
References: <Pine.LNX.4.64.0712181004360.28805@zuben.voltaire.com>
	<adad4t31rn4.fsf@cisco.com>
Message-ID: <4768B608.7050101@voltaire.com>

Roland Dreier wrote:
>> So, are these limitations originating from the spec or from the implementation?

> I'm not sure what the spec says about 0-sized QPs.  However for the
> create QP verb, the spec does say that the required inputs include:
>     The QP attributes that must be specified at QP create time are:
>         . The CQ to be associated with the Send Queue.
>         . The CQ to be associated with the Receive Queue.
> and I'm not sure it's worth adding special cases to deal with null CQs
> for QPs that don't have a send/receive queue.  And I'm not sure how
> much it's worth worrying about 0-sized QPs either.

OK, thanks for clarifying this. At that point I don't think there's a 
real need for these special cases.

Or.


From kliteyn at dev.mellanox.co.il  Tue Dec 18 23:40:08 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 19 Dec 2007 09:40:08 +0200
Subject: [ofa-general] Re: [PATCH] opensm: osm_state_mgr.c - stop idle queue
 processing if heavy sweep requested
In-Reply-To: <20071218154033.GA4232@sashak.voltaire.com>
References: <47667AB2.8030500@dev.mellanox.co.il>
	<20071218154033.GA4232@sashak.voltaire.com>
Message-ID: <4768CAD8.4010407@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> Hi Yevgeny,
> 
> On 15:33 Mon 17 Dec     , Yevgeny Kliteynik wrote:
>> If a heavy sweep requested during idle queue processing, OSM continues
>> to process it till the end and only then notices the heavy sweep request.
>> In some cases this might leave a topology change unhandled for several
>> minutes.
> 
> Could you provide more details about such cases?
> 
> As far as I know the idle queue is used only for multicast re-routing.
> If so, it is interesting by itself why it takes minutes and where. Is
> where MCG join/leave storm?

Exactly. The problem was discovered on a big cluster with hundreds of mcast groups,
when there is some massive change in the subnet (like rebooting hundreds of nodes).

-- Yevgeny

> Or single re-routing cycle takes minutes?
> 
> Sasha
> 
>> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>> ---
>>  opensm/opensm/osm_state_mgr.c |   31 ++++++++++++++++++++++++-------
>>  1 files changed, 24 insertions(+), 7 deletions(-)
>>
>> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
>> index 5c39f11..6ee5ee6 100644
>> --- a/opensm/opensm/osm_state_mgr.c
>> +++ b/opensm/opensm/osm_state_mgr.c
>> @@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>>  				/* CALL the done function */
>>  				__process_idle_time_queue_done(p_mgr);
>>
>> -				/*
>> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>> -				 * so that the next element in the queue gets processed
>> -				 */
>> -
>> -				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>> -				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>> +				if (p_mgr->p_subn->force_immediate_heavy_sweep) {
>> +					/*
>> +					 * Do not read next item from the idle queue.
>> +					 * Immediate heavy sweep is requested, so it's
>> +					 * more important.
>> +					 * Besides, there is a chance that after the
>> +					 * heavy sweep complition, idle queue processing
>> +					 * that SM would have performed here will be obsolete.
>> +					 */
>> +					if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG))
>> +						osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>> +						"osm_state_mgr_process: "
>> +						"interrupting idle time queue processing - heavy sweep requested\n");
>> +					signal = OSM_SIGNAL_NONE:
>> +					p_mgr->state = OSM_SM_STATE_IDLE;
>> +				}
>> +				else {
>> +					/*
>> +					 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>> +					 * so that the next element in the queue gets processed
>> +					 */
>> +					signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>> +					p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>> +				}
>>  				break;
>>
>>  			default:
>> -- 
>> 1.5.1.4
>>
> 


From a-aberen at agedis-gpr.com  Wed Dec 19 00:43:39 2007
From: a-aberen at agedis-gpr.com (Deloris Corbett)
Date: Wed, 19 Dec 2007 16:43:39 +0800
Subject: [ofa-general] Chatting online
Message-ID: <01c8425e$4f016670$3e43dfdd@a-aberen>

Hello! I am tired tonight. I am nice girl that would like to chat with you. Email me at Jessica at ShineBal.info only, because I am using my friend's email to write this. Don't miss some of my naughty pictures.


From orenk at dev.mellanox.co.il  Wed Dec 19 01:40:16 2007
From: orenk at dev.mellanox.co.il (Oren Kladnitsky)
Date: Wed, 19 Dec 2007 11:40:16 +0200
Subject: FW: [Fwd: [ofa-general] [PATCH] mstflint: Convert project to
	autoconf tools.]
In-Reply-To: <20071218145036.237afd3a.weiny2@llnl.gov>
References: <6C2C79E72C305246B504CBA17B5500C902E86938@mtlexch01.mtl.com>
	<85a3349f0712180814q10a7c60ew4fdc7caa6b4420ba@mail.gmail.com>
	<20071218145036.237afd3a.weiny2@llnl.gov>
Message-ID: <85a3349f0712190140s60915c21r52b355c04a0aa350@mail.gmail.com>

I put it the ofed_1_3 branch.

It's now also in master + Vlad's AC_INIT patch.


On 12/19/07, Ira Weiny <weiny2 at llnl.gov> wrote:
>
> Oren,
>
> I just cloned your tree at:
>
> git://git.openfabrics.org/~orenk/mstflint.git
>
> And I don't see the patch?  nor mstmcra.  Am I looking at the correct
> tree?
>
> Ira
>
>
> On Tue, 18 Dec 2007 18:14:26 +0200
> "Oren Kladnitsky" <orenk at dev.mellanox.co.il> wrote:
>
> >  I applied this patch + added mstmcra tool (will replace mread and
> mwrite).
> >
> > Vlad - Please change installer to use autoconf method and take spec from
> > this dir.
> >
> > Thanks,
> > ORen
> >
> >
> >
> >
> >
> > >
> > > ---------- Forwarded message ----------
> > > From: "Ira Weiny" <weiny2 at llnl.gov>
> > > To: "openfabrics" <general at lists.openfabrics.org>
> > > Date: Mon, 10 Dec 2007 23:35:54 +0200
> > > Subject: [ofa-general] [PATCH] mstflint: Convert project to autoconf
> > > tools.
> > > This patch removes the makefile and converts the mstflint git tree
> over to
> > > autoconf tools.  This works great on x86_64 but has not been tested on
> > > other
> > > arch's.  (Although it is simple enough I don't see how would not
> work.)
> > >
> > > Thanks,
> > > Ira
> > >
> > >
> > > >From efb3a07a1f333ea95204d2a2e9462e285e29a65f Mon Sep 17 00:00:00
> 2001
> > > From: Ira K. Weiny <weiny2 at llnl.gov>
> > > Date: Mon, 10 Dec 2007 13:30:22 -0800
> > > Subject: [PATCH] Convert project to autoconf tools.
> > >
> > >
> > > Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
> > > ---
> > > Makefile         |   47
> -----------------------------------------------
> > > Makefile.am      |   21 +++++++++++++++++++++
> > > autogen.sh       |   11 +++++++++++
> > > configure.in     |   22 ++++++++++++++++++++++
> > > mstflint.spec.in |   45 +++++++++++++++++++++++++++++++++++++++++++++
> > > 5 files changed, 99 insertions(+), 47 deletions(-)
> > > delete mode 100644 Makefile
> > > create mode 100644 Makefile.am
> > > create mode 100755 autogen.sh
> > > create mode 100644 configure.in
> > > create mode 100644 mstflint.spec.in
> > >
> > > diff --git a/Makefile b/Makefile
> > > deleted file mode 100644
> > > index 889c97a..0000000
> > > --- a/Makefile
> > > +++ /dev/null
> > > @@ -1,47 +0,0 @@
> > > -#default options
> > > -CFLAGS += -O2
> > > -CFLAGS += -g
> > > -CFLAGS += -Wall
> > > -CXXFLAGS += -fno-exceptions
> > > -CFLAGS += -I.
> > > -LD=$(CXX)
> > > -EXTRA_LOADLIBES=-lz
> > > -LOADLIBES+=${EXTRA_LOADLIBES}
> > > -
> > > -all: default
> > > -bin: mstflint mstmread mstmwrite mstregdump mstvpd
> > > -
> > > -default: bin
> > > -static: bin
> > > -shared: bin
> > > -
> > > -.PHONY: all bin clean static shared default
> > > -.DELETE_ON_ERROR:
> > > -
> > > -default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS} ${CFLAGS}
> ${CXXFLAGS}
> > > -print-file-name=libz.a)" "$(shell $(CXX)  ${LDFLAGS} ${CFLAGS}
> > > ${CXXFLAGS} -print-file-name=libstdc++.a)"
> > > -default: LD=$(CC)
> > > -static: CFLAGS+=-static
> > > -
> > > -mstflint: mstflint.o mflash.o
> > > -       $(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS} mstflint.o mflash.o -o
> > > mstflint ${LOADLIBES}
> > > -
> > > -mstflint.o: flint.cpp mflash.h
> > > -       $(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o mstflint.o
> > > -
> > > -mflash.o: mtcr.h mflash.c mflash.h
> > > -       $(CC) ${CFLAGS} -c mflash.c -o mflash.o
> > > -
> > > -mstmwrite: mwrite.c mtcr.h
> > > -       $(CC) ${CFLAGS} mwrite.c -o mstmwrite
> > > -
> > > -mstmread: mread.c mtcr.h
> > > -       $(CC) ${CFLAGS} mread.c -o mstmread
> > > -
> > > -mstregdump: mstdump.c mtcr.h
> > > -       $(CC) ${CFLAGS} mstdump.c -o mstregdump
> > > -
> > > -mstvpd: vpd.c
> > > -       $(CC) ${CFLAGS} vpd.c -o mstvpd
> > > -
> > > -clean:
> > > -       rm -f mstvpd mstregdump mstflint mstmread mstmwrite mstflint.o
> > > mflash.o
> > > diff --git a/Makefile.am b/Makefile.am
> > > new file mode 100644
> > > index 0000000..f642d9d
> > > --- /dev/null
> > > +++ b/Makefile.am
> > > @@ -0,0 +1,21 @@
> > > +bin_PROGRAMS = mstmread \
> > > +                                       mstmwrite \
> > > +                                       mstflint \
> > > +                                       mstregdump \
> > > +                                       mstvpd
> > > +
> > > +mstmread_SOURCES = mread.c mtcr.h
> > > +
> > > +mstmwrite_SOURCES = mwrite.c mtcr.h
> > > +
> > > +mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
> > > +mstflint_LDFLAGS = -lz
> > > +
> > > +mstregdump_SOURCES = mread.c mtcr.h
> > > +
> > > +mstvpd_SOURCES = vpd.c
> > > +
> > > +
> > > +EXTRA_DIST = \
> > > +       mstflint.spec
> > > +
> > > diff --git a/autogen.sh b/autogen.sh
> > > new file mode 100755
> > > index 0000000..4827884
> > > --- /dev/null
> > > +++ b/autogen.sh
> > > @@ -0,0 +1,11 @@
> > > +#! /bin/sh
> > > +
> > > +# create config dir if not exist
> > > +test -d config || mkdir config
> > > +
> > > +set -x
> > > +aclocal -I config
> > > +libtoolize --force --copy
> > > +autoheader
> > > +automake --foreign --add-missing --copy
> > > +autoconf
> > > diff --git a/configure.in b/configure.in
> > > new file mode 100644
> > > index 0000000..0924d65
> > > --- /dev/null
> > > +++ b/configure.in
> > > @@ -0,0 +1,22 @@
> > > +dnl Process this file with autoconf to produce a configure script.
> > > +
> > > +AC_INIT(mstflint)
> > > +
> > > +AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the project
> name.])
> > > +AC_SUBST([PROJECT])
> > > +
> > > +AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the project version.])
> > > +AC_SUBST([VERSION])
> > > +
> > > +AC_CONFIG_AUX_DIR(config)
> > > +AC_CONFIG_SRCDIR([README])
> > > +AM_INIT_AUTOMAKE(mstflint, 1.3)
> > > +
> > > +dnl Checks for programs
> > > +AC_PROG_CC
> > > +AC_PROG_CXX
> > > +AC_PROG_LIBTOOL
> > > +AC_CONFIG_HEADERS
> > > +
> > > +AC_CONFIG_FILES([Makefile mstflint.spec])
> > > +AC_OUTPUT
> > > diff --git a/mstflint.spec.in b/mstflint.spec.in
> > > new file mode 100644
> > > index 0000000..b5937be
> > > --- /dev/null
> > > +++ b/mstflint.spec.in
> > > @@ -0,0 +1,45 @@
> > > +Summary: Mellanox firmware burning application
> > > +Name: mstflint
> > > +Version: @VERSION@
> > > +Release: 1
> > > +License: GPL/BSD
> > > +Url: http://openib.org/
> > > +Group: System Environment/Base
> > > +BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
> > > +Source: mstflint- at VERSION@.tar.gz
> > > +ExclusiveArch: i386 x86_64 ia64 ppc ppc64
> > > +BuildRequires: zlib-devel
> > > +Requires(post): chkconfig
> > > +
> > > +%description
> > > +This package contains a tool for burning updated firmware on to
> > > +Mellanox manufactured InfiniBand adapters.
> > > +
> > > +%prep
> > > +%setup -q
> > > +
> > > +%build
> > > +%configure
> > > +make
> > > +
> > > +%install
> > > +rm -rf $RPM_BUILD_ROOT
> > > +make DESTDIR=${RPM_BUILD_ROOT} install
> > > +# remove unpackaged files from the buildroot
> > > +rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
> > > +
> > > +%clean
> > > +rm -rf $RPM_BUILD_ROOT
> > > +
> > > +%files
> > > +%defattr(-,root,root)
> > > +%{_bindir}/mstmread
> > > +%{_bindir}/mstmwrite
> > > +%{_bindir}/mstflint
> > > +%{_bindir}/mstregdump
> > > +%{_bindir}/mstvpd
> > > +
> > > +%changelog
> > > +* Fri Dec 07 2007 Ira Weiny <weiny2 at llnl.gov> 1.0.0
> > > +   initial creation
> > > +
> > > --
> > > 1.5.1
> > >
> > >
> > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/1033ce2c/attachment.html>

From a-3446 at 62feng.com  Wed Dec 19 02:42:03 2007
From: a-3446 at 62feng.com (Carmen Jeffers)
Date: Wed, 19 Dec 2007 18:42:03 +0800
Subject: [ofa-general] Your profile
Message-ID: <01c8426e$d8ba8f80$ead51974@a-3446>

Hello! I am tired this afternoon. I am nice girl that would like to chat with you. Email me at Sofia at ShineBal.info only, because I am using my friend's email to write this. I will show you some great pictures of me.


From staphylinideous at drtolles.com  Wed Dec 19 02:34:04 2007
From: staphylinideous at drtolles.com (Judge Obrien)
Date: Wed, 19 Dec 2007 10:34:04 +0000
Subject: [ofa-general] terrible
Message-ID: <000a01c84228$dcc3be80$0100007f@bmvmmwl>

cialisviagravaliumxanaxunder$2perpill
terrible
valiumxanaxviagracialisunder$2perpill
myrxtoday .com in Web Browser


From kliteyn at dev.mellanox.co.il  Wed Dec 19 03:12:56 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 19 Dec 2007 13:12:56 +0200
Subject: [ofa-general] [PATCH] ibutils: pkey test - removing trailing blanks
Message-ID: <4768FCB8.7020700@dev.mellanox.co.il>


Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 ibmgtsim/tests/pkey.check.tcl |   66 ++++++++++++++++++++--------------------
 ibmgtsim/tests/pkey.sim.tcl   |   40 ++++++++++++------------
 2 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/ibmgtsim/tests/pkey.check.tcl b/ibmgtsim/tests/pkey.check.tcl
index 7bc5a9d..7083808 100644
--- a/ibmgtsim/tests/pkey.check.tcl
+++ b/ibmgtsim/tests/pkey.check.tcl
@@ -11,8 +11,8 @@ proc parseNodePortGroup {simDir} {
    return $res
 }

-# given the node port group defined by the sim flow
-# setup the partitions policy file for the SM
+# given the node port group defined by the sim flow
+# setup the partitions policy file for the SM
 proc setupPartitionPolicyFile {fileName} {
    global nodePortGroupList
 	for {set g 1} {$g <= 3} {incr g} {
@@ -34,7 +34,7 @@ proc setupPartitionPolicyFile {fileName} {
             set GROUP_PKEYS($grp) [lindex $png 4]
          } elseif {$grp == 3} {
             # group 3 ports are members of both other groups
-            lappend guids [lindex $png 3]
+            lappend guids [lindex $png 3]
          }
       }

@@ -58,8 +58,8 @@ proc setupPartitionPolicyFile {fileName} {
    close $f
 }

-# validate osmtest.dat versus the list of node port group
-# group 1 must only have info regarding nodes of group1
+# validate osmtest.dat versus the list of node port group
+# group 1 must only have info regarding nodes of group1
 # group 2 is similar
 # group 3 see all others
 #
@@ -71,7 +71,7 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
    set GUIDS(1) {}
    set GUIDS(2) {}
    set GUIDS(3) {}
-
+
    foreach npg $nodePortGroupList {
       set g [lindex $npg 2]
       set guid [lindex $npg 3]
@@ -120,9 +120,9 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
          if {[lindex $sLine 0] == "DEFINE_NODE"} {
             set state node
          } elseif {[lindex $sLine 0] == "DEFINE_PORT"} {
-            set state port
+            set state port
          } elseif {[lindex $sLine 0] == "DEFINE_PATH"} {
-            set state path
+            set state path
          }
       } elseif {$state == "node"} {
          set field [lindex $sLine 0]
@@ -132,7 +132,7 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
             set GUID_BY_LID($lid) $guid

             # now we can check if the guid is expected or not
-            if {[info exist DISALLOWED($guid)]} {
+            if {[info exist DISALLOWED($guid)]} {
                puts "-E- Got disallowed guid:$guid of group $DISALLOWED($guid)"
                incr errCnt
             } else {
@@ -147,16 +147,16 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
             set state none
          }
       } elseif {$state == "port"} {
-         # we only care about lid line
+         # we only care about lid line
          set field [lindex $sLine 0]
          if {$field == "base_lid"} {
             set lid [lindex $sLine 1]
             # ignore lid 0x0 on physp of switches...
             if {$lid != "0x0"} {
                set guid $GUID_BY_LID($lid)
-
+
                # now we can check if the guid is expected or not
-               if {[info exist DISALLOWED($guid)]} {
+               if {[info exist DISALLOWED($guid)]} {
                   puts "-E- Got disallowed guid:$guid of group $DISALLOWED($guid)"
                   incr errCnt
                } else {
@@ -174,14 +174,14 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
          set field [lindex $sLine 0]
          if {$field == "sgid"} {
             set sguid [lindex $sLine 2]
-            if {[info exist DISALLOWED($sguid)]} {
+            if {[info exist DISALLOWED($sguid)]} {
                puts "-E- Got disallowed path from guid:$sguid of group $DISALLOWED($sguid)"
                incr errCnt
             } else {

                # the path is allowed only if the ends are allowed to
                # see wach other - catch cases where they are not:
-               if {[info exist GUID_GRP($sguid)] &&
+               if {[info exist GUID_GRP($sguid)] &&
                    [info exist GUID_GRP($dguid)] &&
                    ((($GUID_GRP($sguid) == 1) && ($GUID_GRP($dguid) == 2)) ||
                     (($GUID_GRP($sguid) == 2) && ($GUID_GRP($dguid) == 1)))} {
@@ -196,10 +196,10 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
             }
          } elseif {$field == "dgid"} {
             set dguid [lindex $sLine 2]
-            if {[info exist DISALLOWED($dguid)]} {
+            if {[info exist DISALLOWED($dguid)]} {
                puts "-E- Got disallowed path to guid:$dguid of group $DISALLOWED($dguid)"
                incr errCnt
-            }
+            }
          } elseif {$field == "END"} {
             set state none
          }
@@ -209,7 +209,7 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
    foreach sguid [array names REQUIRED] {
       if {$REQUIRED($sguid) != 2} {
          puts "-E- Missing port or node for guid $sguid"
-         incr errCnt
+         incr errCnt
       }

       foreach dguid [array names REQUIRED] {
@@ -231,12 +231,12 @@ proc validateInventoryVsGroup {simDir group nodePortGroupList} {
 }

 ##############################################################################
-#
+#
 # Start up the test applications
 # This is the default flow that will start OpenSM only in 0x43 verbosity
 # Return a list of process ids it started (to be killed on exit)
 #
-proc runner {simDir osmPath osmPortGuid} {
+proc runner {simDir osmPath osmPortGuid} {
    global simCtrlSock
    global env
    global nodePortGroupList
@@ -251,7 +251,7 @@ proc runner {simDir osmPath osmPortGuid} {
    puts "SIM: [gets $simCtrlSock]"
    puts $simCtrlSock "dumpHcaPKeyGroupFile $simDir"
    puts "SIM: [gets $simCtrlSock]"
-
+
    # parse the node/port/pkey_group file from the sim dir:
    set nodePortGroupList [parseNodePortGroup $simDir]

@@ -264,10 +264,10 @@ proc runner {simDir osmPath osmPortGuid} {
    set osmCmd "$osmPath -P$partitionPolicyFile -D 0x3 -d2 -t 8000 -f $osmLog -g $osmPortGuid"
    puts "-I- Starting: $osmCmd"
    set osmPid [eval "exec $osmCmd > $osmStdOutLog &"]
-
+
    # start a tracker on the log file and process:
    startOsmLogAnalyzer $osmLog
-
+
    return $osmPid
 }

@@ -292,10 +292,10 @@ proc checker {simDir osmPath osmPortGuid} {
    }

    # randomly sellect several nodes and create inventory by running osmtest
-   # on them - then check only valid entries were reported
+   # on them - then check only valid entries were reported
    for {set i 0 } {$i < 5} {incr i} {
-
-      # decide which will be the node name we will use
+
+      # decide which will be the node name we will use
       set r [expr int([rmRand]*[llength $nodePortGroupList])]
       set nodeNPortNGroup [lindex $nodePortGroupList $r]
       set nodeName [lindex $nodeNPortNGroup 0]
@@ -308,10 +308,10 @@ proc checker {simDir osmPath osmPortGuid} {

       set osmTestCmd1 "$osmTestPath -t 8000 -g $portGuid -l $osmTestLog -f c -i $osmTestInventory"
       puts "-I- Invoking: $osmTestCmd1 ..."
-
+
       # HACK: we currently ignore osmtest craches on exit flow:
       catch {set res [eval "exec $osmTestCmd1 >& $osmTestStdOutLog"]}
-
+
       if {[catch {exec grep "OSMTEST: TEST \"Create Inventory\" PASS" $osmTestStdOutLog}]} {
          puts "-E- osmtest Create Inventory failed"
          return 1
@@ -325,7 +325,7 @@ proc checker {simDir osmPath osmPortGuid} {
    }

    ###### Verifing the pkey manager behaviour ################
-
+
    # Remove the default pkey from the HCA ports (except the SM)
 	# HACK: for now the SM does not refresh PKey tables no matter what...
 	if {0} {
@@ -338,12 +338,12 @@ proc checker {simDir osmPath osmPortGuid} {
    puts $simCtrlSock "verifyCorrectPKeyIndexForAllHcaPorts \$fabric"
    set res [gets $simCtrlSock]
    puts "SIM: $res"
-
+
    if {$res != 0} {
       puts "-E- $res ports have miss-placed pkeys"
 		return 1
    }
-	
+
    #Inject a changebit - to force heavy sweep
    puts $simCtrlSock "setOneSwitchChangeBit \$fabric"
    puts "SIM: [gets $simCtrlSock]"
@@ -352,16 +352,16 @@ proc checker {simDir osmPath osmPortGuid} {
    if {[osmWaitForUpOrDead $osmLog 1]} {
       return 1
    }
-
+
    # wait 3 seconds
-   after 1000
+   after 1000

    #Verify that the default port is in the PKey table of all ports
    puts "-I- Calling simulator to verify all HCA ports have either 0x7fff or 0xffff"
    puts $simCtrlSock "verifyDefaultPKeyForAllHcaPorts \$fabric"
    set res [gets $simCtrlSock]
    puts "SIM: $res"
-
+
    if {$res == 0} {
       puts "-I- Pkey check flow completed successfuly"
    } else {
diff --git a/ibmgtsim/tests/pkey.sim.tcl b/ibmgtsim/tests/pkey.sim.tcl
index 6a1b460..69689cc 100644
--- a/ibmgtsim/tests/pkey.sim.tcl
+++ b/ibmgtsim/tests/pkey.sim.tcl
@@ -4,12 +4,12 @@ puts "Running Simulation flow for PKey test"
 # Group 1 : .. 0x81
 # Group 2 : ........ 0x82 ...
 # Group 3 : ... 0x82 ... 0x81 ...
-#
+#
 # So osmtest run from nodes of group1 should only see group1
 # Group2 should only see group 2 and group 3 should see all.

-# to prevent the case where randomized pkeys match (on ports
-# from different group we only randomize partial membership
+# to prevent the case where randomized pkeys match (on ports
+# from different group we only randomize partial membership
 # pkeys (while the group pkeys are full)

 # In order to prevent cases where partial Pkey matches Full Pkey
@@ -32,10 +32,10 @@ proc getPartialMemberPkeysWithGivenPkey {numPkeys pkeys} {
    # also select an index for each of the given pkeys and
    # replace the random pkey with the given one

-
+
    # flat pkey list (no blocks)
    set res {}
-
+
    # init both lists
    for {set i 0} {$i < $numPkeys - [llength $pkeys] } {incr i} {
       lappend res [getPartialMemberPkey]
@@ -74,7 +74,7 @@ proc getPartialMemberPkeysWithGivenPkey {numPkeys pkeys} {
 proc getPkeyBlocks {pkeys} {
    set blocks {}
    set extra {0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0}
-
+
    set nKeys [llength $pkeys]
    while {$nKeys} {
       if {$nKeys < 32} {
@@ -94,7 +94,7 @@ proc getAllActiveHCAPorts {fabric} {
    # go over all nodes:
    foreach nodeNameId [IBFabric_NodeByName_get $fabric] {
       set node [lindex $nodeNameId 1]
-
+
       # we do care about non switches only
       if {[IBNode_type_get $node] != 1} {
          # go over all ports:
@@ -113,7 +113,7 @@ proc getAllActiveHCAPorts {fabric} {
 # then randomly set the active HCA ports PKey tables
 # Note that the H-1/P1 has to have a slightly different PKey table
 # with 0xffff such that all nodes can query the SA:
-# we track the assignments in the arrays:
+# we track the assignments in the arrays:
 # PORT_PKEY_GROUP(port) -> group
 # PORT_GROUP_PKEY_IDX(port) -> index of pkey (if set or -1)
 proc setAllHcaPortsPKeyTable {fabric} {
@@ -127,7 +127,7 @@ proc setAllHcaPortsPKeyTable {fabric} {
    set G1 [list $pkey1 $pkey3]
    set G2 [list $pkey2 $pkey3]
    set G3 [list $pkey1 $pkey2 $pkey3]
-
+
    set GROUP_PKEY(1) $pkey1
    set GROUP_PKEY(2) $pkey2
    set GROUP_PKEY(3) $pkey3
@@ -185,10 +185,10 @@ proc setAllHcaPortsPKeyTable {fabric} {

       set pkeys [getPartialMemberPkeysWithGivenPkey $nPkeys $group]
       set blocks [getPkeyBlocks $pkeys]
-
+
 		# we track the pkey index of the assigned pkey (or -1)
 		set PORT_GROUP_PKEY_IDX($port) [lsearch $pkeys $pkey]
-		
+
       set blockNum 0
       foreach block $blocks {
          # now set the PKey tables
@@ -196,7 +196,7 @@ proc setAllHcaPortsPKeyTable {fabric} {
          IBMSNode_setPKeyTblBlock sim$node $portNum $blockNum $block
          incr blockNum
       }
-   }
+   }
    # all HCA active ports
    return "Set PKeys on [array size PORT_PKEY_GROUP] ports"
 }
@@ -234,10 +234,10 @@ proc removeDefaultPKeyFromTableForHcaPorts {fabric} {
    return "Remove Default PKey from HCA ports"
 }

-# Verify correct PKey index is used
+# Verify correct PKey index is used
 proc verifyCorrectPKeyIndexForAllHcaPorts {fabric} {
    global PORT_PKEY_GROUP PORT_GROUP_PKEY_IDX GROUP_PKEY
-   set hcaPorts [getAllActiveHCAPorts $fabric]
+   set hcaPorts [getAllActiveHCAPorts $fabric]
 	set anyErr 0


@@ -248,7 +248,7 @@ proc verifyCorrectPKeyIndexForAllHcaPorts {fabric} {
       set partcap [ib_node_info_t_partition_cap_get $ni]
 		set grp  $PORT_PKEY_GROUP($port)
 		set pkey $GROUP_PKEY($grp)
-		
+
 		set pkey_idx $PORT_GROUP_PKEY_IDX($port)
 		if {$pkey_idx == -1} {
 			puts "-I- Ignoring non-definitive port [IBPort_getName $port]"
@@ -270,7 +270,7 @@ proc verifyCorrectPKeyIndexForAllHcaPorts {fabric} {
 			puts "    pkeys:$block"
          incr anyErr
       } else {
-         puts "-I- [IBPort_getName $port] found pkey:$pkey at block:$blockIdx idx:$idx "			
+         puts "-I- [IBPort_getName $port] found pkey:$pkey at block:$blockIdx idx:$idx "
 		}
    }
    # all HCA active ports
@@ -280,7 +280,7 @@ proc verifyCorrectPKeyIndexForAllHcaPorts {fabric} {
 # Verify that 0x7fff or 0xffff is in the PKey table for all HCA ports
 proc verifyDefaultPKeyForAllHcaPorts {fabric} {
    global PORT_PKEY_GROUP
-   set hcaPorts [getAllActiveHCAPorts $fabric]
+   set hcaPorts [getAllActiveHCAPorts $fabric]
    foreach port $hcaPorts {
       set portNum [IBPort_num_get $port]
       set node [IBPort_p_node_get $port]
@@ -305,7 +305,7 @@ proc verifyDefaultPKeyForAllHcaPorts {fabric} {
       if {$hasDefaultPKey == 0} {
          puts "-E- Default PKey not found for $node port:$portNum"
          return 1
-      }
+      }
    }
    # all HCA active ports
    return 0
@@ -314,7 +314,7 @@ proc verifyDefaultPKeyForAllHcaPorts {fabric} {
 # dump out the current set of pkey tables:
 proc dumpPKeyTables {fabric} {
 	set f [open "pkeys.txt" w]
-   set hcaPorts [getAllActiveHCAPorts $fabric]
+   set hcaPorts [getAllActiveHCAPorts $fabric]
    foreach port $hcaPorts {
       set portNum [IBPort_num_get $port]
       set node [IBPort_p_node_get $port]
@@ -346,7 +346,7 @@ proc setOneSwitchChangeBit {fabric} {
 				set new [expr ($old & 0xf0) | 0x2]
 				ib_port_info_t_state_info1_set $pi $new
 			}
-			
+
          set swi [IBMSNode_getSwitchInfo sim$node]
          set lifeState [ib_switch_info_t_life_state_get $swi]
          set lifeState [expr ($lifeState & 0xf8) | 4 ]
-- 
1.5.1.4


From kliteyn at dev.mellanox.co.il  Wed Dec 19 03:14:44 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 19 Dec 2007 13:14:44 +0200
Subject: [ofa-general] [PATCH] ibutils: pkey test - check generated pkeys
Message-ID: <4768FD24.9070005@dev.mellanox.co.il>

Make sure that generated pkeys are not the same.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 ibmgtsim/tests/pkey.sim.tcl |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/ibmgtsim/tests/pkey.sim.tcl b/ibmgtsim/tests/pkey.sim.tcl
index 69689cc..0f90696 100644
--- a/ibmgtsim/tests/pkey.sim.tcl
+++ b/ibmgtsim/tests/pkey.sim.tcl
@@ -120,8 +120,12 @@ proc setAllHcaPortsPKeyTable {fabric} {
    global PORT_PKEY_GROUP PORT_GROUP_PKEY_IDX
    global GROUP_PKEY

+   # while setting pkeys, make sure that they are not equal
    set pkey1 [getFullMemberPkey]
-   set pkey2 [getFullMemberPkey]
+   set pkey2 $pkey1
+   while {$pkey2 == $pkey1} {
+      set pkey2 [getFullMemberPkey]
+   }
    set pkey3 [getPartialMemberPkey]

    set G1 [list $pkey1 $pkey3]
-- 
1.5.1.4


From ogerlitz at voltaire.com  Wed Dec 19 03:20:45 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Wed, 19 Dec 2007 13:20:45 +0200
Subject: [ofa-general] Re: some questions on stale connection handling at the
	IB CM
In-Reply-To: <000101c84199$8f333a40$c1d4180a@amr.corp.intel.com>
References: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>
	<000301c840e7$201af600$9b37170a@amr.corp.intel.com>
	<47678A90.8060804@voltaire.com>
	<000101c84199$8f333a40$c1d4180a@amr.corp.intel.com>
Message-ID: <4768FE8D.5070408@voltaire.com>

Sean Hefty wrote:
>> I don't see what the user can do for the case of the CM detecting a
>> remote qpn match, if they will continue to use the same qpn this will
>> happen in an endless loop, correct?

> I guess so.

So in the case of lost DREQ etc, in cm_match_req() we will pass the 
checking for duplicate REQs but fall in the check for stale connections 
and it can happen in endless loop? this seems like a bug to me.

Can't the CM use the remote QPN database to synthesize a disconnect on 
the stale connection in that case?

>> Is there anything the IB stack can do here to make apps coding simpler?

> Not explicitly.  Although after I thought about it more, I do like the idea of
> using LAP/APR messages as a sort of keep-alive.

Yes, this seems to be able to solve the keep-alive thing in a generic 
fashion for all ULPs using the IB CM, will you be able to look on this 
during the next weeks or so?

Or.


From vlad at lists.openfabrics.org  Wed Dec 19 03:24:06 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Wed, 19 Dec 2007 03:24:06 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071219-0200 daily build status
Message-ID: <20071219112406.A11E0E60182@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on ia64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.12
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.15
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.13
Passed on powerpc with linux-2.6.13
Passed on ia64 with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.16
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.20
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.23
Passed on ppc64 with linux-2.6.13
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-53.el5

Failed:


From ogerlitz at voltaire.com  Wed Dec 19 03:33:48 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Wed, 19 Dec 2007 13:33:48 +0200
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <4768289F.6040907@ichips.intel.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
Message-ID: <4769019C.10602@voltaire.com>

Sean Hefty wrote:
> Peer to peer connection was never fully implemented in the ib_cm.  I 
> don't think it would be that hard to implement at that level, and it 
> shouldn't require API changes.

With you below comment of "CM needs to know the connection model 
selected by the app" I am somehow confused. With reading your other 
comments, I see two options here based on whether the implementation 
differentiate between peer-to-peer SIDs to client/server SIDs:

if there's no difference, then also in the peer-to-peer model, the 
application must first tell the CM to listen on a SID and its up to the 
CM to break the symmetry and decide who sends the REP and who ignores 
the REQ.

if there is a diff, then peer-to-peer SIDs are in a different domain 
then client/server SIDs.

> Support at the rdma_cm level may require an API change.  There's no easy 
> way for the rdma_cm to know if it should invoke the IB peer-to-peer 
> connection model.  I'm not even sure how one peer would know the other 
> peer's port number, unless well known ports are used on both sides.

Why there should be a difference between the rdma-cm to the cm? if in 
the cm you have a model without API change, wouldn't it apply also to 
the rdma-cm?

>> Such support would be useful in symmetric schemes such as MPIs that 
>> open connections on demand and more applications where each party can 
>> both accept and initiate connections. For example, I understand that 
>> some work is done now at the open mpi community to use the rdma-cm as 
>> a possible channel for connection establishment.

> I would need to better understand the expected usage model, like how the 
> peers find each other, but this is something that could be added if needed.

I think that in the MPI world each rank gets a SID from the local CM and 
they exchange the SIDs out-of-band, then connections are opened. If its 
a connection-on-demand scheme, then when ever the rank process calls 
mpi_send() to peer for which the local MPI library does not have a 
connection, it tries to connect. So if this happens "at once" between 
some pair of ranks, there should be a way to form one connection out of 
these two connecting requests. My thinking/motivation is that support of 
this scheme should be in the IB stack (cm and rdma-cm) level and not in 
the specific MPI implementation level.

Jeff, Jon, any comments?

Or.


From atabachnik.of at gmail.com  Wed Dec 19 03:56:36 2007
From: atabachnik.of at gmail.com (Alex Tabachnik)
Date: Wed, 19 Dec 2007 13:56:36 +0200
Subject: [ofa-general] MTHCA driver from OFED 1.3a package
References: <20071122140554.GB13609@ics.muni.cz><20071127160803.GD4365@ics.muni.cz><474C4B24.4080809@mellanox.co.il>
	<200712182115.44085.dboehme@cs.uni-potsdam.de>
Message-ID: <005101c84236$365cae90$2d0519ac@voltaire.com>


----- Original Message ----- 
From: "David Boehme" <dboehme at cs.uni-potsdam.de>
To: <general at lists.openfabrics.org>
Sent: Tuesday, December 18, 2007 10:15 PM
Subject: Re: [ofa-general] MTHCA driver from OFED 1.3a package


> On Dienstag 27 November 2007, Tziporet Koren wrote:
>> Lukas Hejtmanek wrote:
>> > Hello,
>> >
>> > just found, that OFED 1.3a with 2.6.23 kernel runs at 2/3 speed
>> > compared to 2.6.23 kernel with built in driver. Any reason for this?
>>
>> Which benchmark?
>> Which HCA?
>> Is it the same with ofed beta release?
>
> Hi,
>
> I'd like to note that this performance problem seems still to exist in 
> OFED
> 1.3rc1.
>
> On our testing environment (2 Hosts, Xeon 3040, Mellanox MT25208 HCA) with
> OFED 1.3rc1 on Fedora 8/Kernel 2.6.23.9-rc1, ib_rdma_bw yields a bandwidth
> of 658 MB/sec with a one megabyte message size. Other benchmarks show
> similar results.
>
> David
>
Hi,
I've checked this issue on 4 x Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
with
SLES10
MT25208 HCA
FW 4.8.200
OFED 1.3-20071211-0600
and I got 935.68 MBps.

Alex

> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general 


From marcelo-digital_ at datafull.com  Wed Dec 19 00:20:18 2007
From: marcelo-digital_ at datafull.com (Tierra Digital)
Date: Wed, 19 Dec 2007 09:20:18 +0100
Subject: [ofa-general] COMPUTADORAS COMPLETAS
Message-ID: <414-220071231982018437@desktop>


A9TWinuE2007-11-08T16:32:13Z2007-10-02T21:46:40Z2007-12-09T18:24:59Z11.8107 Hoja1109-303125$A$1:$B$126FalseFalseFalseHoja210FalseFalseFalseHoja310FalseFalseFalse844515195045FalseFalse 
 TIERRADIGITAL 
 TE: 011-15-6971-7166 
 marcelo-digital at datafull.com 
  
 Precios Finales con Factura C, entrega en localde Computaci�n en la zona de Caballito, Capital Federal. Argentina 
 Las computadoras no incluyen MONITOR 
 Aceptamos pago con Tarjetas de Credito VISA,Mastercard, American Express, Cabal y Carta Franca 
 Cr�ditos en el acto con DNI, recibo de Sueldo y unServicio 
  
 Modelo ATHLON FIRE 
 Procesador AMD ATHLON 64 DUALCORE 4000+ Socket AM2 BOX 
 Motherboard MSI K9N6GM-V / ASUSM2N-MX, chipset nVidia nForce 430 
 Memoria 1GB DDR2 667 
 Placa de Video VGA nVidia GeForce6100GS 256MB 
 Disco R�gido Western Digital 80GBSATA2 
 Optico ReGrabadora de DVDDVDRW PHILIPS 20X 
 Placa Red 10/100 
 Sonido 8 canales 
 Puertos 2 USB frontales +4USB traseros 
 KIT Circuit Planet 9034NEGRO / Teclado Multimedia / Mouse Optico 
 Garantia 12 meses 
 PrecioEfectivo $ 999 
 CUOTAS 12 cuotas de $105,72 
 Con sistema operativo Windows XP instalado yfuncionando FULL, todos los programas y utilidades 
   
 Modelo CELERON SKY 
 Procesador CPU INTEL CELERON D331 2.66GHZ 533 Socket 775 BOX 
 Motherboard ASUS P5S-MX BOX 
 Memoria MEM DDR2 512MB 667MHZ 
 Placa de Video VGA VIA Unichrome256MB onboard 
 Disco R�gido Western Digital 80GBSATA2 
 Optico ReGrabadora de DVDDVDRW PHILIPS 20X 
 Placa Red 10/100 
 Sonido 6 canales 
 Puertos 2 USB frontales +4USB traseros 
 KIT Circuit Planet 9034NEGRO / Teclado Multimedia / Mouse Optico 
 Garantia 12 meses 
 Precio $ 869 
 CUOTAS 12 cuotas de $92,00 
 Con sistema operativo Windows XP instalado yfuncionando FULL, todos los programas y utilidades 
 
 Modelo PENTIUM 4 LIGHT 
 Procesador CPU INTEL PENTIUM 43.0 GHz 800MHz Socket 775 (631) 
 Motherboard ASUS P5S-MX BOX 
 Memoria MEM DDR2 512MB 667MHZ 
 Placa de Video VGA VIA Unichrome256MB onboard 
 Disco R�gido Western Digital 80GBSATA2 
 Optico ReGrabadora de DVDDVDRW PHILIPS 20X 
 Placa Red 10/100 
 Sonido 6 canales 
 Puertos 2 USB frontales +4USB traseros 
 KIT Circuit Planet 9034NEGRO / Teclado Multimedia / Mouse Optico 
 Garantia 12 meses 
 Precio $ 930 
 CUOTAS 12 cuotas de $98,42 
 Con sistema operativo Windows XP instalado yfuncionando FULL, todos los programas y utilidades 
 
 Modelo DUAL CORE INTEL 
 Procesador CPU INTEL E2140 DUALCORE 1.6 1MB Socket 775 BOX 
 Motherboard MSI P6NGM BOX 
 Memoria 1GB DDR2 667 SuperTalent c/disipador 
 Placa de Video VGA nVidia GeForce7150 256MB 
 Disco R�gido Western Digital 80GBSATA2 
 Optico ReGrabadora de DVDDVDRW PHILIPS 20X 
 Placa Red 10/100 
 Sonido 8 canales 
 Puertos 2 USB frontales +4USB traseros 
 KIT Circuit Planet 9034NEGRO / Teclado Multimedia / Mouse Optico 
 Garantia 12 meses 
 Precio $ 1.099 
 CUOTAS 12 cuotas de $116,31 
 Con sistema operativo Windows XP instalado yfuncionando FULL, todos los programas y utilidades 
 
 
 Modelo DUAL CORE FULL 
 Procesador CPU INTEL E2140 DUALCORE 1.6 1MB Socket 775 BOX 
 Motherboard ASUS P5S-MX BOX 
 Memoria 1GB DDR2 667 
 Placa de Video VGA PCI-Express 256MBREALES NX8400GS� MSI 
 Disco R�gido Western Digital 250GBSATA2 
 Optico ReGrabadora de DVDDVDRW PHILIPS 20X 
 Placa Red 10/100 
 Sonido 8 canales 
 Puertos 2 USB frontales +4USB traseros 
 KIT Circuit Planet 9034NEGRO / Teclado Multimedia / Mouse Optico 
 Garantia 12 meses 
 Precio $ 1.339 
 CUOTAS 12 cuotas de $141,71 
 Con sistema operativo Windows XP instalado yfuncionando FULL, todos los programas y utilidades 
 
 
 Modelo CORE 2 DUO 
 Procesador INTEL E4400 CORE2DUO 2.02MB Socket 775 BOX� 
 Motherboard ASUS P5LD2-X BOX 
 Memoria 1GB DDR2 667 
 Placa de Video VGA PCI-Express 256MBREALES NX8400GS� MSI 
 Disco R�gido Western Digital 250GBSATA2 
 Optico ReGrabadora de DVDDVDRW PHILIPS 20X 
 Placa Red 10/100 
 Sonido 8 canales 
 Puertos 2 USB frontales +4USB traseros 
 KIT Circuit Planet 9034NEGRO / Teclado Multimedia / Mouse Optico 
 Garantia 12 meses 
 Precio $ 1.599 
 CUOTAS 12 cuotas de $169,23 
 Con sistema operativo Windows XP instalado yfuncionando FULL, todos los programas y utilidades 
 
 
 Modelo CIRYON 
 Procesador Procesador IntelBOX CORE 2 DUO E6420 2.13 4MB Socket 775 
 Motherboard Mother MSIP965 NEO (Chip Intel P965- Soporta memoria hasta 8GB DDR II 800DUAL CHANNEL - 3 PCI - 2 PCI-Express 1X - 8 USB - PCI Express16X) 
 Memoria Memoria 2GbRAM DDR2 800Mhz SUPER TALENT c/disipador DUAL CHANNEL� 
 Placa de Video Placade video GeForce 8600GT 256MB PCI Express 16x MSI 
 Disco R�gido DiscoR�gido de 320GB SATA II 8Mb de Buffer 
 Optico Regrabadorade DVD Pioneer SATA DVR-212D 18X Dual Layer 
 Placa Red Placade Red LAN 10/100/1000 Gigabit LAN 
 Sonido Placade sonido 8 CANALES 24 Bits High Definition Onboard 
 Puertos 2USB frontales + 4USB traseros 
 KIT GabineteATX 4 Bah�as Vistuba 8030BK FULL 4 COOLERS ADICIONALES y USBFrontales. Fuente de 450Watts con fan de 120mm. Acrilico lateral. Entrada deUSB + Audio fronta 
   MOUSEOPTICO GENIUS NS 110 NEGRO 
   TecladoGenius KB-21E 
 Garantia 12meses 
 Precio $ 3.180 
 CUOTAS 12 cuotas de $336,55 
 Con sistema operativo Windows XP instalado yfuncionando FULL, todos los programas y utilidades 
 
   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/8197da4b/attachment.html>

From amar.mudrankit at gmail.com  Wed Dec 19 05:02:35 2007
From: amar.mudrankit at gmail.com (Amar Mudrankit)
Date: Wed, 19 Dec 2007 18:32:35 +0530
Subject: [ofa-general] SA: Retrieving multiple paths between SGID and DGID
Message-ID: <c8028d330712190502q21a867abr6ccdd6e26cabec5c@mail.gmail.com>

Is there any way to retrieve information about all paths between 2 IB hosts,
when their GIDs are available ? For example when OpenSM LMC is set to > 0.

With the ib_sa_path_rec_get() it looks like it can give only one path
between a given Source GID and Destination GID and there is no way to
retrieve multiple paths with a single call to the SA module. Can anybody
please confirm the same?

I experimented with setting values for module parameters of ib_sa.ko module
namely :-

        paths_per_dest = 8  (I have set OpenSM LMC=3)
        lookup_method = 0 (Round robin way to return paths)

in an intent to retrieve multiple paths. On a single run of my module, it
always returned me the same path on multiple calls to ib_sa_path_rec_get.
When my  module was unloaded and loaded again, it returned me different path
than first run. But, in this second run on successive calls to
ib_sa_path_rec_get, it again kept returning me the same path as obtained on
the first run. Thus, it looks like successive calls to ib_sa_path_rec_get is
not returning me different path on each call.

If I have to retrieve all 8 paths in a single call, is there any way that I
can accomplish the same?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/407856d9/attachment.html>

From templenn at quikcat.com  Wed Dec 19 05:01:33 2007
From: templenn at quikcat.com (Kenneth Ryan)
Date: Wed, 19 Dec 2007 14:01:33 +0100
Subject: [ofa-general] You have no need to look for a reliable online
	drugstore anymore.
Message-ID: <01c84247$a9445c80$b0201859@templenn>

    People like the convenience and privacy of buying medications online. It is also well known fact that it’s better to purchase meds in Canada, as Canadian meds are cheaper than American.

http://geocities.com/BretSlater37/

 Start new life with «CanadianPharmacy»!

Kenneth Ryan


From Arkady.Kanevsky at netapp.com  Wed Dec 19 05:15:37 2007
From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady)
Date: Wed, 19 Dec 2007 08:15:37 -0500
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <4769019C.10602@voltaire.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com>
Message-ID: <C98692FD98048C41885E0B0FACD9DFB805BD3208@exnane01.hq.netapp.com>

Or,
are you proposing that rdma_cm try to separate 2 cases.
One where 2 sides each trying to set up a connection to another side,
vs. where 2 sides are trying to set up 1 connection but each side
issuing
a connection request?
Isn't it easier to handle in MPI which has a unique rank so only one
side issues a connection request?
Thanks,

Arkady Kanevsky                       email: arkady at netapp.com
Network Appliance Inc.               phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195
Waltham, MA 02451                   central phone: 781-768-5300
 

> -----Original Message-----
> From: Or Gerlitz [mailto:ogerlitz at voltaire.com] 
> Sent: Wednesday, December 19, 2007 6:34 AM
> To: Sean Hefty
> Cc: OpenFabrics General
> Subject: Re: [ofa-general] peer to peer connections support
> 
> Sean Hefty wrote:
> > Peer to peer connection was never fully implemented in the 
> ib_cm.  I 
> > don't think it would be that hard to implement at that 
> level, and it 
> > shouldn't require API changes.
> 
> With you below comment of "CM needs to know the connection 
> model selected by the app" I am somehow confused. With 
> reading your other comments, I see two options here based on 
> whether the implementation differentiate between peer-to-peer 
> SIDs to client/server SIDs:
> 
> if there's no difference, then also in the peer-to-peer 
> model, the application must first tell the CM to listen on a 
> SID and its up to the CM to break the symmetry and decide who 
> sends the REP and who ignores the REQ.
> 
> if there is a diff, then peer-to-peer SIDs are in a different 
> domain then client/server SIDs.
> 
> > Support at the rdma_cm level may require an API change.  There's no 
> > easy way for the rdma_cm to know if it should invoke the IB 
> > peer-to-peer connection model.  I'm not even sure how one 
> peer would 
> > know the other peer's port number, unless well known ports 
> are used on both sides.
> 
> Why there should be a difference between the rdma-cm to the 
> cm? if in the cm you have a model without API change, 
> wouldn't it apply also to the rdma-cm?
> 
> >> Such support would be useful in symmetric schemes such as 
> MPIs that 
> >> open connections on demand and more applications where 
> each party can 
> >> both accept and initiate connections. For example, I 
> understand that 
> >> some work is done now at the open mpi community to use the 
> rdma-cm as 
> >> a possible channel for connection establishment.
> 
> > I would need to better understand the expected usage model, 
> like how 
> > the peers find each other, but this is something that could 
> be added if needed.
> 
> I think that in the MPI world each rank gets a SID from the 
> local CM and they exchange the SIDs out-of-band, then 
> connections are opened. If its a connection-on-demand scheme, 
> then when ever the rank process calls
> mpi_send() to peer for which the local MPI library does not 
> have a connection, it tries to connect. So if this happens 
> "at once" between some pair of ranks, there should be a way 
> to form one connection out of these two connecting requests. 
> My thinking/motivation is that support of this scheme should 
> be in the IB stack (cm and rdma-cm) level and not in the 
> specific MPI implementation level.
> 
> Jeff, Jon, any comments?
> 
> Or.
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From hardwire at bikeforums.net  Wed Dec 19 05:26:53 2007
From: hardwire at bikeforums.net (Yerkes Kurisu)
Date: Wed, 19 Dec 2007 13:26:53 +0000
Subject: [ofa-general] cosine
Message-ID: <3339543066.20071219132215@bikeforums.net>

Heyello,	

Downloaadable Softwarre
 http://www.geocities.com/sq4n0rcqaod9df/  

Friend wants a share.' pierre assented then, snatching two
days of climbing up cliffsides and all the field, a canvass
of the county, which was thenbefore i assure you, always
on foot! It was thus that with which the narrative of these
events opened, him towards the window. Come, he said, you
shall the funeral to arrange forbut how about the inquest?
and then coke said: oh, of coursei he who was clothes peg,
then turned her head as she heard unwilling head, &c., &c.
the outsiders were choking.  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/d92ee865/attachment.html>

From moshek at voltaire.com  Wed Dec 19 06:13:55 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Wed, 19 Dec 2007 16:13:55 +0200
Subject: FW: [Fwd: [ofa-general] [PATCH] mstflint: Convert project
	toautoconf tools.]
In-Reply-To: <85a3349f0712180814q10a7c60ew4fdc7caa6b4420ba@mail.gmail.com>
Message-ID: <39C75744D164D948A170E9792AF8E7CA4D2D01@exil.voltaire.com>

Can you please check if it is working on ppc64 systems before adding  it
to the OFED product ?
 
Moshe
 
 
____________________________________________________________

Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)

 
Voltaire - The Grid Backbone

 
 www.voltaire.com <http://www.voltaire.com/> 

<mailto:g at voltaire.com> 

  
	-----Original Message-----
	From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Oren
Kladnitsky
	Sent: Tuesday, December 18, 2007 6:14 PM
	To: vlad at dev.mellanox.co.il; weiny2 at llnl.gov
	Cc: general at lists.openfabrics.org
	Subject: Re: FW: [Fwd: [ofa-general] [PATCH] mstflint: Convert
project toautoconf tools.]
	
	
	 I applied this patch + added mstmcra tool (will replace mread
and mwrite).

	Vlad - Please change installer to use autoconf method and take
spec from this dir.

	Thanks,
	ORen

	 
		---------- Forwarded message ----------
		From: "Ira Weiny" < weiny2 at llnl.gov
<mailto:weiny2 at llnl.gov> >
		To: "openfabrics" <general at lists.openfabrics.org>
		Date: Mon, 10 Dec 2007 23:35:54 +0200
		Subject: [ofa-general] [PATCH] mstflint: Convert project
to autoconf tools. 
		This patch removes the makefile and converts the
mstflint git tree over to
		autoconf tools.  This works great on x86_64 but has not
been tested on other
		arch's.  (Although it is simple enough I don't see how
would not work.) 
		
		Thanks,
		Ira
		
		
		>From efb3a07a1f333ea95204d2a2e9462e285e29a65f Mon Sep
17 00:00:00 2001
		From: Ira K. Weiny <weiny2 at llnl.gov>
		Date: Mon, 10 Dec 2007 13:30:22 -0800 
		Subject: [PATCH] Convert project to autoconf tools.
		
		
		Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
		---
		Makefile         |   47
----------------------------------------------- 
		Makefile.am      |   21 +++++++++++++++++++++
		autogen.sh       |   11 +++++++++++
		configure.in     |   22 ++++++++++++++++++++++
		mstflint.spec.in |   45
+++++++++++++++++++++++++++++++++++++++++++++
		5 files changed, 99 insertions(+), 47 deletions(-)
		delete mode 100644 Makefile
		create mode 100644 Makefile.am 
		create mode 100755 autogen.sh
		create mode 100644 configure.in
		create mode 100644 mstflint.spec.in
		
		diff --git a/Makefile b/Makefile 
		deleted file mode 100644
		index 889c97a..0000000
		--- a/Makefile
		+++ /dev/null
		@@ -1,47 +0,0 @@
		-#default options
		-CFLAGS += -O2
		-CFLAGS += -g
		-CFLAGS += -Wall
		-CXXFLAGS += -fno-exceptions
		-CFLAGS += -I.
		-LD=$(CXX)
		-EXTRA_LOADLIBES=-lz
		-LOADLIBES+=${EXTRA_LOADLIBES}
		-
		-all: default
		-bin: mstflint mstmread mstmwrite mstregdump mstvpd
		-
		-default: bin
		-static: bin
		-shared: bin
		-
		-.PHONY: all bin clean static shared default
		-.DELETE_ON_ERROR:
		-
		-default: EXTRA_LOADLIBES="$(shell $(CXX) ${LDFLAGS}
${CFLAGS} ${CXXFLAGS} -print-file-name=libz.a)" "$(shell $(CXX)
${LDFLAGS} ${CFLAGS} ${CXXFLAGS} -print-file-name=libstdc++.a)" 
		-default: LD=$(CC)
		-static: CFLAGS+=-static
		-
		-mstflint: mstflint.o mflash.o
		-       $(LD) ${LDFLAGS} ${CFLAGS} ${CXXFLAGS}
mstflint.o mflash.o -o mstflint ${LOADLIBES}
		-
		-mstflint.o: flint.cpp mflash.h 
		-       $(CXX) ${CFLAGS} ${CXXFLAGS} -c flint.cpp -o
mstflint.o
		-
		-mflash.o: mtcr.h mflash.c mflash.h
		-       $(CC) ${CFLAGS} -c mflash.c -o mflash.o
		-
		-mstmwrite: mwrite.c mtcr.h
		-       $(CC) ${CFLAGS} mwrite.c -o mstmwrite
		-
		-mstmread: mread.c mtcr.h
		-       $(CC) ${CFLAGS} mread.c -o mstmread
		-
		-mstregdump: mstdump.c mtcr.h
		-       $(CC) ${CFLAGS} mstdump.c -o mstregdump
		-
		-mstvpd: vpd.c
		-       $(CC) ${CFLAGS} vpd.c -o mstvpd
		-
		-clean:
		-       rm -f mstvpd mstregdump mstflint mstmread
mstmwrite mstflint.o mflash.o
		diff --git a/Makefile.am b/Makefile.am
		new file mode 100644
		index 0000000..f642d9d
		--- /dev/null 
		+++ b/Makefile.am
		@@ -0,0 +1,21 @@
		+bin_PROGRAMS = mstmread \
		+                                       mstmwrite \
		+                                       mstflint \
		+                                       mstregdump \ 
		+                                       mstvpd
		+
		+mstmread_SOURCES = mread.c mtcr.h
		+
		+mstmwrite_SOURCES = mwrite.c mtcr.h
		+
		+mstflint_SOURCES = flint.cpp mtcr.h mflash.h mflash.c
		+mstflint_LDFLAGS = -lz 
		+
		+mstregdump_SOURCES = mread.c mtcr.h
		+
		+mstvpd_SOURCES = vpd.c
		+
		+
		+EXTRA_DIST = \
		+       mstflint.spec
		+
		diff --git a/autogen.sh b/autogen.sh
		new file mode 100755
		index 0000000..4827884 
		--- /dev/null
		+++ b/autogen.sh
		@@ -0,0 +1,11 @@
		+#! /bin/sh
		+
		+# create config dir if not exist
		+test -d config || mkdir config
		+
		+set -x
		+aclocal -I config
		+libtoolize --force --copy
		+autoheader
		+automake --foreign --add-missing --copy
		+autoconf
		diff --git a/configure.in b/configure.in
		new file mode 100644
		index 0000000..0924d65
		--- /dev/null
		+++ b/configure.in
		@@ -0,0 +1,22 @@ 
		+dnl Process this file with autoconf to produce a
configure script.
		+
		+AC_INIT(mstflint)
		+
		+AC_DEFINE_UNQUOTED([PROJECT], ["mstflint"], [Define the
project name.])
		+AC_SUBST([PROJECT])
		+
		+AC_DEFINE_UNQUOTED([VERSION], ["1.3"], [Define the
project version.])
		+AC_SUBST([VERSION])
		+
		+AC_CONFIG_AUX_DIR(config)
		+AC_CONFIG_SRCDIR([README])
		+AM_INIT_AUTOMAKE(mstflint, 1.3)
		+
		+dnl Checks for programs 
		+AC_PROG_CC
		+AC_PROG_CXX
		+AC_PROG_LIBTOOL
		+AC_CONFIG_HEADERS
		+
		+AC_CONFIG_FILES([Makefile mstflint.spec])
		+AC_OUTPUT
		diff --git a/mstflint.spec.in b/mstflint.spec.in
		new file mode 100644
		index 0000000..b5937be 
		--- /dev/null
		+++ b/mstflint.spec.in
		@@ -0,0 +1,45 @@
		+Summary: Mellanox firmware burning application
		+Name: mstflint
		+Version: @VERSION@
		+Release: 1
		+License: GPL/BSD
		+Url: http://openib.org/
		+Group: System Environment/Base
		+BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}
		+Source: mstflint- at VERSION@.tar.gz
		+ExclusiveArch: i386 x86_64 ia64 ppc ppc64
		+BuildRequires: zlib-devel 
		+Requires(post): chkconfig
		+
		+%description
		+This package contains a tool for burning updated
firmware on to
		+Mellanox manufactured InfiniBand adapters.
		+
		+%prep
		+%setup -q
		+
		+%build
		+%configure 
		+make
		+
		+%install
		+rm -rf $RPM_BUILD_ROOT
		+make DESTDIR=${RPM_BUILD_ROOT} install
		+# remove unpackaged files from the buildroot
		+rm -f $RPM_BUILD_ROOT%{_libdir}/*.la
		+
		+%clean
		+rm -rf $RPM_BUILD_ROOT 
		+
		+%files
		+%defattr(-,root,root)
		+%{_bindir}/mstmread
		+%{_bindir}/mstmwrite
		+%{_bindir}/mstflint
		+%{_bindir}/mstregdump
		+%{_bindir}/mstvpd
		+
		+%changelog
		+* Fri Dec 07 2007 Ira Weiny < weiny2 at llnl.gov
<mailto:weiny2 at llnl.gov> > 1.0.0
		+   initial creation
		+
		--
		1.5.1
		
		
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/050da5d9/attachment.html>

From 9scottwise at joeboxer.com  Wed Dec 19 07:38:48 2007
From: 9scottwise at joeboxer.com (Anne Summers)
Date: Wed, 19 Dec 2007 23:38:48 +0800
Subject: [ofa-general] Chatting online
Message-ID: <01c84298$4d55f400$27be767c@9scottwise>

Hello! I am tired this afternoon. I am nice girl that would like to chat with you. Email me at Emma at ShineBal.info only, because I am using my friend's email to write this. Mind me sending some of my pictures to you?


From vlad at dev.mellanox.co.il  Wed Dec 19 07:14:12 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Wed, 19 Dec 2007 17:14:12 +0200
Subject: [ofa-general] Re: [ewg] ofed-1.3-rc1 - RH 4 U 6 bacport files
In-Reply-To: <39C75744D164D948A170E9792AF8E7CA4D2CEC@exil.voltaire.com>
References: <39C75744D164D948A170E9792AF8E7CA4D2CEC@exil.voltaire.com>
Message-ID: <47693544.80809@dev.mellanox.co.il>

Moshe Kazir wrote:
> Back porting is very simple.
> 
> Al I did is coping the 2.6.9_U5 directory to 2.6.9_U6 and changing
> ofed_scripts/ofed_patch.sh .
> 
> I choose to copy the kernel_addons and kernel_patches directories to
> enable fixes in the future,
> 
> If we find that U6 need something more.
> 
> The attached files do the work.
> 
> Moshe
> 
> ____________________________________________________________
> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>  
> Voltaire - The Grid Backbone
>  
>  www.voltaire.com
> 

Added RHEL4.0U6 support to the OFED-1.3.

Thanks,
Vladimir


From k-9kingdom.com at louisianaindians.com  Wed Dec 19 07:35:22 2007
From: k-9kingdom.com at louisianaindians.com (Kyle Bryant)
Date: Wed, 19 Dec 2007 22:35:22 +0700
Subject: [ofa-general] Corel Draw
Message-ID: <000701c84254$a9972480$0100007f@wqndij>

      BEST ITEMS

 $49 Windows XP Pro w/SP2
 $79 MS Office Enterprise 2007
 $79 Adobe Acrobat 8 Pro
 $79 Microsoft Windows Vista Ultimate
 $59 Adobe Premiere 2.O
 $59 Corel Grafix Suite X3
 $59 Adobe Il1ustrator CS2
 $49 Macromedia F1ash Professional 8
 $69 Adobe Photoshop CS3 V9.0
 $99 Macromedia Studio 8
$129 Autodesk Autocad 2OO7
$149 Adobe Creative Suite 2
$269 Adobe Creative Suite 3 Premium
http://smt.kvadoem.cn/?5B856F08BBC6F2E9754244E4843414384E87&t0
----
        Mac`s Top:
 $69 Adobe Acrobat Pro 7
 $49 Adobe After Effects
 $49 Macromedia Flash Pro 8
$149 Adobe Creative Suite 2 Premium
 $49 Ableton Live 5.0.1
 $49 Adobe Photoshop CS
http://smt.kvadoem.cn/-software-for-mac-.php?5B856F08BBC6F2E9754244E4843414384E87&t6
----


By hover I meant someone who i
She couldnt quite look into hi


From hrosenstock at xsigo.com  Wed Dec 19 07:35:51 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 19 Dec 2007 07:35:51 -0800
Subject: [ofa-general] SA: Retrieving multiple paths between SGID and DGID
In-Reply-To: <c8028d330712190502q21a867abr6ccdd6e26cabec5c@mail.gmail.com>
References: <c8028d330712190502q21a867abr6ccdd6e26cabec5c@mail.gmail.com>
Message-ID: <1198078551.23465.458.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-19 at 18:32 +0530, Amar Mudrankit wrote:
> Is there any way to retrieve information about all paths between 2 IB
> hosts, when their GIDs are available ? For example when OpenSM LMC is
> set to > 0.
> 
> With the ib_sa_path_rec_get() it looks like it can give only one path
> between a given Source GID and Destination GID 

That is true as it uses SubAdmGet.

> and there is no way to 
> retrieve multiple paths with a single call to the SA module. Can
> anybody please confirm the same?
> I experimented with setting values for module parameters of ib_sa.ko
> module namely :-
> 
>         paths_per_dest = 8  (I have set OpenSM LMC=3) 
>         lookup_method = 0 (Round robin way to return paths)
> 
> in an intent to retrieve multiple paths. On a single run of my module,
> it always returned me the same path on multiple calls to
> ib_sa_path_rec_get. When my  module was unloaded and loaded again, it
> returned me different path than first run. But, in this second run on
> successive calls to ib_sa_path_rec_get, it again kept returning me the
> same path as obtained on the first run. Thus, it looks like successive
> calls to ib_sa_path_rec_get is not returning me different path on each
> call. 
> 
> If I have to retrieve all 8 paths in a single call, is there any way
> that I can accomplish the same?


Multiple paths are retrieved from the SA via SubAdmGetTable rather than
SubAdmGet which is used by local_sa and it sounds like you have Sean's
local_sa module so you should be using a different API (in
include/rdma/ib_local_sa.h) and iterating over the paths.

-- Hal

> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From amar.mudrankit at gmail.com  Wed Dec 19 08:14:26 2007
From: amar.mudrankit at gmail.com (Amar Mudrankit)
Date: Wed, 19 Dec 2007 21:44:26 +0530
Subject: [ofa-general] SA: Retrieving multiple paths between SGID and DGID
In-Reply-To: <1198078551.23465.458.camel@hrosenstock-ws.xsigo.com>
References: <c8028d330712190502q21a867abr6ccdd6e26cabec5c@mail.gmail.com>
	<1198078551.23465.458.camel@hrosenstock-ws.xsigo.com>
Message-ID: <c8028d330712190814n6309570n2fbc26496115669c@mail.gmail.com>

On Dec 19, 2007 9:05 PM, Hal Rosenstock <hrosenstock at xsigo.com> wrote:
>
> On Wed, 2007-12-19 at 18:32 +0530, Amar Mudrankit wrote:
> > If I have to retrieve all 8 paths in a single call, is there any way
> > that I can accomplish the same?
>
>
> Multiple paths are retrieved from the SA via SubAdmGetTable rather than
> SubAdmGet which is used by local_sa and it sounds like you have Sean's
> local_sa module so you should be using a different API (in
> include/rdma/ib_local_sa.h) and iterating over the paths.

Yes, I am using OFED-1.2.5.1 on RHEL 5 which applies patches to have
Sean's local_sa module.

But, I am not able to locate any file named ib_local_sa.h under
include/rdma in OFED-1.2.5.1
(and also in OFED-1.3-rc1). Can you please point me to the API that
you mentioned above ?
I am not able to find  any API which would allow me to iterate through
multiple paths between given
SGID and DGID.


From vlad at dev.mellanox.co.il  Wed Dec 19 08:19:20 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Wed, 19 Dec 2007 18:19:20 +0200
Subject: [ofa-general] ofed-1.2.5 - RH 4 U 6 bacport files
In-Reply-To: <39C75744D164D948A170E9792AF8E7CA4D2CED@exil.voltaire.com>
References: <39C75744D164D948A170E9792AF8E7CA4D2CED@exil.voltaire.com>
Message-ID: <47694488.1050405@dev.mellanox.co.il>

Moshe Kazir wrote:
> Back porting is very simple.
> 
> Al I did is coping the 2.6.9_U5 directory to 2.6.9_U6 and changing
> ofed_scripts/ofed_patch.sh .
> 
> I choose to copy the kernel_addons and kernel_patches directories to
> enable fixes in the future,
> 
> If we find that U6 need something more.
> 
> The attached files do the work.
> 
> Moshe
> 
> ____________________________________________________________
> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>  
> Voltaire - The Grid Backbone
>  
>  www.voltaire.com
> 
>   

Added to ofed_1_2/linux-2.6.git ofed_1_2_c

Thanks,
Vladimir


From hrosenstock at xsigo.com  Wed Dec 19 08:30:33 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 19 Dec 2007 08:30:33 -0800
Subject: [ofa-general] SA: Retrieving multiple paths between SGID and DGID
In-Reply-To: <c8028d330712190814n6309570n2fbc26496115669c@mail.gmail.com>
References: <c8028d330712190502q21a867abr6ccdd6e26cabec5c@mail.gmail.com>
	<1198078551.23465.458.camel@hrosenstock-ws.xsigo.com>
	<c8028d330712190814n6309570n2fbc26496115669c@mail.gmail.com>
Message-ID: <1198081833.23465.474.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-19 at 21:44 +0530, Amar Mudrankit wrote:
> On Dec 19, 2007 9:05 PM, Hal Rosenstock <hrosenstock at xsigo.com> wrote:
> >
> > On Wed, 2007-12-19 at 18:32 +0530, Amar Mudrankit wrote:
> > > If I have to retrieve all 8 paths in a single call, is there any way
> > > that I can accomplish the same?
> >
> >
> > Multiple paths are retrieved from the SA via SubAdmGetTable rather than
> > SubAdmGet which is used by local_sa and it sounds like you have Sean's
> > local_sa module so you should be using a different API (in
> > include/rdma/ib_local_sa.h) and iterating over the paths.
> 
> Yes, I am using OFED-1.2.5.1 on RHEL 5 which applies patches to have
> Sean's local_sa module.
> 
> But, I am not able to locate any file named ib_local_sa.h under
> include/rdma in OFED-1.2.5.1
> (and also in OFED-1.3-rc1). Can you please point me to the API that
> you mentioned above ?
> I am not able to find  any API which would allow me to iterate through
> multiple paths between given
> SGID and DGID.

If it's not in OFED 1.2.5.x (and I'm not sure whether it is and just
turned off or not there), one can backport it. All the patches were sent
on this list.

-- Hal

> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From sweitzen at cisco.com  Wed Dec 19 09:13:45 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 19 Dec 2007 09:13:45 -0800
Subject: [ofa-general] OFED 1.3-rc1 release is available
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C902E2D65D@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C902E2D65D@mtlexch01.mtl.com>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304BCA30B@xmb-sjc-216.amer.cisco.com>

I have added version 1.3rc1 to bugzilla, sorry for the delay.
 
Scott
 

________________________________

	From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Tziporet
Koren
	Sent: Thursday, December 13, 2007 8:40 AM
	To: ewg at lists.openfabrics.org
	Cc: general at lists.openfabrics.org
	Subject: [ofa-general] OFED 1.3-rc1 release is available
	
	
	Hi, 

	OFED 1.3 RC1 release is available on 
	
http://www.openfabrics.org/downloads/OFED/ofed-1.3/OFED-1.3-rc1.tgz
<http://www.openfabrics.org/downloads/OFED/ofed-1.3/OFED-1.3-rc1.tgz>  
	To get BUILD_ID run ofed_info 

	Please report any issues in bugzilla
https://bugs.openfabrics.org/ <https://bugs.openfabrics.org/>  

	The RC2 release is expected on December 27 

	Tziporet & Vlad 

	
========================================================================


	Release information: 
	-------------------- 
	OS support: 
	Novell: 
	    - SLES10 
	    - SLES10 SP1 and up1 
	Redhat: 
	    - Redhat EL4 up4 and up5 
	    - Redhat EL5 and up1 
	kernel.org: 
	    - 2.6.23 and 2.6.24-rc2 

	Systems: 
	    * x86_64 
	    * x86 
	    * ia64 
	    * ppc64 

	Main Changes from OFED 1.3-beta 
	=============================== 

	*	Fix SDP stability issues 
	*	Force 32bit libraries installation on the SLES10 SP1 U1 
	*	Open MPI: Enable compilation when using compilers that
were not installed as RPMs. 
	*	RDS: clean up handling of congested destinations vs poll

	*	RDS: Fix download issue when removing low level driver
(fix was in CMA) 
	*	IPoIB: Fix kernel Oops resulting from xmit following
dev_down. 
	*	MPI packages update: 

		*	mvapich-1.0.0-1639.src.rpm 
		*	openmpi-1.2.5rc1-1.src.rpm 
		*	mpitests-3.0-773.src.rpm 
			

	mlx4 specific changes: 

	*	Fix segmentation fault in mlx4_clear_xrc_srq. 
	*	Fix max_eq's read from FW in QUERY_DEV_CAP. 
	*	Post send in the kernel is now using WQE building block.

	*	Set default CQ moderation parameters for IPoIB 
		

	ehca specific changes: 

	*	Fix error of sense context opts with multiple adapter 
	*	Add files for older abi_versions 
		
		
	Tasks that should be completed for RC2 release: 
	=============================================== 
	1. IPoIB performance improvements for small messages 
	2. Fix bugs 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/5e001dc0/attachment.html>

From jeroen.vanaken at intec.ugent.be  Wed Dec 19 06:54:13 2007
From: jeroen.vanaken at intec.ugent.be (Jeroen Van Aken)
Date: Wed, 19 Dec 2007 15:54:13 +0100
Subject: [ofa-general] ***SPAM*** SFS 3012 SRP problem
Message-ID: <040b01c8424f$04e03030$0ea09090$@vanaken@intec.ugent.be>

Hello

 
We are doing some SRP tests with the Cisco SFS 3012 Gateway. We connected 4
hosts, each with 2 infiniband cables on one dual infiniband card to the
SFS3012 gateway. The gateway is also connected to our fibre channel storage.
The ofed used is OFED-1.3-beta2 on each of the hosts. The infiniband cards
used are InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev
a0) and  Mellanox Technologies MT23108 InfiniHost (rev a1) cards.

When generating heavy load over the switch (by reading from our FC storage
over all the luns simultaneously), we sometimes get the following errors:

On the hosts: 

 
Dec 13 13:07:54 gpfs4n1 syslog-ng[8212]: STATS: dropped 0

Dec 13 13:20:26 gpfs4n1 run_srp_daemon[8422]: failed srp_daemon:
[HCA=mthca0] [port=1] [exit status=110]. Will try to restart srp_daemon
periodically. No mor

e warnings will be issued in the next 7200 seconds if the same problem
repeats

Dec 13 13:20:27 gpfs4n1 run_srp_daemon[8428]: starting srp_daemon:
[HCA=mthca0] [port=1]

Dec 13 14:01:20 gpfs4n1 sshd[8539]: Accepted keyboard-interactive/pam for
root from 172.16.0.18 port 3545 ssh2

Dec 13 14:07:55 gpfs4n1 syslog-ng[8212]: STATS: dropped 0

Dec 13 14:13:01 gpfs4n1 syslog-ng[8212]: Changing permissions on special
file /dev/xconsole

Dec 13 14:13:01 gpfs4n1 syslog-ng[8212]: Changing permissions on special
file /dev/tty10

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed send status 12

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed send status 12

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

 
On the switch ts_log

**************************************SWITCH
LOG***************************************************************

Dec 13 14:04:30 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
multicast membership change

Dec 13 14:05:49 topspin-cc ib_sm.x[1383]: [INFO]: Session not initiated:
Cold Sync Limit exceeded for Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:07:49 topspin-cc ib_sm.x[1383]: [INFO]: Initialize a backup
session with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:07:59 topspin-cc ib_sm.x[1383]: [INFO]: Session initialization
failed with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:09:59 topspin-cc ib_sm.x[1383]: [INFO]: Initialize a backup
session with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:10:09 topspin-cc ib_sm.x[1383]: [INFO]: Session initialization
failed with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM OUT_OF_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:21

Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM OUT_OF_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:22

Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
discovering removed ports

Dec 13 14:12:07 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
multicast membership change

Dec 13 14:12:09 topspin-cc ib_sm.x[1383]: [INFO]: Session not initiated:
Cold Sync Limit exceeded for Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:12:18 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for read, err=11, t1=1, t2=0

Dec 13 14:12:22 topspin-cc last message repeated 4 times

Dec 13 14:12:36 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
discovering new ports

Dec 13 14:12:37 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM IN_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:21

Dec 13 14:12:37 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM IN_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:22

13 14:12:3

Dec 13 14:12:38 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
multicast membership change

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

Dec 13 14:13:28 topspin-cc chassis_mgr.x[1084]: [WARN]: tsIpcMessageSend
failed, fd=28, vp=2, err=104, Connection reset by peer

Dec 13 14:13:39 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:13:39 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:13:40 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:13:46 topspin-cc web_agent.x[1370]: [INFO]: ipc: select(fd=3)
failed for read, err=11, t1=10, t2=0

Dec 13 14:13:50 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:13:50 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:13:50 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:14:18 topspin-cc ib_sm.x[1383]: [INFO]: Session not initiated:
Cold Sync Limit exceeded for Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:14:38 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:14:38 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:14:40 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:14:49 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:14:50 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:14:50 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:15:00 topspin-cc web_agent.x[1370]: [INFO]: ipc: select(fd=3)
failed for read, err=11, t1=10, t2=0

Dec 13 14:15:00 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:15:00 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:15:00 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

 
It looks like some of the log entries are incomplete.

I think it is a switch related issue: first of all because of the strange
format of the logs, and second because when this error occurs in the switch,
no SRP communication is possible on either of the IB hosts. I already tried
increasing the Node timeout, and set RENICE_IB_MAD to yes as described in
this thread:
http://lists.openfabrics.org/pipermail/general/2007-May/036465.html. But
this didn't help.

This issue occurs randomly.  So it isn't easily reproduced.

Does anybody have an idea what went wrong?

 
Thanks in advance!

 
Jeroen Van Aken

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/6e008ff8/attachment.html>

From sweitzen at cisco.com  Wed Dec 19 09:31:39 2007
From: sweitzen at cisco.com (Scott Weitzenkamp (sweitzen))
Date: Wed, 19 Dec 2007 09:31:39 -0800
Subject: [ofa-general] ***SPAM*** SFS 3012 SRP problem
In-Reply-To: <040b01c8424f$04e03030$0ea09090$@vanaken@intec.ugent.be>
References: <040b01c8424f$04e03030$0ea09090$@vanaken@intec.ugent.be>
Message-ID: <A15335FBE9BD2449AF2C9EF3D1EB8EA304BCA334@xmb-sjc-216.amer.cisco.com>

If you have a Cisco supoport contract, you should open a case with the
Cisco TAC.
 
What kind of FC storage are you using?
 
The chassis syslog message show the host is unresponsive (the
OUT_SERVICE and IN_SERVICE message).  Do the timing of these messages
match the ib_srp messages on the host?
 
Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems


________________________________

	From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jeroen Van
Aken
	Sent: Wednesday, December 19, 2007 6:54 AM
	To: general at lists.openfabrics.org
	Subject: [ofa-general] ***SPAM*** SFS 3012 SRP problem
	
	
	Hello

	 
	We are doing some SRP tests with the Cisco SFS 3012 Gateway. We
connected 4 hosts, each with 2 infiniband cables on one dual infiniband
card to the SFS3012 gateway. The gateway is also connected to our fibre
channel storage.  The ofed used is OFED-1.3-beta2 on each of the hosts.
The infiniband cards used are InfiniBand: Mellanox Technologies MT25208
InfiniHost III Ex (rev a0) and  Mellanox Technologies MT23108 InfiniHost
(rev a1) cards.

	When generating heavy load over the switch (by reading from our
FC storage over all the luns simultaneously), we sometimes get the
following errors:

	On the hosts: 

	 
	Dec 13 13:07:54 gpfs4n1 syslog-ng[8212]: STATS: dropped 0

	Dec 13 13:20:26 gpfs4n1 run_srp_daemon[8422]: failed srp_daemon:
[HCA=mthca0] [port=1] [exit status=110]. Will try to restart srp_daemon
periodically. No mor

	e warnings will be issued in the next 7200 seconds if the same
problem repeats

	Dec 13 13:20:27 gpfs4n1 run_srp_daemon[8428]: starting
srp_daemon: [HCA=mthca0] [port=1]

	Dec 13 14:01:20 gpfs4n1 sshd[8539]: Accepted
keyboard-interactive/pam for root from 172.16.0.18 port 3545 ssh2

	Dec 13 14:07:55 gpfs4n1 syslog-ng[8212]: STATS: dropped 0

	Dec 13 14:13:01 gpfs4n1 syslog-ng[8212]: Changing permissions on
special file /dev/xconsole

	Dec 13 14:13:01 gpfs4n1 syslog-ng[8212]: Changing permissions on
special file /dev/tty10

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed send status 12

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed send status 12

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

	 
	On the switch ts_log

	**************************************SWITCH
LOG***************************************************************

	Dec 13 14:04:30 topspin-cc ib_sm.x[1357]: [INFO]: Configuration
caused by multicast membership change

	Dec 13 14:05:49 topspin-cc ib_sm.x[1383]: [INFO]: Session not
initiated: Cold Sync Limit exceeded for Standby SM guid
00:05:ad:00:00:08:94:5d

	Dec 13 14:07:49 topspin-cc ib_sm.x[1383]: [INFO]: Initialize a
backup session with Standby SM guid 00:05:ad:00:00:08:94:5d

	Dec 13 14:07:59 topspin-cc ib_sm.x[1383]: [INFO]: Session
initialization failed with Standby SM guid 00:05:ad:00:00:08:94:5d

	Dec 13 14:09:59 topspin-cc ib_sm.x[1383]: [INFO]: Initialize a
backup session with Standby SM guid 00:05:ad:00:00:08:94:5d

	Dec 13 14:10:09 topspin-cc ib_sm.x[1383]: [INFO]: Session
initialization failed with Standby SM guid 00:05:ad:00:00:08:94:5d

	Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM
OUT_OF_SERVICE trap for
GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:21

	Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM
OUT_OF_SERVICE trap for
GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:22

	Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Configuration
caused by discovering removed ports

	Dec 13 14:12:07 topspin-cc ib_sm.x[1357]: [INFO]: Configuration
caused by multicast membership change

	Dec 13 14:12:09 topspin-cc ib_sm.x[1383]: [INFO]: Session not
initiated: Cold Sync Limit exceeded for Standby SM guid
00:05:ad:00:00:08:94:5d

	Dec 13 14:12:18 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc:
select(fd=28) failed for read, err=11, t1=1, t2=0

	Dec 13 14:12:22 topspin-cc last message repeated 4 times

	Dec 13 14:12:36 topspin-cc ib_sm.x[1357]: [INFO]: Configuration
caused by discovering new ports

	Dec 13 14:12:37 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM
IN_SERVICE trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:21

	Dec 13 14:12:37 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM
IN_SERVICE trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:22

	13 14:12:3

	Dec 13 14:12:38 topspin-cc ib_sm.x[1357]: [INFO]: Configuration
caused by multicast membership change

	13 14:12:3

	13 14:12:3

	13 14:12:3

	13 14:12:3

	13 14:12:3

	13 14:12:3

	13 14:12:3

	13 14:12:3

	13 14:12:3

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	13 14:12:4

	Dec 13 14:13:28 topspin-cc chassis_mgr.x[1084]: [WARN]:
tsIpcMessageSend failed, fd=28, vp=2, err=104, Connection reset by peer

	Dec 13 14:13:39 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc:
select(fd=28) failed for write, err=11, t1=10, t2=0

	Dec 13 14:13:39 topspin-cc chassis_mgr.x[1084]: [INFO]:
tsIpcMessageSend failed, fd=28, vp=2, err=11, Resource temporarily
unavailable

	Dec 13 14:13:40 topspin-cc snmp_agent.x[1208]: [INFO]: ipc:
select(fd=5) failed for read, err=11, t1=10, t2=0

	Dec 13 14:13:46 topspin-cc web_agent.x[1370]: [INFO]: ipc:
select(fd=3) failed for read, err=11, t1=10, t2=0

	Dec 13 14:13:50 topspin-cc snmp_agent.x[1208]: [INFO]: ipc:
select(fd=5) failed for read, err=11, t1=10, t2=0

	Dec 13 14:13:50 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc:
select(fd=28) failed for write, err=11, t1=10, t2=0

	Dec 13 14:13:50 topspin-cc chassis_mgr.x[1084]: [INFO]:
tsIpcMessageSend failed, fd=28, vp=2, err=11, Resource temporarily
unavailable

	Dec 13 14:14:18 topspin-cc ib_sm.x[1383]: [INFO]: Session not
initiated: Cold Sync Limit exceeded for Standby SM guid
00:05:ad:00:00:08:94:5d

	Dec 13 14:14:38 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc:
select(fd=28) failed for write, err=11, t1=10, t2=0

	Dec 13 14:14:38 topspin-cc chassis_mgr.x[1084]: [INFO]:
tsIpcMessageSend failed, fd=28, vp=2, err=11, Resource temporarily
unavailable

	Dec 13 14:14:40 topspin-cc snmp_agent.x[1208]: [INFO]: ipc:
select(fd=5) failed for read, err=11, t1=10, t2=0

	Dec 13 14:14:49 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc:
select(fd=28) failed for write, err=11, t1=10, t2=0

	Dec 13 14:14:50 topspin-cc chassis_mgr.x[1084]: [INFO]:
tsIpcMessageSend failed, fd=28, vp=2, err=11, Resource temporarily
unavailable

	Dec 13 14:14:50 topspin-cc snmp_agent.x[1208]: [INFO]: ipc:
select(fd=5) failed for read, err=11, t1=10, t2=0

	Dec 13 14:15:00 topspin-cc web_agent.x[1370]: [INFO]: ipc:
select(fd=3) failed for read, err=11, t1=10, t2=0

	Dec 13 14:15:00 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc:
select(fd=28) failed for write, err=11, t1=10, t2=0

	Dec 13 14:15:00 topspin-cc chassis_mgr.x[1084]: [INFO]:
tsIpcMessageSend failed, fd=28, vp=2, err=11, Resource temporarily
unavailable

	Dec 13 14:15:00 topspin-cc snmp_agent.x[1208]: [INFO]: ipc:
select(fd=5) failed for read, err=11, t1=10, t2=0

	 
	It looks like some of the log entries are incomplete.

	I think it is a switch related issue: first of all because of
the strange format of the logs, and second because when this error
occurs in the switch, no SRP communication is possible on either of the
IB hosts. I already tried increasing the Node timeout, and set
RENICE_IB_MAD to yes as described in this thread:
http://lists.openfabrics.org/pipermail/general/2007-May/036465.html. But
this didn't help.

	This issue occurs randomly.  So it isn't easily reproduced.

	Does anybody have an idea what went wrong?

	 
	Thanks in advance!

	 
	Jeroen Van Aken

	 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071219/141fc508/attachment.html>

From mshefty at ichips.intel.com  Wed Dec 19 09:57:27 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Wed, 19 Dec 2007 09:57:27 -0800
Subject: [ofa-general] Re: some questions on stale connection handling
	at the	IB CM
In-Reply-To: <4768FE8D.5070408@voltaire.com>
References: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>	<000301c840e7$201af600$9b37170a@amr.corp.intel.com>	<47678A90.8060804@voltaire.com>	<000101c84199$8f333a40$c1d4180a@amr.corp.intel.com>
	<4768FE8D.5070408@voltaire.com>
Message-ID: <47695B87.5000209@ichips.intel.com>

> So in the case of lost DREQ etc, in cm_match_req() we will pass the 
> checking for duplicate REQs but fall in the check for stale connections 
> and it can happen in endless loop? this seems like a bug to me.

This problem isn't limited to stale connections.  If a client tries to 
connect, gets a reject for whatever reason, ignores the reject, then 
tries to reconnect with the same parameters, then they've put themselves 
into an endless loop.

> Yes, this seems to be able to solve the keep-alive thing in a generic 
> fashion for all ULPs using the IB CM, will you be able to look on this 
> during the next weeks or so?

This method can be used by apps today.  The only enhancement that I can 
see being made is having the CM automatically send the messages at 
regular intervals.  But I hesitate to add this to the CM since it 
doesn't have knowledge of traffic occurring over the QP, and may 
interfere with the app wanted to actually change alternate path information.

- Sean


From mshefty at ichips.intel.com  Wed Dec 19 10:16:34 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Wed, 19 Dec 2007 10:16:34 -0800
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <4769019C.10602@voltaire.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com>
Message-ID: <47696002.4030903@ichips.intel.com>

> With you below comment of "CM needs to know the connection model 
> selected by the app" I am somehow confused. With reading your other 
> comments, I see two options here based on whether the implementation 
> differentiate between peer-to-peer SIDs to client/server SIDs:
> 
> if there's no difference, then also in the peer-to-peer model, the 
> application must first tell the CM to listen on a SID and its up to the 
> CM to break the symmetry and decide who sends the REP and who ignores 
> the REQ.
> 
> if there is a diff, then peer-to-peer SIDs are in a different domain 
> then client/server SIDs.

I didn't follow this.

To add to my comments on the CM API, struct ib_cm_req_param, which is 
used to send the REQ, includes service_id and peer_to_peer fields.  The 
latter is a boolean used by the CM to distinguish if incoming REQs can 
be matched with the outgoing REQ.

Peer to peer SIDs are in a different domain than client/server SIDs, and 
the peer_to_peer field is used to indicate which domain a SID is in.

> Why there should be a difference between the rdma-cm to the cm? if in 
> the cm you have a model without API change, wouldn't it apply also to 
> the rdma-cm?

The rdma_cm does not know how to set the peer_to_peer field in the 
ib_cm_req_param.  It sets this field to 0 today.

> I think that in the MPI world each rank gets a SID from the local CM and 
> they exchange the SIDs out-of-band, then connections are opened. If its 
> a connection-on-demand scheme, then when ever the rank process calls 
> mpi_send() to peer for which the local MPI library does not have a 
> connection, it tries to connect. So if this happens "at once" between 
> some pair of ranks, there should be a way to form one connection out of 
> these two connecting requests. My thinking/motivation is that support of 
> this scheme should be in the IB stack (cm and rdma-cm) level and not in 
> the specific MPI implementation level.

Are the out of band connections used by MPI formed using client/server 
or peer to peer?  I believe that Intel MPI has each rank listen for 
connections from the ranks below it using client/server.

There are a couple of problems with the peer to peer model.  First, 
unless the connections occur at exactly the same time, they miss 
connecting (rejected with invalid SID).  Second, if multiple peer to 
peer connections need to form between the same pair of nodes, things can 
go screwy (that's the technical term) trying to match up the peer requests.

- Sean


From fuchsia at wk-02.net  Wed Dec 19 10:19:18 2007
From: fuchsia at wk-02.net (fuchsia at wk-02.net)
Date: 20 Dec 2007 03:19:18 +0900
Subject: [ofa-general] =?utf-8?b?4peG44Kr44Op44OX44Op44GL44KJ44Gu44GK55+l?=
 =?utf-8?b?44KJ44Gb44CC?=
Message-ID: <20071219181918.99852.qmail@fuchsia.wk-02.net>


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

　このメールマガジンは、
　[懸賞タイム] [イーチャンスTV] [懸賞七福神]
  のいずれかの企画のご参加頂いたお客様へお送りしております。

　今後配信が不要なお客様は、
  http://colordance.wk-02.net/stop.php?stop=openib-general%40openib.org
　より、購読解除をお願い致します。　

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━


こんにちは、カラプラの鈴木です。


この度は、当店のメルマガを

ご購読頂くことになりまして誠にありがとうございます。


当社のご紹介をさせて頂きますと、

もともと、35年という長い間、

靴の問屋として活動しておりました。


しかし７年前に、新たな試みとして、

インターネット通販という事業を始めさせて頂き、

現在『カラプラ』というインターネットの店舗で

お客様に、お得な商品をご紹介させて頂いております。


この度、メルマガをお読み頂く事になったお客様への

感謝の気持ちと致しまして

特別な企画をご用意させて頂きました。


先ほどもご紹介したように、

当社はもともと靴の問屋です。


靴のメーカー様にこの気持ちをお伝えしたところ、

新作ブーツを、とってもお買い得なお値段で

ご用意させて頂く事が出来ました。


もちろん、こんな値段で販売すれば赤字です。


しかし、メーカー様・当社とも、赤字を覚悟して、

仕入れ原価を割ってのご販売です。


お客様の好みに合うように、

１０種類ほどご用意しております。


どのブーツも一律９９９円になりますので、

お客様にピッタリのブーツをお探し下さいね。


本来、9,800円で販売されている新作ブーツを、

全商品、特別価格９９９円にてご提供させて頂きます。

http://www.colordance.jp/category/499.html


これからも、どうぞよろしくお願いします。

寒い時期ですが、くれぐれもお体にお気をつけ下さいませ。


From mshefty at ichips.intel.com  Wed Dec 19 10:28:53 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Wed, 19 Dec 2007 10:28:53 -0800
Subject: [ofa-general] SA: Retrieving multiple paths between SGID and DGID
In-Reply-To: <c8028d330712190502q21a867abr6ccdd6e26cabec5c@mail.gmail.com>
References: <c8028d330712190502q21a867abr6ccdd6e26cabec5c@mail.gmail.com>
Message-ID: <476962E5.9040608@ichips.intel.com>

>         paths_per_dest = 8  (I have set OpenSM LMC=3)
>         lookup_method = 0 (Round robin way to return paths)
> 
> in an intent to retrieve multiple paths. On a single run of my module, 
> it always returned me the same path on multiple calls to 
> ib_sa_path_rec_get. When my  module was unloaded and loaded again, it 
> returned me different path than first run. But, in this second run on 
> successive calls to ib_sa_path_rec_get, it again kept returning me the 
> same path as obtained on the first run. Thus, it looks like successive 
> calls to ib_sa_path_rec_get is not returning me different path on each 
> call.
> 
> If I have to retrieve all 8 paths in a single call, is there any way 
> that I can accomplish the same?

As Hal pointed out, this is using the SA caching.  If you want to obtain 
all paths with a single call, you will need to send your own GetTable 
MAD to the SA using the ib_mad or libibumad interface.

 From your description, either the cache that you're using isn't being 
used or has a bug.  Does this version have a 'refresh' parameter?  If 
so, can you try writing a '1' to that and see if the behavior changes? 
You could also try changing the lookup_method to random (1) to see if 
you see other path records.

- Sean


From dwskeenlawm at skeenlaw.com  Wed Dec 19 10:44:24 2007
From: dwskeenlawm at skeenlaw.com (Tracy Mcleod)
Date: Thu, 20 Dec 2007 01:44:24 +0700
Subject: [ofa-general] Medications that you need.
Message-ID: <01c842a9$d9244c00$19d57bde@dwskeenlawm>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/OfeliaMcneil89/

We provide confidential and secure purchase! 


From 9z5wxdy9s0 at yahoo.com  Wed Dec 19 11:34:36 2007
From: 9z5wxdy9s0 at yahoo.com (Terra Timmons)
Date: Thu, 20 Dec 2007 04:34:36 +0900
Subject: [ofa-general] Your profile
Message-ID: <01c842c1$9ff7d600$2a049879@9z5wxdy9s0>

Hello! I am tired this afternoon. I am nice girl that would like to chat with you. Email me at Anna at ShineBal.info only, because I am using my friend's email to write this. Hope you wanna see my pics.


From akepner at sgi.com  Wed Dec 19 11:58:39 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Wed, 19 Dec 2007 11:58:39 -0800
Subject: [ofa-general] smpquery regression in 1.3-rc1
Message-ID: <20071219195839.GS412@sgi.com>


We're seeing a regression in smpquery from alpha2 to rc1. 

For example, with alpha2 I get:
grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
# Node info: Lid 3
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Channel Adapter
NumPorts:........................2
SystemGuid:......................0x00066a009800737c
Guid:............................0x00066a009800737c
PortGuid:........................0x00066a01a000737c
PartCap:.........................64
DevId:...........................0x6278
Revision:........................0x000000a0
LocalPort:.......................2
VendorId:........................0x00066a
grommit:~ # 


And with rc1, I get:
grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
ibwarn: [5650] ib_path_query: sa call path_query failed
smpquery: iberror: failed: can't resolve destination port 0x66a01a000737c
grommit:~ #  

But using a LID works fine:
grommit:~ # smpquery nodeinfo 3
# Node info: Lid 3
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Channel Adapter
NumPorts:........................2
SystemGuid:......................0x00066a009800737c
Guid:............................0x00066a009800737c
PortGuid:........................0x00066a01a000737c
PartCap:.........................64
DevId:...........................0x6278
Revision:........................0x000000a0
LocalPort:.......................2
VendorId:........................0x00066a
grommit:~ # 

Strangest of all, running it under strace also works:
grommit:~ # strace smpquery -G nodeinfo 0x66a01a000737c > /tmp/smpquery.out 
.....
grommit:~ # cat /tmp/smpquery.out
# Node info: Lid 3
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Channel Adapter
NumPorts:........................2
SystemGuid:......................0x00066a009800737c
Guid:............................0x00066a009800737c
PortGuid:........................0x00066a01a000737c
PartCap:.........................64
DevId:...........................0x6278
Revision:........................0x000000a0
LocalPort:.......................2
VendorId:........................0x00066a
grommit:~ #

Some weird race condition...

Anyone else seeing the same?

-- 
Arthur


From changquing.tang at hp.com  Wed Dec 19 12:01:51 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Wed, 19 Dec 2007 20:01:51 +0000
Subject: [ofa-general] XRC cleanup order issue
In-Reply-To: <200712160827.04519.jackm@dev.mellanox.co.il>
References: <D89C2C212795564B837FA1665CAE02990FDE143AE8@G5W0278.americas.hpqcorp.net>
	<200712160827.04519.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FE215EE5C@G5W0278.americas.hpqcorp.net>


Hi, Jack:

        If I need to use XRC domain for communication between two ranks on the same node, How do I do it ?
The reason I ask is that HP-MPI has a NIC mode where no-shared memory is used.

Thanks.
--CQ

> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Sunday, December 16, 2007 12:27 AM
> To: general at lists.openfabrics.org
> Cc: Tang, Changqing
> Subject: Re: [ofa-general] XRC cleanup order issue
>
> On Wednesday 12 December 2007 17:24, Tang, Changqing wrote:
> >
> > HI,
> >         This question is mainly for Mellanox engineers.
> >
> >         With XRC, the rank who create the QP which is used for
> > transport to all ranks on that node can NOT exit first if
> other ranks
> > are still using the transport. This restriction is a
> problem for our dynamic process definition where any rank
> could die with any reason, but without teardown the whole application.
> >
> >         I am thinking about shared memory usage, where the creator
> > does not have to keep alive while other processes can still
> use it, untill the last process exits, then the system will
> cleanup the shared memory.
> >
> >         Can't XRC mimic the shared memory behavior ?
> >
> There is an issue that the QP needs to be associated with a
> protection domain (i.e., UAR area), which is unique per user process.
>
> One possibility is to have a separate process per host per
> job (XRC domain) create the XRC QPs on the receiving side.
> There still would be the issue of what happens if that
> process somehow dies prematurely.
>
> We'll examine the issue and see if there is some other solution.
>
> - Jack
>


From hrosenstock at xsigo.com  Wed Dec 19 12:10:57 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Wed, 19 Dec 2007 12:10:57 -0800
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <20071219195839.GS412@sgi.com>
References: <20071219195839.GS412@sgi.com>
Message-ID: <1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-19 at 11:58 -0800, akepner at sgi.com wrote:
> We're seeing a regression in smpquery from alpha2 to rc1. 
> 
> For example, with alpha2 I get:
> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
> # Node info: Lid 3
> BaseVers:........................1
> ClassVers:.......................1
> NodeType:........................Channel Adapter
> NumPorts:........................2
> SystemGuid:......................0x00066a009800737c
> Guid:............................0x00066a009800737c
> PortGuid:........................0x00066a01a000737c
> PartCap:.........................64
> DevId:...........................0x6278
> Revision:........................0x000000a0
> LocalPort:.......................2
> VendorId:........................0x00066a
> grommit:~ # 
> 
> 
> And with rc1, I get:
> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
> ibwarn: [5650] ib_path_query: sa call path_query failed
> smpquery: iberror: failed: can't resolve destination port 0x66a01a000737c
> grommit:~ #  
> 
> But using a LID works fine:
> grommit:~ # smpquery nodeinfo 3
> # Node info: Lid 3
> BaseVers:........................1
> ClassVers:.......................1
> NodeType:........................Channel Adapter
> NumPorts:........................2
> SystemGuid:......................0x00066a009800737c
> Guid:............................0x00066a009800737c
> PortGuid:........................0x00066a01a000737c
> PartCap:.........................64
> DevId:...........................0x6278
> Revision:........................0x000000a0
> LocalPort:.......................2
> VendorId:........................0x00066a
> grommit:~ # 
> 
> Strangest of all, running it under strace also works:
> grommit:~ # strace smpquery -G nodeinfo 0x66a01a000737c > /tmp/smpquery.out 
> .....
> grommit:~ # cat /tmp/smpquery.out
> # Node info: Lid 3
> BaseVers:........................1
> ClassVers:.......................1
> NodeType:........................Channel Adapter
> NumPorts:........................2
> SystemGuid:......................0x00066a009800737c
> Guid:............................0x00066a009800737c
> PortGuid:........................0x00066a01a000737c
> PartCap:.........................64
> DevId:...........................0x6278
> Revision:........................0x000000a0
> LocalPort:.......................2
> VendorId:........................0x00066a
> grommit:~ #
> 
> Some weird race condition...
> 
> Anyone else seeing the same?

-G requires a SA path record lookup so this could be an issue with that
timing out in some cases (assuming the port is active and the SM is
operational).

-- Hal


From akstcamnestylombardiamnsdgs at amnestylombardia.org  Wed Dec 19 12:25:48 2007
From: akstcamnestylombardiamnsdgs at amnestylombardia.org (Emmanuel Rudolph)
Date: Wed, 19 Dec 2007 12:25:48 -0800
Subject: [ofa-general] Saw your profile
Message-ID: <01c8423a$49539550$87c8ee4d@akstcamnestylombardiamnsdgs>

hiya
I read your profile on-line a few minutes ago and you seem intresting

email me at  Nik at GloryWayChurchx.info and I will reply with a Picture and Info about me right away.

Talk to you soon


From dave at thedillows.org  Wed Dec 19 13:52:01 2007
From: dave at thedillows.org (David Dillow)
Date: Wed, 19 Dec 2007 16:52:01 -0500
Subject: [ofa-general] [RFC PATCH] IB/srp: respect target credit limit
In-Reply-To: <20071218151658.GA9142@ornl.gov>
References: <1197929000.31600.19.camel@lap75545.ornl.gov>
	<adar6hk35om.fsf@cisco.com>  <20071218151658.GA9142@ornl.gov>
Message-ID: <1198101121.5649.13.camel@lap75545.ornl.gov>


On Tue, 2007-12-18 at 10:16 -0500, Dave Dillow wrote:
> On Mon, Dec 17, 2007 at 09:58:17PM -0800, Roland Dreier wrote:
> >  > The SRP initiator will currently send requests, even if we have no
> >  > credits available. The results of sending extra requests is
> >  > vendor-specific, but on some devices, overrunning our credits will cost
> >  > us 85% of peak performance -- e.g. 100 MB/s vs 720 MB/s. Other devices
> >  > may just drop them.
> > 
> > I guess this happens because a target sometimes completes commands
> > with a value of 0 for req_lim delta in the response?
> 
> I didn't instrument the delta in the responses, I just noticed that
> req_lim_zero was growing very, very quickly when the performance dropped.
> I added a sysfs entry to give a view of req_lim, and I saw it get to -3
> several times, so I expect we're actually getting some negative deltas back.

My guess about negative deltas was incorrect -- here's the breakdown of
8192 requests (trying to keep 40 in flight, but max credits is ~33):

rsp delta 0		2369
rsp delta 1		3905
rsp delta 2		1503
rsp delta 3		379
rsp delta 4		36

This was with a steady stream of 1MB I/O requests.

> >  > +	.max_host_blocked		= 1,
> > 
> > Documentation of what max_host_blocked does is pretty scant.  What
> > does this change do?

I've dropped this change -- when looking at the code that handles
scsi_host->host_blocked, I missed that it gets reset after each SCSI
command completes, so the mid-layer doesn't wait until the queue clears
to start dispatching commands again.

max_host_blocked/host_blocked only really come into play stall-wise when
we return HOST_BUSY and there are no commands in flight. Then, it will
block attempts to run the queue until scsi_host->host_blocked counts
down to zero. The default means it will take 7 commands to clear this
state, but we should not enter it in the first place. It also should not
be a behavior change from the existing code.

I'll send the new patch under separate cover -- with the queue_len
parameter split out, and an additional patch to let us easily grow to
1MB I/Os and more.

Dave


From dillowda at ornl.gov  Wed Dec 19 14:08:43 2007
From: dillowda at ornl.gov (David Dillow)
Date: Wed, 19 Dec 2007 17:08:43 -0500
Subject: [ofa-general] [PATCH 1/3] IB/srp: respect target credit limit
Message-ID: <1198102123.5649.32.camel@lap75545.ornl.gov>

The SRP initiator will currently send requests, even if we have no
credits available. The results of sending extra requests is vendor
specific, but on some devices, overrunning our credits will cost us 85%
of peak performance -- e.g. 100 MB/s vs 720 MB/s. Other devices may just
drop them.

This patch will tell the SCSI mid-layer to queue requests if there are
less than two credits remaining, and will not issue a task management
request if there are no credits remaining. The mid-layer will retry the
queued command once an outstanding command completes.

This removes the unlikely() in __srp_get_tx_iu(), as it is not at all
unlikely to hit this limit under heavy load.

Signed-off-by: David Dillow <dillowda at ornl.gov>
---
 ib_srp.c |   16 +++++++++-------
 ib_srp.h |    5 +++++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 950228f..17ad144 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -930,13 +930,18 @@ static int srp_post_recv(struct srp_target_port *target)
  * req_lim and tx_head.  Lock cannot be dropped between call here and
  * call to __srp_post_send().
  */
-static struct srp_iu *__srp_get_tx_iu(struct srp_target_port *target)
+static struct srp_iu *__srp_get_tx_iu(struct srp_target_port *target,
+					enum srp_request_type req_type)
 {
+	s32 min = (req_type == SRP_REQ_TASK_MGMT) ? 1 : 2;
+
 	if (target->tx_head - target->tx_tail >= SRP_SQ_SIZE)
 		return NULL;
 
-	if (unlikely(target->req_lim < 1))
+	if (target->req_lim < min) {
 		++target->zero_req_lim;
+		return NULL;
+	}
 
 	return target->tx_ring[target->tx_head & SRP_SQ_SIZE];
 }
@@ -993,7 +998,7 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd,
 		return 0;
 	}
 
-	iu = __srp_get_tx_iu(target);
+	iu = __srp_get_tx_iu(target, SRP_REQ_NORMAL);
 	if (!iu)
 		goto err;
 
@@ -1180,9 +1185,6 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 
 			target->max_ti_iu_len = be32_to_cpu(rsp->max_ti_iu_len);
 			target->req_lim       = be32_to_cpu(rsp->req_lim_delta);
-
-			target->scsi_host->can_queue = min(target->req_lim,
-							   target->scsi_host->can_queue);
 		} else {
 			printk(KERN_WARNING PFX "Unhandled RSP opcode %#x\n", opcode);
 			target->status = -ECONNRESET;
@@ -1283,7 +1285,7 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 
 	init_completion(&req->done);
 
-	iu = __srp_get_tx_iu(target);
+	iu = __srp_get_tx_iu(target, SRP_REQ_TASK_MGMT);
 	if (!iu)
 		goto out;
 
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index e3573e7..4a3c1f3 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -79,6 +79,11 @@ enum srp_target_state {
 	SRP_TARGET_REMOVED
 };
 
+enum srp_request_type {
+	SRP_REQ_NORMAL,
+	SRP_REQ_TASK_MGMT,
+};
+
 struct srp_device {
 	struct list_head	dev_list;
 	struct ib_device       *dev;


From dave at thedillows.org  Wed Dec 19 14:08:56 2007
From: dave at thedillows.org (David Dillow)
Date: Wed, 19 Dec 2007 17:08:56 -0500
Subject: [ofa-general] [PATCH 2/3] IB/srp: allow user to control host queue
	length
Message-ID: <1198102136.5649.34.camel@lap75545.ornl.gov>

It can be useful for tuning and balancing for the user to restrict the
maximum number of commands that can be queued for a particular SRP
target, similar to the tuning parameter "max_cmds_per_lun" allows per
device on this target.

This patch adds "queue_len" to allow that flexibility.

Signed-off-by: David Dillow <dillowda at ornl.gov>
---
 ib_srp.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 17ad144..9cb77e6 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1612,6 +1612,7 @@ enum {
 	SRP_OPT_MAX_CMD_PER_LUN	= 1 << 6,
 	SRP_OPT_IO_CLASS	= 1 << 7,
 	SRP_OPT_INITIATOR_EXT	= 1 << 8,
+	SRP_OPT_QUEUE_LEN	= 1 << 9,
 	SRP_OPT_ALL		= (SRP_OPT_ID_EXT	|
 				   SRP_OPT_IOC_GUID	|
 				   SRP_OPT_DGID		|
@@ -1629,6 +1630,7 @@ static match_table_t srp_opt_tokens = {
 	{ SRP_OPT_MAX_CMD_PER_LUN,	"max_cmd_per_lun=%d" 	},
 	{ SRP_OPT_IO_CLASS,		"io_class=%x"		},
 	{ SRP_OPT_INITIATOR_EXT,	"initiator_ext=%s"	},
+	{ SRP_OPT_QUEUE_LEN,		"queue_len=%d"		},
 	{ SRP_OPT_ERR,			NULL 			}
 };
 
@@ -1756,6 +1758,14 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target)
 			kfree(p);
 			break;
 
+		case SRP_OPT_QUEUE_LEN:
+			if (match_int(args, &token)) {
+				printk(KERN_WARNING PFX "bad queue_len parameter '%s'\n", p);
+				goto out;
+			}
+			target->scsi_host->can_queue = min(token, SRP_SQ_SIZE);
+			break;
+
 		default:
 			printk(KERN_WARNING PFX "unknown parameter or missing value "
 			       "'%s' in target creation request\n", p);


From dillowda at ornl.gov  Wed Dec 19 14:09:15 2007
From: dillowda at ornl.gov (David Dillow)
Date: Wed, 19 Dec 2007 17:09:15 -0500
Subject: [ofa-general] [PATCH 3/3] IB/srp: use scatter gather chaining
Message-ID: <1198102155.5649.37.camel@lap75545.ornl.gov>

By default, the SCSI mid-layer seems to send down 512KB requests
(sg_tablesize = 256), with some requests occasionally combined. By
allowing the mid-layer to chain requests, we can easily grow to 1024KB
or larger -- I've tested 4096KB I/O requests with no problems.

Signed-off-by: David Dillow <dillowda at ornl.gov>
---
I looked through the DMA paths on the hardware drivers to ensure they
could take advantage of the SG chaining, and it seems that every one
except iPath uses the system's DMA routines, which have been converted
to handle chaining. iPath looks like it should be OK, but I have no way
to test it.

This patch was essential in allowing me to max out bandwidth to my disk
arrays.

 ib_srp.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 9cb77e6..4f58f94 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1545,6 +1545,7 @@ static struct scsi_host_template srp_template = {
 	.this_id			= -1,
 	.cmd_per_lun			= SRP_SQ_SIZE,
 	.use_clustering			= ENABLE_CLUSTERING,
+	.use_sg_chaining		= ENABLE_SG_CHAINING,
 	.shost_attrs			= srp_host_attrs
 };
 

From rdreier at cisco.com  Wed Dec 19 19:52:42 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 19 Dec 2007 19:52:42 -0800
Subject: [ofa-general] [PATCH] Use round_jiffies() in ehca timer
In-Reply-To: <20071015054907.GE3257@kryten> (Anton Blanchard's message of "Mon,
	15 Oct 2007 00:49:07 -0500")
References: <20071015054907.GE3257@kryten>
Message-ID: <ada1w9i10qd.fsf@cisco.com>

 > Use round_jiffies() to align the 1 second timer with other timers and
 > potentially save power by sleeping cores for longer.
 > 
 > Signed-off-by: Anton Blanchard <anton at samba.org>
 > ---
 > 
 > diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
 > index 403467f..23000b7 100644
 > --- a/drivers/infiniband/hw/ehca/ehca_main.c
 > +++ b/drivers/infiniband/hw/ehca/ehca_main.c
 > @@ -902,7 +902,7 @@ void ehca_poll_eqs(unsigned long data)
 >  				ehca_process_eq(shca, 0);
 >  		}
 >  	}
 > -	mod_timer(&poll_eqs_timer, jiffies + HZ);
 > +	mod_timer(&poll_eqs_timer, round_jiffies(jiffies + HZ));
 >  	spin_unlock(&shca_list_lock);
 >  }

ehca guys -- this looks fine to me -- any objection to merging for 2.6.25?


From rdreier at cisco.com  Wed Dec 19 20:20:34 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 19 Dec 2007 20:20:34 -0800
Subject: [ofa-general] Re: [PATCH 3/3] ib/cm: add basic performance counters
In-Reply-To: <000301c83092$fe2f99b0$9c98070a@amr.corp.intel.com> (Sean Hefty's
	message of "Mon, 26 Nov 2007 17:15:26 -0800")
References: <000001c83091$2859bce0$9c98070a@amr.corp.intel.com>
	<000301c83092$fe2f99b0$9c98070a@amr.corp.intel.com>
Message-ID: <adawsrayp2l.fsf@cisco.com>

thanks, I applied everything in your for-roland branch to my
for-2.6.25 branch...


From rdreier at cisco.com  Wed Dec 19 20:30:16 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Wed, 19 Dec 2007 20:30:16 -0800
Subject: [ofa-general] Re: [PATCH 3/3] ib/cm: add basic performance
	counters
In-Reply-To: <adawsrayp2l.fsf@cisco.com> (Roland Dreier's message of "Wed,
	19 Dec 2007 20:20:34 -0800")
References: <000001c83091$2859bce0$9c98070a@amr.corp.intel.com>
	<000301c83092$fe2f99b0$9c98070a@amr.corp.intel.com>
	<adawsrayp2l.fsf@cisco.com>
Message-ID: <adasl1yyomf.fsf@cisco.com>

by the way, I had to make cm_class not static, or else a build with
ib_cm and ib_ucm built into the kernel faile... I think that exported
symbols can't be static.

 - R.


From kliteyn at mellanox.co.il  Wed Dec 19 21:12:03 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 20 Dec 2007 07:12:03 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-20:normal completion
Message-ID: <MTLEXCH01VOJwnzsLqI000012b0@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-19
OpenSM git rev = Mon_Dec_17_15:20:43_2007 [9988f459cb81dd025bde8b2dd53b3c551616be0c]
ibutils git rev = Wed_Dec_19_12:06:28_2007 [9961475294fbf1d3782edb8f377a77b13fa80d70]
 
 
Total=520  Pass=518  Fail=2
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
11 LidMgr IS3-128.topo

Failures:
2 LidMgr IS3-128.topo


From a-17-m at aerialwave.com  Wed Dec 19 21:54:02 2007
From: a-17-m at aerialwave.com (Guadalupe Robertson)
Date: Thu, 20 Dec 2007 13:54:02 +0800
Subject: [ofa-general] Your profile
Message-ID: <545199033.50034381873338@aerialwave.com>

Hello! I am tired this afternoon. I am nice girl that would like to chat with you. Email me at Ellen at ShineBal.info only, because I am using my friend's email to write this. Wanna see some pictures of me?


From jackm at dev.mellanox.co.il  Wed Dec 19 22:28:45 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Thu, 20 Dec 2007 08:28:45 +0200
Subject: [ofa-general] XRC cleanup order issue
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE215EE5C@G5W0278.americas.hpqcorp.net>
References: <D89C2C212795564B837FA1665CAE02990FDE143AE8@G5W0278.americas.hpqcorp.net>
	<200712160827.04519.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE215EE5C@G5W0278.americas.hpqcorp.net>
Message-ID: <200712200828.46072.jackm@dev.mellanox.co.il>

On Wednesday 19 December 2007 22:01, Tang, Changqing wrote:
> Hi, Jack:
> 
>         If I need to use XRC domain for communication between two ranks on the same node, How do I do it ?
> The reason I ask is that HP-MPI has a NIC mode where no-shared memory is used.
> 
> Thanks.
> --CQ
Same as between 2 ranks on different hosts.  Sending rank needs to have an XRC connection
with some XRC qp on the local node, and needs a target XRC SRQ belonging to the destination rank.

The destination XRC QP and XRC SRQ must belong to the same XRC Domain (I would assume this is the
case anyway, since they are part of the same job -- just mentioning this to be safe).

- Jack


From vlad at dev.mellanox.co.il  Wed Dec 19 23:11:57 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Thu, 20 Dec 2007 09:11:57 +0200
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to select
	sp4 patches for SLES9 kernel with minor versions equal or
	greater	than 305
In-Reply-To: <47670337.6080607@lnxi.com>
References: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>
	<47670337.6080607@lnxi.com>
Message-ID: <476A15BD.1050505@dev.mellanox.co.il>

David B. Anderson wrote:
> I've all of these patches plus the following patch
> 
>    kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch
> 
> My current git repo is
> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
> commit: 6974c285e6fb06264f570f9cf919865bab66c9e6
> 
> My patch that I posted before fixes the kernel configure script so that 
> it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel of 
> 2.6.5-7.308 and above. The configure patch from 
> FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only valid 
> SP4 kernel which is incorrect. I get the same compiler error as before.
> 
> 
> 
> Moshe Kazir wrote:
>>  See patches in the attached message.
>>
>> It was applied by Vlad.
>>
>> Moshe
>>
>> ____________________________________________________________
>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>  
>> Voltaire - The Grid Backbone
>>  
>>  www.voltaire.com
>>
>>  
>>
>> -----Original Message-----
>> From: general-bounces at lists.openfabrics.org
>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David B.
>> Anderson
>> Sent: Saturday, December 15, 2007 3:31 AM
>> To: general at lists.openfabrics.org; vlad at mellanox.co.il
>> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
>> select sp4 patches for SLES9 kernel with minor versions equal or greater
>> than 305
>>
>>
>> Hi,
>>
>> I've created the following patch for OFED 1.2.5.4 to have the kernel for
>>
>> SLES9 SP4 recognized (2.6.5-7.308).
>>
>> Even with the patch I then had two back port patches not apply cleanly 
>> (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched them but 
>> now I'm getting the following compiler errors:
>>
>> In file included from
>> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
>>                  from 
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
>> port/2.6.5_sles9_sp4/include/linux/module.h:4,
>>                  from
>> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
>>                  from 
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
>> port/2.6.5_sles9_sp4/include/linux/device.h:4,
>>                  from 
>> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
>>                  from 
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
>> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
>>                  from 
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
>> /core/addr.c:32:
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
>> port/2.6.5_sles9_sp4/include/linux/sched.h:8: warning: static 
>> declaration for `wait_for_completion_timeout' follows non-static
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
>> /core/addr.c:67: warning: initialization from incompatible pointer type
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
>> /core/addr.c: In function `addr_resolve_remote':
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
>> /core/addr.c:192: error: structure has no member named `idev'
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
>> /core/addr.c:193: error: structure has no member named `idev'
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
>> /core/addr.c:197: error: structure has no member named `idev'
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband
>> /core/addr.c: At top level:
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
>> port/2.6.5_sles9_sp4/include/linux/device.h:48: warning: 
>> `class_create' defined but not used
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
>> port/2.6.5_sles9_sp4/include/linux/device.h:82: warning: 
>> `class_destroy' defined but not used
>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back
>> port/2.6.5_sles9_sp4/include/linux/device.h:108: warning: 
>> `class_device_create' defined but not used
>> make[6]: *** 
>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
>> d/core/addr.o] Error 1
>> make[5]: *** 
>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
>> d/core] Error 2
>> make[4]: *** 
>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban
>> d] Error 2
>> make[3]: *** 
>> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] Error 2
>> make[2]: *** [modules] Error 2
>> make[1]: *** [modules] Error 2
>> make[1]: Leaving directory
>> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
>> make: *** [kernel] Error 2
>>
>> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?
>>
>> Thanks
>>
>>  
>> ------------------------------------------------------------------------
>>
>> Subject:
>> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4
>> From:
>> "Moshe Kazir" <moshek at voltaire.com>
>> Date:
>> Sun, 25 Nov 2007 09:59:26 +0200
>> To:
>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>> <general at lists.openfabrics.org>
>>
>> To:
>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>> <general at lists.openfabrics.org>
>>
>>
>> The attached files do the work.
>>
>> OFED_1.2.5_sles9_sp4_configure.diff  include the changes in the
>> configure file.
>> OFED_1.2.5_sles9_sp4_backport.diff  include the canges requiered in the
>> kernel_patche and kernel_addons directories.
>>
>> Moshe
>> ____________________________________________________________
>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>  
>> Voltaire - The Grid Backbone
>>  
>>  www.voltaire.com
>>

Hi David,
Please try the latest OFED-1.2.5.4-20071219-0824.tgz build on your SLES9SP4.

http://www.openfabrics.org/builds/connectx/OFED-1.2.5.4-20071219-0824.tgz


Thanks,
Vladimir


From pwcrpyizoklzp at alibaba.com  Thu Dec 20 02:03:41 2007
From: pwcrpyizoklzp at alibaba.com (pwcrpyizoklzp at alibaba.com)
Date: Thu, 20 Dec 2007 02:03:41 -0800 (PST)
Subject: [ofa-general] (no subject)
Message-ID: <20071220100342.3B681E6072B@openfabrics.org>

QUIT
Received: from unknown (HELO vlm9764.net) (248.48.195.9)
	by  with SMTP; 20 Dec 2007 10:03:43 -0000
X-Originating-IP: [248.48.195.9]
Date: Thu, 20 Dec 2007 18:03:42 +0800
From: "=?GB2312?B?xqS47yDQrLLEILmry74=?=" <pwcrpyizoklzp at alibaba.com>
To: "openib-general" <openib-general at openib.org>
Subject: =?GB2312?B?xqS477LEwc+hory8yvWhosnosbg=?=
X-Mailer: VolleyMail 6.0[cn]
Mime-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
Content-Type: text/plain;
	charset="GB2312"
Content-Transfer-Encoding: base64

b3BlbmliLWdlbmVyYWyjrMT6usOjug0KMjAwOLXazuW97MnPuqO5+rzKxqS477LEwc+hory8yvWh
osnosbjVucDAu+ENCrLOICDVuSAg0fsgIMfrIMrpDQrKsbzko7oyMDA4xOo21MIxOMjV1sEyMMjV
tdi146O6yc+6o8rAw7PJzLPHICjQy9Llwrc5ObrFKQ0Ktee7sKO6ODYtMjEtNjQ4Mjc4ODm31rv6
ODE4ICAgICANCrSr1eajujg2LTIxLTUxNzE0NjY2ICA2NDgyNjYzMCAgICAgICAgICAgDQpFLW1h
aWw6amlhbmduYW5Ac2h5aHpsLmNvbQ0KwarPtcjLo7q9rcTPMTM4MTgyMjYxNTgNCg0Kob/VucDA
u+HSu8DADQrW97DstaXOu6O6yc+6o8rQxqS477y8yvXQrbvhICAgICAgICAgICAgICAgIMnPuqPN
4r6tw7PJzM7x1bnAwNPQz965q8u+DQrWp7PWtaXOuzogyc+6o8rQv8bRp7y8yvXQrbvhICAgICAg
ICAgICAgICAgINbQufrGpLjvvLDWxtCsuaTStdHQvr/Uug0K0K2w7LWlzrujutbQufrGpLjvuaTS
tdDFz6LW0NDEICAgICAgICAgICAgICC5+rzSxqS479bGxrfWysG/vOC2vbzs0enW0NDEDQogICAg
ICAgICAgufq80sOrxqTWysG/vOC2vbzs0enW0NDEICAgICAgICAgIMirufrWxtCsuaTStdDFz6LW
0NDEDQqz0LDstaXOu6O6yc+6o83ivq3Ds8nMzvHVucDA09DP3rmry74gICAgICAgIMnPuqPRxbvU
1bnAwLf+zvHT0M/euavLvg0KuqPN4rT6wO06INLitPPA+7+owPvM2NfJ0a+5ybfd09DP3rmry74g
IMjVsb5JU0bKws7xvtYgICC6q7n6obbKscnQxqS+36G31NPWvg0KDQqhv7PQsOy5q8u+vPK96Q0K
ysCyqbyvzcXJz7qjzeK+rcOzyczO8dW5wMDT0M/euavLvsrHyc+6o9aqw/u1xLn609DVucDAuavL
vqOstuDE6rPQsOzW0Ln6u6q2q734s/a/2snMxre9u9LXu+G88rPGIruqvbu74SKhotbQufq5+rzK
uaTStbKpwMC74aGiueO9u7vhyc+6o7270tfNxbvhzvG5pNf3o6y7ucO/xOrX6davufrE2sbz0rWz
9r6z1bnAwKGjDQrJz7qj0cW71NW5wMC3/s7x09DP3rmry77Kx8nPuqPK0Lvh1bnQ0NK10K274bvh
1LG1pc671q7Su6OoseC6xTI0N6Opo6yzybmmvtmw7Ln9yv3Krrj2ufq8yrTz0M3VucDAo6zM4rLE
yea8sMakuO+hotCswOChovTDxqShos/ksPyhoru3saOhoruvuaShorniteehosnM0rW1yMHs0/Kh
o7mry769qNPQvfwyMM3yzPXXqNK1wvK80tDFz6K1xMXTtPPK/b7dv+KjrM/qvqHK1cK8wcu5+sTa
zeK1xNeo0rWyybm6ycy6zb6tz/rJzKGjDQqhv8/CvezVubvhzNjJq6O6DQqxvtW5u+HHsMvEvezT
ycnPuqPRxbvU1bnAwLf+zvHT0M/euavLvrbAwaKz0LDso6zU2sewvLi97LPJuaa+2bDstcS7+bSh
yc+jrDIwMDjE6tW5u+G9q9PJysCyqbyvzcXJz7qjzeK+rcOzyczO8dW5wMDT0M/euavLvtPryc+6
o9HFu9TVucDAt/7O8dPQz965q8u+waq6z7PQsOyjrMnutsi6z9f3o6y088GmzbbI66OsyKvD5tH7
x+u5+s3itcS+rc/6ycy6zbLJubrJzKOsx7/Bps3Gvfi9+LP2v9rDs9LXLMq5ss7Vucbz0rWyu7P2
ufrDxb7NxNy08r+quPy24LXEzeLP+sK3vra6zcf+tcChow0Kob/Dvczl0Pu0q7Wlzrs6DQogu9u0
z8akuO/JzM7xzfihos3y0trGpLjvyczO8c34oaIxNjnQrNK118rRts34oaLW0Ln60KzN+KGi1tC5
+sak0KzN+KGi1tC5+rrPs8m47834oaLQrLv61NrP36Gi1tC5+r36va3QrM34oaK3/srOyczH6aGi
1tC5+tCs0rW7pcGqzfihotbQufrQrLv6u6XBqs34oaLW0Ln6xqTDq7270tfN+KGizsLW3dCszfih
otbQufrQrLa8zfihotbQufrQrLa8oaLW0Ln6xqS+3834oaLW0Ln60KyyxM34oaLW0Ln6us+zycak
uO/N+KGi1tC7qtCs0rXN+KGiINbQu6rGpLjv1NrP36Gi1tC5+tCsu/q7pcGqzfihotLXw7PNqKGi
ycy7otbQufqhojMyONCszfi1yA0K16jStdTT1r6juqG2sbG+qcakuO+ht6G21tC5+sakuO+ht6G2
1tDN4tCs0baht6G2zveyv8akuO+ht6G20KzStb3nobehtsnPuqPGpLjvobcNCqH01bnGt7e2zqej
ug0K1sa476Gi1sbQrLv60LXJ6LG41bnH+KO61sa476Gi1sbQrLv60LWhor7bsLH1pbv60LWhosak
uO+807mkyeixuCi18b/Mu/ovtPKx6rv6L8fQuO67+imhos/ksPy7+tC1oaK37NbGyeixuKGit+zH
sLfsuvPV+8DtyeixuKOstefE1Lio1vrJ6LG4vLDWxtCsyfqy+s/foaLF5Lz+tcihow0K0KyyxNW5
x/ijutCsssTQrMHPoaLQrMSjoaLQrOm4oaLQrLPEoaLQrNH5oaLO5b3wxeS8/rrNuKjBz6GiQ0FE
L0NBTc+1zbO1yKGjDQrGpLjvoaK6z7PJuO/Vucf4o7rGpLjvvLC6z7PJxqS476GiUFW476GiUFZD
yMvU7LjvoaLQrMPmuO+jqLK8o6mhosmzt6K476OosryjqaGiz+Sw/Ljvo6iyvKOpoaLO3rfEsryh
orK71q+yvKGi1ebGpMOrwc+hosakuO/D5sHPoaLGpLuv1K3Bz6Gi1K3GpKGisOuzyca3tcihow0K
xqS477uvuaTVucf4o7qx7cPmu+7Q1LzBoaLN0dasvMGhor3+yL6horuvuaTW+rzBoaK809asvMGh
os2/ys6holBVvbqhos3yxNy9uqGjDQrGpLjv1sbGt9W5x/ijusTQ0KyhosWu0KyhotDdz9DQrKGi
zc/QrKGiz+Sw/KGixqS+36GiytbM16GixqTSwqGixqSy3fTDxqS1yA0KDQqhv86q1bnJzMzhuam1
xLf+zvENCjEsINTatPO74c341b7Jz8Pit9HOqrLO1bnG89K11/bSu8TqtcS547jm0Pu0qzsNCjIs
IMPit9HOqrLO1bnJzMzhuam5q7my1PDIzrGjz9WhotW5s6HH5b3goaIyNNChyrGxo7Cytci3/s7x
o7sNCjOjrLTzu+G9q7Hg06G+q8PAu+G/r6Osw+K30c6qss7Vucbz0rW/r7XHMjAw19bX89PSuavL
vrzyvemjuw0KNKOstPO74czhuam9u82o1MvK5KGi1bm+39fiwd6hosDx0se1yLf+zvGjrNW5ycy4
+b7dx+m/9tGh08OjrLfR08PX1MDto7sNCjWjrLTzu+HWuLaotv7Qx9bBzuXQx7j3tbW+xrXqo6zV
ucnMxr7WpMjr16G9q8/tyty9z7Tz1du/26GjDQqhv7LO1bnPuNTyDQoxLszu0LSyztW5yerH67Ht
08q8xLvytKvV5tbB1+nWr7WlzrujrLKi1No3yNXE2r2rss7VubfR08O157vju/K9u9bB1+nWr7Wl
zrujrLLO1bnJzNTau+Oz9rj3z+630dPDuvOjrMfrvavS+NDQu+O/7rWltKvV5tbB1+nWr7Wlzruj
rM7Sw8e9q9TaytW1vbLO1bm30brzv6q+37eixrGjuw0KMi7Vuc67y7PQ8rfWxeTUrdTyo7oiz8jJ
6sfro6zPyLCyxcWjrM/IuLa/7qOsz8jIt8jPIizLq8Pmv6q/2tW5zru808rVMjAlt9HTw6O7DQoz
Ltfp1q+1pc67ytW1vbLO1bnJ6sfrvLDVucyot9HTw7rzo6y9q9PaMjAwOMTqMdTCMjDI1cewvMSh
trLO1bnK1rLhobe4+NW5ycyjuyANCg0Kob++tMfrss7Vucbz0rW8sMqx0+vO0sPHubXNqMGqwuej
rLvxyKHX7tDC1bm74dDFz6INCsrAsqm8r83FL8nPuqPN4r6tw7PJzM7x1bnAwLmry74vyc+6o8rQ
xqS477y8yvXQrbvhL8nPuqPRxbvU1bnAwLf+zvG5q8u+DQq12Na3o7rJz7qjytDk7s+qwrcyNTHF
qs371+WzxzW6xcKlMjBGytIgICAg08qx4KO6MjAwMjM1DQq157uwo7o4Ni0yMS02NDgyNzg4ObfW
u/o4MTggICAgICAgICAgICC0q9Xmo7o4Ni0yMS01MTcxNDY2NiAgNjQ4MjY2MzANCkUtbWFpbDpq
aWFuZ25hbkBzaHloemwuY29tLmNuDQrBqs+1yMujur2txM/PyMn6MTM4MTgyMjYxNTgNCg0KDQqh
oaGhoaGhoaGhoaGhoaGh1sINCsDxo6ENCg0KDQogICAgICAgICAgICAgICAgICAgIA0KICAgICAg
ICAgICAgICAgICANCgqhqqGqoaqhqqGqoaqhqqGqoaqhqqGqoaqhqqGqoaqhqqGqoaqhqqGqoaqh
qqGqoaqhqqGqoaqhqqGqoaoKob7XotLiob/Jz8PmtcTTyrz+xNrI3dPr0tTPws7E19bO3rnYoaOx
vsjtvP699s/e09q6z7eo08PNviEKuMPTyrz+08mhtlZvbGxleW1haWzTyrz+yLq3oteovNKht8jt
vP63osvNo7uxu83409HGwM6q1+7A97qmCrXE08q8/si6t6LI7bz+tvi24LTO0qrH88bGveKjoc/W
w+K30c/C1NijrM7ez97KsbzkyrnTw6GjCs/qx+nH67fDzsrO0sPHtcTW99Kzo7podHRwOi8vd3d3
LmNueXNvZnQuY29tLw==


From 9scottwise at johnnyrockets.com  Thu Dec 20 03:22:12 2007
From: 9scottwise at johnnyrockets.com (Craig Swanson)
Date: Thu, 20 Dec 2007 11:22:12 +0000
Subject: [ofa-general] Your profile
Message-ID: <327961151.96326752542721@johnnyrockets.com>

Hello! I am tired tonight. I am nice girl that would like to chat with you. Email me at Anna at ShineBal.info only, because I am using my friend's email to write this. I want to show you some pictures.


From vlad at lists.openfabrics.org  Thu Dec 20 03:17:18 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Thu, 20 Dec 2007 03:17:18 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071220-0200 daily build status
Message-ID: <20071220111718.96EDAE602F4@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.18
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ppc64 with linux-2.6.18
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.15
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.16
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.13
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:


From kliteyn at dev.mellanox.co.il  Thu Dec 20 03:42:40 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 20 Dec 2007 13:42:40 +0200
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>
References: <20071219195839.GS412@sgi.com>
	<1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>
Message-ID: <476A5530.3070602@dev.mellanox.co.il>

Hal Rosenstock wrote:
> On Wed, 2007-12-19 at 11:58 -0800, akepner at sgi.com wrote:
>> We're seeing a regression in smpquery from alpha2 to rc1. 
>>
>> For example, with alpha2 I get:
>> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
>> # Node info: Lid 3
>> BaseVers:........................1
>> ClassVers:.......................1
>> NodeType:........................Channel Adapter
>> NumPorts:........................2
>> SystemGuid:......................0x00066a009800737c
>> Guid:............................0x00066a009800737c
>> PortGuid:........................0x00066a01a000737c
>> PartCap:.........................64
>> DevId:...........................0x6278
>> Revision:........................0x000000a0
>> LocalPort:.......................2
>> VendorId:........................0x00066a
>> grommit:~ # 
>>
>>
>> And with rc1, I get:
>> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
>> ibwarn: [5650] ib_path_query: sa call path_query failed
>> smpquery: iberror: failed: can't resolve destination port 0x66a01a000737c
>> grommit:~ #  
>>
>> But using a LID works fine:
>> grommit:~ # smpquery nodeinfo 3
>> # Node info: Lid 3
>> BaseVers:........................1
>> ClassVers:.......................1
>> NodeType:........................Channel Adapter
>> NumPorts:........................2
>> SystemGuid:......................0x00066a009800737c
>> Guid:............................0x00066a009800737c
>> PortGuid:........................0x00066a01a000737c
>> PartCap:.........................64
>> DevId:...........................0x6278
>> Revision:........................0x000000a0
>> LocalPort:.......................2
>> VendorId:........................0x00066a
>> grommit:~ # 
>>
>> Strangest of all, running it under strace also works:
>> grommit:~ # strace smpquery -G nodeinfo 0x66a01a000737c > /tmp/smpquery.out 
>> .....
>> grommit:~ # cat /tmp/smpquery.out
>> # Node info: Lid 3
>> BaseVers:........................1
>> ClassVers:.......................1
>> NodeType:........................Channel Adapter
>> NumPorts:........................2
>> SystemGuid:......................0x00066a009800737c
>> Guid:............................0x00066a009800737c
>> PortGuid:........................0x00066a01a000737c
>> PartCap:.........................64
>> DevId:...........................0x6278
>> Revision:........................0x000000a0
>> LocalPort:.......................2
>> VendorId:........................0x00066a
>> grommit:~ #
>>
>> Some weird race condition...
>>
>> Anyone else seeing the same?
> 
> -G requires a SA path record lookup so this could be an issue with that
> timing out in some cases (assuming the port is active and the SM is
> operational).

I'm seeing the same problem.
Sometimes the query works, and sometimes it doesn't.
I also see that when the query fails, OpenSM doesn't get PathRecord query at all.

Hal, can you elaborate on "that timing out in some cases" issue?

Adding Jack for the libibmad issue:

I see that the ib_path_query() in libibmad/sa.c sometimes fails
when calling safe_sa_call().


-- Yevgeny

> -- Hal
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


From dotanb at dev.mellanox.co.il  Thu Dec 20 04:32:43 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Thu, 20 Dec 2007 14:32:43 +0200
Subject: [ofa-general] can you please add a new product to OpenFabrics Linux?
Message-ID: <476A60EB.2020100@dev.mellanox.co.il>

The product "mstflint" is missing.

The owner of this product is orenk.at.dev.mellanox.co.il


thanks
Dotan


From a-alexva at acergardens.com  Thu Dec 20 05:44:45 2007
From: a-alexva at acergardens.com (Randy Marshall)
Date: Thu, 20 Dec 2007 21:44:45 +0800
Subject: [ofa-general] Where have you been?
Message-ID: <01c84351$8981b450$be49dadd@a-alexva>

Hello! I am bored this evening. I am nice girl that would like to chat with you. Email me at Karin at ShineBal.info only, because I am using my friend's email to write this. Hope you will like my pictures.


From akstcagamamnsdgs at agama.net  Thu Dec 20 05:43:54 2007
From: akstcagamamnsdgs at agama.net (Fran Finn)
Date: Thu, 20 Dec 2007 21:43:54 +0800
Subject: [ofa-general] Finn
Message-ID: <01c84351$6aa8d2c0$91777fdd@akstcagamamnsdgs>

We got huge stock of geniune quality medicines at very less price.

http://katheryntyusuu.googlepages.com
press right here

Thanks,

DR. Finn Fran


From jackm at dev.mellanox.co.il  Thu Dec 20 05:35:37 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Thu, 20 Dec 2007 15:35:37 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of any
	one user process
Message-ID: <200712201535.37527.jackm@dev.mellanox.co.il>


background:  see "XRC Cleanup order issue thread" at

	http://lists.openfabrics.org/pipermail/general/2007-December/043935.html

(userspace process which created the receiving XRC qp on a given host dies before
other processes which still need to receive XRC messages on their SRQs which are
"paired" with the now-destroyed receiving XRC QP.)

Solution: Add a userspace verb (as part of the XRC suite) which enables the user process
to create an XRC QP owned by the kernel -- which belongs to the required XRC domain.

This QP will be destroyed when the XRC domain is closed (i.e., as part of a ibv_close_xrc_domain
call, but only when the domain's reference count goes to zero).

Below, I give the new userspace API for this function.  Any feedback will be appreciated.
This API will be implemented in the upcoming OFED 1.3 release, so we need feedback ASAP.

Notes:
1. There is no query or destroy verb for this QP. There is also no userspace object for the
   QP. Userspace has ONLY the raw qp number to use when creating the (X)RC connection.

2. Since the QP is "owned" by kernel space, async events for this QP are also handled in kernel
   space (i.e., reported in /var/log/messages). There are no completion events for the QP, since
   it does not send, and all receives completions are reported in the XRC SRQ's cq.

   If this QP enters the error state, the remote QP which sends will start receiving RETRY_EXCEEDED
   errors, so the application will be aware of the failure.

- Jack
======================================================================================
/**
 * ibv_alloc_xrc_rcv_qp - creates an XRC QP for serving as a receive-side only QP,
 *	and moves the created qp through the RESET->INIT and INIT->RTR transitions.
 *      (The RTR->RTS transition is not needed, since this QP does no sending).
 * 	The sending XRC QP uses this QP as destination, while specifying an XRC SRQ
 * 	for actually receiving the transmissions and generating all completions on the
 *	receiving side.
 *
 * 	This QP is created in kernel space, and persists until the XRC domain is closed.
 *	(i.e., its reference count goes to zero).
 *
 * @pd: protection domain to use.  At lower layer, this provides access to userspace obj
 * @xrc_domain: xrc domain to use for the QP.
 * @attr: modify-qp attributes needed to bring the QP to RTR.
 * @attr_mask:  bitmap indicating which attributes are provided in the attr struct.
 * 	used for validity checking.
 * @xrc_rcv_qpn: qp_num of created QP (if success). To be passed to the remote node. The
 *               remote node will use xrc_rcv_qpn in ibv_post_send when sending to
 *		 XRC SRQ's on this host in the same xrc domain.
 *
 * RETURNS: success (0), or a (negative) error value.
 */

int ibv_alloc_xrc_rcv_qp(struct ibv_pd *pd,
			 struct ibv_xrc_domain *xrc_domain,
			 struct ibv_qp_attr *attr,
			 enum ibv_qp_attr_mask attr_mask,
			 uint32_t *xrc_rcv_qpn);

Notes:

1. Although the kernel creates the qp in the kernel's own PD, we still need the PD
   parameter to determine the device.

2. I chose to use struct ibv_qp_attr, which is used in modify QP, rather than create
   a new structure for this purpose.  This also guards against API changes in the event
   that during development I notice that more modify-qp parameters must be specified
   for this operation to work.

3. Table of the ibv_qp_attr parameters showing what values to set:

struct ibv_qp_attr {
	enum ibv_qp_state	qp_state;		Not needed
	enum ibv_qp_state	cur_qp_state;		Not needed
		-- Driver starts from RESET and takes qp to RTR.
	enum ibv_mtu		path_mtu;		Yes
	enum ibv_mig_state	path_mig_state;		Yes
	uint32_t		qkey;			Yes
	uint32_t		rq_psn;			Yes
	uint32_t		sq_psn;			Not needed
	uint32_t		dest_qp_num;		Yes -- this is the remote side QP for the RC conn.
	int			qp_access_flags;	Yes
	struct ibv_qp_cap	cap;			Need only XRC domain. 
							Other caps will use hard-coded values:
								max_send_wr = 1;
								max_recv_wr = 0;
								max_send_sge = 1;
								max_recv_sge = 0;
								max_inline_data = 0;
	struct ibv_ah_attr	ah_attr;		Yes
	struct ibv_ah_attr	alt_ah_attr;		Optional
	uint16_t		pkey_index;		Yes
	uint16_t		alt_pkey_index;		Optional
	uint8_t			en_sqd_async_notify;	Not needed (No sq)
	uint8_t			sq_draining;		Not needed (No sq)
	uint8_t			max_rd_atomic;		Not needed (No sq)
	uint8_t			max_dest_rd_atomic;	Yes -- Total max outstanding RDMAs expected
							for ALL srq destinations using this receive QP.
							(if you are only using SENDs, this value can be 0).
	uint8_t			min_rnr_timer;		default - 0
	uint8_t			port_num;		Yes
	uint8_t			timeout;		Yes
	uint8_t			retry_cnt;		Yes
	uint8_t			rnr_retry;		Yes
	uint8_t			alt_port_num;		Optional
	uint8_t			alt_timeout;		Optional
};

4. Attribute mask bits to set:
	For RESET_to_INIT transition:
		IB_QP_ACCESS_FLAGS | IB_QP_PKEY_INDEX | IB_QP_PORT

	For INIT_to_RTR transition:
		IB_QP_AV | IB_QP_PATH_MTU |
		IB_QP_DEST_QPN | IB_QP_RQ_PSN | IB_QP_MIN_RNR_TIMER
	   If you are using RDMA or atomics, also set:
		IB_QP_MAX_DEST_RD_ATOMIC


From ogerlitz at voltaire.com  Thu Dec 20 05:41:52 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 20 Dec 2007 15:41:52 +0200
Subject: [ofa-general] Re: some questions on stale connection handling
	at the	IB CM
In-Reply-To: <47695B87.5000209@ichips.intel.com>
References: <Pine.LNX.4.64.0712171638420.28805@zuben.voltaire.com>	<000301c840e7$201af600$9b37170a@amr.corp.intel.com>	<47678A90.8060804@voltaire.com>	<000101c84199$8f333a40$c1d4180a@amr.corp.intel.com>
	<4768FE8D.5070408@voltaire.com> <47695B87.5000209@ichips.intel.com>
Message-ID: <476A7120.4060201@voltaire.com>

Sean Hefty wrote:
>> So in the case of lost DREQ etc, in cm_match_req() we will pass the
>> checking for duplicate REQs but fall in the check for stale connections
>> and it can happen in endless loop? this seems like a bug to me.

> This problem isn't limited to stale connections.  If a client tries to
> connect, gets a reject for whatever reason, ignores the reject, then
> tries to reconnect with the same parameters, then they've put themselves
> into an endless loop.

I don't follow: if they don't ignore the reject, but reuse the same QP 
for their successive connection requests, each new REQ will pass the ID 
check (duplicate REQs) but will fail on the remote QPN check, correct? 
so what can a client do to not fall into that? what does it means to not 
ignore the reject? note that even if on getting a reject they release 
the qp and allocate new one, they can get the qp number.

>> Yes, this seems to be able to solve the keep-alive thing in a generic
>> fashion for all ULPs using the IB CM, will you be able to look on this
>> during the next weeks or so?

> This method can be used by apps today.  The only enhancement that I can
> see being made is having the CM automatically send the messages at
> regular intervals.  But I hesitate to add this to the CM since it
> doesn't have knowledge of traffic occurring over the QP, and may
> interfere with the app wanted to actually change alternate path information.

You mean one side to send a LAP message with the current path and the 
peer replying with APR message confirming this is fine? I guess this LAP 
sending has to carried out by both sides, correct? and its not supported 
for RDMA-CM users...

As for your comments, assuming an app must notify the CM that it does 
not use a QP anymore (and if not we delare it RTFM bug), as long as the 
QP is alive from the CM view point, its perfectly fine to sends these 
LAPs, doing this once every few seconds or tens of seconds will not 
create heavy load, I think. As for the point of interfering with apps 
that want to use LAP/APR for APM implementation over their protocols, we 
can let the CM consumer specify if they want the CM to issue keep-alives 
for them, and what is the frequency of sending the messages.

Or.


From ogerlitz at voltaire.com  Thu Dec 20 05:57:02 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 20 Dec 2007 15:57:02 +0200
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <C98692FD98048C41885E0B0FACD9DFB805BD3208@exnane01.hq.netapp.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com>
	<C98692FD98048C41885E0B0FACD9DFB805BD3208@exnane01.hq.netapp.com>
Message-ID: <476A74AE.1060602@voltaire.com>

Kanevsky, Arkady wrote:
> are you proposing that rdma_cm try to separate 2 cases.
> One where 2 sides each trying to set up a connection to another side,
> vs. where 2 sides are trying to set up 1 connection but each side
> issuing a connection request?

I am not proposing now, but rather trying to understand with Sean what 
his vision of a possible API

> Isn't it easier to handle in MPI which has a unique rank so only one
> side issues a connection request?

This is in MPI schemes that all-to-all-connect on job start, where I 
refer the case of connections on demand.

Or.


From jeroen.vanaken at intec.ugent.be  Thu Dec 20 01:07:12 2007
From: jeroen.vanaken at intec.ugent.be (Jeroen Van Aken)
Date: Thu, 20 Dec 2007 10:07:12 +0100
Subject: ***SPAM*** RE: [ofa-general] ***SPAM*** SFS 3012 SRP problem
In-Reply-To: <A15335FBE9BD2449AF2C9EF3D1EB8EA304BCA334@xmb-sjc-216.amer.cisco.com>
References: <040b01c8424f$04e03030$0ea09090$@vanaken@intec.ugent.be>
	<A15335FBE9BD2449AF2C9EF3D1EB8EA304BCA334@xmb-sjc-216.amer.cisco.com>
Message-ID: <045d01c842e7$b65dac50$231904f0$@vanaken@intec.ugent.be>

We are using 2 IBM FAStT900's.

Normally the timestamps of the messages on both the SFS and the IB host
match.

Thanks

 
jeroen

 
From: Scott Weitzenkamp (sweitzen) [mailto:sweitzen at cisco.com] 
Sent: woensdag 19 december 2007 18:32
To: Jeroen Van Aken; general at lists.openfabrics.org
Subject: RE: [ofa-general] ***SPAM*** SFS 3012 SRP problem

 
If you have a Cisco supoport contract, you should open a case with the Cisco
TAC.

 
What kind of FC storage are you using?

 
The chassis syslog message show the host is unresponsive (the OUT_SERVICE
and IN_SERVICE message).  Do the timing of these messages match the ib_srp
messages on the host?

 
Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems

 
  _____  


From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Jeroen Van Aken
Sent: Wednesday, December 19, 2007 6:54 AM
To: general at lists.openfabrics.org
Subject: [ofa-general] ***SPAM*** SFS 3012 SRP problem

Hello

 
We are doing some SRP tests with the Cisco SFS 3012 Gateway. We connected 4
hosts, each with 2 infiniband cables on one dual infiniband card to the
SFS3012 gateway. The gateway is also connected to our fibre channel storage.
The ofed used is OFED-1.3-beta2 on each of the hosts. The infiniband cards
used are InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev
a0) and  Mellanox Technologies MT23108 InfiniHost (rev a1) cards.

When generating heavy load over the switch (by reading from our FC storage
over all the luns simultaneously), we sometimes get the following errors:

On the hosts: 

 
Dec 13 13:07:54 gpfs4n1 syslog-ng[8212]: STATS: dropped 0

Dec 13 13:20:26 gpfs4n1 run_srp_daemon[8422]: failed srp_daemon:
[HCA=mthca0] [port=1] [exit status=110]. Will try to restart srp_daemon
periodically. No mor

e warnings will be issued in the next 7200 seconds if the same problem
repeats

Dec 13 13:20:27 gpfs4n1 run_srp_daemon[8428]: starting srp_daemon:
[HCA=mthca0] [port=1]

Dec 13 14:01:20 gpfs4n1 sshd[8539]: Accepted keyboard-interactive/pam for
root from 172.16.0.18 port 3545 ssh2

Dec 13 14:07:55 gpfs4n1 syslog-ng[8212]: STATS: dropped 0

Dec 13 14:13:01 gpfs4n1 syslog-ng[8212]: Changing permissions on special
file /dev/xconsole

Dec 13 14:13:01 gpfs4n1 syslog-ng[8212]: Changing permissions on special
file /dev/tty10

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:01 gpfs4n1 kernel: SRP abort called

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed send status 12

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed send status 12

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

Dec 13 14:13:02 gpfs4n1 kernel: ib_srp: failed receive status 5

 
On the switch ts_log

**************************************SWITCH
LOG***************************************************************

Dec 13 14:04:30 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
multicast membership change

Dec 13 14:05:49 topspin-cc ib_sm.x[1383]: [INFO]: Session not initiated:
Cold Sync Limit exceeded for Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:07:49 topspin-cc ib_sm.x[1383]: [INFO]: Initialize a backup
session with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:07:59 topspin-cc ib_sm.x[1383]: [INFO]: Session initialization
failed with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:09:59 topspin-cc ib_sm.x[1383]: [INFO]: Initialize a backup
session with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:10:09 topspin-cc ib_sm.x[1383]: [INFO]: Session initialization
failed with Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM OUT_OF_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:21

Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM OUT_OF_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:22

Dec 13 14:12:06 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
discovering removed ports

Dec 13 14:12:07 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
multicast membership change

Dec 13 14:12:09 topspin-cc ib_sm.x[1383]: [INFO]: Session not initiated:
Cold Sync Limit exceeded for Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:12:18 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for read, err=11, t1=1, t2=0

Dec 13 14:12:22 topspin-cc last message repeated 4 times

Dec 13 14:12:36 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
discovering new ports

Dec 13 14:12:37 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM IN_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:21

Dec 13 14:12:37 topspin-cc ib_sm.x[1357]: [INFO]: Generate SM IN_SERVICE
trap for GID=fe:80:00:00:00:00:00:00:00:05:ad:00:00:1d:ce:22

13 14:12:3

Dec 13 14:12:38 topspin-cc ib_sm.x[1357]: [INFO]: Configuration caused by
multicast membership change

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:3

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

13 14:12:4

Dec 13 14:13:28 topspin-cc chassis_mgr.x[1084]: [WARN]: tsIpcMessageSend
failed, fd=28, vp=2, err=104, Connection reset by peer

Dec 13 14:13:39 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:13:39 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:13:40 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:13:46 topspin-cc web_agent.x[1370]: [INFO]: ipc: select(fd=3)
failed for read, err=11, t1=10, t2=0

Dec 13 14:13:50 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:13:50 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:13:50 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:14:18 topspin-cc ib_sm.x[1383]: [INFO]: Session not initiated:
Cold Sync Limit exceeded for Standby SM guid 00:05:ad:00:00:08:94:5d

Dec 13 14:14:38 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:14:38 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:14:40 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:14:49 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:14:50 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:14:50 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

Dec 13 14:15:00 topspin-cc web_agent.x[1370]: [INFO]: ipc: select(fd=3)
failed for read, err=11, t1=10, t2=0

Dec 13 14:15:00 topspin-cc chassis_mgr.x[1084]: [INFO]: ipc: select(fd=28)
failed for write, err=11, t1=10, t2=0

Dec 13 14:15:00 topspin-cc chassis_mgr.x[1084]: [INFO]: tsIpcMessageSend
failed, fd=28, vp=2, err=11, Resource temporarily unavailable

Dec 13 14:15:00 topspin-cc snmp_agent.x[1208]: [INFO]: ipc: select(fd=5)
failed for read, err=11, t1=10, t2=0

 
It looks like some of the log entries are incomplete.

I think it is a switch related issue: first of all because of the strange
format of the logs, and second because when this error occurs in the switch,
no SRP communication is possible on either of the IB hosts. I already tried
increasing the Node timeout, and set RENICE_IB_MAD to yes as described in
this thread:
http://lists.openfabrics.org/pipermail/general/2007-May/036465.html. But
this didn't help.

This issue occurs randomly.  So it isn't easily reproduced.

Does anybody have an idea what went wrong?

 
Thanks in advance!

 
Jeroen Van Aken

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/d75aa72d/attachment.html>

From hrosenstock at xsigo.com  Thu Dec 20 06:07:39 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 20 Dec 2007 06:07:39 -0800
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <476A5530.3070602@dev.mellanox.co.il>
References: <20071219195839.GS412@sgi.com>
	<1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>
	<476A5530.3070602@dev.mellanox.co.il>
Message-ID: <1198159659.6635.164.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-20 at 13:42 +0200, Yevgeny Kliteynik wrote:
> Hal Rosenstock wrote:
> > On Wed, 2007-12-19 at 11:58 -0800, akepner at sgi.com wrote:
> >> We're seeing a regression in smpquery from alpha2 to rc1. 
> >>
> >> For example, with alpha2 I get:
> >> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
> >> # Node info: Lid 3
> >> BaseVers:........................1
> >> ClassVers:.......................1
> >> NodeType:........................Channel Adapter
> >> NumPorts:........................2
> >> SystemGuid:......................0x00066a009800737c
> >> Guid:............................0x00066a009800737c
> >> PortGuid:........................0x00066a01a000737c
> >> PartCap:.........................64
> >> DevId:...........................0x6278
> >> Revision:........................0x000000a0
> >> LocalPort:.......................2
> >> VendorId:........................0x00066a
> >> grommit:~ # 
> >>
> >>
> >> And with rc1, I get:
> >> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
> >> ibwarn: [5650] ib_path_query: sa call path_query failed
> >> smpquery: iberror: failed: can't resolve destination port 0x66a01a000737c
> >> grommit:~ #  
> >>
> >> But using a LID works fine:
> >> grommit:~ # smpquery nodeinfo 3
> >> # Node info: Lid 3
> >> BaseVers:........................1
> >> ClassVers:.......................1
> >> NodeType:........................Channel Adapter
> >> NumPorts:........................2
> >> SystemGuid:......................0x00066a009800737c
> >> Guid:............................0x00066a009800737c
> >> PortGuid:........................0x00066a01a000737c
> >> PartCap:.........................64
> >> DevId:...........................0x6278
> >> Revision:........................0x000000a0
> >> LocalPort:.......................2
> >> VendorId:........................0x00066a
> >> grommit:~ # 
> >>
> >> Strangest of all, running it under strace also works:
> >> grommit:~ # strace smpquery -G nodeinfo 0x66a01a000737c > /tmp/smpquery.out 
> >> .....
> >> grommit:~ # cat /tmp/smpquery.out
> >> # Node info: Lid 3
> >> BaseVers:........................1
> >> ClassVers:.......................1
> >> NodeType:........................Channel Adapter
> >> NumPorts:........................2
> >> SystemGuid:......................0x00066a009800737c
> >> Guid:............................0x00066a009800737c
> >> PortGuid:........................0x00066a01a000737c
> >> PartCap:.........................64
> >> DevId:...........................0x6278
> >> Revision:........................0x000000a0
> >> LocalPort:.......................2
> >> VendorId:........................0x00066a
> >> grommit:~ #
> >>
> >> Some weird race condition...
> >>
> >> Anyone else seeing the same?
> > 
> > -G requires a SA path record lookup so this could be an issue with that
> > timing out in some cases (assuming the port is active and the SM is
> > operational).
> 
> I'm seeing the same problem.
> Sometimes the query works, and sometimes it doesn't.
> I also see that when the query fails, OpenSM doesn't get PathRecord query at all.
> 
> Hal, can you elaborate on "that timing out in some cases" issue?

I just meant that the SM not responding (for an unknown reason right
now) would yield this effect.

> Adding Jack for the libibmad issue:
> 
> I see that the ib_path_query() in libibmad/sa.c sometimes fails
> when calling safe_sa_call().

This could just be more detail on the same thing in terms of the
(smpquery) client which is layered on top of libibmad: the SA path query
timeout.

I would suggest running OpenSM in verbose mode (both instances are with
OpenSM) and seeing if it responds to the PathRecord query used by this
form of smpquery and continue troubleshooting from there based on the
result.

-- Hal

> -- Yevgeny
> 
> > -- Hal
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > 
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> > 
> 


From hnguyen at linux.vnet.ibm.com  Thu Dec 20 06:06:33 2007
From: hnguyen at linux.vnet.ibm.com (Hoang-Nam Nguyen)
Date: Thu, 20 Dec 2007 15:06:33 +0100
Subject: [ofa-general] [PATCH] IB/ehca: Forward event
	client-reregister-required to registered clients
Message-ID: <200712201506.34253.hnguyen@linux.vnet.ibm.com>

This patch allows ehca to forward event client-reregister-required to
registered clients. Such one event is generated by the switch eg. after
its reboot.

Signed-off-by: Hoang-Nam Nguyen <hnguyen at de.ibm.com>
---
 drivers/infiniband/hw/ehca/ehca_irq.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c
index 3f617b2..4c734ec 100644
--- a/drivers/infiniband/hw/ehca/ehca_irq.c
+++ b/drivers/infiniband/hw/ehca/ehca_irq.c
@@ -62,6 +62,7 @@
 #define NEQE_PORT_NUMBER       EHCA_BMASK_IBM( 8, 15)
 #define NEQE_PORT_AVAILABILITY EHCA_BMASK_IBM(16, 16)
 #define NEQE_DISRUPTIVE        EHCA_BMASK_IBM(16, 16)
+#define NEQE_SPECIFIC_EVENT    EHCA_BMASK_IBM(16, 23)
 
 #define ERROR_DATA_LENGTH      EHCA_BMASK_IBM(52, 63)
 #define ERROR_DATA_TYPE        EHCA_BMASK_IBM( 0,  7)
@@ -354,6 +355,7 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe)
 {
 	u8 ec   = EHCA_BMASK_GET(NEQE_EVENT_CODE, eqe);
 	u8 port = EHCA_BMASK_GET(NEQE_PORT_NUMBER, eqe);
+	u8 spec_event;
 
 	switch (ec) {
 	case 0x30: /* port availability change */
@@ -394,6 +396,16 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe)
 	case 0x33:  /* trace stopped */
 		ehca_err(&shca->ib_device, "Traced stopped.");
 		break;
+	case 0x34: /* util async event */
+		spec_event = EHCA_BMASK_GET(NEQE_SPECIFIC_EVENT, eqe);
+		if (spec_event == 0x80) /* client reregister required */
+			dispatch_port_event(shca, port,
+					    IB_EVENT_CLIENT_REREGISTER,
+					    "client reregister req.");
+		else
+			ehca_warn(&shca->ib_device, "Unknown util async "
+				  "event %x on port %x", spec_event, port);
+		break;
 	default:
 		ehca_err(&shca->ib_device, "Unknown event code: %x on %s.",
 			 ec, shca->ib_device.name);
-- 
1.5.2


From baldpate at ctsnet.org  Thu Dec 20 06:28:20 2007
From: baldpate at ctsnet.org (Foulkes Posso)
Date: Thu, 20 Dec 2007 14:28:20 +0000
Subject: [ofa-general] cladophora
Message-ID: <1362257691.20071220142451@ctsnet.org>

Halloha,


  Downnloadable Softwaare   
http://www.geocities.com/ggfd28kfhfkgku/ 


To be due to ignorance or delusion. The soul's interview
with the king, and placed the memorandum betrayed the brotherhood?
from every member of colonel at last called the halt, the
boy sank of what good for you and me to speculate, since
can trust one another's word more fully than the faithful
to her promise, abandoning that prosperity must be no distracting
cares i will look for the has since assumed the less heathen
appellation then the ... Manufacturers' association, ...
it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/715f6c9c/attachment.html>

From ogerlitz at voltaire.com  Thu Dec 20 07:08:45 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 20 Dec 2007 17:08:45 +0200
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <47696002.4030903@ichips.intel.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com> <47696002.4030903@ichips.intel.com>
Message-ID: <476A857D.3090608@voltaire.com>

Sean Hefty wrote:
...
> I didn't follow this.
...
> Peer to peer SIDs are in a different domain than client/server SIDs, and
> the peer_to_peer field is used to indicate which domain a SID is in.

Sorry if I wasn't clear, let me see if I understand you: with this
different domain implementation, under both client/server the passive
calls cm listen and the active call cm connect, where under peer/to/peer
both sides call cm listen and later both sides may call cm connect or
only one side, correct?

> To add to my comments on the CM API, struct ib_cm_req_param, which is
> used to send the REQ, includes service_id and peer_to_peer fields.  The
> latter is a boolean used by the CM to distinguish if incoming REQs can
> be matched with the outgoing REQ.

OK, this makes things clearer.

>> Why there should be a difference between the rdma-cm to the cm? if in
>> the cm you have a model without API change, wouldn't it apply also to
>> the rdma-cm?

> The rdma_cm does not know how to set the peer_to_peer field in the
> ib_cm_req_param.  It sets this field to 0 today.

But it could set it to one as well... assuming my understanding above of
the suggested implementation is correct, we can change the RDMA-CM API
to let users specify on rdma_connect that they want peer to peer
support, so such apps can issue rdma_listen call and later call
rdma_connect with this bit set and they are done (or almost done... I
guess there some more devil in the details here, isn't it?)

>  > I think that in the MPI world each rank gets a SID from the local CM and
>  > they exchange the SIDs out-of-band, then connections are opened. If its
>  > a connection-on-demand scheme, then when ever the rank process calls
>  > mpi_send() to peer for which the local MPI library does not have a
>  > connection, it tries to connect. So if this happens "at once" between
>  > some pair of ranks, there should be a way to form one connection out of
>  > these two connecting requests. My thinking/motivation is that support of
>  > this scheme should be in the IB stack (cm and rdma-cm) level and not in
>  > the specific MPI implementation level.
> 
> Are the out of band connections used by MPI formed using client/server
> or peer to peer?  I believe that Intel MPI has each rank listen for
> connections from the ranks below it using client/server.

yes, MPIs that do all-to-all-connect on job start, typically use
client/server where all the ranks > 0 issue listen call and then all
lower ranks connect to higher ranks or etc some other symmetry breaking
scheme. I am trying to see what needs to be supported by the IB stack to 
let MPIs that do connect on demand use the RDMA-CM.

> There are a couple of problems with the peer to peer model.  First,
> unless the connections occur at exactly the same time, they miss
> connecting (rejected with invalid SID).  

This makes the all peer to peer model useless, since an app can not make
sure that connection occur at exactly the same time! my understanding of
the spec is that peer to peer model has the ability to handle also 
connections that occur at exactly the same time but not only.

> Second, if multiple peer to
> peer connections need to form between the same pair of nodes, things can
> go screwy (that's the technical term) trying to match up the peer requests.

Under MPI each rank uses a different SID, so I think we are safe from 
this problem.

Or


From pasha at dev.mellanox.co.il  Thu Dec 20 07:14:48 2007
From: pasha at dev.mellanox.co.il (Pavel Shamis (Pasha))
Date: Thu, 20 Dec 2007 17:14:48 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent
	of any	one user process
In-Reply-To: <200712201535.37527.jackm@dev.mellanox.co.il>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
Message-ID: <476A86E8.8020308@dev.mellanox.co.il>

Adding Open MPI and MVAPICH community to the thread.

Pasha (Pavel Shamis)

Jack Morgenstein wrote:
> background:  see "XRC Cleanup order issue thread" at
>
> 	http://lists.openfabrics.org/pipermail/general/2007-December/043935.html
>
> (userspace process which created the receiving XRC qp on a given host dies before
> other processes which still need to receive XRC messages on their SRQs which are
> "paired" with the now-destroyed receiving XRC QP.)
>
> Solution: Add a userspace verb (as part of the XRC suite) which enables the user process
> to create an XRC QP owned by the kernel -- which belongs to the required XRC domain.
>
> This QP will be destroyed when the XRC domain is closed (i.e., as part of a ibv_close_xrc_domain
> call, but only when the domain's reference count goes to zero).
>
> Below, I give the new userspace API for this function.  Any feedback will be appreciated.
> This API will be implemented in the upcoming OFED 1.3 release, so we need feedback ASAP.
>
> Notes:
> 1. There is no query or destroy verb for this QP. There is also no userspace object for the
>    QP. Userspace has ONLY the raw qp number to use when creating the (X)RC connection.
>
> 2. Since the QP is "owned" by kernel space, async events for this QP are also handled in kernel
>    space (i.e., reported in /var/log/messages). There are no completion events for the QP, since
>    it does not send, and all receives completions are reported in the XRC SRQ's cq.
>
>    If this QP enters the error state, the remote QP which sends will start receiving RETRY_EXCEEDED
>    errors, so the application will be aware of the failure.
>
> - Jack
> ======================================================================================
> /**
>  * ibv_alloc_xrc_rcv_qp - creates an XRC QP for serving as a receive-side only QP,
>  *	and moves the created qp through the RESET->INIT and INIT->RTR transitions.
>  *      (The RTR->RTS transition is not needed, since this QP does no sending).
>  * 	The sending XRC QP uses this QP as destination, while specifying an XRC SRQ
>  * 	for actually receiving the transmissions and generating all completions on the
>  *	receiving side.
>  *
>  * 	This QP is created in kernel space, and persists until the XRC domain is closed.
>  *	(i.e., its reference count goes to zero).
>  *
>  * @pd: protection domain to use.  At lower layer, this provides access to userspace obj
>  * @xrc_domain: xrc domain to use for the QP.
>  * @attr: modify-qp attributes needed to bring the QP to RTR.
>  * @attr_mask:  bitmap indicating which attributes are provided in the attr struct.
>  * 	used for validity checking.
>  * @xrc_rcv_qpn: qp_num of created QP (if success). To be passed to the remote node. The
>  *               remote node will use xrc_rcv_qpn in ibv_post_send when sending to
>  *		 XRC SRQ's on this host in the same xrc domain.
>  *
>  * RETURNS: success (0), or a (negative) error value.
>  */
>
> int ibv_alloc_xrc_rcv_qp(struct ibv_pd *pd,
> 			 struct ibv_xrc_domain *xrc_domain,
> 			 struct ibv_qp_attr *attr,
> 			 enum ibv_qp_attr_mask attr_mask,
> 			 uint32_t *xrc_rcv_qpn);
>
> Notes:
>
> 1. Although the kernel creates the qp in the kernel's own PD, we still need the PD
>    parameter to determine the device.
>
> 2. I chose to use struct ibv_qp_attr, which is used in modify QP, rather than create
>    a new structure for this purpose.  This also guards against API changes in the event
>    that during development I notice that more modify-qp parameters must be specified
>    for this operation to work.
>
> 3. Table of the ibv_qp_attr parameters showing what values to set:
>
> struct ibv_qp_attr {
> 	enum ibv_qp_state	qp_state;		Not needed
> 	enum ibv_qp_state	cur_qp_state;		Not needed
> 		-- Driver starts from RESET and takes qp to RTR.
> 	enum ibv_mtu		path_mtu;		Yes
> 	enum ibv_mig_state	path_mig_state;		Yes
> 	uint32_t		qkey;			Yes
> 	uint32_t		rq_psn;			Yes
> 	uint32_t		sq_psn;			Not needed
> 	uint32_t		dest_qp_num;		Yes -- this is the remote side QP for the RC conn.
> 	int			qp_access_flags;	Yes
> 	struct ibv_qp_cap	cap;			Need only XRC domain. 
> 							Other caps will use hard-coded values:
> 								max_send_wr = 1;
> 								max_recv_wr = 0;
> 								max_send_sge = 1;
> 								max_recv_sge = 0;
> 								max_inline_data = 0;
> 	struct ibv_ah_attr	ah_attr;		Yes
> 	struct ibv_ah_attr	alt_ah_attr;		Optional
> 	uint16_t		pkey_index;		Yes
> 	uint16_t		alt_pkey_index;		Optional
> 	uint8_t			en_sqd_async_notify;	Not needed (No sq)
> 	uint8_t			sq_draining;		Not needed (No sq)
> 	uint8_t			max_rd_atomic;		Not needed (No sq)
> 	uint8_t			max_dest_rd_atomic;	Yes -- Total max outstanding RDMAs expected
> 							for ALL srq destinations using this receive QP.
> 							(if you are only using SENDs, this value can be 0).
> 	uint8_t			min_rnr_timer;		default - 0
> 	uint8_t			port_num;		Yes
> 	uint8_t			timeout;		Yes
> 	uint8_t			retry_cnt;		Yes
> 	uint8_t			rnr_retry;		Yes
> 	uint8_t			alt_port_num;		Optional
> 	uint8_t			alt_timeout;		Optional
> };
>
> 4. Attribute mask bits to set:
> 	For RESET_to_INIT transition:
> 		IB_QP_ACCESS_FLAGS | IB_QP_PKEY_INDEX | IB_QP_PORT
>
> 	For INIT_to_RTR transition:
> 		IB_QP_AV | IB_QP_PATH_MTU |
> 		IB_QP_DEST_QPN | IB_QP_RQ_PSN | IB_QP_MIN_RNR_TIMER
> 	   If you are using RDMA or atomics, also set:
> 		IB_QP_MAX_DEST_RD_ATOMIC
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>   


-- 
Pavel Shamis (Pasha)
Mellanox Technologies


From sashak at voltaire.com  Thu Dec 20 07:43:01 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 20 Dec 2007 15:43:01 +0000
Subject: [ofa-general] Re: [PATCH] opensm: osm_state_mgr.c - stop idle queue
	processing if heavy sweep requested
In-Reply-To: <4768CAD8.4010407@dev.mellanox.co.il>
References: <47667AB2.8030500@dev.mellanox.co.il>
	<20071218154033.GA4232@sashak.voltaire.com>
	<4768CAD8.4010407@dev.mellanox.co.il>
Message-ID: <20071220154301.GC15888@sashak.voltaire.com>

On 09:40 Wed 19 Dec     , Yevgeny Kliteynik wrote:
>  Sasha Khapyorsky wrote:
> > Hi Yevgeny,
> > On 15:33 Mon 17 Dec     , Yevgeny Kliteynik wrote:
> >> If a heavy sweep requested during idle queue processing, OSM continues
> >> to process it till the end and only then notices the heavy sweep request.
> >> In some cases this might leave a topology change unhandled for several
> >> minutes.
> > Could you provide more details about such cases?
> > As far as I know the idle queue is used only for multicast re-routing.
> > If so, it is interesting by itself why it takes minutes and where. Is
> > where MCG join/leave storm?
> 
>  Exactly. The problem was discovered on a big cluster with hundreds of mcast 
>  groups,
>  when there is some massive change in the subnet (like rebooting hundreds of 
>  nodes).

Ok, then proposed patch looks like half solution for me.

During mcast join/leave storm idle queue will be filled with requests to
rebuild mcast routing. OpenSM will process it one by one (and this will
take a lot of time) instead of process all pended mcast groups in one
run. I think it is first improvement needed here.

Even with such improvement we will not be able to control the order of
heavy sweep/mcast join requests, so basically idea of breaking idle
queue processing looks fine for me, but it is not all what should be
done here. Heavy sweep by itself recalculates mcast routing for all
existing groups, it should invalidate all pended mcast rerouting
requests instead of continuing idle queue processing after heavy
sweep. Make sense?

Sasha

> 
>  -- Yevgeny
> 
> > Or single re-routing cycle takes minutes?
> > Sasha
> >> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> >> ---
> >>  opensm/opensm/osm_state_mgr.c |   31 ++++++++++++++++++++++++-------
> >>  1 files changed, 24 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
> >> index 5c39f11..6ee5ee6 100644
> >> --- a/opensm/opensm/osm_state_mgr.c
> >> +++ b/opensm/opensm/osm_state_mgr.c
> >> @@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t * 
> >> const p_mgr,
> >>  				/* CALL the done function */
> >>  				__process_idle_time_queue_done(p_mgr);
> >>
> >> -				/*
> >> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> >> -				 * so that the next element in the queue gets processed
> >> -				 */
> >> -
> >> -				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
> >> -				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
> >> +				if (p_mgr->p_subn->force_immediate_heavy_sweep) {
> >> +					/*
> >> +					 * Do not read next item from the idle queue.
> >> +					 * Immediate heavy sweep is requested, so it's
> >> +					 * more important.
> >> +					 * Besides, there is a chance that after the
> >> +					 * heavy sweep complition, idle queue processing
> >> +					 * that SM would have performed here will be obsolete.
> >> +					 */
> >> +					if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG))
> >> +						osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> >> +						"osm_state_mgr_process: "
> >> +						"interrupting idle time queue processing - heavy sweep 
> >> requested\n");
> >> +					signal = OSM_SIGNAL_NONE:
> >> +					p_mgr->state = OSM_SM_STATE_IDLE;
> >> +				}
> >> +				else {
> >> +					/*
> >> +					 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> >> +					 * so that the next element in the queue gets processed
> >> +					 */
> >> +					signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
> >> +					p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
> >> +				}
> >>  				break;
> >>
> >>  			default:
> >> -- 
> >> 1.5.1.4
> >>
> 


From kliteyn at dev.mellanox.co.il  Thu Dec 20 07:43:37 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 20 Dec 2007 17:43:37 +0200
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <1198159659.6635.164.camel@hrosenstock-ws.xsigo.com>
References: <20071219195839.GS412@sgi.com>	
	<1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>	
	<476A5530.3070602@dev.mellanox.co.il>
	<1198159659.6635.164.camel@hrosenstock-ws.xsigo.com>
Message-ID: <476A8DA9.8040408@dev.mellanox.co.il>

Hal Rosenstock wrote:
> On Thu, 2007-12-20 at 13:42 +0200, Yevgeny Kliteynik wrote:
>> Hal Rosenstock wrote:
>>> On Wed, 2007-12-19 at 11:58 -0800, akepner at sgi.com wrote:
>>>> We're seeing a regression in smpquery from alpha2 to rc1. 
>>>>
>>>> For example, with alpha2 I get:
>>>> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
>>>> # Node info: Lid 3
>>>> BaseVers:........................1
>>>> ClassVers:.......................1
>>>> NodeType:........................Channel Adapter
>>>> NumPorts:........................2
>>>> SystemGuid:......................0x00066a009800737c
>>>> Guid:............................0x00066a009800737c
>>>> PortGuid:........................0x00066a01a000737c
>>>> PartCap:.........................64
>>>> DevId:...........................0x6278
>>>> Revision:........................0x000000a0
>>>> LocalPort:.......................2
>>>> VendorId:........................0x00066a
>>>> grommit:~ # 
>>>>
>>>>
>>>> And with rc1, I get:
>>>> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
>>>> ibwarn: [5650] ib_path_query: sa call path_query failed
>>>> smpquery: iberror: failed: can't resolve destination port 0x66a01a000737c
>>>> grommit:~ #  
>>>>
>>>> But using a LID works fine:
>>>> grommit:~ # smpquery nodeinfo 3
>>>> # Node info: Lid 3
>>>> BaseVers:........................1
>>>> ClassVers:.......................1
>>>> NodeType:........................Channel Adapter
>>>> NumPorts:........................2
>>>> SystemGuid:......................0x00066a009800737c
>>>> Guid:............................0x00066a009800737c
>>>> PortGuid:........................0x00066a01a000737c
>>>> PartCap:.........................64
>>>> DevId:...........................0x6278
>>>> Revision:........................0x000000a0
>>>> LocalPort:.......................2
>>>> VendorId:........................0x00066a
>>>> grommit:~ # 
>>>>
>>>> Strangest of all, running it under strace also works:
>>>> grommit:~ # strace smpquery -G nodeinfo 0x66a01a000737c > /tmp/smpquery.out 
>>>> .....
>>>> grommit:~ # cat /tmp/smpquery.out
>>>> # Node info: Lid 3
>>>> BaseVers:........................1
>>>> ClassVers:.......................1
>>>> NodeType:........................Channel Adapter
>>>> NumPorts:........................2
>>>> SystemGuid:......................0x00066a009800737c
>>>> Guid:............................0x00066a009800737c
>>>> PortGuid:........................0x00066a01a000737c
>>>> PartCap:.........................64
>>>> DevId:...........................0x6278
>>>> Revision:........................0x000000a0
>>>> LocalPort:.......................2
>>>> VendorId:........................0x00066a
>>>> grommit:~ #
>>>>
>>>> Some weird race condition...
>>>>
>>>> Anyone else seeing the same?
>>> -G requires a SA path record lookup so this could be an issue with that
>>> timing out in some cases (assuming the port is active and the SM is
>>> operational).
>> I'm seeing the same problem.
>> Sometimes the query works, and sometimes it doesn't.
>> I also see that when the query fails, OpenSM doesn't get PathRecord query at all.
>>
>> Hal, can you elaborate on "that timing out in some cases" issue?
> 
> I just meant that the SM not responding (for an unknown reason right
> now) would yield this effect.
> 
>> Adding Jack for the libibmad issue:
>>
>> I see that the ib_path_query() in libibmad/sa.c sometimes fails
>> when calling safe_sa_call().
> 
> This could just be more detail on the same thing in terms of the
> (smpquery) client which is layered on top of libibmad: the SA path query
> timeout.
> I would suggest running OpenSM in verbose mode (both instances are with
> OpenSM) and seeing if it responds to the PathRecord query used by this
> form of smpquery and continue troubleshooting from there based on the
> result.

This is actually what I was saying here.
I have *debugged* smpquery, and saw that the failing function is
ib_path_query() in libibmad/sa.c
As I've mentioned, I did run it with OpenSM in verbose mode, and saw
that when smpquery fails, OpenSM log does not have any PathRecord request.
When smpquery passes, I see the PathRecord request and response in the
OpenSM log.

-- Yevgeny

> -- Hal
> 
>> -- Yevgeny
>>
>>> -- Hal
>>> _______________________________________________
>>> general mailing list
>>> general at lists.openfabrics.org
>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>
>>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>>>
> 


From Arkady.Kanevsky at netapp.com  Thu Dec 20 07:48:29 2007
From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady)
Date: Thu, 20 Dec 2007 10:48:29 -0500
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <476A857D.3090608@voltaire.com>
References: <4767A2CD.8030209@voltaire.com>
	<4768289F.6040907@ichips.intel.com><4769019C.10602@voltaire.com>
	<47696002.4030903@ichips.intel.com> <476A857D.3090608@voltaire.com>
Message-ID: <C98692FD98048C41885E0B0FACD9DFB805BD355A@exnane01.hq.netapp.com>

SO in a nutshell the proposal is to add
some identifier into "CM private data" which indicate that it
is peer-to-peer model, and unique peers IDs for the requested
connection.

Is this the model?
Thanks,

Arkady Kanevsky                       email: arkady at netapp.com
Network Appliance Inc.               phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195
Waltham, MA 02451                   central phone: 781-768-5300
 

> -----Original Message-----
> From: Or Gerlitz [mailto:ogerlitz at voltaire.com] 
> Sent: Thursday, December 20, 2007 10:09 AM
> To: Sean Hefty
> Cc: OpenFabrics General
> Subject: Re: [ofa-general] peer to peer connections support
> 
> Sean Hefty wrote:
> ...
> > I didn't follow this.
> ...
> > Peer to peer SIDs are in a different domain than 
> client/server SIDs, 
> > and the peer_to_peer field is used to indicate which domain 
> a SID is in.
> 
> Sorry if I wasn't clear, let me see if I understand you: with 
> this different domain implementation, under both 
> client/server the passive calls cm listen and the active call 
> cm connect, where under peer/to/peer both sides call cm 
> listen and later both sides may call cm connect or only one 
> side, correct?
> 
> > To add to my comments on the CM API, struct 
> ib_cm_req_param, which is 
> > used to send the REQ, includes service_id and peer_to_peer fields.  
> > The latter is a boolean used by the CM to distinguish if 
> incoming REQs 
> > can be matched with the outgoing REQ.
> 
> OK, this makes things clearer.
> 
> >> Why there should be a difference between the rdma-cm to 
> the cm? if in 
> >> the cm you have a model without API change, wouldn't it 
> apply also to 
> >> the rdma-cm?
> 
> > The rdma_cm does not know how to set the peer_to_peer field in the 
> > ib_cm_req_param.  It sets this field to 0 today.
> 
> But it could set it to one as well... assuming my 
> understanding above of the suggested implementation is 
> correct, we can change the RDMA-CM API to let users specify 
> on rdma_connect that they want peer to peer support, so such 
> apps can issue rdma_listen call and later call rdma_connect 
> with this bit set and they are done (or almost done... I 
> guess there some more devil in the details here, isn't it?)
> 
> >  > I think that in the MPI world each rank gets a SID from 
> the local 
> > CM and  > they exchange the SIDs out-of-band, then connections are 
> > opened. If its  > a connection-on-demand scheme, then when ever the 
> > rank process calls  > mpi_send() to peer for which the local MPI 
> > library does not have a  > connection, it tries to connect. 
> So if this 
> > happens "at once" between  > some pair of ranks, there 
> should be a way 
> > to form one connection out of  > these two connecting requests. My 
> > thinking/motivation is that support of  > this scheme 
> should be in the 
> > IB stack (cm and rdma-cm) level and not in  > the specific 
> MPI implementation level.
> > 
> > Are the out of band connections used by MPI formed using 
> client/server 
> > or peer to peer?  I believe that Intel MPI has each rank listen for 
> > connections from the ranks below it using client/server.
> 
> yes, MPIs that do all-to-all-connect on job start, typically 
> use client/server where all the ranks > 0 issue listen call 
> and then all lower ranks connect to higher ranks or etc some 
> other symmetry breaking scheme. I am trying to see what needs 
> to be supported by the IB stack to let MPIs that do connect 
> on demand use the RDMA-CM.
> 
> > There are a couple of problems with the peer to peer model.  First, 
> > unless the connections occur at exactly the same time, they miss 
> > connecting (rejected with invalid SID).
> 
> This makes the all peer to peer model useless, since an app 
> can not make sure that connection occur at exactly the same 
> time! my understanding of the spec is that peer to peer model 
> has the ability to handle also connections that occur at 
> exactly the same time but not only.
> 
> > Second, if multiple peer to
> > peer connections need to form between the same pair of 
> nodes, things 
> > can go screwy (that's the technical term) trying to match 
> up the peer requests.
> 
> Under MPI each rank uses a different SID, so I think we are 
> safe from this problem.
> 
> Or
> 
> 
> 
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 


From dwsinom at sino.org  Thu Dec 20 07:58:41 2007
From: dwsinom at sino.org (Annette Boris)
Date: Thu, 20 Dec 2007 21:28:41 +0530
Subject: [ofa-general] Fulfill all your pharmaceutical needs with Canadian
	Pharmacy.
Message-ID: <01c8434f$4a9ab770$e66e7d7c@dwsinom>

    There are a lot of online drugstores on the Web, but not all of them are equally trustworthy. United Medical Research Organization recommends you to purchase meds with «CanadianPharmacy». 

http://geocities.com/TimBaxter79/

 Don’t hesitate to purchase with «CanadianPharmacy»!

Annette Boris


From changquing.tang at hp.com  Thu Dec 20 08:24:09 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Thu, 20 Dec 2007 16:24:09 +0000
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	any	one user process
In-Reply-To: <476A86E8.8020308@dev.mellanox.co.il>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<476A86E8.8020308@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>


Jack:
        Thanks for adding this new function, this is what we need. There is one issue I want to make clear,

This new "kernel" owned QP "will be destroyed when the XRC domain is closed
(i.e., as part of a ibv_close_xrc_domain call, but only when the domain's reference count goes to zero) "

        If I have a MPI server processes on a node, many other MPI client processes will dynamically
connect/disconnect with the server. The server use same XRC domain.

        Will this cause accumulating the "kernel" QP for such application ? we want the server to run 365 days
a year.


Thanks.
--CQ


> -----Original Message-----
> From: Pavel Shamis (Pasha) [mailto:pasha at dev.mellanox.co.il]
> Sent: Thursday, December 20, 2007 9:15 AM
> To: Jack Morgenstein
> Cc: Tang, Changqing; Roland Dreier;
> general at lists.openfabrics.org; Open MPI Developers;
> mvapich-discuss at cse.ohio-state.edu
> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> independent of any one user process
>
> Adding Open MPI and MVAPICH community to the thread.
>
> Pasha (Pavel Shamis)
>
> Jack Morgenstein wrote:
> > background:  see "XRC Cleanup order issue thread" at
> >
> >
> >
> http://lists.openfabrics.org/pipermail/general/2007-December/043935.ht
> > ml
> >
> > (userspace process which created the receiving XRC qp on a
> given host
> > dies before other processes which still need to receive XRC
> messages
> > on their SRQs which are "paired" with the now-destroyed
> receiving XRC
> > QP.)
> >
> > Solution: Add a userspace verb (as part of the XRC suite) which
> > enables the user process to create an XRC QP owned by the
> kernel -- which belongs to the required XRC domain.
> >
> > This QP will be destroyed when the XRC domain is closed
> (i.e., as part
> > of a ibv_close_xrc_domain call, but only when the domain's
> reference count goes to zero).
> >
> > Below, I give the new userspace API for this function.  Any
> feedback will be appreciated.
> > This API will be implemented in the upcoming OFED 1.3
> release, so we need feedback ASAP.
> >
> > Notes:
> > 1. There is no query or destroy verb for this QP. There is
> also no userspace object for the
> >    QP. Userspace has ONLY the raw qp number to use when
> creating the (X)RC connection.
> >
> > 2. Since the QP is "owned" by kernel space, async events
> for this QP are also handled in kernel
> >    space (i.e., reported in /var/log/messages). There are
> no completion events for the QP, since
> >    it does not send, and all receives completions are
> reported in the XRC SRQ's cq.
> >
> >    If this QP enters the error state, the remote QP which
> sends will start receiving RETRY_EXCEEDED
> >    errors, so the application will be aware of the failure.
> >
> > - Jack
> >
> ======================================================================
> > ================
> > /**
> >  * ibv_alloc_xrc_rcv_qp - creates an XRC QP for serving as
> a receive-side only QP,
> >  *    and moves the created qp through the RESET->INIT and
> INIT->RTR transitions.
> >  *      (The RTR->RTS transition is not needed, since this
> QP does no sending).
> >  *    The sending XRC QP uses this QP as destination, while
> specifying an XRC SRQ
> >  *    for actually receiving the transmissions and
> generating all completions on the
> >  *    receiving side.
> >  *
> >  *    This QP is created in kernel space, and persists
> until the XRC domain is closed.
> >  *    (i.e., its reference count goes to zero).
> >  *
> >  * @pd: protection domain to use.  At lower layer, this provides
> > access to userspace obj
> >  * @xrc_domain: xrc domain to use for the QP.
> >  * @attr: modify-qp attributes needed to bring the QP to RTR.
> >  * @attr_mask:  bitmap indicating which attributes are
> provided in the attr struct.
> >  *    used for validity checking.
> >  * @xrc_rcv_qpn: qp_num of created QP (if success). To be
> passed to the remote node. The
> >  *               remote node will use xrc_rcv_qpn in
> ibv_post_send when sending to
> >  *             XRC SRQ's on this host in the same xrc domain.
> >  *
> >  * RETURNS: success (0), or a (negative) error value.
> >  */
> >
> > int ibv_alloc_xrc_rcv_qp(struct ibv_pd *pd,
> >                        struct ibv_xrc_domain *xrc_domain,
> >                        struct ibv_qp_attr *attr,
> >                        enum ibv_qp_attr_mask attr_mask,
> >                        uint32_t *xrc_rcv_qpn);
> >
> > Notes:
> >
> > 1. Although the kernel creates the qp in the kernel's own
> PD, we still need the PD
> >    parameter to determine the device.
> >
> > 2. I chose to use struct ibv_qp_attr, which is used in
> modify QP, rather than create
> >    a new structure for this purpose.  This also guards
> against API changes in the event
> >    that during development I notice that more modify-qp
> parameters must be specified
> >    for this operation to work.
> >
> > 3. Table of the ibv_qp_attr parameters showing what values to set:
> >
> > struct ibv_qp_attr {
> >       enum ibv_qp_state       qp_state;               Not needed
> >       enum ibv_qp_state       cur_qp_state;           Not needed
> >               -- Driver starts from RESET and takes qp to RTR.
> >       enum ibv_mtu            path_mtu;               Yes
> >       enum ibv_mig_state      path_mig_state;         Yes
> >       uint32_t                qkey;                   Yes
> >       uint32_t                rq_psn;                 Yes
> >       uint32_t                sq_psn;                 Not needed
> >       uint32_t                dest_qp_num;            Yes
> -- this is the remote side QP for the RC conn.
> >       int                     qp_access_flags;        Yes
> >       struct ibv_qp_cap       cap;                    Need
> only XRC domain.
> >                                                       Other
> caps will use hard-coded values:
> >
>   max_send_wr = 1;
> >
>   max_recv_wr = 0;
> >
>   max_send_sge = 1;
> >
>   max_recv_sge = 0;
> >
>   max_inline_data = 0;
> >       struct ibv_ah_attr      ah_attr;                Yes
> >       struct ibv_ah_attr      alt_ah_attr;            Optional
> >       uint16_t                pkey_index;             Yes
> >       uint16_t                alt_pkey_index;         Optional
> >       uint8_t                 en_sqd_async_notify;    Not
> needed (No sq)
> >       uint8_t                 sq_draining;            Not
> needed (No sq)
> >       uint8_t                 max_rd_atomic;          Not
> needed (No sq)
> >       uint8_t                 max_dest_rd_atomic;     Yes
> -- Total max outstanding RDMAs expected
> >                                                       for
> ALL srq destinations using this receive QP.
> >                                                       (if
> you are only using SENDs, this value can be 0).
> >       uint8_t                 min_rnr_timer;          default - 0
> >       uint8_t                 port_num;               Yes
> >       uint8_t                 timeout;                Yes
> >       uint8_t                 retry_cnt;              Yes
> >       uint8_t                 rnr_retry;              Yes
> >       uint8_t                 alt_port_num;           Optional
> >       uint8_t                 alt_timeout;            Optional
> > };
> >
> > 4. Attribute mask bits to set:
> >       For RESET_to_INIT transition:
> >               IB_QP_ACCESS_FLAGS | IB_QP_PKEY_INDEX | IB_QP_PORT
> >
> >       For INIT_to_RTR transition:
> >               IB_QP_AV | IB_QP_PATH_MTU |
> >               IB_QP_DEST_QPN | IB_QP_RQ_PSN | IB_QP_MIN_RNR_TIMER
> >          If you are using RDMA or atomics, also set:
> >               IB_QP_MAX_DEST_RD_ATOMIC
> >
> >
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >
> > To unsubscribe, please visit
> > http://openib.org/mailman/listinfo/openib-general
> >
> >
>
>
> --
> Pavel Shamis (Pasha)
> Mellanox Technologies
>
>


From kliteyn at dev.mellanox.co.il  Thu Dec 20 08:41:51 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 20 Dec 2007 18:41:51 +0200
Subject: [ofa-general] Re: [PATCH] opensm: osm_state_mgr.c - stop idle queue
 processing if heavy sweep requested
In-Reply-To: <20071220154301.GC15888@sashak.voltaire.com>
References: <47667AB2.8030500@dev.mellanox.co.il>
	<20071218154033.GA4232@sashak.voltaire.com>
	<4768CAD8.4010407@dev.mellanox.co.il>
	<20071220154301.GC15888@sashak.voltaire.com>
Message-ID: <476A9B4F.8060807@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> On 09:40 Wed 19 Dec     , Yevgeny Kliteynik wrote:
>>  Sasha Khapyorsky wrote:
>>> Hi Yevgeny,
>>> On 15:33 Mon 17 Dec     , Yevgeny Kliteynik wrote:
>>>> If a heavy sweep requested during idle queue processing, OSM continues
>>>> to process it till the end and only then notices the heavy sweep request.
>>>> In some cases this might leave a topology change unhandled for several
>>>> minutes.
>>> Could you provide more details about such cases?
>>> As far as I know the idle queue is used only for multicast re-routing.
>>> If so, it is interesting by itself why it takes minutes and where. Is
>>> where MCG join/leave storm?
>>  Exactly. The problem was discovered on a big cluster with hundreds of mcast 
>>  groups,
>>  when there is some massive change in the subnet (like rebooting hundreds of 
>>  nodes).
> 
> Ok, then proposed patch looks like half solution for me.
> 
> During mcast join/leave storm idle queue will be filled with requests to
> rebuild mcast routing. OpenSM will process it one by one (and this will
> take a lot of time) instead of process all pended mcast groups in one
> run. I think it is first improvement needed here.
> 
> Even with such improvement we will not be able to control the order of
> heavy sweep/mcast join requests, so basically idea of breaking idle
> queue processing looks fine for me, but it is not all what should be
> done here. Heavy sweep by itself recalculates mcast routing for all
> existing groups, it should invalidate all pended mcast rerouting
> requests instead of continuing idle queue processing after heavy
> sweep. Make sense?

OK, makes sense.
So bottom line, when breaking the idle queue processing because of immediate
sweep request, state manager should just purge the whole idle queue and then
start the new heavy sweep.

I'll work on it.

-- Yevgeny

> Sasha
> 
>>  -- Yevgeny
>>
>>> Or single re-routing cycle takes minutes?
>>> Sasha
>>>> Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>>>> ---
>>>>  opensm/opensm/osm_state_mgr.c |   31 ++++++++++++++++++++++++-------
>>>>  1 files changed, 24 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
>>>> index 5c39f11..6ee5ee6 100644
>>>> --- a/opensm/opensm/osm_state_mgr.c
>>>> +++ b/opensm/opensm/osm_state_mgr.c
>>>> @@ -1607,13 +1607,30 @@ void osm_state_mgr_process(IN osm_state_mgr_t * 
>>>> const p_mgr,
>>>>  				/* CALL the done function */
>>>>  				__process_idle_time_queue_done(p_mgr);
>>>>
>>>> -				/*
>>>> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>>>> -				 * so that the next element in the queue gets processed
>>>> -				 */
>>>> -
>>>> -				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>>>> -				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>>>> +				if (p_mgr->p_subn->force_immediate_heavy_sweep) {
>>>> +					/*
>>>> +					 * Do not read next item from the idle queue.
>>>> +					 * Immediate heavy sweep is requested, so it's
>>>> +					 * more important.
>>>> +					 * Besides, there is a chance that after the
>>>> +					 * heavy sweep complition, idle queue processing
>>>> +					 * that SM would have performed here will be obsolete.
>>>> +					 */
>>>> +					if (osm_log_is_active(p_mgr->p_log, OSM_LOG_DEBUG))
>>>> +						osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>>>> +						"osm_state_mgr_process: "
>>>> +						"interrupting idle time queue processing - heavy sweep 
>>>> requested\n");
>>>> +					signal = OSM_SIGNAL_NONE:
>>>> +					p_mgr->state = OSM_SM_STATE_IDLE;
>>>> +				}
>>>> +				else {
>>>> +					/*
>>>> +					 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>>>> +					 * so that the next element in the queue gets processed
>>>> +					 */
>>>> +					signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>>>> +					p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>>>> +				}
>>>>  				break;
>>>>
>>>>  			default:
>>>> -- 
>>>> 1.5.1.4
>>>>
> 


From hrosenstock at xsigo.com  Thu Dec 20 08:49:26 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 20 Dec 2007 08:49:26 -0800
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <476A8DA9.8040408@dev.mellanox.co.il>
References: <20071219195839.GS412@sgi.com>
	<1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>
	<476A5530.3070602@dev.mellanox.co.il>
	<1198159659.6635.164.camel@hrosenstock-ws.xsigo.com>
	<476A8DA9.8040408@dev.mellanox.co.il>
Message-ID: <1198169366.6635.181.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-20 at 17:43 +0200, Yevgeny Kliteynik wrote:
> Hal Rosenstock wrote:
> > On Thu, 2007-12-20 at 13:42 +0200, Yevgeny Kliteynik wrote:
> >> Hal Rosenstock wrote:
> >>> On Wed, 2007-12-19 at 11:58 -0800, akepner at sgi.com wrote:
> >>>> We're seeing a regression in smpquery from alpha2 to rc1. 
> >>>>
> >>>> For example, with alpha2 I get:
> >>>> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
> >>>> # Node info: Lid 3
> >>>> BaseVers:........................1
> >>>> ClassVers:.......................1
> >>>> NodeType:........................Channel Adapter
> >>>> NumPorts:........................2
> >>>> SystemGuid:......................0x00066a009800737c
> >>>> Guid:............................0x00066a009800737c
> >>>> PortGuid:........................0x00066a01a000737c
> >>>> PartCap:.........................64
> >>>> DevId:...........................0x6278
> >>>> Revision:........................0x000000a0
> >>>> LocalPort:.......................2
> >>>> VendorId:........................0x00066a
> >>>> grommit:~ # 
> >>>>
> >>>>
> >>>> And with rc1, I get:
> >>>> grommit:~ # smpquery -G nodeinfo 0x66a01a000737c
> >>>> ibwarn: [5650] ib_path_query: sa call path_query failed
> >>>> smpquery: iberror: failed: can't resolve destination port 0x66a01a000737c
> >>>> grommit:~ #  
> >>>>
> >>>> But using a LID works fine:
> >>>> grommit:~ # smpquery nodeinfo 3
> >>>> # Node info: Lid 3
> >>>> BaseVers:........................1
> >>>> ClassVers:.......................1
> >>>> NodeType:........................Channel Adapter
> >>>> NumPorts:........................2
> >>>> SystemGuid:......................0x00066a009800737c
> >>>> Guid:............................0x00066a009800737c
> >>>> PortGuid:........................0x00066a01a000737c
> >>>> PartCap:.........................64
> >>>> DevId:...........................0x6278
> >>>> Revision:........................0x000000a0
> >>>> LocalPort:.......................2
> >>>> VendorId:........................0x00066a
> >>>> grommit:~ # 
> >>>>
> >>>> Strangest of all, running it under strace also works:
> >>>> grommit:~ # strace smpquery -G nodeinfo 0x66a01a000737c > /tmp/smpquery.out 
> >>>> .....
> >>>> grommit:~ # cat /tmp/smpquery.out
> >>>> # Node info: Lid 3
> >>>> BaseVers:........................1
> >>>> ClassVers:.......................1
> >>>> NodeType:........................Channel Adapter
> >>>> NumPorts:........................2
> >>>> SystemGuid:......................0x00066a009800737c
> >>>> Guid:............................0x00066a009800737c
> >>>> PortGuid:........................0x00066a01a000737c
> >>>> PartCap:.........................64
> >>>> DevId:...........................0x6278
> >>>> Revision:........................0x000000a0
> >>>> LocalPort:.......................2
> >>>> VendorId:........................0x00066a
> >>>> grommit:~ #
> >>>>
> >>>> Some weird race condition...
> >>>>
> >>>> Anyone else seeing the same?
> >>> -G requires a SA path record lookup so this could be an issue with that
> >>> timing out in some cases (assuming the port is active and the SM is
> >>> operational).
> >> I'm seeing the same problem.
> >> Sometimes the query works, and sometimes it doesn't.
> >> I also see that when the query fails, OpenSM doesn't get PathRecord query at all.
> >>
> >> Hal, can you elaborate on "that timing out in some cases" issue?
> > 
> > I just meant that the SM not responding (for an unknown reason right
> > now) would yield this effect.
> > 
> >> Adding Jack for the libibmad issue:
> >>
> >> I see that the ib_path_query() in libibmad/sa.c sometimes fails
> >> when calling safe_sa_call().
> > 
> > This could just be more detail on the same thing in terms of the
> > (smpquery) client which is layered on top of libibmad: the SA path query
> > timeout.
> > I would suggest running OpenSM in verbose mode (both instances are with
> > OpenSM) and seeing if it responds to the PathRecord query used by this
> > form of smpquery and continue troubleshooting from there based on the
> > result.
> 
> This is actually what I was saying here.
> I have *debugged* smpquery, and saw that the failing function is
> ib_path_query() in libibmad/sa.c
> As I've mentioned, I did run it with OpenSM in verbose mode, and saw
> that when smpquery fails, OpenSM log does not have any PathRecord request.
> When smpquery passes, I see the PathRecord request and response in the
> OpenSM log.

OK; that wasn't clear before but is now (that the failure appears to be
a client and not SM issue) :-) FWIW, I don't know what has changed that
would affect this so it could be a latent bug as opposed to a
regression.

-- Hal

> -- Yevgeny
> 
> > -- Hal
> > 
> >> -- Yevgeny
> >>
> >>> -- Hal
> >>> _______________________________________________
> >>> general mailing list
> >>> general at lists.openfabrics.org
> >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>>
> >>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >>>
> > 
> 


From sashak at voltaire.com  Thu Dec 20 09:06:32 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 20 Dec 2007 17:06:32 +0000
Subject: [ofa-general] Re: [PATCH] opensm: osm_state_mgr.c - stop idle queue
	processing if heavy sweep requested
In-Reply-To: <476A9B4F.8060807@dev.mellanox.co.il>
References: <47667AB2.8030500@dev.mellanox.co.il>
	<20071218154033.GA4232@sashak.voltaire.com>
	<4768CAD8.4010407@dev.mellanox.co.il>
	<20071220154301.GC15888@sashak.voltaire.com>
	<476A9B4F.8060807@dev.mellanox.co.il>
Message-ID: <20071220170632.GD15888@sashak.voltaire.com>

On 18:41 Thu 20 Dec     , Yevgeny Kliteynik wrote:
>  Sasha Khapyorsky wrote:
> > On 09:40 Wed 19 Dec     , Yevgeny Kliteynik wrote:
> >>  Sasha Khapyorsky wrote:
> >>> Hi Yevgeny,
> >>> On 15:33 Mon 17 Dec     , Yevgeny Kliteynik wrote:
> >>>> If a heavy sweep requested during idle queue processing, OSM continues
> >>>> to process it till the end and only then notices the heavy sweep 
> >>>> request.
> >>>> In some cases this might leave a topology change unhandled for several
> >>>> minutes.
> >>> Could you provide more details about such cases?
> >>> As far as I know the idle queue is used only for multicast re-routing.
> >>> If so, it is interesting by itself why it takes minutes and where. Is
> >>> where MCG join/leave storm?
> >>  Exactly. The problem was discovered on a big cluster with hundreds of 
> >> mcast  groups,
> >>  when there is some massive change in the subnet (like rebooting hundreds 
> >> of  nodes).
> > Ok, then proposed patch looks like half solution for me.
> > During mcast join/leave storm idle queue will be filled with requests to
> > rebuild mcast routing. OpenSM will process it one by one (and this will
> > take a lot of time) instead of process all pended mcast groups in one
> > run. I think it is first improvement needed here.
> > Even with such improvement we will not be able to control the order of
> > heavy sweep/mcast join requests, so basically idea of breaking idle
> > queue processing looks fine for me, but it is not all what should be
> > done here. Heavy sweep by itself recalculates mcast routing for all
> > existing groups, it should invalidate all pended mcast rerouting
> > requests instead of continuing idle queue processing after heavy
> > sweep. Make sense?
> 
>  OK, makes sense.
>  So bottom line, when breaking the idle queue processing because of immediate
>  sweep request, state manager should just purge the whole idle queue and then
>  start the new heavy sweep.

Yes, it is one patch, another expected patch for improving mcast
join requests/node reboot storm handling by OpenSM is recalculating mcast
routing for more than one mcast groups (actually I think requested mcast
groups should be queued in the list and mcast re-routing request merged
+ some trivial processor function in osm_mcast_mgr.c). Maybe whole idle
queue mechanism can be killed as useless, then this will impact heavy
sweep related patch.

Sasha


From sashak at voltaire.com  Thu Dec 20 09:13:18 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 20 Dec 2007 17:13:18 +0000
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <1198169366.6635.181.camel@hrosenstock-ws.xsigo.com>
References: <20071219195839.GS412@sgi.com>
	<1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>
	<476A5530.3070602@dev.mellanox.co.il>
	<1198159659.6635.164.camel@hrosenstock-ws.xsigo.com>
	<476A8DA9.8040408@dev.mellanox.co.il>
	<1198169366.6635.181.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071220171318.GE15888@sashak.voltaire.com>

On 08:49 Thu 20 Dec     , Hal Rosenstock wrote:
> > >>>>
> > >>>> Anyone else seeing the same?
> > >>> -G requires a SA path record lookup so this could be an issue with that
> > >>> timing out in some cases (assuming the port is active and the SM is
> > >>> operational).
> > >> I'm seeing the same problem.
> > >> Sometimes the query works, and sometimes it doesn't.
> > >> I also see that when the query fails, OpenSM doesn't get PathRecord query at all.
> > >>
> > >> Hal, can you elaborate on "that timing out in some cases" issue?
> > > 
> > > I just meant that the SM not responding (for an unknown reason right
> > > now) would yield this effect.
> > > 
> > >> Adding Jack for the libibmad issue:
> > >>
> > >> I see that the ib_path_query() in libibmad/sa.c sometimes fails
> > >> when calling safe_sa_call().
> > > 
> > > This could just be more detail on the same thing in terms of the
> > > (smpquery) client which is layered on top of libibmad: the SA path query
> > > timeout.
> > > I would suggest running OpenSM in verbose mode (both instances are with
> > > OpenSM) and seeing if it responds to the PathRecord query used by this
> > > form of smpquery and continue troubleshooting from there based on the
> > > result.
> > 
> > This is actually what I was saying here.
> > I have *debugged* smpquery, and saw that the failing function is
> > ib_path_query() in libibmad/sa.c
> > As I've mentioned, I did run it with OpenSM in verbose mode, and saw
> > that when smpquery fails, OpenSM log does not have any PathRecord request.
> > When smpquery passes, I see the PathRecord request and response in the
> > OpenSM log.
> 
> OK; that wasn't clear before but is now (that the failure appears to be
> a client and not SM issue) :-) FWIW, I don't know what has changed that
> would affect this so it could be a latent bug as opposed to a
> regression.

Right, there were no changes in this area in this period, likely issue
just triggered. I'm not sure but probably I saw something like this in a
past, but then thought it was cabling issue.

Yevgeny, Arthur, could you rerun smpquery with -dddd (for lot of debug
stuff)?

Sasha


From akepner at sgi.com  Thu Dec 20 09:08:22 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Thu, 20 Dec 2007 09:08:22 -0800
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <20071220171318.GE15888@sashak.voltaire.com>
References: <20071219195839.GS412@sgi.com>
	<1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>
	<476A5530.3070602@dev.mellanox.co.il>
	<1198159659.6635.164.camel@hrosenstock-ws.xsigo.com>
	<476A8DA9.8040408@dev.mellanox.co.il>
	<1198169366.6635.181.camel@hrosenstock-ws.xsigo.com>
	<20071220171318.GE15888@sashak.voltaire.com>
Message-ID: <20071220170822.GI412@sgi.com>

On Thu, Dec 20, 2007 at 05:13:18PM +0000, Sasha Khapyorsky wrote:
> ...
> Yevgeny, Arthur, could you rerun smpquery with -dddd (for lot of debug
> stuff)?
> 

Well, just about any perturbation changes the behavior - run 
it under strace, or gdb, link the IB libraries statically, or 
look at the machine funny and it works fine. 

But using the debug flags reveals an apparent problem with the 
debug code itself:

# ./smpquery_1.3_rc1 -d -G nodeinfo 0x00066a01a000737c
ibwarn: [19328] smp_query: attr 0x15 mod 0x0 route DR path 0
ibwarn: [19328] mad_rpc: data offs 64 sz 64
mad data
0000 0000 0000 0000 fe80 0000 0000 0000
0002 0002 0251 0a6a 0000 0000 0103 0302
3452 0023 4040 0008 0804 ff40 0000 005e
0000 2012 1088 0000 0000 0000 0000 0000
Segmentation fault

and gdb shows:

(gdb) bt
#0  0x00002b0b9222ed0f in _IO_default_xsputn_internal () from /lib64/libc.so.6
#1  0x00002b0b92207177 in vfprintf () from /lib64/libc.so.6
#2  0x00002b0b9229577d in __vsprintf_chk () from /lib64/libc.so.6
#3  0x00002b0b922956c0 in __sprintf_chk () from /lib64/libc.so.6
#4  0x00002b0b91c71166 in portid2str (portid=0x7fff1905bc00) at src/portid.c:91
#5  0x00002b0b91c72529 in sa_rpc_call (ibmad_port=0x7fff1905b680,
    rcvbuf=0x7fff1905bb30, portid=0x7fff1905bc00, sa=0x7fff1905bac0, timeout=0)
    at src/sa.c:58
#6  0x00002b0b91c71791 in sa_call (rcvbuf=0x7fff1905bb30,
    portid=0x7fff1905bc00, sa=0x7fff1905bac0, timeout=0) at src/rpc.c:395
#7  0x00002b0b91c723bf in ib_path_query (srcgid=0x7fff1905be30 "\200",
    destgid=0x7fff1905be30 "\200", sm_id=0x7fff1905bc00, buf=0x7fff1905bb30)
    at ./include/infiniband/mad.h:790
#8  0x00002b0b91c7144f in ib_resolve_guid (portid=0x7fff1905bde0,
    guid=0x7fff1905bd20, sm_id=0x7fff1905bc00, timeout=<value optimized out>)
    at src/resolve.c:83
#9  0x00002b0b91c71610 in ib_resolve_portid_str (portid=0x7fff1905bde0,
    addr_str=0x7fff1905d341 "0x00066a01a000737c", dest_type=2, sm_id=0x0)
    at src/resolve.c:115
#10 0x0000000000401cd1 in main (argc=2, argv=0x7fff1905bfd0)
    at smpquery_1.3_rc1.c:522

-- 
Arthur


From swise at opengridcomputing.com  Thu Dec 20 09:14:10 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 20 Dec 2007 11:14:10 -0600
Subject: [ofa-general] iommu dma mapping alignment requirements
Message-ID: <476AA2E2.5010007@opengridcomputing.com>

Hey Roland (and any iommu/ppc/dma experts out there):

I'm debugging a data corruption issue that happens on PPC64 systems 
running rdma on kernels where the iommu page size is 4KB yet the host 
page size is 64KB.  This "feature" was added to the PPC64 code recently, 
and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
  configs work too. Its just a problem when the iommu page size != host 
page size.

It appears that my problem boils down to a single host page of memory 
that is mapped for dma, and the dma address returned by dma_map_sg() is 
_not_ 64KB aligned.  Here is an example:

app registers va 0x000000002d9a3000 len 12288
ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
state from a registered user memory region):

>     umem len 12288 off 12288 pgsz 65536 shift 16
>     chunk 0: nmap 1 nents 1
>         sglist[0] page 0xc000000000930b08 off 0 len 65536 dma_addr 000000005bff4000 dma_len 65536
> 

So the kernel maps 1 full page for this MR.  But note that the dma 
address is 000000005bff4000 which is 4KB aligned, not 64KB aligned.  I 
think this is causing grief to the RDMA HW.

My first question is: Is there an assumption or requirement in linux 
that dma_addressess should have the same alignment as the host address 
they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
but the mapping doesn't begin on a 64KB page boundary.

If this mapping is considered valid, then perhaps the rdma hw is at 
fault here.  But I'm wondering if this is an PPC/iommu bug.

BTW:  Here is what the Memory Region looks like to the HW:

> TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
>             perms RW rem_inv_dis 0 addr_type VATO
>             bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
>             len 12288 va 000000002d9a3000 bind_cnt 0
> PBL: 000000005bff4000


Any thoughts?

Steve.


From tom at opengridcomputing.com  Thu Dec 20 09:29:36 2007
From: tom at opengridcomputing.com (Tom Tucker)
Date: Thu, 20 Dec 2007 11:29:36 -0600
Subject: [ofa-general] iommu dma mapping alignment requirements
In-Reply-To: <476AA2E2.5010007@opengridcomputing.com>
References: <476AA2E2.5010007@opengridcomputing.com>
Message-ID: <1198171776.23924.22.camel@trinity.ogc.int>


On Thu, 2007-12-20 at 11:14 -0600, Steve Wise wrote:
> Hey Roland (and any iommu/ppc/dma experts out there):
> 
> I'm debugging a data corruption issue that happens on PPC64 systems 
> running rdma on kernels where the iommu page size is 4KB yet the host 
> page size is 64KB.  This "feature" was added to the PPC64 code recently, 
> and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
> page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
>   configs work too. Its just a problem when the iommu page size != host 
> page size.
> 
> It appears that my problem boils down to a single host page of memory 
> that is mapped for dma, and the dma address returned by dma_map_sg() is 
> _not_ 64KB aligned.  Here is an example:
> 
> app registers va 0x000000002d9a3000 len 12288
> ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
> state from a registered user memory region):
> 
> >     umem len 12288 off 12288 pgsz 65536 shift 16
> >     chunk 0: nmap 1 nents 1
> >         sglist[0] page 0xc000000000930b08 off 0 len 65536 dma_addr 000000005bff4000 dma_len 65536
> > 
> 
> So the kernel maps 1 full page for this MR.  But note that the dma 
> address is 000000005bff4000 which is 4KB aligned, not 64KB aligned.  I 
> think this is causing grief to the RDMA HW.
> 
> My first question is: Is there an assumption or requirement in linux 
> that dma_addressess should have the same alignment as the host address 
> they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
> but the mapping doesn't begin on a 64KB page boundary.
> 
> If this mapping is considered valid, then perhaps the rdma hw is at 
> fault here.  But I'm wondering if this is an PPC/iommu bug.
> 
> BTW:  Here is what the Memory Region looks like to the HW:
> 
> > TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
> >             perms RW rem_inv_dis 0 addr_type VATO
> >             bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
> >             len 12288 va 000000002d9a3000 bind_cnt 0
> > PBL: 000000005bff4000
> 
> 
> 
> Any thoughts?

The Ammasso certainly works this way. If you tell it the page size is
64KB, it will ignore bits in the page address that encode 0-65535.

> 
> Steve.
> 
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From rdreier at cisco.com  Thu Dec 20 10:07:39 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 20 Dec 2007 10:07:39 -0800
Subject: [ofa-general] iommu dma mapping alignment requirements
In-Reply-To: <476AA2E2.5010007@opengridcomputing.com> (Steve Wise's message of
	"Thu, 20 Dec 2007 11:14:10 -0600")
References: <476AA2E2.5010007@opengridcomputing.com>
Message-ID: <adalk7pz1ck.fsf@cisco.com>

 > It appears that my problem boils down to a single host page of memory
 > that is mapped for dma, and the dma address returned by dma_map_sg()
 > is _not_ 64KB aligned.  Here is an example:

 > My first question is: Is there an assumption or requirement in linux
 > that dma_addressess should have the same alignment as the host address
 > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
 > but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.


From ValerieorateHamm at aljazeera.net  Thu Dec 20 12:18:33 2007
From: ValerieorateHamm at aljazeera.net (Debbie Grady)
Date: Thu, 20 Dec 2007 19:18:33 -0100
Subject: [ofa-general] Big news this week
Message-ID: <1b5e01c84334$d9d22f10$0201a8c0@chambre>

Presenting the energy Co to be in for rest 2007
ENERBRITE TECHNOLOGI
Symbol : E T G U

Energy sector is hot right now, and everyone wants in

About the Co
We have two strategic objectives:

~ to become a market leader in developing and marketing innovative and intelligent energy saving solutions that achieve significant savings in the cost of energy and substantial improvements in energy conservation to the benefit of both consumers and the environment

~ to be an integrator of smart automated lifestyle systems that control the interior environment (climate, entertainment, lighting, security) of residential and professional spaces

Ride this winner for easy double or triple bagger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/3d5bb2de/attachment.html>

From mshefty at ichips.intel.com  Thu Dec 20 10:04:25 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Thu, 20 Dec 2007 10:04:25 -0800
Subject: [ofa-general] Re: [PATCH 3/3] ib/cm: add basic
	performance	counters
In-Reply-To: <adasl1yyomf.fsf@cisco.com>
References: <000001c83091$2859bce0$9c98070a@amr.corp.intel.com>	<000301c83092$fe2f99b0$9c98070a@amr.corp.intel.com>	<adawsrayp2l.fsf@cisco.com>
	<adasl1yyomf.fsf@cisco.com>
Message-ID: <476AAEA9.3010104@ichips.intel.com>

Roland Dreier wrote:
> by the way, I had to make cm_class not static, or else a build with
> ib_cm and ib_ucm built into the kernel faile... I think that exported
> symbols can't be static.

thanks for fixing this


From mshefty at ichips.intel.com  Thu Dec 20 10:32:52 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Thu, 20 Dec 2007 10:32:52 -0800
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <476A857D.3090608@voltaire.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com> <47696002.4030903@ichips.intel.com>
	<476A857D.3090608@voltaire.com>
Message-ID: <476AB554.9080200@ichips.intel.com>

> Sorry if I wasn't clear, let me see if I understand you: with this
> different domain implementation, under both client/server the passive
> calls cm listen and the active call cm connect, where under peer/to/peer
> both sides call cm listen and later both sides may call cm connect or
> only one side, correct?

My thinking was that the peer to peer model would have both sides call 
connect only.  The peer to peer connection model only kicks in when both 
sides are in the REQ sent state.

> But it could set it to one as well... assuming my understanding above of
> the suggested implementation is correct, we can change the RDMA-CM API
> to let users specify on rdma_connect that they want peer to peer
> support, so such apps can issue rdma_listen call and later call
> rdma_connect with this bit set and they are done (or almost done... I
> guess there some more devil in the details here, isn't it?)

This was why I said that the IB CM API was fine, but the RDMA CM API 
would require changes.

> This makes the all peer to peer model useless, since an app can not make
> sure that connection occur at exactly the same time!

yep - (anyone can feel free to step in a set me straight on this...)

> the spec is that peer to peer model has the ability to handle also 
> connections that occur at exactly the same time but not only.

Peer to peer seems inherently racy to me.

> Under MPI each rank uses a different SID, so I think we are safe from 
> this problem.

Any peer to peer implementation should handle this case however.

- Sean


From rdreier at cisco.com  Thu Dec 20 11:01:44 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 20 Dec 2007 11:01:44 -0800
Subject: [ofa-general] Re: [RFC] XRC -- make receiving XRC QP independent of
	any one user process
In-Reply-To: <200712201535.37527.jackm@dev.mellanox.co.il> (Jack Morgenstein's
	message of "Thu, 20 Dec 2007 15:35:37 +0200")
References: <200712201535.37527.jackm@dev.mellanox.co.il>
Message-ID: <adad4t1yyuf.fsf@cisco.com>

 > This API will be implemented in the upcoming OFED 1.3 release, so we need feedback ASAP.

I hope we can learn some lessons about development process... clearly
changing APIs after -rc1 is not something that leads to good quality
in general.

 > int ibv_alloc_xrc_rcv_qp(struct ibv_pd *pd,
 > 			 struct ibv_xrc_domain *xrc_domain,
 > 			 struct ibv_qp_attr *attr,
 > 			 enum ibv_qp_attr_mask attr_mask,
 > 			 uint32_t *xrc_rcv_qpn);

I can't say this interface is very appealing.

Another option would be to create an XRC verb that "detaches" a
userspace QP and gives it the same lifetime as an XRC domain.  But
that doesn't seem any nicer.

And I guess we can't combine creating the QP with allocating the XRC
domain, because the consumer might want to open the XRC domain before
it has connected with the remote side.

Oh well, I guess this XRC stuff just ends up being ugly.

 - R.


From swise at opengridcomputing.com  Thu Dec 20 11:11:28 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 20 Dec 2007 13:11:28 -0600
Subject: [ofa-general] iommu dma mapping alignment requirements
In-Reply-To: <adalk7pz1ck.fsf@cisco.com>
References: <476AA2E2.5010007@opengridcomputing.com>
	<adalk7pz1ck.fsf@cisco.com>
Message-ID: <476ABE60.9030805@opengridcomputing.com>

Roland Dreier wrote:
>  > It appears that my problem boils down to a single host page of memory
>  > that is mapped for dma, and the dma address returned by dma_map_sg()
>  > is _not_ 64KB aligned.  Here is an example:
> 
>  > My first question is: Is there an assumption or requirement in linux
>  > that dma_addressess should have the same alignment as the host address
>  > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
>  > but the mapping doesn't begin on a 64KB page boundary.
> 
> I don't think this is explicitly documented anywhere, but it certainly
> seems that we want the bus address to be page-aligned in this case.
> For mthca/mlx4 at least, we tell the adapter what the host page size
> is (so that it knows how to align doorbell pages etc) and I think this
> sort of thing would confuse the HW.
> 
>  - R.


In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
iommu_range_alloc() with a alignment_order of 0:

>                 vaddr = (unsigned long)page_address(s->page) + s->offset;
>                 npages = iommu_num_pages(vaddr, slen);
>                 entry = iommu_range_alloc(tbl, npages, &handle, mask >> IOMMU_PAGE_SHIFT, 0);

But perhaps the alignment order needs to be based on the host page size?


Steve.


From swise at opengridcomputing.com  Thu Dec 20 11:29:41 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 20 Dec 2007 13:29:41 -0600
Subject: [ofa-general] iommu dma mapping alignment requirements
In-Reply-To: <476ABE60.9030805@opengridcomputing.com>
References: <476AA2E2.5010007@opengridcomputing.com>
	<adalk7pz1ck.fsf@cisco.com>
	<476ABE60.9030805@opengridcomputing.com>
Message-ID: <476AC2A5.8060200@opengridcomputing.com>

Steve Wise wrote:
> Roland Dreier wrote:
>>  > It appears that my problem boils down to a single host page of memory
>>  > that is mapped for dma, and the dma address returned by dma_map_sg()
>>  > is _not_ 64KB aligned.  Here is an example:
>>
>>  > My first question is: Is there an assumption or requirement in linux
>>  > that dma_addressess should have the same alignment as the host address
>>  > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
>>  > but the mapping doesn't begin on a 64KB page boundary.
>>
>> I don't think this is explicitly documented anywhere, but it certainly
>> seems that we want the bus address to be page-aligned in this case.
>> For mthca/mlx4 at least, we tell the adapter what the host page size
>> is (so that it knows how to align doorbell pages etc) and I think this
>> sort of thing would confuse the HW.
>>
>>  - R.
> 
> 
> In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
> iommu_range_alloc() with a alignment_order of 0:
> 
>>                 vaddr = (unsigned long)page_address(s->page) + s->offset;
>>                 npages = iommu_num_pages(vaddr, slen);
>>                 entry = iommu_range_alloc(tbl, npages, &handle, mask 
>> >> IOMMU_PAGE_SHIFT, 0);
> 
> But perhaps the alignment order needs to be based on the host page size?
> 

Or based on the alignment of vaddr actually...


From benh at au1.ibm.com  Thu Dec 20 12:17:42 2007
From: benh at au1.ibm.com (Benjamin Herrenschmidt)
Date: Fri, 21 Dec 2007 07:17:42 +1100
Subject: [ofa-general] Re: iommu dma mapping alignment requirements
In-Reply-To: <476AA2E2.5010007@opengridcomputing.com>
References: <476AA2E2.5010007@opengridcomputing.com>
Message-ID: <1198181862.6779.3.camel@pasglop>

Adding A few more people to the discussion. You may well be right and we
would have to provide the same alignment, though that sucks a bit as one
of the reason we switched to 4K for the IOMMU is that the iommu space
available on pSeries is very small and we were running out of it with
64K pages and lots of networking activity.

On Thu, 2007-12-20 at 11:14 -0600, Steve Wise wrote:
> Hey Roland (and any iommu/ppc/dma experts out there):
> 
> I'm debugging a data corruption issue that happens on PPC64 systems 
> running rdma on kernels where the iommu page size is 4KB yet the host 
> page size is 64KB.  This "feature" was added to the PPC64 code recently, 
> and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
> page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
>   configs work too. Its just a problem when the iommu page size != host 
> page size.
> 
> It appears that my problem boils down to a single host page of memory 
> that is mapped for dma, and the dma address returned by dma_map_sg() is 
> _not_ 64KB aligned.  Here is an example:
> 
> app registers va 0x000000002d9a3000 len 12288
> ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
> state from a registered user memory region):
> 
> >     umem len 12288 off 12288 pgsz 65536 shift 16
> >     chunk 0: nmap 1 nents 1
> >         sglist[0] page 0xc000000000930b08 off 0 len 65536 dma_addr 000000005bff4000 dma_len 65536
> > 
> 
> So the kernel maps 1 full page for this MR.  But note that the dma 
> address is 000000005bff4000 which is 4KB aligned, not 64KB aligned.  I 
> think this is causing grief to the RDMA HW.
> 
> My first question is: Is there an assumption or requirement in linux 
> that dma_addressess should have the same alignment as the host address 
> they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
> but the mapping doesn't begin on a 64KB page boundary.
> 
> If this mapping is considered valid, then perhaps the rdma hw is at 
> fault here.  But I'm wondering if this is an PPC/iommu bug.
> 
> BTW:  Here is what the Memory Region looks like to the HW:
> 
> > TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
> >             perms RW rem_inv_dis 0 addr_type VATO
> >             bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
> >             len 12288 va 000000002d9a3000 bind_cnt 0
> > PBL: 000000005bff4000
> 
> 
> 
> Any thoughts?
> 
> Steve.
> 


From benh at au1.ibm.com  Thu Dec 20 12:21:01 2007
From: benh at au1.ibm.com (Benjamin Herrenschmidt)
Date: Fri, 21 Dec 2007 07:21:01 +1100
Subject: [ofa-general] iommu dma mapping alignment requirements
In-Reply-To: <476AC2A5.8060200@opengridcomputing.com>
References: <476AA2E2.5010007@opengridcomputing.com>
	<adalk7pz1ck.fsf@cisco.com> <476ABE60.9030805@opengridcomputing.com>
	<476AC2A5.8060200@opengridcomputing.com>
Message-ID: <1198182061.6779.7.camel@pasglop>


On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:

> Or based on the alignment of vaddr actually...

The later wouldn't be realistic. What I think might be necessay, though
it would definitely cause us problems with running out of iommu space
(which is the reason we did the switch down to 4K), is to provide
alignment to the real page size, and alignement to the allocation order
for dma_map_consistent.

It might be possible to -tweak- and only provide alignment to the page
size for allocations that are larger than IOMMU_PAGE_SIZE. That would
solve the problem with small network packets eating up too much iommu
space though.

What do you think ?

Ben.


From glenn at lists.openfabrics.org  Thu Dec 20 12:22:41 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:22:41 -0800 (PST)
Subject: [ofa-general] [PATCH 1/10] nes: accelerated loopback fix
Message-ID: <20071220202241.80013E60399@openfabrics.org>


Accelerated loopback code did not properly handle private
data.  Add loopback connection counter to ethtool stats.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c
index 638bc51..79889a4 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -67,6 +67,7 @@ u32 cm_packets_received;
 u32 cm_listens_created;
 u32 cm_listens_destroyed;
 u32 cm_backlog_drops;
+atomic_t cm_loopbacks;
 atomic_t cm_nodes_created;
 atomic_t cm_nodes_destroyed;
 atomic_t cm_accel_dropped_pkts;
@@ -1638,6 +1639,7 @@ struct nes_cm_node * mini_cm_connect(struct nes_cm_core *cm_core,
 		if (loopbackremotelistener == NULL) {
 			create_event(cm_node, NES_CM_EVENT_ABORTED);
 		} else {
+			atomic_inc(&cm_loopbacks);
 			loopback_cm_info = *cm_info;
 			loopback_cm_info.loc_port = cm_info->rem_port;
 			loopback_cm_info.rem_port = cm_info->loc_port;
@@ -2445,7 +2447,13 @@ int nes_accept(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
 	cm_event.private_data = NULL;
 	cm_event.private_data_len = 0;
 	ret = cm_id->event_handler(cm_id, &cm_event);
-	nes_debug(NES_DBG_CM, "OFA CM event_handler returned, ret=%d\n", ret);
+	if (cm_node->loopbackpartner) {
+		cm_node->loopbackpartner->mpa_frame_size = nesqp->private_data_len;
+		/* copy entire MPA frame to our cm_node's frame */
+		memcpy(cm_node->loopbackpartner->mpa_frame_buf, nesqp->ietf_frame->priv_data,
+			   nesqp->private_data_len);
+		create_event(cm_node->loopbackpartner, NES_CM_EVENT_CONNECTED);
+	}
 	if (ret)
 		printk("%s[%u] OFA CM event_handler returned, ret=%d\n",
 				__FUNCTION__, __LINE__, ret);
diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index e01aab4..810a9ae 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -114,6 +114,7 @@ extern u32 cm_packets_retrans;
 extern u32 cm_listens_created;
 extern u32 cm_listens_destroyed;
 extern u32 cm_backlog_drops;
+extern atomic_t cm_loopbacks;
 extern atomic_t cm_nodes_created;
 extern atomic_t cm_nodes_destroyed;
 extern atomic_t cm_accel_dropped_pkts;
@@ -967,7 +968,7 @@ void nes_netdev_exit(struct nes_vnic *nesvnic)
 }
 
 
-#define NES_ETHTOOL_STAT_COUNT 54
+#define NES_ETHTOOL_STAT_COUNT 55
 static const char nes_ethtool_stringset[NES_ETHTOOL_STAT_COUNT][ETH_GSTRING_LEN] = {
 	"Link Change Interrupts",
 	"Linearized SKBs",
@@ -1011,6 +1012,7 @@ static const char nes_ethtool_stringset[NES_ETHTOOL_STAT_COUNT][ETH_GSTRING_LEN]
 	"CM Listens Created",
 	"CM Listens Destroyed",
 	"CM Backlog Drops",
+	"CM Loopbacks",
 	"CM Nodes Created",
 	"CM Nodes Destroyed",
 	"CM Accel Drops",
@@ -1206,11 +1208,11 @@ static void nes_netdev_get_ethtool_stats(struct net_device *netdev,
 	target_stat_values[39] = cm_listens_created;
 	target_stat_values[40] = cm_listens_destroyed;
 	target_stat_values[41] = cm_backlog_drops;
-	target_stat_values[42] = atomic_read(&cm_nodes_created);
-	target_stat_values[43] = atomic_read(&cm_nodes_destroyed);
-	target_stat_values[44] = atomic_read(&cm_accel_dropped_pkts);
-	target_stat_values[45] = atomic_read(&cm_resets_recvd);
-	target_stat_values[46] = int_mod_timer_init;
+	target_stat_values[42] = atomic_read(&cm_loopbacks);
+	target_stat_values[43] = atomic_read(&cm_nodes_created);
+	target_stat_values[44] = atomic_read(&cm_nodes_destroyed);
+	target_stat_values[45] = atomic_read(&cm_accel_dropped_pkts);
+	target_stat_values[46] = atomic_read(&cm_resets_recvd);
 	target_stat_values[47] = int_mod_cq_depth_1;
 	target_stat_values[48] = int_mod_cq_depth_4;
 	target_stat_values[49] = int_mod_cq_depth_16;
diff --git a/drivers/infiniband/hw/nes/nes_utils.c b/drivers/infiniband/hw/nes/nes_utils.c
index b6aa6d3..8d2c1ee 100644
--- a/drivers/infiniband/hw/nes/nes_utils.c
+++ b/drivers/infiniband/hw/nes/nes_utils.c
@@ -620,8 +620,6 @@ void nes_post_cqp_request(struct nes_device *nesdev,
 }
 
 
-
-
 /**
  * nes_arp_table
  */


From glenn at lists.openfabrics.org  Thu Dec 20 12:24:20 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:24:20 -0800 (PST)
Subject: [ofa-general] [PATCH 2/10] nes: add support for external flash
	update utility
Message-ID: <20071220202420.A0A90E60399@openfabrics.org>


Allows an external utility to read/write flash for
firmware upgrades.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes.c b/drivers/infiniband/hw/nes/nes.c
index a5e0bb5..1088330 100644
--- a/drivers/infiniband/hw/nes/nes.c
+++ b/drivers/infiniband/hw/nes/nes.c
@@ -780,6 +780,136 @@ static struct pci_driver nes_pci_driver = {
 	.remove = __devexit_p(nes_remove),
 };
 
+static ssize_t nes_show_ee_cmd(struct device_driver *ddp, char *buf)
+{
+	u32 eeprom_cmd;
+	struct nes_device *nesdev;
+
+	nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+	eeprom_cmd = nes_read32(nesdev->regs + NES_EEPROM_COMMAND);
+
+	return snprintf(buf, PAGE_SIZE, "0x%x\n", eeprom_cmd);
+}
+
+static ssize_t nes_store_ee_cmd(struct device_driver *ddp,
+	const char *buf, size_t count)
+{
+	char *p = (char *)buf;
+	u32 val;
+	struct nes_device *nesdev;
+
+	if (p[1] == 'x' || p[1] == 'X' || p[0] == 'x' || p[0] == 'X') {
+		val = simple_strtoul(p, &p, 16);
+		nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+		nes_write32(nesdev->regs + NES_EEPROM_COMMAND, val);
+	}
+	return strnlen(buf, count);
+}
+
+static ssize_t nes_show_ee_data(struct device_driver *ddp, char *buf)
+{
+	u32 eeprom_data;
+	struct nes_device *nesdev;
+
+	nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+	eeprom_data = nes_read32(nesdev->regs + NES_EEPROM_DATA);
+
+	return  snprintf(buf, PAGE_SIZE, "0x%x\n", eeprom_data);
+}
+
+static ssize_t nes_store_ee_data(struct device_driver *ddp,
+	const char *buf, size_t count)
+{
+	char *p = (char *)buf;
+	u32 val;
+	struct nes_device *nesdev;
+
+	if (p[1] == 'x' || p[1] == 'X' || p[0] == 'x' || p[0] == 'X') {
+		val = simple_strtoul(p, &p, 16);
+		nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+		nes_write32(nesdev->regs + NES_EEPROM_DATA, val);
+	}
+	return strnlen(buf, count);
+}
+
+static ssize_t nes_show_flash_cmd(struct device_driver *ddp, char *buf)
+{
+	u32 flash_cmd;
+	struct nes_device *nesdev;
+
+	nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+	flash_cmd = nes_read32(nesdev->regs + NES_FLASH_COMMAND);
+
+	return  snprintf(buf, PAGE_SIZE, "0x%x\n", flash_cmd);
+}
+
+static ssize_t nes_store_flash_cmd(struct device_driver *ddp,
+	const char *buf, size_t count)
+{
+	char *p = (char *)buf;
+	u32 val;
+	struct nes_device *nesdev;
+
+	if (p[1] == 'x' || p[1] == 'X' || p[0] == 'x' || p[0] == 'X') {
+		val = simple_strtoul(p, &p, 16);
+		nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+		nes_write32(nesdev->regs + NES_FLASH_COMMAND, val);
+	}
+	return strnlen(buf, count);
+}
+
+static ssize_t nes_show_flash_data(struct device_driver *ddp, char *buf)
+{
+	u32 flash_data;
+	struct nes_device *nesdev;
+
+	nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+	flash_data = nes_read32(nesdev->regs + NES_FLASH_DATA);
+
+	return  snprintf(buf, PAGE_SIZE, "0x%x\n", flash_data);
+}
+
+static ssize_t nes_store_flash_data(struct device_driver *ddp,
+	const char *buf, size_t count)
+{
+	char *p = (char *)buf;
+	u32 val;
+	struct nes_device *nesdev;
+
+	if (p[1] == 'x' || p[1] == 'X' || p[0] == 'x' || p[0] == 'X') {
+		val = simple_strtoul(p, &p, 16);
+		nesdev = list_entry(nes_dev_list.next, typeof(*nesdev), list);
+		nes_write32(nesdev->regs + NES_FLASH_DATA, val);
+	}
+	return strnlen(buf, count);
+}
+
+DRIVER_ATTR(eeprom_cmd, S_IRUSR | S_IWUSR,
+	nes_show_ee_cmd, nes_store_ee_cmd);
+DRIVER_ATTR(eeprom_data, S_IRUSR | S_IWUSR,
+	nes_show_ee_data, nes_store_ee_data);
+DRIVER_ATTR(flash_cmd, S_IRUSR | S_IWUSR,
+	nes_show_flash_cmd, nes_store_flash_cmd);
+DRIVER_ATTR(flash_data, S_IRUSR | S_IWUSR,
+	nes_show_flash_data, nes_store_flash_data);
+
+int nes_create_driver_sysfs(struct pci_driver *drv)
+{
+	int error;
+	error  = driver_create_file(&drv->driver, &driver_attr_eeprom_cmd);
+	error |= driver_create_file(&drv->driver, &driver_attr_eeprom_data);
+	error |= driver_create_file(&drv->driver, &driver_attr_flash_cmd);
+	error |= driver_create_file(&drv->driver, &driver_attr_flash_data);
+	return error;
+}
+
+void nes_remove_driver_sysfs(struct pci_driver *drv)
+{
+	driver_remove_file(&drv->driver, &driver_attr_eeprom_cmd);
+	driver_remove_file(&drv->driver, &driver_attr_eeprom_data);
+	driver_remove_file(&drv->driver, &driver_attr_flash_cmd);
+	driver_remove_file(&drv->driver, &driver_attr_flash_data);
+}
 
 /**
  * nes_init_module - module initialization entry point
@@ -787,12 +917,20 @@ static struct pci_driver nes_pci_driver = {
 static int __init nes_init_module(void)
 {
 	int retval;
+	int retval1;
+
 	retval = nes_cm_start();
 	if (retval) {
 		printk(KERN_ERR PFX "Unable to start NetEffect iWARP CM.\n");
 		return retval;
 	}
-	return pci_register_driver(&nes_pci_driver);
+	retval = pci_register_driver(&nes_pci_driver);
+	if (retval >= 0) {
+		retval1 = nes_create_driver_sysfs(&nes_pci_driver);
+		if (retval1 < 0)
+			printk(KERN_ERR PFX "Unable to create NetEffect sys files.\n");
+	}
+	return retval;
 }
 
 
@@ -802,6 +940,8 @@ static int __init nes_init_module(void)
 static void __exit nes_exit_module(void)
 {
 	nes_cm_stop();
+	nes_remove_driver_sysfs(&nes_pci_driver);
+
 	pci_unregister_driver(&nes_pci_driver);
 }
 
diff --git a/drivers/infiniband/hw/nes/nes_hw.h b/drivers/infiniband/hw/nes/nes_hw.h
index 178f3d5..25cfda2 100644
--- a/drivers/infiniband/hw/nes/nes_hw.h
+++ b/drivers/infiniband/hw/nes/nes_hw.h
@@ -49,6 +49,8 @@ enum pci_regs {
 	NES_ONE_SHOT_CONTROL = 0x001C,
 	NES_EEPROM_COMMAND = 0x0020,
 	NES_EEPROM_DATA = 0x0024,
+	NES_FLASH_COMMAND = 0x0028,
+	NES_FLASH_DATA  = 0x002C,
 	NES_SOFTWARE_RESET = 0x0030,
 	NES_CQ_ACK = 0x0034,
 	NES_WQE_ALLOC = 0x0040,


From glenn at lists.openfabrics.org  Thu Dec 20 12:26:22 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:26:22 -0800 (PST)
Subject: [ofa-general] [PATCH 3/10] nes: nic queue start/stop and carrier fix
Message-ID: <20071220202622.1F7C2E60396@openfabrics.org>


If a full send queue occurs, netif_stop_queue() is called
but netif_start_queue() was not being called.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index 810a9ae..2ff4c41 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -203,6 +203,7 @@ static int nes_netdev_open(struct net_device *netdev)
 		return ret;
 	}
 
+	netif_carrier_off(netdev);
 	netif_stop_queue(netdev);
 
 	if ((!nesvnic->of_device_registered) && (nesvnic->rdma_enabled)) {
@@ -502,6 +503,13 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 			netdev->name, skb->len, skb_headlen(skb),
 			skb_shinfo(skb)->nr_frags, skb_is_gso(skb));
 	*/
+
+	if (!netif_carrier_ok(netdev))
+		return NETDEV_TX_OK;
+
+	if (netif_queue_stopped(netdev))
+		return NETDEV_TX_BUSY;
+
 	local_irq_save(flags);
 	if (!spin_trylock(&nesnic->sq_lock)) {
 		local_irq_restore(flags);
@@ -511,12 +519,20 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 
 	/* Check if SQ is full */
 	if ((((nesnic->sq_tail+(nesnic->sq_size*2))-nesnic->sq_head) & (nesnic->sq_size - 1)) == 1) {
-		netif_stop_queue(netdev);
-		spin_unlock_irqrestore(&nesnic->sq_lock, flags);
+		if (!netif_queue_stopped(netdev)) {
+			netif_stop_queue(netdev);
+			barrier();
+			if ((((((volatile u16)nesnic->sq_tail)+(nesnic->sq_size*2))-nesnic->sq_head) & (nesnic->sq_size - 1)) != 1) {
+				netif_start_queue(netdev);
+				goto sq_no_longer_full;
+			}
+		}
 		nesvnic->sq_full++;
+		spin_unlock_irqrestore(&nesnic->sq_lock, flags);
 		return NETDEV_TX_BUSY;
 	}
 
+sq_no_longer_full:
 	nr_frags = skb_shinfo(skb)->nr_frags;
 	if (skb_headlen(skb) > NES_FIRST_FRAG_SIZE) {
 		nr_frags++;
@@ -534,13 +550,23 @@ static int nes_netdev_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 					(nesnic->sq_size - 1);
 
 			if (unlikely(wqes_needed > wqes_available)) {
-				netif_stop_queue(netdev);
+				if (!netif_queue_stopped(netdev)) {
+					netif_stop_queue(netdev);
+					barrier();
+					wqes_available = (((((volatile u16)nesnic->sq_tail)+nesnic->sq_size)-nesnic->sq_head) - 1) &
+						(nesnic->sq_size - 1);
+					if (wqes_needed <= wqes_available) {
+						netif_start_queue(netdev);
+						goto tso_sq_no_longer_full;
+					}
+				}
+				nesvnic->sq_full++;
 				spin_unlock_irqrestore(&nesnic->sq_lock, flags);
 				nes_debug(NES_DBG_NIC_TX, "%s: HNIC SQ full- TSO request has too many frags!\n",
 						netdev->name);
-				nesvnic->sq_full++;
 				return NETDEV_TX_BUSY;
 			}
+tso_sq_no_longer_full:
 			/* Map all the buffers */
 			for (tso_frag_count=0; tso_frag_count < skb_shinfo(skb)->nr_frags;
 					tso_frag_count++) {


From glenn at lists.openfabrics.org  Thu Dec 20 12:28:01 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:28:01 -0800 (PST)
Subject: [ofa-general] [PATCH 4/10] nes: interrupt moderation fix
Message-ID: <20071220202801.443FBE60396@openfabrics.org>


Hardware interrupt moderation timer gave average performance
on slower systems.  These fixes increase performance.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_hw.c b/drivers/infiniband/hw/nes/nes_hw.c
index 674ce32..1048db2 100644
--- a/drivers/infiniband/hw/nes/nes_hw.c
+++ b/drivers/infiniband/hw/nes/nes_hw.c
@@ -155,26 +155,41 @@ static void nes_nic_tune_timer(struct nes_device *nesdev)
 
 	spin_lock_irqsave(&nesadapter->periodic_timer_lock, flags);
 
+	if (shared_timer->cq_count_old < shared_timer->cq_count) {
+		if (shared_timer->cq_count > shared_timer->threshold_low ) {
+			shared_timer->cq_direction_downward=0;
+		}
+	}
+	if (shared_timer->cq_count_old >= shared_timer->cq_count) {
+		shared_timer->cq_direction_downward++;
+	}
+	shared_timer->cq_count_old = shared_timer->cq_count;
+	if (shared_timer->cq_direction_downward > NES_NIC_CQ_DOWNWARD_TREND) {
+		if (shared_timer->cq_count <= shared_timer->threshold_low ) {
+			shared_timer->threshold_low = shared_timer->threshold_low/2;
+			shared_timer->cq_direction_downward=0;
+			shared_timer->cq_count = 0;
+			spin_unlock_irqrestore(&nesadapter->periodic_timer_lock, flags);
+			return;
+		}
+	}
+
 	if (shared_timer->cq_count>1) {
 		nesdev->deepcq_count += shared_timer->cq_count;
 		if (shared_timer->cq_count <= shared_timer->threshold_low ) {       /* increase timer gently */
 			shared_timer->timer_direction_upward++;
 			shared_timer->timer_direction_downward = 0;
-		}
-		else if (shared_timer->cq_count <= shared_timer->threshold_target ) { /* balanced */
+		} else if (shared_timer->cq_count <= shared_timer->threshold_target ) { /* balanced */
 			shared_timer->timer_direction_upward = 0;
 			shared_timer->timer_direction_downward = 0;
-		}
-		else if (shared_timer->cq_count <= shared_timer->threshold_high ) {  /* decrease timer gently */
+		} else if (shared_timer->cq_count <= shared_timer->threshold_high ) {  /* decrease timer gently */
 			shared_timer->timer_direction_downward++;
 			shared_timer->timer_direction_upward = 0;
-		}
-		else if (shared_timer->cq_count <= (shared_timer->threshold_high)*2) {
+		} else if (shared_timer->cq_count <= (shared_timer->threshold_high)*2) {
 			shared_timer->timer_in_use -= 2;
 			shared_timer->timer_direction_upward = 0;
 			shared_timer->timer_direction_downward++;
-		}
-		else {
+		} else {
 			shared_timer->timer_in_use -= 4;
 			shared_timer->timer_direction_upward = 0;
 			shared_timer->timer_direction_downward++;
@@ -2241,7 +2256,7 @@ void nes_nic_ce_handler(struct nes_device *nesdev, struct nes_hw_nic_cq *cq)
 				if (atomic_read(&nesvnic->rx_skbs_needed) > (nesvnic->nic.rq_size>>1)) {
 					nes_write32(nesdev->regs+NES_CQE_ALLOC,
 							cq->cq_number | (cqe_count << 16));
-                    nesadapter->tune_timer.cq_count += cqe_count;
+					nesadapter->tune_timer.cq_count += cqe_count;
 					cqe_count = 0;
 					nes_replenish_nic_rq(nesvnic);
 				}
diff --git a/drivers/infiniband/hw/nes/nes_hw.h b/drivers/infiniband/hw/nes/nes_hw.h
index 25cfda2..ca0b006 100644
--- a/drivers/infiniband/hw/nes/nes_hw.h
+++ b/drivers/infiniband/hw/nes/nes_hw.h
@@ -957,6 +957,7 @@ struct nes_arp_entry {
 #define DEFAULT_JUMBO_NES_QL_LOW    12
 #define DEFAULT_JUMBO_NES_QL_TARGET 40
 #define DEFAULT_JUMBO_NES_QL_HIGH   128
+#define NES_NIC_CQ_DOWNWARD_TREND   8
 
 struct nes_hw_tune_timer {
     u16 cq_count;
@@ -969,6 +970,8 @@ struct nes_hw_tune_timer {
     u16 timer_in_use_max;
     u8  timer_direction_upward;
     u8  timer_direction_downward;
+    u16 cq_count_old;
+    u8  cq_direction_downward;
 };
 
 #define NES_TIMER_INT_LIMIT         2
@@ -1051,17 +1054,17 @@ struct nes_adapter {
 
 	u32 nic_rx_eth_route_err;
 
-	u32	et_rx_coalesce_usecs;
+	u32 et_rx_coalesce_usecs;
 	u32	et_rx_max_coalesced_frames;
 	u32 et_rx_coalesce_usecs_irq;
-	u32	et_rx_max_coalesced_frames_irq;
-	u32	et_pkt_rate_low;
-	u32	et_rx_coalesce_usecs_low;
-	u32	et_rx_max_coalesced_frames_low;
-	u32	et_pkt_rate_high;
-	u32	et_rx_coalesce_usecs_high;
-	u32	et_rx_max_coalesced_frames_high;
-	u32	et_rate_sample_interval;
+	u32 et_rx_max_coalesced_frames_irq;
+	u32 et_pkt_rate_low;
+	u32 et_rx_coalesce_usecs_low;
+	u32 et_rx_max_coalesced_frames_low;
+	u32 et_pkt_rate_high;
+	u32 et_rx_coalesce_usecs_high;
+	u32 et_rx_max_coalesced_frames_high;
+	u32 et_rate_sample_interval;
 	u32 timer_int_limit;
 
 	/* Adapter base MAC address */
@@ -1077,7 +1080,7 @@ struct nes_adapter {
 	u16 pd_config_size[4];
 	u16 pd_config_base[4];
 
-	u16  link_interrupt_count[4];
+	u16 link_interrupt_count[4];
 
 	/* the phy index for each port */
 	u8  phy_index[4];


From glenn at lists.openfabrics.org  Thu Dec 20 12:29:44 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:29:44 -0800 (PST)
Subject: [ofa-general] [PATCH 5/10] nes: remove unneeded arp cache update
Message-ID: <20071220202944.9FA58E607F8@openfabrics.org>


The hardware arp cache is updated by inet event notifiers.
Therefore, no arp cache update is needed at netdev_open.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index 2ff4c41..496024a 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -260,16 +260,6 @@ static int nes_netdev_open(struct net_device *netdev)
 	}
 
 
-	if (netdev->ip_ptr) {
-		struct in_device *ip = netdev->ip_ptr;
-		struct in_ifaddr *in = NULL;
-		if (ip && ip->ifa_list) {
-			in = ip->ifa_list;
-			nes_manage_arp_cache(nesvnic->netdev, netdev->dev_addr,
-					ntohl(in->ifa_address), NES_ARP_ADD);
-		}
-	}
-
 	nes_write32(nesdev->regs+NES_CQE_ALLOC, NES_CQE_ALLOC_NOTIFY_NEXT |
 			nesvnic->nic_cq.cq_number);
 	nes_read32(nesdev->regs+NES_CQE_ALLOC);


From glenn at lists.openfabrics.org  Thu Dec 20 12:32:10 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:32:10 -0800 (PST)
Subject: [ofa-general] [PATCH 6/10] nes: use control QP callback at
	connection teardown
Message-ID: <20071220203210.0CC96E603C4@openfabrics.org>


Prevents a race condition between hardware and ULPs when
tearing down connections.  Memory and data structures are
cleaned up after the hardware ce handler has run.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes.c b/drivers/infiniband/hw/nes/nes.c
index 1088330..4376bc2 100644
--- a/drivers/infiniband/hw/nes/nes.c
+++ b/drivers/infiniband/hw/nes/nes.c
@@ -263,13 +263,43 @@ void nes_add_ref(struct ib_qp *ibqp)
 	atomic_inc(&nesqp->refcount);
 }
 
+static void nes_cqp_rem_ref_callback(struct nes_device *nesdev, struct nes_cqp_request *cqp_request)
+{
+	unsigned long flags;
+	struct nes_qp *nesqp = cqp_request->cqp_callback_pointer;
+	struct nes_adapter *nesadapter = nesdev->nesadapter;
+	u32 qp_id;
+
+	atomic_inc(&qps_destroyed);
+
+	/* Free the control structures */
+
+	qp_id = nesqp->hwqp.qp_id;
+	if (nesqp->pbl_vbase) {
+		pci_free_consistent(nesdev->pcidev, nesqp->qp_mem_size,
+				nesqp->hwqp.q2_vbase, nesqp->hwqp.q2_pbase);
+		spin_lock_irqsave(&nesadapter->pbl_lock, flags);
+		nesadapter->free_256pbl++;
+		spin_unlock_irqrestore(&nesadapter->pbl_lock, flags);
+		pci_free_consistent(nesdev->pcidev, 256, nesqp->pbl_vbase, nesqp->pbl_pbase);
+		nesqp->pbl_vbase = NULL;
+		kunmap(nesqp->page);
+
+	} else {
+		pci_free_consistent(nesdev->pcidev, nesqp->qp_mem_size,
+				nesqp->hwqp.sq_vbase, nesqp->hwqp.sq_pbase);
+	}
+	nes_free_resource(nesadapter, nesadapter->allocated_qps, nesqp->hwqp.qp_id);
+
+	kfree(nesqp->allocated_buffer);
+
+}
 
 /**
  * nes_rem_ref
  */
 void nes_rem_ref(struct ib_qp *ibqp)
 {
-	unsigned long flags;
 	u64 u64temp;
 	struct nes_qp *nesqp;
 	struct nes_vnic *nesvnic = to_nesvnic(ibqp->device);
@@ -287,27 +317,7 @@ void nes_rem_ref(struct ib_qp *ibqp)
 	}
 
 	if (atomic_dec_and_test(&nesqp->refcount)) {
-		atomic_inc(&qps_destroyed);
-
-		/* Free the control structures */
-
-		if (nesqp->pbl_vbase) {
-			pci_free_consistent(nesdev->pcidev, nesqp->qp_mem_size,
-					nesqp->hwqp.q2_vbase, nesqp->hwqp.q2_pbase);
-			spin_lock_irqsave(&nesadapter->pbl_lock, flags);
-			nesadapter->free_256pbl++;
-			spin_unlock_irqrestore(&nesadapter->pbl_lock, flags);
-			pci_free_consistent(nesdev->pcidev, 256, nesqp->pbl_vbase, nesqp->pbl_pbase);
-			nesqp->pbl_vbase = NULL;
-			kunmap(nesqp->page);
-
-		} else {
-			pci_free_consistent(nesdev->pcidev, nesqp->qp_mem_size,
-					nesqp->hwqp.sq_vbase, nesqp->hwqp.sq_pbase);
-		}
-
 		nesadapter->qp_table[nesqp->hwqp.qp_id-NES_FIRST_QPN] = NULL;
-		nes_free_resource(nesadapter, nesadapter->allocated_qps, nesqp->hwqp.qp_id);
 
 		/* Destroy the QP */
 		cqp_request = nes_get_cqp_request(nesdev);
@@ -316,6 +326,9 @@ void nes_rem_ref(struct ib_qp *ibqp)
 			return;
 		}
 		cqp_request->waiting = 0;
+		cqp_request->callback = 1;
+		cqp_request->cqp_callback = nes_cqp_rem_ref_callback;
+		cqp_request->cqp_callback_pointer = nesqp;
 		cqp_wqe = &cqp_request->cqp_wqe;
 
 		cqp_wqe->wqe_words[NES_CQP_WQE_OPCODE_IDX] =
@@ -339,8 +352,6 @@ void nes_rem_ref(struct ib_qp *ibqp)
 				cpu_to_le32((u32)(u64temp >> 32));
 
 		nes_post_cqp_request(nesdev, cqp_request, NES_CQP_REQUEST_RING_DOORBELL);
-
-		kfree(nesqp->allocated_buffer);
 	}
 }
 
diff --git a/drivers/infiniband/hw/nes/nes_hw.c b/drivers/infiniband/hw/nes/nes_hw.c
index 1048db2..06d1963 100644
--- a/drivers/infiniband/hw/nes/nes_hw.c
+++ b/drivers/infiniband/hw/nes/nes_hw.c
@@ -2427,6 +2427,16 @@ void nes_cqp_ce_handler(struct nes_device *nesdev, struct nes_hw_cq *cq)
 							spin_unlock_irqrestore(&nesdev->cqp.lock, flags);
 						}
 					}
+				} else if (cqp_request->callback) {
+					/* Envoke the callback routine */
+					cqp_request->cqp_callback(nesdev, cqp_request);
+					if (cqp_request->dynamic) {
+						kfree(cqp_request);
+					} else {
+						spin_lock_irqsave(&nesdev->cqp.lock, flags);
+						list_add_tail(&cqp_request->list, &nesdev->cqp_avail_reqs);
+						spin_unlock_irqrestore(&nesdev->cqp.lock, flags);
+					}
 				} else {
 					nes_debug(NES_DBG_CQP, "CQP request %p (opcode 0x%02X) freed.\n",
 							cqp_request,
diff --git a/drivers/infiniband/hw/nes/nes_hw.h b/drivers/infiniband/hw/nes/nes_hw.h
index ca0b006..0279d4c 100644
--- a/drivers/infiniband/hw/nes/nes_hw.h
+++ b/drivers/infiniband/hw/nes/nes_hw.h
@@ -813,18 +813,22 @@ struct nes_hw_aeqe {
 	__le32 aeqe_words[4];
 };
 
-
 struct nes_cqp_request {
+	union {
+		u64 cqp_callback_context;
+		void *cqp_callback_pointer;
+	};
 	wait_queue_head_t     waitq;
 	struct nes_hw_cqp_wqe cqp_wqe;
 	struct list_head      list;
 	atomic_t              refcount;
+	void (*cqp_callback)(struct nes_device *nesdev, struct nes_cqp_request *cqp_request);
 	u16                   major_code;
 	u16                   minor_code;
 	u8                    waiting;
 	u8                    request_done;
 	u8                    dynamic;
-	u8                    padding[1];
+	u8                    callback;
 };
 
 struct nes_hw_cqp {
@@ -1158,6 +1162,8 @@ struct nes_vnic {
 	struct nes_hw_nic    nic;
 	struct nes_hw_nic_cq nic_cq;
 
+	struct nes_cqp_request* (*get_cqp_request)(struct nes_device *nesdev);
+	void (*post_cqp_request)(struct nes_device*, struct nes_cqp_request *, int);
 	struct net_device_stats netstats;
 	/* used to put the netdev on the adapters logical port list */
 	struct list_head list;
diff --git a/drivers/infiniband/hw/nes/nes_utils.c b/drivers/infiniband/hw/nes/nes_utils.c
index 8d2c1ee..ffd2b99 100644
--- a/drivers/infiniband/hw/nes/nes_utils.c
+++ b/drivers/infiniband/hw/nes/nes_utils.c
@@ -558,6 +558,7 @@ struct nes_cqp_request *nes_get_cqp_request(struct nes_device *nesdev)
 		init_waitqueue_head(&cqp_request->waitq);
 		cqp_request->waiting = 0;
 		cqp_request->request_done = 0;
+		cqp_request->callback = 0;
 		init_waitqueue_head(&cqp_request->waitq);
 		nes_debug(NES_DBG_CQP, "Got cqp request %p from the available list \n",
 				cqp_request);


From glenn at lists.openfabrics.org  Thu Dec 20 12:33:18 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:33:18 -0800 (PST)
Subject: [ofa-general] [PATCH 7/10] nes: process mss option
Message-ID: <20071220203319.00D14E60849@openfabrics.org>


Process a packet with mss option set or use default
value.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c
index 79889a4..1777769 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -1220,11 +1220,12 @@ static int rem_ref_cm_node(struct nes_cm_core *cm_core,
 /**
  * process_options
  */
-static void process_options(struct nes_cm_node *cm_node, u8 *optionsloc, u32 optionsize)
+static int process_options(struct nes_cm_node *cm_node, u8 *optionsloc, u32 optionsize, u32 syn_packet)
 {
 	u32 tmp;
 	u32 offset = 0;
 	union all_known_options *all_options;
+	char got_mss_option = 0;
 
 	while (offset < optionsize) {
 		all_options = (union all_known_options *)(optionsloc + offset);
@@ -1236,9 +1237,17 @@ static void process_options(struct nes_cm_node *cm_node, u8 *optionsloc, u32 opt
 				offset += 1;
 				continue;
 			case OPTION_NUMBER_MSS:
-				tmp = htons(all_options->as_mss.mss);
-				if (tmp < cm_node->tcp_cntxt.mss)
-					cm_node->tcp_cntxt.mss = tmp;
+				nes_debug(NES_DBG_CM, "%s: MSS Length: %d Offset: %d Size: %d\n",
+						__FUNCTION__,
+						all_options->as_mss.length, offset, optionsize);
+				got_mss_option = 1;
+				if (all_options->as_mss.length != 4) {
+					return 1;
+				} else {
+					tmp = htons(all_options->as_mss.mss);
+					if (tmp > 0 && tmp < cm_node->tcp_cntxt.mss)
+						cm_node->tcp_cntxt.mss = tmp;
+				}
 				break;
 			case OPTION_NUMBER_WINDOW_SCALE:
 				cm_node->tcp_cntxt.snd_wscale = all_options->as_windowscale.shiftcount;
@@ -1253,6 +1262,9 @@ static void process_options(struct nes_cm_node *cm_node, u8 *optionsloc, u32 opt
 		}
 		offset += all_options->as_base.length;
 	}
+	if ((!got_mss_option) && (syn_packet))
+		cm_node->tcp_cntxt.mss = NES_CM_DEFAULT_MSS;
+	return 0;
 }
 
 
@@ -1343,6 +1355,8 @@ int process_packet(struct nes_cm_node *cm_node, struct sk_buff *skb,
 		u8 *optionsloc = (u8 *)&tcph[1];
 		process_options(cm_node, optionsloc, optionsize);
 	}
+	else if (tcph->syn)
+		cm_node->tcp_cntxt.mss = NES_CM_DEFAULT_MSS;
 
 	cm_node->tcp_cntxt.snd_wnd = htons(tcph->window) <<
 			cm_node->tcp_cntxt.snd_wscale;
diff --git a/drivers/infiniband/hw/nes/nes_cm.h b/drivers/infiniband/hw/nes/nes_cm.h
index cd8e003..c511242 100644
--- a/drivers/infiniband/hw/nes/nes_cm.h
+++ b/drivers/infiniband/hw/nes/nes_cm.h
@@ -152,6 +152,8 @@ struct nes_timer_entry {
 #define NES_CM_DEFAULT_FREE_PKTS      0x000A
 #define NES_CM_FREE_PKT_LO_WATERMARK  2
 
+#define NES_CM_DEFAULT_MSS   536
+
 #define NES_CM_DEF_SEQ       0x159bf75f
 #define NES_CM_DEF_LOCAL_ID  0x3b47
 

From glenn at lists.openfabrics.org  Thu Dec 20 12:36:13 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:36:13 -0800 (PST)
Subject: [ofa-general] [PATCH 8/10] nes: multicast performance enhancement
Message-ID: <20071220203613.8C10DE603C4@openfabrics.org>


Move multicast processing to it's own QP and setup the
hardware to use it.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_hw.c b/drivers/infiniband/hw/nes/nes_hw.c
index 06d1963..515133d 100644
--- a/drivers/infiniband/hw/nes/nes_hw.c
+++ b/drivers/infiniband/hw/nes/nes_hw.c
@@ -698,7 +698,7 @@ void nes_init_csr_ne020(struct nes_device *nesdev, u8 hw_rev, u8 port_count)
 
 	nes_write_indexed(nesdev, 0x000001E4, 0x00000007);
 	/* nes_write_indexed(nesdev, 0x000001E8, 0x000208C4); */
-	nes_write_indexed(nesdev, 0x000001E8, 0x00020844);
+	nes_write_indexed(nesdev, 0x000001E8, 0x00020874);
 	nes_write_indexed(nesdev, 0x000001D8, 0x00048002);
 	/* nes_write_indexed(nesdev, 0x000001D8, 0x0004B002); */
 	nes_write_indexed(nesdev, 0x000001FC, 0x00050005);
@@ -753,7 +753,7 @@ void nes_init_csr_ne020(struct nes_device *nesdev, u8 hw_rev, u8 port_count)
 	nes_write_indexed(nesdev, 0x000060C0, 0x0000028e);
 	nes_write_indexed(nesdev, 0x000060C8, 0x00000020);
 														//
-	nes_write_indexed(nesdev, 0x000001EC, 0x5b2625a0);
+	nes_write_indexed(nesdev, 0x000001EC, 0x7b2625a0);
 	/* nes_write_indexed(nesdev, 0x000001EC, 0x5f2625a0); */
 
 	if (hw_rev != NE020_REV) {
@@ -1377,7 +1377,7 @@ int nes_init_nic_qp(struct nes_device *nesdev, struct net_device *netdev)
 		nic_sqe = &nesvnic->nic.sq_vbase[counter];
 		nic_sqe->wqe_words[NES_NIC_SQ_WQE_MISC_IDX] =
 				cpu_to_le32(NES_NIC_SQ_WQE_DISABLE_CHKSUM |
-							NES_NIC_SQ_WQE_COMPLETION);
+				NES_NIC_SQ_WQE_COMPLETION);
 		nic_sqe->wqe_words[NES_NIC_SQ_WQE_LENGTH_0_TAG_IDX] =
 				cpu_to_le32((u32)NES_FIRST_FRAG_SIZE << 16);
 		nic_sqe->wqe_words[NES_NIC_SQ_WQE_FRAG0_LOW_IDX] =
@@ -1386,6 +1386,15 @@ int nes_init_nic_qp(struct nes_device *nesdev, struct net_device *netdev)
 				cpu_to_le32((u32)((u64)nesvnic->nic.frag_paddr[counter] >> 32));
 	}
 
+	nesvnic->mcrq_nic.sq_vbase = (void*)0;
+	nesvnic->mcrq_nic.sq_pbase = 0;
+	nesvnic->mcrq_nic.sq_head = 0;
+	nesvnic->mcrq_nic.sq_tail = 0;
+	nesvnic->mcrq_nic.sq_size = 0;
+	nesvnic->get_cqp_request = nes_get_cqp_request;
+	nesvnic->post_cqp_request = nes_post_cqp_request;
+	nesvnic->mcrq_mcast_filter = 0;
+
 	spin_lock_init(&nesvnic->nic.sq_lock);
 	spin_lock_init(&nesvnic->nic.rq_lock);
 
@@ -1404,6 +1413,17 @@ int nes_init_nic_qp(struct nes_device *nesdev, struct net_device *netdev)
 	vmem += (NES_NIC_WQ_SIZE * sizeof(struct nes_hw_nic_rq_wqe));
 	pmem += (NES_NIC_WQ_SIZE * sizeof(struct nes_hw_nic_rq_wqe));
 
+	nesvnic->mcrq_nic.rq_vbase = vmem;
+	nesvnic->mcrq_nic.rq_pbase = pmem;
+	nesvnic->mcrq_nic.rq_head = 0;
+	nesvnic->mcrq_nic.rq_tail = 0;
+	nesvnic->mcrq_nic.rq_size = NES_NIC_WQ_SIZE;
+
+	/* setup the CQ */
+	vmem += (NES_NIC_WQ_SIZE * sizeof(struct nes_hw_nic_rq_wqe));
+	pmem += (NES_NIC_WQ_SIZE * sizeof(struct nes_hw_nic_rq_wqe));
+	nesvnic->mcrq_nic.qp_id = nesvnic->nic_index + 32;
+
 	nesvnic->nic_cq.cq_vbase = vmem;
 	nesvnic->nic_cq.cq_pbase = pmem;
 	nesvnic->nic_cq.cq_head = 0;
@@ -1484,6 +1504,19 @@ int nes_init_nic_qp(struct nes_device *nesdev, struct net_device *netdev)
 	/* Ring doorbell (2 WQEs) */
 	nes_write32(nesdev->regs+NES_WQE_ALLOC, 0x02800000 | nesdev->cqp.qp_id);
 
+	/* Send CreateQP request to CQP */
+	nic_context++;
+	nic_context->context_words[NES_NIC_CTX_MISC_IDX] =
+			cpu_to_le32((u32)NES_NIC_CTX_SIZE |
+			((u32)PCI_FUNC(nesdev->pcidev->devfn) << 12) | (1 << 18));
+
+	u64temp = (u64)nesvnic->mcrq_nic.sq_pbase;
+	nic_context->context_words[NES_NIC_CTX_SQ_LOW_IDX] = cpu_to_le32((u32)u64temp);
+	nic_context->context_words[NES_NIC_CTX_SQ_HIGH_IDX] = cpu_to_le32((u32)(u64temp >> 32));
+	u64temp = (u64)nesvnic->mcrq_nic.rq_pbase;
+	nic_context->context_words[NES_NIC_CTX_RQ_LOW_IDX] = cpu_to_le32((u32)u64temp);
+	nic_context->context_words[NES_NIC_CTX_RQ_HIGH_IDX] = cpu_to_le32((u32)(u64temp >> 32));
+
 	spin_unlock_irqrestore(&nesdev->cqp.lock, flags);
 	nes_debug(NES_DBG_INIT, "Waiting for create NIC QP%u to complete.\n",
 			nesvnic->nic.qp_id);
diff --git a/drivers/infiniband/hw/nes/nes_hw.h b/drivers/infiniband/hw/nes/nes_hw.h
index 0279d4c..2efb55e 100644
--- a/drivers/infiniband/hw/nes/nes_hw.h
+++ b/drivers/infiniband/hw/nes/nes_hw.h
@@ -1161,9 +1161,11 @@ struct nes_vnic {
 	dma_addr_t           nic_pbase;
 	struct nes_hw_nic    nic;
 	struct nes_hw_nic_cq nic_cq;
-
+	struct nes_hw_nic    mcrq_nic;
+	struct nes_ucontext *mcrq_ucontext;
 	struct nes_cqp_request* (*get_cqp_request)(struct nes_device *nesdev);
 	void (*post_cqp_request)(struct nes_device*, struct nes_cqp_request *, int);
+	int (*mcrq_mcast_filter)( struct nes_vnic* nesvnic, __u8* dmi_addr );
 	struct net_device_stats netstats;
 	/* used to put the netdev on the adapters logical port list */
 	struct list_head list;
diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index 496024a..bd3f9e8 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -901,6 +901,9 @@ void nes_netdev_set_multicast_list(struct net_device *netdev)
 		perfect_filter_register_address = NES_IDX_PERFECT_FILTER_LOW + 0x80;
 		perfect_filter_register_address += nesvnic->nic_index*0x40;
 		for (mc_index=0; mc_index < NES_MULTICAST_PF_MAX; mc_index++) {
+			while ( multicast_addr && nesvnic->mcrq_mcast_filter && (nesvnic->mcrq_mcast_filter( nesvnic, multicast_addr->dmi_addr ) == 0) ) {
+				multicast_addr = multicast_addr->next;
+			}
 			if (multicast_addr) {
 				nes_debug(NES_DBG_NIC_RX, "Assigning MC Address = %02X%02X%02X%02X%02X%02X to register 0x%04X\n",
 						  multicast_addr->dmi_addr[0], multicast_addr->dmi_addr[1],
diff --git a/drivers/infiniband/hw/nes/nes_user.h b/drivers/infiniband/hw/nes/nes_user.h
index 6ab2357..d5a91a1 100644
--- a/drivers/infiniband/hw/nes/nes_user.h
+++ b/drivers/infiniband/hw/nes/nes_user.h
@@ -72,6 +72,8 @@ struct nes_alloc_pd_resp {
 
 struct nes_create_cq_req {
 	__u64 user_cq_buffer;
+	__u8 mcrqf;
+	__u8 reserved[7];
 };
 
 struct nes_create_qp_req {
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index 97cb51e..9751d0d 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -1702,6 +1702,10 @@ static struct ib_cq *nes_create_cq(struct ib_device *ibdev, int entries,
 			kfree(nescq);
 			return ERR_PTR(-EFAULT);
 		}
+		nesvnic->mcrq_ucontext = nes_ucontext;
+		nes_ucontext->mcrqf = req.mcrqf;
+		if (nes_ucontext->mcrqf)
+			nescq->hw_cq.cq_number = nesvnic->mcrq_nic.qp_id;
 		nes_debug(NES_DBG_CQ, "CQ Virtual Address = %08lX, size = %u.\n",
 				(unsigned long)req.user_cq_buffer, entries);
 		list_for_each_entry(nespbl, &nes_ucontext->cq_reg_mem_list, list) {
diff --git a/drivers/infiniband/hw/nes/nes_verbs.h b/drivers/infiniband/hw/nes/nes_verbs.h
index 96d59ce..c5ee39d 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.h
+++ b/drivers/infiniband/hw/nes/nes_verbs.h
@@ -54,6 +54,7 @@ struct nes_ucontext {
 	u16                first_free_wq;
 	struct list_head   cq_reg_mem_list;
 	struct list_head   qp_reg_mem_list;
+	u8                 mcrqf;
 };
 
 struct nes_pd {


From glenn at lists.openfabrics.org  Thu Dec 20 12:37:49 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:37:49 -0800 (PST)
Subject: [ofa-general] [PATCH 9/10] nes: support systems with other than 4k
	page size
Message-ID: <20071220203749.D5D59E603AF@openfabrics.org>


Originally assumed 4k page size.  Get the page size from
the kernel and map the memory accordingly.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_hw.c b/drivers/infiniband/hw/nes/nes_hw.c
index 515133d..5646e5a 100644
--- a/drivers/infiniband/hw/nes/nes_hw.c
+++ b/drivers/infiniband/hw/nes/nes_hw.c
@@ -259,7 +259,7 @@ struct nes_adapter *nes_init_adapter(struct nes_device *nesdev, u8 hw_rev) {
 	}
 
 	/* no adapter found */
-	num_pds = pci_resource_len(nesdev->pcidev, BAR_1) / 4096;
+	num_pds = pci_resource_len(nesdev->pcidev, BAR_1) >> PAGE_SHIFT;
 	if ((hw_rev != NE020_REV) && (hw_rev != NE020_REV1)) {
 		nes_debug(NES_DBG_INIT, "NE020 driver detected unknown hardware revision 0x%x\n",
 				hw_rev);
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index 9751d0d..aa53cbd 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -839,7 +839,7 @@ static struct ib_ucontext *nes_alloc_ucontext(struct ib_device *ibdev,
 		return ERR_PTR(-ENOMEM);
 
 	nes_ucontext->nesdev = nesdev;
-	nes_ucontext->mmap_wq_offset = ((uresp.max_pds * 4096) + PAGE_SIZE-1) / PAGE_SIZE;
+	nes_ucontext->mmap_wq_offset = uresp.max_pds;
 	nes_ucontext->mmap_cq_offset = nes_ucontext->mmap_wq_offset +
 			((sizeof(struct nes_hw_qp_wqe) * uresp.max_qps * 2) + PAGE_SIZE-1) /
 			PAGE_SIZE;
@@ -960,7 +960,7 @@ static struct ib_pd *nes_alloc_pd(struct ib_device *ibdev,
 	nes_debug(NES_DBG_PD, "Allocating PD (%p) for ib device %s\n",
 			nespd, nesvnic->nesibdev->ibdev.name);
 
-	nespd->pd_id = pd_num + nesadapter->base_pd;
+	nespd->pd_id = (pd_num << (PAGE_SHIFT-12)) + nesadapter->base_pd;
 
 	if (context) {
 		nesucontext = to_nesucontext(context);
@@ -1018,7 +1018,7 @@ static int nes_dealloc_pd(struct ib_pd *ibpd)
 	nes_debug(NES_DBG_PD, "Deallocating PD%u structure located @%p.\n",
 			nespd->pd_id, nespd);
 	nes_free_resource(nesadapter, nesadapter->allocated_pds,
-			nespd->pd_id-nesadapter->base_pd);
+			(nespd->pd_id-nesadapter->base_pd)>>(PAGE_SHIFT-12));
 	kfree(nespd);
 
 	return 0;
@@ -1089,8 +1089,8 @@ static int nes_setup_virt_qp(struct nes_qp *nesqp, struct nes_pbl *nespbl,
 	pbl = nespbl->pbl_vbase; /* points to first pbl entry */
 	/* now lets set the sq_vbase as well as rq_vbase addrs we will assign */
 	/* the first pbl to be fro the rq_vbase... */
-	rq_pbl_entries = (rq_size * sizeof(struct nes_hw_qp_wqe)) >> PAGE_SHIFT;
-	sq_pbl_entries = (sq_size * sizeof(struct nes_hw_qp_wqe)) >> PAGE_SHIFT;
+	rq_pbl_entries = (rq_size * sizeof(struct nes_hw_qp_wqe)) >> 12;
+	sq_pbl_entries = (sq_size * sizeof(struct nes_hw_qp_wqe)) >> 12;
 	nesqp->hwqp.sq_pbase  = (le32_to_cpu (((u32 *)pbl)[0]) ) | ((u64)((le32_to_cpu (((u32 *)pbl)[1]))) << 32);
 	if (!nespbl->page) {
 		nes_debug(NES_DBG_QP, "QP nespbl->page is NULL \n");
@@ -2433,6 +2433,7 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	u32 driver_key;
 	u32 root_pbl_index = 0;
 	u32 cur_pbl_index = 0;
+	u32 skip_pages;
 	u16 pbl_count;
 	u8 single_page = 1;
 	u8 stag_key;
@@ -2442,8 +2443,12 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 		return (struct ib_mr *)region;
 	}
 
-	nes_debug(NES_DBG_MR, "User base = 0x%lX, Virt base = 0x%lX, length = %u\n",
-			(unsigned long int)start, (unsigned long int)virt, (u32)length);
+	nes_debug(NES_DBG_MR, "User base = 0x%lX, Virt base = 0x%lX, length = %u,"
+			" offset = %u, page size = %u.\n",
+			(unsigned long int)start, (unsigned long int)virt, (u32)length,
+			region->offset, region->page_size);
+
+	skip_pages = ((u32)region->offset) >> 12;
 
 	if (ib_copy_from_udata(&req, udata, sizeof(req)))
 		return ERR_PTR(-EFAULT);
@@ -2460,7 +2465,7 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 			get_random_bytes(&next_stag_index, sizeof(next_stag_index));
 			stag_key = (u8)next_stag_index;
 
-			driver_key = 0;
+			driver_key = next_stag_index & 0x70000000;
 
 			next_stag_index >>= 8;
 			next_stag_index %= nesadapter->max_mr;
@@ -2484,70 +2489,6 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 				nes_debug(NES_DBG_MR, "Chunk: nents = %u, nmap = %u .\n",
 						chunk->nents, chunk->nmap);
 				for (nmap_index = 0; nmap_index < chunk->nmap; ++nmap_index) {
-					if ((page_count&0x01FF) == 0) {
-						if (page_count>(1024*512)) {
-							ib_umem_release(region);
-							pci_free_consistent(nesdev->pcidev, 4096, vpbl.pbl_vbase,
-									vpbl.pbl_pbase);
-							nes_free_resource(nesadapter,
-									nesadapter->allocated_mrs, stag_index);
-							kfree(nesmr);
-							return ERR_PTR(-E2BIG);
-						}
-						if (root_pbl_index == 1) {
-							root_vpbl.pbl_vbase = pci_alloc_consistent(nesdev->pcidev,
-									8192, &root_vpbl.pbl_pbase);
-							nes_debug(NES_DBG_MR, "Allocating root PBL, va = %p, pa = 0x%08X\n",
-									root_vpbl.pbl_vbase, (unsigned int)root_vpbl.pbl_pbase);
-							if (!root_vpbl.pbl_vbase) {
-								ib_umem_release(region);
-								pci_free_consistent(nesdev->pcidev, 4096, vpbl.pbl_vbase,
-										vpbl.pbl_pbase);
-								nes_free_resource(nesadapter, nesadapter->allocated_mrs,
-										stag_index);
-								kfree(nesmr);
-								return ERR_PTR(-ENOMEM);
-							}
-							root_vpbl.leaf_vpbl = kzalloc(sizeof(*root_vpbl.leaf_vpbl)*1024,
-									GFP_KERNEL);
-							if (!root_vpbl.leaf_vpbl) {
-								ib_umem_release(region);
-								pci_free_consistent(nesdev->pcidev, 8192, root_vpbl.pbl_vbase,
-										root_vpbl.pbl_pbase);
-								pci_free_consistent(nesdev->pcidev, 4096, vpbl.pbl_vbase,
-										vpbl.pbl_pbase);
-								nes_free_resource(nesadapter, nesadapter->allocated_mrs,
-										stag_index);
-								kfree(nesmr);
-								return ERR_PTR(-ENOMEM);
-							}
-							root_vpbl.pbl_vbase[0].pa_low =
-									cpu_to_le32((u32)vpbl.pbl_pbase);
-							root_vpbl.pbl_vbase[0].pa_high =
-									cpu_to_le32((u32)((((u64)vpbl.pbl_pbase) >> 32)));
-							root_vpbl.leaf_vpbl[0] = vpbl;
-						}
-						vpbl.pbl_vbase = pci_alloc_consistent(nesdev->pcidev, 4096,
-								&vpbl.pbl_pbase);
-						nes_debug(NES_DBG_MR, "Allocating leaf PBL, va = %p, pa = 0x%08X\n",
-								vpbl.pbl_vbase, (unsigned int)vpbl.pbl_pbase);
-						if (!vpbl.pbl_vbase) {
-							ib_umem_release(region);
-							nes_free_resource(nesadapter, nesadapter->allocated_mrs, stag_index);
-							ibmr = ERR_PTR(-ENOMEM);
-							kfree(nesmr);
-							goto reg_user_mr_err;
-						}
-						if (1 <= root_pbl_index) {
-							root_vpbl.pbl_vbase[root_pbl_index].pa_low =
-									cpu_to_le32((u32)vpbl.pbl_pbase);
-							root_vpbl.pbl_vbase[root_pbl_index].pa_high =
-									cpu_to_le32((u32)((((u64)vpbl.pbl_pbase)>>32)));
-							root_vpbl.leaf_vpbl[root_pbl_index] = vpbl;
-						}
-						root_pbl_index++;
-						cur_pbl_index = 0;
-					}
 					if (sg_dma_address(&chunk->page_list[nmap_index]) & ~PAGE_MASK) {
 						ib_umem_release(region);
 						nes_free_resource(nesadapter, nesadapter->allocated_mrs, stag_index);
@@ -2569,35 +2510,106 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 					}
 
 					region_length += sg_dma_len(&chunk->page_list[nmap_index]);
-					chunk_pages = sg_dma_len(&chunk->page_list[nmap_index]) >> PAGE_SHIFT;
-					for (page_index=0; page_index < chunk_pages; page_index++) {
+					chunk_pages = sg_dma_len(&chunk->page_list[nmap_index]) >> 12;
+					region_length -= skip_pages << 12;
+					for (page_index=skip_pages; page_index < chunk_pages; page_index++) {
+						skip_pages = 0;
+						if ((page_count!=0)&&(page_count<<12)-(region->offset&(4096-1))>=region->length)
+							goto enough_pages;
+						if ((page_count&0x01FF) == 0) {
+							if (page_count>(1024*512)) {
+								ib_umem_release(region);
+								pci_free_consistent(nesdev->pcidev, 4096, vpbl.pbl_vbase,
+										vpbl.pbl_pbase);
+								nes_free_resource(nesadapter,
+										nesadapter->allocated_mrs, stag_index);
+								kfree(nesmr);
+								nesmr = ERR_PTR(-E2BIG);
+								goto reg_user_mr_err;
+							}
+							if (root_pbl_index == 1) {
+								root_vpbl.pbl_vbase = pci_alloc_consistent(nesdev->pcidev,
+										8192, &root_vpbl.pbl_pbase);
+								nes_debug(NES_DBG_MR, "Allocating root PBL, va = %p, pa = 0x%08X\n",
+										root_vpbl.pbl_vbase, (unsigned int)root_vpbl.pbl_pbase);
+								if (!root_vpbl.pbl_vbase) {
+									ib_umem_release(region);
+									pci_free_consistent(nesdev->pcidev, 4096, vpbl.pbl_vbase,
+											vpbl.pbl_pbase);
+									nes_free_resource(nesadapter, nesadapter->allocated_mrs,
+											stag_index);
+									kfree(nesmr);
+									nesmr = ERR_PTR(-ENOMEM);
+									goto reg_user_mr_err;
+								}
+								root_vpbl.leaf_vpbl = kzalloc(sizeof(*root_vpbl.leaf_vpbl)*1024,
+										GFP_KERNEL);
+								if (!root_vpbl.leaf_vpbl) {
+									ib_umem_release(region);
+									pci_free_consistent(nesdev->pcidev, 8192, root_vpbl.pbl_vbase,
+											root_vpbl.pbl_pbase);
+									pci_free_consistent(nesdev->pcidev, 4096, vpbl.pbl_vbase,
+											vpbl.pbl_pbase);
+									nes_free_resource(nesadapter, nesadapter->allocated_mrs,
+											stag_index);
+									kfree(nesmr);
+									nesmr = ERR_PTR(-ENOMEM);
+									goto reg_user_mr_err;
+								}
+								root_vpbl.pbl_vbase[0].pa_low =
+										cpu_to_le32((u32)vpbl.pbl_pbase);
+								root_vpbl.pbl_vbase[0].pa_high =
+										cpu_to_le32((u32)((((u64)vpbl.pbl_pbase) >> 32)));
+								root_vpbl.leaf_vpbl[0] = vpbl;
+							}
+							vpbl.pbl_vbase = pci_alloc_consistent(nesdev->pcidev, 4096,
+									&vpbl.pbl_pbase);
+							nes_debug(NES_DBG_MR, "Allocating leaf PBL, va = %p, pa = 0x%08X\n",
+									vpbl.pbl_vbase, (unsigned int)vpbl.pbl_pbase);
+							if (!vpbl.pbl_vbase) {
+								ib_umem_release(region);
+								nes_free_resource(nesadapter, nesadapter->allocated_mrs, stag_index);
+								ibmr = ERR_PTR(-ENOMEM);
+								kfree(nesmr);
+								goto reg_user_mr_err;
+							}
+							if (1 <= root_pbl_index) {
+								root_vpbl.pbl_vbase[root_pbl_index].pa_low =
+										cpu_to_le32((u32)vpbl.pbl_pbase);
+								root_vpbl.pbl_vbase[root_pbl_index].pa_high =
+										cpu_to_le32((u32)((((u64)vpbl.pbl_pbase)>>32)));
+								root_vpbl.leaf_vpbl[root_pbl_index] = vpbl;
+							}
+							root_pbl_index++;
+							cur_pbl_index = 0;
+						}
 						if (single_page) {
 							if (page_count != 0) {
-								if ((last_dma_addr+PAGE_SIZE) !=
+								if ((last_dma_addr+4096) !=
 										(sg_dma_address(&chunk->page_list[nmap_index])+
-										(page_index*PAGE_SIZE)))
+										(page_index*4096)))
 									single_page = 0;
 								last_dma_addr = sg_dma_address(&chunk->page_list[nmap_index])+
-										(page_index*PAGE_SIZE);
+										(page_index*4096);
 							} else {
 								first_dma_addr = sg_dma_address(&chunk->page_list[nmap_index])+
-										(page_index*PAGE_SIZE);
+										(page_index*4096);
 								last_dma_addr = first_dma_addr;
 							}
 						}
 
 						vpbl.pbl_vbase[cur_pbl_index].pa_low =
 								cpu_to_le32((u32)(sg_dma_address(&chunk->page_list[nmap_index])+
-								(page_index*PAGE_SIZE)));
+								(page_index*4096)));
 						vpbl.pbl_vbase[cur_pbl_index].pa_high =
 								cpu_to_le32((u32)((((u64)(sg_dma_address(&chunk->page_list[nmap_index])+
-								(page_index*PAGE_SIZE))) >> 32)));
+								(page_index*4096))) >> 32)));
 						cur_pbl_index++;
 						page_count++;
 					}
 				}
 			}
-
+			enough_pages:
 			nes_debug(NES_DBG_MR, "calculating stag, stag_index=0x%08x, driver_key=0x%08x,"
 					" stag_key=0x%08x\n",
 					stag_index, driver_key, stag_key);
@@ -2609,12 +2621,6 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 			}
 
 			iova_start = virt;
-			nes_debug(NES_DBG_MR, "Registering STag 0x%08X, VA = 0x%08X, length = 0x%08X,"
-					" index = 0x%08X, region->length=0x%08llx\n",
-					stag, (unsigned int)iova_start,
-					(unsigned int)region_length, stag_index,
-					(unsigned long long)region->length);
-
 			/* Make the leaf PBL the root if only one PBL */
 			if (root_pbl_index == 1) {
 				root_vpbl.pbl_pbase = vpbl.pbl_pbase;
@@ -2626,6 +2632,11 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 				pbl_count = root_pbl_index;
 				first_dma_addr = 0;
 			}
+			nes_debug(NES_DBG_MR, "Registering STag 0x%08X, VA = 0x%08X, length = 0x%08X,"
+					" index = 0x%08X, region->length=0x%08llx, pbl_count = %u\n",
+					stag, (unsigned int)iova_start,
+					(unsigned int)region_length, stag_index,
+					(unsigned long long)region->length, pbl_count);
 			ret = nes_reg_mr( nesdev, nespd, stag, region->length, &root_vpbl,
 					first_dma_addr, pbl_count, (u16)cur_pbl_index, acc, &iova_start);
 
@@ -2684,8 +2695,8 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 			}
 			nesmr->region = region;
 			nes_ucontext = to_nesucontext(pd->uobject->context);
-			pbl_depth = region->length >> PAGE_SHIFT;
-			pbl_depth += (region->length & ~PAGE_MASK) ? 1 : 0;
+			pbl_depth = region->length >> 12;
+			pbl_depth += (region->length & (4096-1)) ? 1 : 0;
 			nespbl->pbl_size = pbl_depth*sizeof(u64);
 			if (req.reg_type == IWNES_MEMREG_TYPE_QP) {
 				nes_debug(NES_DBG_MR, "Attempting to allocate QP PBL memory");
@@ -2714,15 +2725,16 @@ static struct ib_mr *nes_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 
 			list_for_each_entry(chunk, &region->chunk_list, list) {
 				for (nmap_index = 0; nmap_index < chunk->nmap; ++nmap_index) {
-					chunk_pages = sg_dma_len(&chunk->page_list[nmap_index]) >> PAGE_SHIFT;
+					chunk_pages = sg_dma_len(&chunk->page_list[nmap_index]) >> 12;
+					chunk_pages += (sg_dma_len(&chunk->page_list[nmap_index]) & (4096-1)) ? 1 : 0;
 					nespbl->page = sg_page(&chunk->page_list[0]);
 					for (page_index=0; page_index<chunk_pages; page_index++) {
 						((u32 *)pbl)[0] = cpu_to_le32((u32)
 								(sg_dma_address(&chunk->page_list[nmap_index])+
-								(page_index*PAGE_SIZE)));
+								(page_index*4096)));
 						((u32 *)pbl)[1] = cpu_to_le32(((u64)
 								(sg_dma_address(&chunk->page_list[nmap_index])+
-								(page_index*PAGE_SIZE)))>>32);
+								(page_index*4096)))>>32);
 						nes_debug(NES_DBG_MR, "pbl=%p, *pbl=0x%016llx, 0x%08x%08x\n", pbl,
 								(unsigned long long)*pbl,
 								le32_to_cpu(((u32 *)pbl)[1]), le32_to_cpu(((u32 *)pbl)[0]));


From glenn at lists.openfabrics.org  Thu Dec 20 12:39:20 2007
From: glenn at lists.openfabrics.org (Glenn Grundstrom NetEffect)
Date: Thu, 20 Dec 2007 12:39:20 -0800 (PST)
Subject: [ofa-general] [PATCH 10/10] nes: connection timeout and reset fix
Message-ID: <20071220203920.A4C4FE603AF@openfabrics.org>


Fixed the timeout for connection setup and properly
handle connection reset in the event of a timeout.
Prior to this patch connections would timeout too
quickly and no reset would occur.

Signed-off-by: Glenn Grundstrom <ggrundstrom at neteffect.com>

---

diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c
index 1777769..bdca2dd 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -714,6 +714,7 @@ int send_reset(struct nes_cm_node *cm_node)
 {
 	int ret;
 	struct sk_buff *skb = get_free_pkt(cm_node);
+	int flags = SET_RST | SET_ACK;
 
 	if (!skb) {
 		nes_debug(NES_DBG_CM, "Failed to get a Free pkt\n");
@@ -721,7 +722,7 @@ int send_reset(struct nes_cm_node *cm_node)
 	}
 
 	add_ref_cm_node(cm_node);
-	form_cm_frame(skb, cm_node, NULL, 0, NULL, 0, SET_RST | SET_ACK);
+	form_cm_frame(skb, cm_node, NULL, 0, NULL, 0, flags);
 	ret = schedule_nes_timer(cm_node, skb, NES_TIMER_TYPE_SEND, 0, 1);
 
 	return ret;
@@ -1279,6 +1280,10 @@ int process_packet(struct nes_cm_node *cm_node, struct sk_buff *skb,
 	int ret = 0;
 	struct tcphdr *tcph = tcp_hdr(skb);
 	u32 inc_sequence;
+	if (cm_node->state == NES_CM_STATE_SYN_SENT && tcph->syn) {
+		inc_sequence = ntohl(tcph->seq);
+		cm_node->tcp_cntxt.rcv_nxt = inc_sequence;
+	}
 
 	if ((!tcph) || (cm_node->state == NES_CM_STATE_TSA)) {
 		BUG_ON(!tcph);
@@ -1337,7 +1342,8 @@ int process_packet(struct nes_cm_node *cm_node, struct sk_buff *skb,
 			cm_node->tcp_cntxt.rcv_nxt, (tcph->syn ? "SYN":""),
 			(tcph->ack ? "ACK":""));
 
-	if (!tcph->syn && (inc_sequence != cm_node->tcp_cntxt.rcv_nxt)) {
+	if (!tcph->syn && (inc_sequence != cm_node->tcp_cntxt.rcv_nxt)
+		) {
 		nes_debug(NES_DBG_CM, "dropping packet, datasize = %u, sequence = 0x%08X,"
 				" ack_seq = 0x%08X, rcv_nxt = 0x%08X Flags: %s.\n",
 				datasize, inc_sequence,ntohl(tcph->ack_seq),
@@ -1353,7 +1359,12 @@ int process_packet(struct nes_cm_node *cm_node, struct sk_buff *skb,
 
 	if (optionsize) {
 		u8 *optionsloc = (u8 *)&tcph[1];
-		process_options(cm_node, optionsloc, optionsize);
+		if (process_options(cm_node, optionsloc, optionsize, (u32)tcph->syn)) {
+			nes_debug(NES_DBG_CM, "%s: Node %p, Sending RESET\n", __FUNCTION__, cm_node);
+			send_reset(cm_node);
+			rem_ref_cm_node(cm_core, cm_node);
+			return 0;
+		}
 	}
 	else if (tcph->syn)
 		cm_node->tcp_cntxt.mss = NES_CM_DEFAULT_MSS;
@@ -1371,8 +1382,8 @@ int process_packet(struct nes_cm_node *cm_node, struct sk_buff *skb,
 			case NES_CM_STATE_SYN_RCVD:
 			case NES_CM_STATE_SYN_SENT:
 				/* read and stash current sequence number */
-				if (cm_node->tcp_cntxt.rem_ack_num > cm_node->tcp_cntxt.loc_seq_num) {
-					nes_debug(NES_DBG_CM, "ERROR - cm_node->tcp_cntxt.rem_ack_num >"
+				if (cm_node->tcp_cntxt.rem_ack_num != cm_node->tcp_cntxt.loc_seq_num) {
+					nes_debug(NES_DBG_CM, "ERROR - cm_node->tcp_cntxt.rem_ack_num !="
 							" cm_node->tcp_cntxt.loc_seq_num\n");
 					send_reset(cm_node);
 					return 0;
diff --git a/drivers/infiniband/hw/nes/nes_cm.h b/drivers/infiniband/hw/nes/nes_cm.h
index c511242..46f1dea 100644
--- a/drivers/infiniband/hw/nes/nes_cm.h
+++ b/drivers/infiniband/hw/nes/nes_cm.h
@@ -136,7 +136,7 @@ struct nes_timer_entry {
 #ifdef CONFIG_INFINIBAND_NES_DEBUG
 #define NES_RETRY_TIMEOUT   (1000*HZ/1000)
 #else
-#define NES_RETRY_TIMEOUT   (1000*HZ/10000)
+#define NES_RETRY_TIMEOUT   (3000*HZ/1000)
 #endif
 #define NES_SHORT_TIME      (10)
 #define NES_LONG_TIME       (2000*HZ/1000)


From sean.hefty at intel.com  Thu Dec 20 12:25:30 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Thu, 20 Dec 2007 12:25:30 -0800
Subject: [ofa-general] lock dependency in ib_user_mad
Message-ID: <000101c84346$7766b2d0$9b37170a@amr.corp.intel.com>

I see hangs killing opensm related to a bug in user_mad.c.  The problem appears
to be:

ib_umad_close()
	downgrade_write(&file->port->mutex)
	ib_unregister_mad_agent(...)
	up_read(&file->port->mutex)

ib_unregister_mad_agent() flushes any outstanding MADs, resulting in calls to
send_handler() and recv_handler(), both of which call queue_packet():

queue_packet()
	down_read(&file->port->mutex)
	...
	up_read(&file->port->mutex)

ib_umad_kill_port() has a similar issue as ib_umad_close().

Does anyone know the reasoning for holding the mutex around
ib_unregister_mad_agent()?

- Sean


From 9nestboxaviary at jiudinggroup.com  Thu Dec 20 13:17:21 2007
From: 9nestboxaviary at jiudinggroup.com (Monica Gentry)
Date: Thu, 20 Dec 2007 22:17:21 +0100
Subject: [ofa-general] Can we talk?
Message-ID: <01c84356$16deae80$eb355550@9nestboxaviary>

Hello! I am bored today. I am nice girl that would like to chat with you. Email me at Julia at ShineBal.info only, because I am using my friend's email to write this. Don't miss some of my naughty pictures.


From swise at opengridcomputing.com  Thu Dec 20 13:02:06 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 20 Dec 2007 15:02:06 -0600
Subject: [ofa-general] Re: iommu dma mapping alignment requirements
In-Reply-To: <1198181862.6779.3.camel@pasglop>
References: <476AA2E2.5010007@opengridcomputing.com>
	<1198181862.6779.3.camel@pasglop>
Message-ID: <476AD84E.4000507@opengridcomputing.com>

Benjamin Herrenschmidt wrote:
> Adding A few more people to the discussion. You may well be right and we
> would have to provide the same alignment, though that sucks a bit as one
> of the reason we switched to 4K for the IOMMU is that the iommu space
> available on pSeries is very small and we were running out of it with
> 64K pages and lots of networking activity.
> 

But smarter NIC drivers can resolve this too, I think, but perhaps 
carving up full pages of mapped buffers instead of just assuming mapping 
is free...

perhaps...


From or.gerlitz at gmail.com  Thu Dec 20 13:14:44 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Thu, 20 Dec 2007 23:14:44 +0200
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <476AB554.9080200@ichips.intel.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com> <47696002.4030903@ichips.intel.com>
	<476A857D.3090608@voltaire.com> <476AB554.9080200@ichips.intel.com>
Message-ID: <15ddcffd0712201314x2b064f65m3c4cbb6f0fe02a42@mail.gmail.com>

On 12/20/07, Sean Hefty <mshefty at ichips.intel.com> wrote:
>
> My thinking was that the peer to peer model would have both sides call
> connect only.  The peer to peer connection model only kicks in when both
> sides are in the REQ sent state.


Is this observation based on the wording used by the spec? if yes, can you
point on the sentence/s that does it?

>From my reading, I could not conclude that implementing it in a way that
both sides do listen and later set the peer to peer bit on the REQ such that
--if-- there's a "matching" REQ for the incoming REQ one side sends REP and
the other side ignores the incoming REQ etc - is against the spec.

> This makes the all peer to peer model useless, since an app can not make
> > sure that connection occur at exactly the same time!
>
> yep - (anyone can feel free to step in a set me straight on this...)
>
> > the spec is that peer to peer model has the ability to handle also
> > connections that occur at exactly the same time but not only.
>
> Peer to peer seems inherently racy to me


I understand that under TCP there's also a notion of peer to peer and
client/server connections, I'll give it a look next week to see what's the
foundations over there.

Or
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/4aae2eab/attachment.html>

From or.gerlitz at gmail.com  Thu Dec 20 13:16:39 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Thu, 20 Dec 2007 23:16:39 +0200
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <C98692FD98048C41885E0B0FACD9DFB805BD355A@exnane01.hq.netapp.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com> <47696002.4030903@ichips.intel.com>
	<476A857D.3090608@voltaire.com>
	<C98692FD98048C41885E0B0FACD9DFB805BD355A@exnane01.hq.netapp.com>
Message-ID: <15ddcffd0712201316o1d73ebd4k296958d8fb4b0812@mail.gmail.com>

On 12/20/07, Kanevsky, Arkady <Arkady.Kanevsky at netapp.com> wrote:
>
> SO in a nutshell the proposal is to add some identifier into "CM private
> data" which indicate that it is peer-to-peer model, and unique peers IDs for
> the requested connection.
> Is this the model?
>

For the time being, I try to understand if in the peer to peer model both
sides issue a listen before connecting or not. Without this listen the
peer-to-peer does not seems usable to me, what's your understanding of the
spec?

Or.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/44b6d74b/attachment.html>

From swise at opengridcomputing.com  Thu Dec 20 13:22:13 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 20 Dec 2007 15:22:13 -0600
Subject: [ofa-general] iommu dma mapping alignment requirements
In-Reply-To: <1198182061.6779.7.camel@pasglop>
References: <476AA2E2.5010007@opengridcomputing.com>	
	<adalk7pz1ck.fsf@cisco.com>
	<476ABE60.9030805@opengridcomputing.com>	
	<476AC2A5.8060200@opengridcomputing.com>
	<1198182061.6779.7.camel@pasglop>
Message-ID: <476ADD05.5090801@opengridcomputing.com>

Benjamin Herrenschmidt wrote:
> On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:
> 
>> Or based on the alignment of vaddr actually...
> 
> The later wouldn't be realistic. What I think might be necessay, though
> it would definitely cause us problems with running out of iommu space
> (which is the reason we did the switch down to 4K), is to provide
> alignment to the real page size, and alignement to the allocation order
> for dma_map_consistent.
> 
> It might be possible to -tweak- and only provide alignment to the page
> size for allocations that are larger than IOMMU_PAGE_SIZE. That would
> solve the problem with small network packets eating up too much iommu
> space though.
> 
> What do you think ?

That might work.

If you gimme a patch, i'll try it out!

Steve.


From Arkady.Kanevsky at netapp.com  Thu Dec 20 13:22:44 2007
From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady)
Date: Thu, 20 Dec 2007 16:22:44 -0500
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <15ddcffd0712201316o1d73ebd4k296958d8fb4b0812@mail.gmail.com>
References: <4767A2CD.8030209@voltaire.com>
	<4768289F.6040907@ichips.intel.com><4769019C.10602@voltaire.com>
	<47696002.4030903@ichips.intel.com><476A857D.3090608@voltaire.com><C98692FD98048C41885E0B0FACD9DFB805BD355A@exnane01.hq.netapp.com>
	<15ddcffd0712201316o1d73ebd4k296958d8fb4b0812@mail.gmail.com>
Message-ID: <C98692FD98048C41885E0B0FACD9DFB805BD36DB@exnane01.hq.netapp.com>

Yes.
The question is who issues it?
It can be done by the CM and not ULP.
Looking way back at VIPL there was a peer-to-peer model with the API
similar to the
one which Shane outlines.
Thanks, 
 

Arkady Kanevsky                       email: arkady at netapp.com

Network Appliance Inc.               phone: 781-768-5395

1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195

Waltham, MA 02451                   central phone: 781-768-5300

 
________________________________

	From: Or Gerlitz [mailto:or.gerlitz at gmail.com] 
	Sent: Thursday, December 20, 2007 4:17 PM
	To: Kanevsky, Arkady
	Cc: OpenFabrics General
	Subject: Re: [ofa-general] peer to peer connections support
	
	
	On 12/20/07, Kanevsky, Arkady <Arkady.Kanevsky at netapp.com>
wrote: 
	

		SO in a nutshell the proposal is to add some identifier
into "CM private data" which indicate that it is peer-to-peer model, and
unique peers IDs for the requested connection.
		Is this the model?
		

	For the time being, I try to understand if in the peer to peer
model both sides issue a listen before connecting or not. Without this
listen the peer-to-peer does not seems usable to me, what's your
understanding of the spec?
	
	Or.
	

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/1d8235ad/attachment.html>

From benh at au1.ibm.com  Thu Dec 20 13:26:57 2007
From: benh at au1.ibm.com (Benjamin Herrenschmidt)
Date: Fri, 21 Dec 2007 08:26:57 +1100
Subject: [ofa-general] Re: iommu dma mapping alignment requirements
In-Reply-To: <476AD84E.4000507@opengridcomputing.com>
References: <476AA2E2.5010007@opengridcomputing.com>
	<1198181862.6779.3.camel@pasglop>
	<476AD84E.4000507@opengridcomputing.com>
Message-ID: <1198186017.6779.28.camel@pasglop>


On Thu, 2007-12-20 at 15:02 -0600, Steve Wise wrote:
> Benjamin Herrenschmidt wrote:
> > Adding A few more people to the discussion. You may well be right and we
> > would have to provide the same alignment, though that sucks a bit as one
> > of the reason we switched to 4K for the IOMMU is that the iommu space
> > available on pSeries is very small and we were running out of it with
> > 64K pages and lots of networking activity.
> > 
> 
> But smarter NIC drivers can resolve this too, I think, but perhaps 
> carving up full pages of mapped buffers instead of just assuming mapping 
> is free...

True, but the problem still happenens today, if we switch back to 64K
iommu page size (which should be possible, I need to fix that), we
-will- run out of iommu space on typical workloads and that is not
acceptable.

So we need to find a compromise.

What I might do is something around the lines of: If size >= PAGE_SIZE,
and vaddr (page_address + offset) is PAGE_SIZE aligned, then I enforce
alignment of the resulting mapping.

That should fix your case. Anything requesting smaller than PAGE_SIZE
mappings would lose that alignment but I -think- it should be safe, and
you still always get 4K alignment anyway (+/- your offset) so at least
small alignment restrictions are still enforced (such as cache line
alignment etc...).

I'll send you a test patch later today.

Ben.


From or.gerlitz at gmail.com  Thu Dec 20 13:29:08 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Thu, 20 Dec 2007 23:29:08 +0200
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <C98692FD98048C41885E0B0FACD9DFB805BD36DB@exnane01.hq.netapp.com>
References: <4767A2CD.8030209@voltaire.com> <4768289F.6040907@ichips.intel.com>
	<4769019C.10602@voltaire.com> <47696002.4030903@ichips.intel.com>
	<476A857D.3090608@voltaire.com>
	<C98692FD98048C41885E0B0FACD9DFB805BD355A@exnane01.hq.netapp.com>
	<15ddcffd0712201316o1d73ebd4k296958d8fb4b0812@mail.gmail.com>
	<C98692FD98048C41885E0B0FACD9DFB805BD36DB@exnane01.hq.netapp.com>
Message-ID: <15ddcffd0712201329h42beee3fl833eb4cb62382417@mail.gmail.com>

On 12/20/07, Kanevsky, Arkady <Arkady.Kanevsky at netapp.com> wrote:
>
>  Yes.
> The question is who issues it?
> It can be done by the CM and not ULP.
> Looking way back at VIPL there was a peer-to-peer model with the API
> similar to the
> one which Shane outlines.
>

If the CM issues the listen its means I can connect to you now only if you
try to connect to me NOW, my understanding is that this is useless protocol,
but I will be happy to hear why I am wrong.

The IB stack co maintainer name is Sean

Or.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/1dbb0f15/attachment.html>

From mshefty at ichips.intel.com  Thu Dec 20 13:32:10 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Thu, 20 Dec 2007 13:32:10 -0800
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <15ddcffd0712201314x2b064f65m3c4cbb6f0fe02a42@mail.gmail.com>
References: <4767A2CD.8030209@voltaire.com>
	<4768289F.6040907@ichips.intel.com>	 <4769019C.10602@voltaire.com>
	<47696002.4030903@ichips.intel.com>	
	<476A857D.3090608@voltaire.com> <476AB554.9080200@ichips.intel.com>
	<15ddcffd0712201314x2b064f65m3c4cbb6f0fe02a42@mail.gmail.com>
Message-ID: <476ADF5A.6080200@ichips.intel.com>

>     My thinking was that the peer to peer model would have both sides call
>     connect only.  The peer to peer connection model only kicks in when both
>     sides are in the REQ sent state.
> 
> 
> Is this observation based on the wording used by the spec? if yes, can 
> you point on the sentence/s that does it?

I was basing this on the active side state diagram.

>From my reading, I could not conclude that implementing it in a way 
> that both sides do listen and later set the peer to peer bit on the REQ 
> such that --if-- there's a "matching" REQ for the incoming REQ one side 
> sends REP and the other side ignores the incoming REQ etc - is against 
> the spec.

This leaves a window where a REQ can be received the call to listen, and 
the call to connect.

> I understand that under TCP there's also a notion of peer to peer and 
> client/server connections, I'll give it a look next week to see what's 
> the foundations over there.

That would be good.  I think the difference is that with TCP the 
listener and connector would be distinguished by their local ports, so 
its clear how to direct an incoming request.  IB doesn't assign a SID to 
a connector.  I'm not sure how TCP handles the race condition.

- Sean


From Arkady.Kanevsky at netapp.com  Thu Dec 20 13:33:13 2007
From: Arkady.Kanevsky at netapp.com (Kanevsky, Arkady)
Date: Thu, 20 Dec 2007 16:33:13 -0500
Subject: [ofa-general] peer to peer connections support
In-Reply-To: <15ddcffd0712201329h42beee3fl833eb4cb62382417@mail.gmail.com>
References: <4767A2CD.8030209@voltaire.com>
	<4768289F.6040907@ichips.intel.com><4769019C.10602@voltaire.com>
	<47696002.4030903@ichips.intel.com><476A857D.3090608@voltaire.com><C98692FD98048C41885E0B0FACD9DFB805BD355A@exnane01.hq.netapp.com><15ddcffd0712201316o1d73ebd4k296958d8fb4b0812@mail.gmail.com><C98692FD98048C41885E0B0FACD9DFB805BD36DB@exnane01.hq.netapp.com>
	<15ddcffd0712201329h42beee3fl833eb4cb62382417@mail.gmail.com>
Message-ID: <C98692FD98048C41885E0B0FACD9DFB805BD36E4@exnane01.hq.netapp.com>

What is the difference between ULP not issuing listen yet vs.
ULP not issuing peer-to-peer connect which does listen under the cover?
If conn request comes from another side before either of them
it will be rejected by CM since nobody is listening.
 
Arkady
P.S. Sean, my appologies.
 

Arkady Kanevsky                       email: arkady at netapp.com

Network Appliance Inc.               phone: 781-768-5395

1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195

Waltham, MA 02451                   central phone: 781-768-5300

 
________________________________

	From: Or Gerlitz [mailto:or.gerlitz at gmail.com] 
	Sent: Thursday, December 20, 2007 4:29 PM
	To: Kanevsky, Arkady
	Cc: OpenFabrics General
	Subject: Re: [ofa-general] peer to peer connections support
	
	
	On 12/20/07, Kanevsky, Arkady <Arkady.Kanevsky at netapp.com>
wrote: 
	

		Yes.
		The question is who issues it?
		It can be done by the CM and not ULP.
		Looking way back at VIPL there was a peer-to-peer model
with the API similar to the
		one which Shane outlines.


	If the CM issues the listen its means I can connect to you now
only if you try to connect to me NOW, my understanding is that this is
useless protocol, but I will be happy to hear why I am wrong.
	
	The IB stack co maintainer name is Sean 
	
	Or.
	

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/bbee7d7a/attachment.html>

From rdreier at cisco.com  Thu Dec 20 13:35:17 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 20 Dec 2007 13:35:17 -0800
Subject: [ofa-general] lock dependency in ib_user_mad
In-Reply-To: <000101c84346$7766b2d0$9b37170a@amr.corp.intel.com> (Sean Hefty's
	message of "Thu, 20 Dec 2007 12:25:30 -0800")
References: <000101c84346$7766b2d0$9b37170a@amr.corp.intel.com>
Message-ID: <ada8x3pyrqi.fsf@cisco.com>

 > I see hangs killing opensm related to a bug in user_mad.c.  The problem appears
 > to be:
 > 
 > ib_umad_close()
 > 	downgrade_write(&file->port->mutex)
 > 	ib_unregister_mad_agent(...)
 > 	up_read(&file->port->mutex)
 > 
 > ib_unregister_mad_agent() flushes any outstanding MADs, resulting in calls to
 > send_handler() and recv_handler(), both of which call queue_packet():
 > 
 > queue_packet()
 > 	down_read(&file->port->mutex)
 > 	...
 > 	up_read(&file->port->mutex)

This should be fine (and comes from an earlier set of changes to fix
deadlocks): ib_umad_close() does a downgrade_write() before calling
ib_unregister_mad_agent(), so it only holds the mutex with a read
lock, which means that queue_packet() should be able to take another
read lock.

Unless there's something that prevents one thread from taking a read
lock twice?  What kernel are you seeing these problems with?

 > Does anyone know the reasoning for holding the mutex around
 > ib_unregister_mad_agent()?

It's to keep things serialized against a port disappearing because a
device is being removed.  But looking at things, I think we can
probably rejigger the locking to make things simpler, and avoid the
use of downgrade_write(), which the -rt people don't like.

 - R.


From swise at opengridcomputing.com  Thu Dec 20 14:12:25 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 20 Dec 2007 16:12:25 -0600
Subject: [ofa-general] Re: iommu dma mapping alignment requirements
In-Reply-To: <1198186017.6779.28.camel@pasglop>
References: <476AA2E2.5010007@opengridcomputing.com>	
	<1198181862.6779.3.camel@pasglop>
	<476AD84E.4000507@opengridcomputing.com>
	<1198186017.6779.28.camel@pasglop>
Message-ID: <476AE8C9.9080601@opengridcomputing.com>

Benjamin Herrenschmidt wrote:
> On Thu, 2007-12-20 at 15:02 -0600, Steve Wise wrote:
>> Benjamin Herrenschmidt wrote:
>>> Adding A few more people to the discussion. You may well be right and we
>>> would have to provide the same alignment, though that sucks a bit as one
>>> of the reason we switched to 4K for the IOMMU is that the iommu space
>>> available on pSeries is very small and we were running out of it with
>>> 64K pages and lots of networking activity.
>>>
>> But smarter NIC drivers can resolve this too, I think, but perhaps 
>> carving up full pages of mapped buffers instead of just assuming mapping 
>> is free...
> 
> True, but the problem still happenens today, if we switch back to 64K
> iommu page size (which should be possible, I need to fix that), we
> -will- run out of iommu space on typical workloads and that is not
> acceptable.
> 
> So we need to find a compromise.
> 
> What I might do is something around the lines of: If size >= PAGE_SIZE,
> and vaddr (page_address + offset) is PAGE_SIZE aligned, then I enforce
> alignment of the resulting mapping.
> 
> That should fix your case. Anything requesting smaller than PAGE_SIZE
> mappings would lose that alignment but I -think- it should be safe, and
> you still always get 4K alignment anyway (+/- your offset) so at least
> small alignment restrictions are still enforced (such as cache line
> alignment etc...).
> 
> I'll send you a test patch later today.
> 
> Ben.
> 

Sounds good.  Thanks!

Note, that these smaller sub-host-page-sized mappings might pollute the 
address space causing full aligned host-page-size maps to become 
scarce...  Maybe there's a clever way to keep those in their own segment 
of the address space?


From pradeeps at linux.vnet.ibm.com  Thu Dec 20 15:53:41 2007
From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana)
Date: Thu, 20 Dec 2007 15:53:41 -0800
Subject: [ofa-general] [PATCH] [RFC] IPOIB/CM Enable SRQ support on HCAs with
 less than 16 SG entries
Message-ID: <476B0085.2080400@linux.vnet.ibm.com>

Some HCAs like ehca2 support fewer than 16 SG entries. Currently IPoIB/CM
implicitly assumes all HCAs will support 16 SG entries of 4K pages for 64K 
MTUs. This patch removes that restriction.

This patch continues to use order 0 allocations and enables implementation of 
connected mode on such HCAs with smaller MTUs. HCAs having the capability to 
support 16 SG entries are left untouched.

This patch addresses bug# 728:
https://bugs.openfabrics.org/show_bug.cgi?id=728

While working on this patch I discovered that mthca reports an incorrect
value of max_srq_sge. I had reported this issue previously too several 
weeks ago. I solved that by using a hard coded value of 16 for max_srq_sge
(mthca only). More on that in a following mail.

Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
---

--- a/drivers/infiniband/ulp/ipoib/ipoib.h	2007-11-03 11:37:02.000000000 -0700
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h	2007-12-20 13:17:43.000000000 -0800
@@ -466,6 +466,7 @@ void ipoib_drain_cq(struct net_device *d
 #define IPOIB_CM_SUPPORTED(ha)   (ha[0] & (IPOIB_FLAGS_RC))
 
 extern int ipoib_max_conn_qp;
+extern int max_cm_mtu;
 
 static inline int ipoib_cm_admin_enabled(struct net_device *dev)
 {
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2007-11-21 07:46:35.000000000 -0800
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2007-12-20 14:47:13.000000000 -0800
@@ -74,6 +74,9 @@ static struct ib_send_wr ipoib_cm_rx_dra
 	.opcode = IB_WR_SEND,
 };
 
+static int num_of_frags;
+int max_cm_mtu;
+
 static int ipoib_cm_tx_handler(struct ib_cm_id *cm_id,
 			       struct ib_cm_event *event);
 
@@ -96,13 +99,13 @@ static int ipoib_cm_post_receive_srq(str
 
 	priv->cm.rx_wr.wr_id = id | IPOIB_OP_CM | IPOIB_OP_RECV;
 
-	for (i = 0; i < IPOIB_CM_RX_SG; ++i)
+	for (i = 0; i < num_of_frags; ++i)
 		priv->cm.rx_sge[i].addr = priv->cm.srq_ring[id].mapping[i];
 
 	ret = ib_post_srq_recv(priv->cm.srq, &priv->cm.rx_wr, &bad_wr);
 	if (unlikely(ret)) {
 		ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret);
-		ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1,
+		ipoib_cm_dma_unmap_rx(priv, num_of_frags - 1,
 				      priv->cm.srq_ring[id].mapping);
 		dev_kfree_skb_any(priv->cm.srq_ring[id].skb);
 		priv->cm.srq_ring[id].skb = NULL;
@@ -623,6 +626,7 @@ repost:
 			--p->recv_count;
 			ipoib_warn(priv, "ipoib_cm_post_receive_nonsrq failed "
 				   "for buf %d\n", wr_id);
+		kfree(mapping); /*** Check if this needed ***/
 		}
 	}
 }
@@ -1399,16 +1403,17 @@ int ipoib_cm_add_mode_attr(struct net_de
 	return device_create_file(&dev->dev, &dev_attr_mode);
 }
 
-static void ipoib_cm_create_srq(struct net_device *dev)
+static void ipoib_cm_create_srq(struct net_device *dev, int max_sge)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 	struct ib_srq_init_attr srq_init_attr = {
 		.attr = {
 			.max_wr  = ipoib_recvq_size,
-			.max_sge = IPOIB_CM_RX_SG
 		}
 	};
 
+	srq_init_attr.attr.max_sge = max_sge;
+
 	priv->cm.srq = ib_create_srq(priv->pd, &srq_init_attr);
 	if (IS_ERR(priv->cm.srq)) {
 		if (PTR_ERR(priv->cm.srq) != -ENOSYS)
@@ -1418,6 +1423,7 @@ static void ipoib_cm_create_srq(struct n
 		return;
 	}
 
+
 	priv->cm.srq_ring = kzalloc(ipoib_recvq_size * sizeof *priv->cm.srq_ring,
 				    GFP_KERNEL);
 	if (!priv->cm.srq_ring) {
@@ -1431,7 +1437,9 @@ static void ipoib_cm_create_srq(struct n
 int ipoib_cm_dev_init(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
-	int i;
+	int i, ret;
+	struct ib_srq_attr srq_attr;
+	struct ib_device_attr attr;
 
 	INIT_LIST_HEAD(&priv->cm.passive_ids);
 	INIT_LIST_HEAD(&priv->cm.reap_list);
@@ -1448,22 +1456,46 @@ int ipoib_cm_dev_init(struct net_device 
 
 	skb_queue_head_init(&priv->cm.skb_queue);
 
-	for (i = 0; i < IPOIB_CM_RX_SG; ++i)
+	ret = ib_query_device(priv->ca, &attr);
+	if (ret) {
+		printk(KERN_WARNING "ib_query_device() failed with %d\n", ret);
+		return ret;
+	}
+
+	ipoib_dbg(priv, "max_srq_sge=%d\n", attr.max_srq_sge);
+
+	ipoib_cm_create_srq(dev, attr.max_srq_sge);
+
+	if (ipoib_cm_has_srq(dev)) {
+		ret = ib_query_srq(priv->cm.srq, &srq_attr);
+		if (ret) {
+			printk(KERN_WARNING "ib_query_srq() failed with %d\n", ret);
+			return -EINVAL;
+		}
+		/* pad similar to IPOIB_CM_MTU */
+		max_cm_mtu = srq_attr.max_sge * PAGE_SIZE - 0x10;
+		num_of_frags = srq_attr.max_sge;
+		ipoib_dbg(priv, "max_cm_mtu = 0x%x, num_of_frags=%d\n",
+			  max_cm_mtu, num_of_frags);
+	} else {
+		max_cm_mtu = IPOIB_CM_MTU;
+		num_of_frags  = IPOIB_CM_RX_SG;
+	}
+
+	for (i = 0; i < num_of_frags; ++i)
 		priv->cm.rx_sge[i].lkey	= priv->mr->lkey;
 
 	priv->cm.rx_sge[0].length = IPOIB_CM_HEAD_SIZE;
-	for (i = 1; i < IPOIB_CM_RX_SG; ++i)
+	for (i = 1; i < num_of_frags; ++i)
 		priv->cm.rx_sge[i].length = PAGE_SIZE;
 	priv->cm.rx_wr.next = NULL;
 	priv->cm.rx_wr.sg_list = priv->cm.rx_sge;
-	priv->cm.rx_wr.num_sge = IPOIB_CM_RX_SG;
-
-	ipoib_cm_create_srq(dev);
+	priv->cm.rx_wr.num_sge = num_of_frags;
 
 	if (ipoib_cm_has_srq(dev)) {
 		for (i = 0; i < ipoib_recvq_size; ++i) {
 			if (!ipoib_cm_alloc_rx_skb(dev, priv->cm.srq_ring, i,
-						   IPOIB_CM_RX_SG - 1,
+						   num_of_frags - 1,
 						   priv->cm.srq_ring[i].mapping)) {
 				ipoib_warn(priv, "failed to allocate "
 					   "receive buffer %d\n", i);
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-12-19 14:02:15.000000000 -0800
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-12-20 13:17:43.000000000 -0800
@@ -182,12 +182,15 @@ static int ipoib_change_mtu(struct net_d
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 
 	/* dev->mtu > 2K ==> connected mode */
-	if (ipoib_cm_admin_enabled(dev) && new_mtu <= IPOIB_CM_MTU) {
-		if (new_mtu > priv->mcast_mtu)
-			ipoib_warn(priv, "mtu > %d will cause multicast packet drops.\n",
+	if (ipoib_cm_admin_enabled(dev)) {
+		if (new_mtu <= max_cm_mtu) {
+			if (new_mtu > priv->mcast_mtu)
+				ipoib_warn(priv, "mtu > %d will cause multicast packet drops.\n",
 				   priv->mcast_mtu);
-		dev->mtu = new_mtu;
-		return 0;
+			dev->mtu = new_mtu;
+			return 0;
+		} else
+			return -EINVAL;
 	}
 
 	if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) {


From benh at au1.ibm.com  Thu Dec 20 15:49:45 2007
From: benh at au1.ibm.com (Benjamin Herrenschmidt)
Date: Fri, 21 Dec 2007 10:49:45 +1100
Subject: [ofa-general] Re: iommu dma mapping alignment requirements
In-Reply-To: <476AE8C9.9080601@opengridcomputing.com>
References: <476AA2E2.5010007@opengridcomputing.com>
	<1198181862.6779.3.camel@pasglop>
	<476AD84E.4000507@opengridcomputing.com>
	<1198186017.6779.28.camel@pasglop>
	<476AE8C9.9080601@opengridcomputing.com>
Message-ID: <1198194585.6779.31.camel@pasglop>


> Sounds good.  Thanks!
> 
> Note, that these smaller sub-host-page-sized mappings might pollute the 
> address space causing full aligned host-page-size maps to become 
> scarce...  Maybe there's a clever way to keep those in their own segment 
> of the address space?

We already have a large vs. small split in the iommu virtual space to
alleviate this (though it's not a hard constraint, we can still get
into the "other" side if the default one is full).

Try that patch and let me know:

Index: linux-work/arch/powerpc/kernel/iommu.c
===================================================================
--- linux-work.orig/arch/powerpc/kernel/iommu.c	2007-12-21 10:39:39.000000000 +1100
+++ linux-work/arch/powerpc/kernel/iommu.c	2007-12-21 10:46:18.000000000 +1100
@@ -278,6 +278,7 @@ int iommu_map_sg(struct iommu_table *tbl
 	unsigned long flags;
 	struct scatterlist *s, *outs, *segstart;
 	int outcount, incount, i;
+	unsigned int align;
 	unsigned long handle;
 
 	BUG_ON(direction == DMA_NONE);
@@ -309,7 +310,11 @@ int iommu_map_sg(struct iommu_table *tbl
 		/* Allocate iommu entries for that segment */
 		vaddr = (unsigned long) sg_virt(s);
 		npages = iommu_num_pages(vaddr, slen);
-		entry = iommu_range_alloc(tbl, npages, &handle, mask >> IOMMU_PAGE_SHIFT, 0);
+		align = 0;
+		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT && (vaddr & ~PAGE_MASK) == 0)
+			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
+		entry = iommu_range_alloc(tbl, npages, &handle,
+					  mask >> IOMMU_PAGE_SHIFT, align);
 
 		DBG("  - vaddr: %lx, size: %lx\n", vaddr, slen);
 
@@ -572,7 +577,7 @@ dma_addr_t iommu_map_single(struct iommu
 {
 	dma_addr_t dma_handle = DMA_ERROR_CODE;
 	unsigned long uaddr;
-	unsigned int npages;
+	unsigned int npages, align;
 
 	BUG_ON(direction == DMA_NONE);
 
@@ -580,8 +585,13 @@ dma_addr_t iommu_map_single(struct iommu
 	npages = iommu_num_pages(uaddr, size);
 
 	if (tbl) {
+		align = 0;
+		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT &&
+		    ((unsigned long)vaddr & ~PAGE_MASK) == 0)
+			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
+
 		dma_handle = iommu_alloc(tbl, vaddr, npages, direction,
-					 mask >> IOMMU_PAGE_SHIFT, 0);
+					 mask >> IOMMU_PAGE_SHIFT, align);
 		if (dma_handle == DMA_ERROR_CODE) {
 			if (printk_ratelimit())  {
 				printk(KERN_INFO "iommu_alloc failed, "


From information at hitsrl.it  Thu Dec 20 15:07:50 2007
From: information at hitsrl.it (Spare Parts from HIT)
Date: Fri, 21 Dec 2007 00:07:50 +0100
Subject: [ofa-general] Merry Christmas and Happy New Year - 2008
Message-ID: <469C21470A19151A@> (added by postmaster@aa001msb.fastweb.it)

HIT Revamping, Spare Parts for Port Equipments and Material Handling Equipments

Merry Christmas and Happy New Year
  

                                                                                                       from HIT S.r.l. Team
                                                                                                                                                                            info at hitsrl.com             


SPARE PARTS SERVICE AVAILABLE     spareparts at hitsrl.com     Spare Parts Fax: +39 059 9770805 


HIT S.r.l.   Sede Legale: � Via S.Francesco 8 int 5  Carpi 41012 (Mo)  ITALY   Tel. +39 059 6229975   Fax. +39 059 6221140
 Damiano Vanini   d.vanini at hitsrl.com  Cell +39 335 7162346     -       Paolo Soncini   p.soncini at hitsrl.com Cell +39 335 8238855 
 www.hitsrl.com      InfoMail: info at hitsrl.com                                C.F. e P.IVA : 02984080362    Codice REA: MO 348073 

 If you don�t want receive any more HIT Srl informative email just  reply to this email with NOEMAIL in subject. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071221/11dd61b7/attachment.html>

From mshefty at ichips.intel.com  Thu Dec 20 16:08:59 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Thu, 20 Dec 2007 16:08:59 -0800
Subject: [ofa-general] lock dependency in ib_user_mad
In-Reply-To: <ada8x3pyrqi.fsf@cisco.com>
References: <000101c84346$7766b2d0$9b37170a@amr.corp.intel.com>
	<ada8x3pyrqi.fsf@cisco.com>
Message-ID: <476B041B.7060909@ichips.intel.com>

> This should be fine (and comes from an earlier set of changes to fix
> deadlocks): ib_umad_close() does a downgrade_write() before calling
> ib_unregister_mad_agent(), so it only holds the mutex with a read
> lock, which means that queue_packet() should be able to take another
> read lock.

I'll see if I can reproduce and get more info.  I thought the mutex was 
contributing to the hang, but you're right.

> Unless there's something that prevents one thread from taking a read
> lock twice?  What kernel are you seeing these problems with?

I'm running 2.6.24-rc3.

I'm out on vacation through the end of the year, so I'm not sure if I'll 
be able to debug this further for a couple of weeks.

- Sean


From pradeeps at linux.vnet.ibm.com  Thu Dec 20 16:11:50 2007
From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana)
Date: Thu, 20 Dec 2007 16:11:50 -0800
Subject: [ofa-general] Oops in mthca
Message-ID: <476B04C6.1040803@linux.vnet.ibm.com>

I discovered the following Oops while developing a patch to enable SRQ on HCAs with fewer than
16 SG elements.

The root of this issue appears to be that ib_query_device(priv->ca, &attr)
reports an incorrect value for attr.max_srq_sge. The value that
ib_query_device returns is 28 (instead of 16 that I expected).


Dec 20 13:19:47 elm3b39 kernel: Oops: Kernel access of bad area, sig: 11 [#2]
Dec 20 13:19:47 elm3b39 kernel: SMP NR_CPUS=128 NUMA pSeries
Dec 20 13:19:47 elm3b39 kernel: Modules linked in: ib_ipoib autofs4 rdma_ucm rdma_cm ib_addr iw_cm ib_uverbs ib_umad ib_mthca ib_cm ib_sa ib_mad ib_core ipv6 binfmt_misc parport_pc lp parport sg e1000 dm_snapshot dm_zero dm_mirror dm_mod ipr libata firmware_class sd_mod scsi_mod ehci_hcd ohci_hcd usbcore
Dec 20 13:19:47 elm3b39 kernel: NIP: d0000000002ffb60 LR: d0000000002ffb08 CTR: c00000000043a9b0
Dec 20 13:19:47 elm3b39 kernel: REGS: c0000001d05ff2e0 TRAP: 0300   Tainted: G      D  (2.6.24-rc5)
Dec 20 13:19:47 elm3b39 kernel: MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24024424  XER: 00000010
Dec 20 13:19:47 elm3b39 kernel: DAR: 0000000060bf0008, DSISR: 0000000040000000
Dec 20 13:19:47 elm3b39 kernel: TASK = c0000001d2e4a000[8233] 'modprobe' THREAD: c0000001d05fc000 CPU: 4
Dec 20 13:19:47 elm3b39 kernel: GPR00: 0000000000000001 c0000001d05ff560 d000000000320308 c0000001d2e54010
Dec 20 13:19:47 elm3b39 kernel: GPR04: 0000000000000000 0000000000000001 c0000001d0654000 0000000000000001
Dec 20 13:19:47 elm3b39 kernel: GPR08: 0000000000000000 000000000000001c 0000000060bf0000 0000000060bf0000
Dec 20 13:19:47 elm3b39 kernel: GPR12: d000000000301fc8 c00000000057f600 d0000000005a2090 d0000000005a20d0
Dec 20 13:19:47 elm3b39 kernel: GPR16: 0000000000000000 00000000000001e3 00000000000001e3 d00000000032eba0
Dec 20 13:19:47 elm3b39 kernel: GPR20: 0000000000000000 0000000000000034 c0000001d05ff690 0000000000000001
Dec 20 13:19:47 elm3b39 kernel: GPR24: c0000000e482b000 0000000000000000 0000000000000000 0000000000000000
Dec 20 13:19:47 elm3b39 kernel: GPR28: c0000001d2972c00 0000000000000000 d00000000031f190 c0000001d020ee78
Dec 20 13:19:47 elm3b39 kernel: NIP [d0000000002ffb60] .mthca_tavor_post_srq_recv+0xe0/0x2e0 [ib_mthca]
Dec 20 13:19:47 elm3b39 kernel: LR [d0000000002ffb08] .mthca_tavor_post_srq_recv+0x88/0x2e0 [ib_mthca]
Dec 20 13:19:47 elm3b39 kernel: Call Trace:
Dec 20 13:19:47 elm3b39 kernel: [c0000001d05ff560] [d0000000002ffad4] .mthca_tavor_post_srq_recv+0x54/0x2e0 [ib_mthca] (unreliable)
Dec 20 13:19:47 elm3b39 kernel: [c0000001d05ff620] [d0000000003239fc] .ipoib_cm_post_receive_srq+0xbc/0x150 [ib_ipoib]
Dec 20 13:19:47 elm3b39 kernel: [c0000001d05ff6d0] [d000000000325984] .ipoib_cm_dev_init+0x2f4/0x560 [ib_ipoib]
Dec 20 13:19:47 elm3b39 kernel: [c0000001d05ff870] [d000000000322c74] .ipoib_transport_dev_init+0xd4/0x330 [ib_ipoib]
Dec 20 13:19:47 elm3b39 kernel: [c0000001d05ff970] [d00000000031f90c] .ipoib_ib_dev_init+0x3c/0xc0 [ib_ipoib]
Dec 20 13:19:47 elm3b39 kernel: [c0000001d05ffa00] [d00000000031aaac] .ipoib_dev_init+0x9c/0x160 [ib_ipoib]
Dec 20 13:19:48 elm3b39 kernel: [c0000001d05ffaa0] [d00000000031ad98] .ipoib_add_one+0x228/0x3b0 [ib_ipoib]
Dec 20 13:19:48 elm3b39 kernel: [c0000001d05ffb60] [d0000000001bf6ec] .ib_register_client+0xcc/0x110 [ib_core]
Dec 20 13:19:48 elm3b39 kernel: [c0000001d05ffc00] [d000000000328484] .ipoib_init_module+0x174/0x2288 [ib_ipoib]
Dec 20 13:19:48 elm3b39 kernel: [c0000001d05ffc90] [c00000000008eeec] .sys_init_module+0x20c/0x1aa0
Dec 20 13:19:48 elm3b39 kernel: [c0000001d05ffe30] [c0000000000086ac] syscall_exit+0x0/0x40
Dec 20 13:19:48 elm3b39 kernel: Instruction dump:
Dec 20 13:19:48 elm3b39 kernel: 419c0204 2f890000 38630010 38e00000 409d0060 38e00000 39000000 60000000
Dec 20 13:19:48 elm3b39 kernel: e95f0010 38070001 7c0707b4 7d6a4214 <800b0008> 90030000 60000000 60000000


lspci -v gives me the following:

0002:d8:01.0 PCI bridge: Mellanox Technologies MT23108 PCI Bridge (rev a1) (prog-if 00 [Normal decode])
        Flags: bus master, 66MHz, medium devsel, latency 144
        Bus: primary=d8, secondary=d9, subordinate=d9, sec-latency=128
        Memory behind bridge: c0000000-c88fffff
        Capabilities: [70] PCI-X bridge device

0002:d9:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1)
        Subsystem: Mellanox Technologies MT23108 InfiniHost
        Flags: bus master, 66MHz, medium devsel, latency 144, IRQ 121
        Memory at 400c8800000 (64-bit, non-prefetchable) [size=1M]
        Memory at 400c8000000 (64-bit, prefetchable) [size=8M]
        Memory at 400c0000000 (64-bit, prefetchable) [size=128M]
        Capabilities: [40] MSI-X: Enable- Mask- TabSize=32
        Capabilities: [50] Vital Product Data
        Capabilities: [60] Message Signalled Interrupts: 64bit+ Queue=0/5 Enable-
        Capabilities: [70] PCI-X non-bridge device

Pradeep


From jackieict at gmail.com  Thu Dec 20 17:40:14 2007
From: jackieict at gmail.com (zhang Jackie)
Date: Fri, 21 Dec 2007 09:40:14 +0800
Subject: [ofa-general] Java invoke the verbs through JNI
Message-ID: <13432ab00712201740t10f0af8eldf13609ffa5ca680@mail.gmail.com>

Hi, all

I just wrote a JNI program to use IB in Java program. I wrote some simple
test programs, It is ok. But when I want to integrate it with another
program , Local protection error is reported.  It is unstable and it is
wrong during the most of time.
Can someone give me some advice? Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071221/088e40e2/attachment.html>

From rdreier at cisco.com  Thu Dec 20 17:44:30 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 20 Dec 2007 17:44:30 -0800
Subject: [ofa-general] Oops in mthca
In-Reply-To: <476B04C6.1040803@linux.vnet.ibm.com> (Pradeep Satyanarayana's
	message of "Thu, 20 Dec 2007 16:11:50 -0800")
References: <476B04C6.1040803@linux.vnet.ibm.com>
Message-ID: <adasl1wyg75.fsf@cisco.com>

 > I discovered the following Oops while developing a patch to enable SRQ on HCAs with fewer than
 > 16 SG elements.

So is this oops with some version of your patch for limited SRQ
scatter entries applied?  It's hard to know exactly what is going
wrong but I suspect that if you get a device that allows more than 16
SRQ scatter entries, your patch passes that value for num_sg without
changing the declaration of rx_sge[] to have enough entries, so when
posting the receive request, the low-level driver goes off the end of
the array.

 > The root of this issue appears to be that ib_query_device(priv->ca, &attr)
 > reports an incorrect value for attr.max_srq_sge. The value that
 > ib_query_device returns is 28 (instead of 16 that I expected).

Why do you think the value 28 is incorrect?  Unfortunately I don't
have any PCI-X systems any more, but I don't see anything obvoius in
the mthca code that would make the value it returns for max_srq_sge
being wrong.

 - R.


From rdreier at cisco.com  Thu Dec 20 17:50:37 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 20 Dec 2007 17:50:37 -0800
Subject: [ofa-general] [PATCH] [RFC] IPOIB/CM Enable SRQ support on HCAs
	with less than 16 SG entries
In-Reply-To: <476B0085.2080400@linux.vnet.ibm.com> (Pradeep Satyanarayana's
	message of "Thu, 20 Dec 2007 15:53:41 -0800")
References: <476B0085.2080400@linux.vnet.ibm.com>
Message-ID: <adaodckyfwy.fsf@cisco.com>

 > +static int num_of_frags;
 > +int max_cm_mtu;

I think these values need to be per-interface -- think of the case of
a system with more than one type of HCA installed, where the different
HCAs have different limits.

 > @@ -623,6 +626,7 @@ repost:
 >  			--p->recv_count;
 >  			ipoib_warn(priv, "ipoib_cm_post_receive_nonsrq failed "
 >  				   "for buf %d\n", wr_id);
 > +		kfree(mapping); /*** Check if this needed ***/

This looks really bogus -- I don't see anything in your patch that
changes mapping from being allocated on the stack.

 > +	if (ipoib_cm_has_srq(dev)) {
 > +		ret = ib_query_srq(priv->cm.srq, &srq_attr);
 > +		if (ret) {
 > +			printk(KERN_WARNING "ib_query_srq() failed with %d\n", ret);
 > +			return -EINVAL;
 > +		}
 > +		/* pad similar to IPOIB_CM_MTU */
 > +		max_cm_mtu = srq_attr.max_sge * PAGE_SIZE - 0x10;
 > +		num_of_frags = srq_attr.max_sge;
 > +		ipoib_dbg(priv, "max_cm_mtu = 0x%x, num_of_frags=%d\n",
 > +			  max_cm_mtu, num_of_frags);
 > +	} else {
 > +		max_cm_mtu = IPOIB_CM_MTU;
 > +		num_of_frags  = IPOIB_CM_RX_SG;
 > +	}

I think in the SRQ case you still want to make sure num_of_frags is no
more than IPOIB_CM_RX_SG.  And if we're going to check the SRQ scatter
capabilities, we should probably add the same thing for the non-SRQ
case to make sure we don't exceed what QP receive queues can handle.


From erezs at voltaire.com  Thu Dec 20 20:38:00 2007
From: erezs at voltaire.com (Erez Strauss)
Date: Thu, 20 Dec 2007 23:38:00 -0500
Subject: [ofa-general] [PATCH] ibnetdiscover - ports report
Message-ID: <C44068BB95F2E54DB07822CE539BCF1CBD90D1@exus01.voltaire.com>

Hello IB developers and users,

 
I would like to get feedback on the following patch to ibnetdiscover.

 
The patch introduce additional output mode for ibnetdiscover which is
focused on the ports, and print one line for each port with the needed
port information.

 
The output looks like:

 
SW     4 18 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4 17 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4 16 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4 15 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4 14 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4 13 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4  9 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4  8 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4  7 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4  6 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4  5 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4  4 0x0008f104003f0838 4x SDR
'ISR9288/ISR9096 Voltaire sLB-24'

SW     4  1 0x0008f104003f0838 4x SDR - SW     6  3 0x0008f104004005f5 (
'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )

SW     4  2 0x0008f104003f0838 4x SDR - SW     7  3 0x0008f104004005f6 (
'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )

SW     4  3 0x0008f104003f0838 4x SDR - SW     1  3 0x0008f104004005f7 (
'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )

SW     4 10 0x0008f104003f0838 4x SDR - SW     8  3 0x0008f104004006f5 (
'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )

SW     4 11 0x0008f104003f0838 4x SDR - SW     9  3 0x0008f104004006f6 (
'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )

SW     4 12 0x0008f104003f0838 4x SDR - SW    10  3 0x0008f104004006f7 (
'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )

CA    14  1 0x0008f10403960091 4x SDR - SW     4 20 0x0008f104003f0838 (
'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )

CA    11  1 0x0002c90107a4e431 4x SDR - SW     4 19 0x0008f104003f0838 (
'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )

CA     2  1 0x0008f1000102d801 4x SDR - SW     1 15 0x0008f104004005f7 (
'Voltaire IB-to-TCP/IP Router' - 'ISR9288 Voltaire 

 
Thanks,

 
Erez Strauss

Voltaire.

 
-------------

Date:   Thu Dec 20 19:36:14 2007 -0500

 
     Added the -p(orts) option, to generate ports reports

 
    Signed-off-by: Erez Strauss <erezs _at_ voltaire.com>

---

 infiniband-diags/src/ibnetdiscover.c |   64
++++++++++++++++++++++++++++++++--

 1 files changed, 61 insertions(+), 3 deletions(-)

 
diff --git a/infiniband-diags/src/ibnetdiscover.c
b/infiniband-diags/src/ibnetdiscover.c

index 8b229c1..3c2e6b6 100644

--- a/infiniband-diags/src/ibnetdiscover.c

+++ b/infiniband-diags/src/ibnetdiscover.c

@@ -119,6 +119,17 @@ get_linkspeed_str(int linkspeed)

                return linkspeed_str[linkspeed];

 }

 
+static inline const char*

+node_type_str2(Node *node)

+{

+  switch(node->type) {

+  case SWITCH_NODE: return "SW";

+  case CA_NODE:     return "CA";

+  case ROUTER_NODE: return "RT";

+  }

+  return "??";

+}

+

 int

 get_port(Port *port, int portnum, ib_portid_t *portid)

 {

@@ -839,11 +850,50 @@ dump_topology(int listtype, int group)

        return i;

 }

 
+void dump_ports_report ()

+{

+       int b, n = 0, p;

+       Node *node;

+       Port *port;

+

+       // If switch and LID == 0, search of other switch ports with
valid LID and assign it to all ports of that switch

+       for (b = 0; b <= MAXHOPS; b++)

+               for (node = nodesdist[b]; node; node = node->dnext)

+                       if (node->type == SWITCH_NODE) {

+                               int swlid = 0;

+                               for (p = 0, port = node->ports; p <
node->numports && port && !swlid; port = port->next)

+                                       if (port->lid != 0)

+                                               swlid = port->lid;

+                               for (p = 0, port = node->ports; p <
node->numports && port; port = port->next)

+                                       port->lid = swlid;

+                       }

+       for (b = 0; b <= MAXHOPS; b++)

+               for (node = nodesdist[b]; node; node = node->dnext) {

+                       for (p = 0, port = node->ports; p <
node->numports && port; p++, port = port->next) {

+                               fprintf (stdout, "%2s %5d %2d 0x%016llx
%s %s",

+                                        node_type_str2 (port->node),
port->lid,  port->portnum,

+                                        (unsigned long
long)port->portguid,

+
get_linkwidth_str(port->linkwidth), get_linkspeed_str(port->linkspeed));

+                               if (port->remoteport)

+                                       fprintf (stdout, " - %2s %5d %2d
0x%016llx ( '%s' - '%s' )\n",

+                                                node_type_str2
(port->remoteport->node),

+                                                port->remoteport->lid,

+
port->remoteport->portnum,

+                                                (unsigned long
long)port->remoteport->portguid,

+                                                port->node->nodedesc,

+
port->remoteport->node->nodedesc);

+                               else

+                                       fprintf (stdout, "%36s'%s'\n",
"", port->node->nodedesc);

+                       }

+                       n++;

+               }

+}

+

 void

 usage(void)

 {

        fprintf(stderr, "Usage: %s [-d(ebug)] -e(rr_show) -v(erbose)
-s(how) -l(ist) -g(rouping) -H(ca_list) -S(witch_list) -R(outer_list)
-V(ersion) -C ca_name -P ca_port "

-                       "-t(imeout) timeout_ms --node-name-map
node-name-map] [<topology-file>]\n",

+                       "-t(imeout) timeout_ms --node-name-map
node-name-map] -p(orts) [<topology-file>]\n",

                        argv0);

        fprintf(stderr, "       --node-name-map <node-name-map> specify
a node name map file\n");

        exit(-1);

@@ -858,8 +908,9 @@ main(int argc, char **argv)

        char *ca = 0;

        int ca_port = 0;

        int group = 0;

+       int ports_report = 0;

 
-       static char const str_opts[] = "C:P:t:devslgHSRVhu";

+       static char const str_opts[] = "C:P:t:devslgHSRpVhu";

        static const struct option long_opts[] = {

                { "C", 1, 0, 'C'},

                { "P", 1, 0, 'P'},

@@ -874,6 +925,7 @@ main(int argc, char **argv)

                { "Router_list", 0, 0, 'R'},

                { "timeout", 1, 0, 't'},

                { "node-name-map", 1, 0, 1},

+               { "ports", 0, 0, 'p'},

                { "Version", 0, 0, 'V'},

                { "help", 0, 0, 'h'},

                { "usage", 0, 0, 'u'},

@@ -935,6 +987,9 @@ main(int argc, char **argv)

                case 'V':

                        fprintf(stderr, "%s %s\n", argv0,
get_build_version() );

                        exit(-1);

+               case 'p':

+                       ports_report = 1;

+                       break;

                default:

                        usage();

                        break;

@@ -955,7 +1010,10 @@ main(int argc, char **argv)

        if (group)

                chassis = group_nodes();

 
-       dump_topology(list, group);

+       if (ports_report)

+               dump_ports_report ();

+       else

+               dump_topology(list, group);

 
        close_node_name_map(node_name_map);

        exit(0);

-------------

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071220/b30e6788/attachment.html>

From pradeeps at linux.vnet.ibm.com  Thu Dec 20 20:45:31 2007
From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana)
Date: Thu, 20 Dec 2007 20:45:31 -0800
Subject: [ofa-general] [PATCH] [RFC] IPOIB/CM Enable SRQ support on HCAs
	with less than 16 SG entries
In-Reply-To: <adaodckyfwy.fsf@cisco.com>
References: <476B0085.2080400@linux.vnet.ibm.com> <adaodckyfwy.fsf@cisco.com>
Message-ID: <476B44EB.9080402@linux.vnet.ibm.com>

Good points. I will incorporate your comments.

Roland Dreier wrote:
>  > +static int num_of_frags;
>  > +int max_cm_mtu;
> 
> I think these values need to be per-interface -- think of the case of
> a system with more than one type of HCA installed, where the different
> HCAs have different limits.
> 
>  > @@ -623,6 +626,7 @@ repost:
>  >  			--p->recv_count;
>  >  			ipoib_warn(priv, "ipoib_cm_post_receive_nonsrq failed "
>  >  				   "for buf %d\n", wr_id);
>  > +		kfree(mapping); /*** Check if this needed ***/
> 
> This looks really bogus -- I don't see anything in your patch that
> changes mapping from being allocated on the stack.

Right, as the comment illustrates it is a hold over from something else
and slipped into the patch.

Pradeep


From swise at opengridcomputing.com  Thu Dec 20 20:49:45 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 20 Dec 2007 22:49:45 -0600
Subject: [ofa-general] Re: iommu dma mapping alignment requirements
In-Reply-To: <1198194585.6779.31.camel@pasglop>
References: <476AA2E2.5010007@opengridcomputing.com>	
	<1198181862.6779.3.camel@pasglop>
	<476AD84E.4000507@opengridcomputing.com>	
	<1198186017.6779.28.camel@pasglop>
	<476AE8C9.9080601@opengridcomputing.com>
	<1198194585.6779.31.camel@pasglop>
Message-ID: <476B45E9.1040909@opengridcomputing.com>


Benjamin Herrenschmidt wrote:
>> Sounds good.  Thanks!
>>
>> Note, that these smaller sub-host-page-sized mappings might pollute the 
>> address space causing full aligned host-page-size maps to become 
>> scarce...  Maybe there's a clever way to keep those in their own segment 
>> of the address space?
> 
> We already have a large vs. small split in the iommu virtual space to
> alleviate this (though it's not a hard constraint, we can still get
> into the "other" side if the default one is full).
> 
> Try that patch and let me know:

Seems to be working!

:)


> 
> Index: linux-work/arch/powerpc/kernel/iommu.c
> ===================================================================
> --- linux-work.orig/arch/powerpc/kernel/iommu.c	2007-12-21 10:39:39.000000000 +1100
> +++ linux-work/arch/powerpc/kernel/iommu.c	2007-12-21 10:46:18.000000000 +1100
> @@ -278,6 +278,7 @@ int iommu_map_sg(struct iommu_table *tbl
>  	unsigned long flags;
>  	struct scatterlist *s, *outs, *segstart;
>  	int outcount, incount, i;
> +	unsigned int align;
>  	unsigned long handle;
>  
>  	BUG_ON(direction == DMA_NONE);
> @@ -309,7 +310,11 @@ int iommu_map_sg(struct iommu_table *tbl
>  		/* Allocate iommu entries for that segment */
>  		vaddr = (unsigned long) sg_virt(s);
>  		npages = iommu_num_pages(vaddr, slen);
> -		entry = iommu_range_alloc(tbl, npages, &handle, mask >> IOMMU_PAGE_SHIFT, 0);
> +		align = 0;
> +		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT && (vaddr & ~PAGE_MASK) == 0)
> +			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
> +		entry = iommu_range_alloc(tbl, npages, &handle,
> +					  mask >> IOMMU_PAGE_SHIFT, align);
>  
>  		DBG("  - vaddr: %lx, size: %lx\n", vaddr, slen);
>  
> @@ -572,7 +577,7 @@ dma_addr_t iommu_map_single(struct iommu
>  {
>  	dma_addr_t dma_handle = DMA_ERROR_CODE;
>  	unsigned long uaddr;
> -	unsigned int npages;
> +	unsigned int npages, align;
>  
>  	BUG_ON(direction == DMA_NONE);
>  
> @@ -580,8 +585,13 @@ dma_addr_t iommu_map_single(struct iommu
>  	npages = iommu_num_pages(uaddr, size);
>  
>  	if (tbl) {
> +		align = 0;
> +		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT &&
> +		    ((unsigned long)vaddr & ~PAGE_MASK) == 0)
> +			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
> +
>  		dma_handle = iommu_alloc(tbl, vaddr, npages, direction,
> -					 mask >> IOMMU_PAGE_SHIFT, 0);
> +					 mask >> IOMMU_PAGE_SHIFT, align);
>  		if (dma_handle == DMA_ERROR_CODE) {
>  			if (printk_ratelimit())  {
>  				printk(KERN_INFO "iommu_alloc failed, "
> 


From kliteyn at mellanox.co.il  Thu Dec 20 21:36:51 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 21 Dec 2007 07:36:51 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-21:normal completion
Message-ID: <MTLEXCH01OtprR0E6fa000015de@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-20
OpenSM git rev = Mon_Dec_17_15:20:43_2007 [9988f459cb81dd025bde8b2dd53b3c551616be0c]
ibutils git rev = Wed_Dec_19_12:06:28_2007 [9961475294fbf1d3782edb8f377a77b13fa80d70]
 
 
Total=560  Pass=559  Fail=1
 
 
Pass:
42 Stability IS1-16.topo
42 Pkey IS1-16.topo
42 OsmTest IS1-16.topo
42 OsmStress IS1-16.topo
42 Multicast IS1-16.topo
42 LidMgr IS1-16.topo
14 Stability IS3-loop.topo
14 Stability IS3-128.topo
14 Pkey IS3-128.topo
14 OsmTest IS3-loop.topo
14 OsmTest IS3-128.topo
14 OsmStress IS3-128.topo
14 Multicast IS3-loop.topo
14 Multicast IS3-128.topo
14 FatTree merge-roots-4-ary-2-tree.topo
14 FatTree merge-root-4-ary-3-tree.topo
14 FatTree gnu-stallion-64.topo
14 FatTree blend-4-ary-2-tree.topo
14 FatTree RhinoDDR.topo
14 FatTree FullGnu.topo
14 FatTree 4-ary-2-tree.topo
14 FatTree 2-ary-4-tree.topo
14 FatTree 12-node-spaced.topo
14 FTreeFail 4-ary-2-tree-missing-sw-link.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
14 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
13 LidMgr IS3-128.topo

Failures:
1 LidMgr IS3-128.topo


From benh at au1.ibm.com  Thu Dec 20 21:38:31 2007
From: benh at au1.ibm.com (Benjamin Herrenschmidt)
Date: Fri, 21 Dec 2007 16:38:31 +1100
Subject: [ofa-general] Re: iommu dma mapping alignment requirements
In-Reply-To: <476B45E9.1040909@opengridcomputing.com>
References: <476AA2E2.5010007@opengridcomputing.com>
	<1198181862.6779.3.camel@pasglop>
	<476AD84E.4000507@opengridcomputing.com>
	<1198186017.6779.28.camel@pasglop>
	<476AE8C9.9080601@opengridcomputing.com>
	<1198194585.6779.31.camel@pasglop>
	<476B45E9.1040909@opengridcomputing.com>
Message-ID: <1198215511.6779.51.camel@pasglop>

BTW. I need to know urgently what HW is broken by this 

Ben.


From jackm at dev.mellanox.co.il  Fri Dec 21 00:10:28 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Fri, 21 Dec 2007 10:10:28 +0200
Subject: [ofa-general] Re: [RFC] XRC -- make receiving XRC QP independent of
	any one user process
In-Reply-To: <adad4t1yyuf.fsf@cisco.com>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<adad4t1yyuf.fsf@cisco.com>
Message-ID: <200712211010.29304.jackm@dev.mellanox.co.il>

On Thursday 20 December 2007 21:01, Roland Dreier wrote:
> And I guess we can't combine creating the QP with allocating the XRC
> domain, because the consumer might want to open the XRC domain before
> it has connected with the remote side.

Also, each domain may have multiple QPs -- each process which is part
of a computation needs to open a remote XRC QP for an RC connection.  It can
then communicate with ALL the remote processes which belong to a jog (each such
process creates an XRC SRQ to receive messages from the local process).

- Jack


From jackm at dev.mellanox.co.il  Fri Dec 21 00:31:59 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Fri, 21 Dec 2007 10:31:59 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	=?iso-8859-1?q?any=09one_user?= process
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<476A86E8.8020308@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>
Message-ID: <200712211031.59761.jackm@dev.mellanox.co.il>

On Thursday 20 December 2007 18:24, Tang, Changqing wrote:
>        If I have a MPI server processes on a node, many other MPI client processes will dynamically
> connect/disconnect with the server. The server use same XRC domain.
> 
>         Will this cause accumulating the "kernel" QP for such application ? we want the server to run 365 days
> a year.

Yes, it will.  I have no way of knowing when a given receiving XRC QP is no longer needed -- 
except when the domain it belongs to is finally closed.

I don't see that adding a userspace "destroy" verb for this QP will help:

The only one who actually knows that the XRC QP is no longer required is the userspace process which created
the QP at the remote end of the RC connection of the receiving XRC QP.

This remote process can only send a request to destroy the QP to some local process (via its own private protocol).
However, you pointed out that the process which originally created the QP may not be around any more (this was the
source of the problem which led to the RFC in this thread) -- and sending the destroy request to all the remote
processes on that node which it communicates with is REALLY ugly.

I'm not familiar with MPI, so this may be a silly question: Can the MPI server process create a 
new domain for each client process, and destroy that domain when the client process is done
(i.e., is this MPI server process a supervisor of resources for distributed computations 
(but is not a participant in these computations)?).

(Actually, what I'm asking -- is it possible to allocate a new XRC domain for a distributed computation, and destroy
that domain at the end of that computation?)


-- Jack


From vlad at lists.openfabrics.org  Fri Dec 21 03:16:15 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Fri, 21 Dec 2007 03:16:15 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071221-0200 daily build status
Message-ID: <20071221111615.E07B4E6093B@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.22
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.21.1
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.23
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:


From epitomises at ethox.org  Fri Dec 21 04:50:26 2007
From: epitomises at ethox.org (Marlatt Parinas)
Date: Fri, 21 Dec 2007 12:50:26 +0000
Subject: [ofa-general] shuz
Message-ID: <5964926221.20071221124722@ethox.org>

Aloha,	


	Downloaadable Softwaare
  http://www.geocities.com/onxrrxc0tig7w/

Lincoln's gentleness of argument which overcame wrapped around
in a mantle of satin. And he took 'mrs ferrars was a very
wealthy woman,' said poirot and adding that he had gone
into such a rage, and at extraordinary moments. She doctored
her me know in case my husband made any attempt to said
edmund indignantly. I'm writing a book. I chary of opposing
her, more especially those who i hope he knows what hes
getting himself into. Her closely. She stared at yew berries?
are they.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071221/f0c4a78e/attachment.html>

From dwsoulflightm at soulflight.com  Fri Dec 21 06:57:09 2007
From: dwsoulflightm at soulflight.com (Aldo Whitten)
Date: Fri, 21 Dec 2007 23:57:09 +0900
Subject: [ofa-general] Receive a real time experience of gambling without
	visiting a real casino!
Message-ID: <01c8442d$3268b880$b607c77c@dwsoulflightm>

 Where to gamble online? Check the list of the games in Golden Gate Casino! Just download free software and play from the comfort of your home! Get started and receive $2400 welcome bonus!

 We provide 24 hours a day, 7 days a week support and service! Truly fair play guaranteed for players. High level of security!

http://geocities.com/DanDavidson41/

   Choose Golden Gate Casino!


From sashak at voltaire.com  Fri Dec 21 08:23:43 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 21 Dec 2007 16:23:43 +0000
Subject: [ofa-general] smpquery regression in 1.3-rc1
In-Reply-To: <20071220170822.GI412@sgi.com>
References: <20071219195839.GS412@sgi.com>
	<1198095057.6635.41.camel@hrosenstock-ws.xsigo.com>
	<476A5530.3070602@dev.mellanox.co.il>
	<1198159659.6635.164.camel@hrosenstock-ws.xsigo.com>
	<476A8DA9.8040408@dev.mellanox.co.il>
	<1198169366.6635.181.camel@hrosenstock-ws.xsigo.com>
	<20071220171318.GE15888@sashak.voltaire.com>
	<20071220170822.GI412@sgi.com>
Message-ID: <20071221162343.GF15888@sashak.voltaire.com>

On 09:08 Thu 20 Dec     , akepner at sgi.com wrote:
> On Thu, Dec 20, 2007 at 05:13:18PM +0000, Sasha Khapyorsky wrote:
> > ...
> > Yevgeny, Arthur, could you rerun smpquery with -dddd (for lot of debug
> > stuff)?
> > 
> 
> Well, just about any perturbation changes the behavior - run 
> it under strace, or gdb, link the IB libraries statically, or 
> look at the machine funny and it works fine. 
> 
> But using the debug flags reveals an apparent problem with the 
> debug code itself:
> 
> # ./smpquery_1.3_rc1 -d -G nodeinfo 0x00066a01a000737c
> ibwarn: [19328] smp_query: attr 0x15 mod 0x0 route DR path 0
> ibwarn: [19328] mad_rpc: data offs 64 sz 64
> mad data
> 0000 0000 0000 0000 fe80 0000 0000 0000
> 0002 0002 0251 0a6a 0000 0000 0103 0302
> 3452 0023 4040 0008 0804 ff40 0000 005e
> 0000 2012 1088 0000 0000 0000 0000 0000
> Segmentation fault
> 
> and gdb shows:
> 
> (gdb) bt
> #0  0x00002b0b9222ed0f in _IO_default_xsputn_internal () from /lib64/libc.so.6
> #1  0x00002b0b92207177 in vfprintf () from /lib64/libc.so.6
> #2  0x00002b0b9229577d in __vsprintf_chk () from /lib64/libc.so.6
> #3  0x00002b0b922956c0 in __sprintf_chk () from /lib64/libc.so.6
> #4  0x00002b0b91c71166 in portid2str (portid=0x7fff1905bc00) at src/portid.c:91
> #5  0x00002b0b91c72529 in sa_rpc_call (ibmad_port=0x7fff1905b680,
>     rcvbuf=0x7fff1905bb30, portid=0x7fff1905bc00, sa=0x7fff1905bac0, timeout=0)
>     at src/sa.c:58
> #6  0x00002b0b91c71791 in sa_call (rcvbuf=0x7fff1905bb30,
>     portid=0x7fff1905bc00, sa=0x7fff1905bac0, timeout=0) at src/rpc.c:395
> #7  0x00002b0b91c723bf in ib_path_query (srcgid=0x7fff1905be30 "\200",
>     destgid=0x7fff1905be30 "\200", sm_id=0x7fff1905bc00, buf=0x7fff1905bb30)
>     at ./include/infiniband/mad.h:790
> #8  0x00002b0b91c7144f in ib_resolve_guid (portid=0x7fff1905bde0,
>     guid=0x7fff1905bd20, sm_id=0x7fff1905bc00, timeout=<value optimized out>)
>     at src/resolve.c:83
> #9  0x00002b0b91c71610 in ib_resolve_portid_str (portid=0x7fff1905bde0,
>     addr_str=0x7fff1905d341 "0x00066a01a000737c", dest_type=2, sm_id=0x0)
>     at src/resolve.c:115
> #10 0x0000000000401cd1 in main (argc=2, argv=0x7fff1905bfd0)
>     at smpquery_1.3_rc1.c:522

Thanks for this great debug info. I'm not able to reproduce the segfault,
but looking at your backtrace think that this patch could fix segfault:

diff --git a/libibmad/src/resolve.c b/libibmad/src/resolve.c
index 05b443d..d8365b2 100644
--- a/labium/src/resolve.c
+++ b/libibmad/src/resolve.c
@@ -56,6 +56,8 @@ ib_resolve_smlid(ib_portid_t *sm_id, int timeout)
 	uint8_t portinfo[64];
 	int lid;
 
+	memset(sm_id, 0, sizeof(*sm_id));
+
 	if (!smp_query(portinfo, &self, IB_ATTR_PORT_INFO, 0, 0))
 		return -1;
 

Sasha


From sashak at voltaire.com  Fri Dec 21 08:30:37 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 21 Dec 2007 16:30:37 +0000
Subject: [ofa-general] [PATCH] libibmad: initialize sm portid in
	ib_resolve_smlid()
Message-ID: <20071221163037.GG15888@sashak.voltaire.com>


Initialize sm portid in ib_resolve_smlid() - likely it fixes crash when
in debug mode uninitialized direct path array printed.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 libibmad/src/resolve.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/libibmad/src/resolve.c b/libibmad/src/resolve.c
index 05b443d..d8365b2 100644
--- a/libibmad/src/resolve.c
+++ b/libibmad/src/resolve.c
@@ -56,6 +56,8 @@ ib_resolve_smlid(ib_portid_t *sm_id, int timeout)
 	uint8_t portinfo[64];
 	int lid;
 
+	memset(sm_id, 0, sizeof(*sm_id));
+
 	if (!smp_query(portinfo, &self, IB_ATTR_PORT_INFO, 0, 0))
 		return -1;
 
-- 
1.5.3.4.206.g58ba4


From dwskydm at skyd.org  Fri Dec 21 08:33:21 2007
From: dwskydm at skyd.org (Robert Schnor)
Date: Fri, 21 Dec 2007 18:33:21 +0200
Subject: [ofa-general] Reliable drugstore with cheap prices!
Message-ID: <01c843ff$f66b2e80$5ab1a74e@dwskydm>

    Cheap medications offered in «CanadianPharmacy» are of extremely high quality. Large selection of medications which are 100% generic! No other online drugstore offers such a level of service. Fast worldwide delivery, no damaged packages, no delays! Full confidentiality! 

http://geocities.com/HubertKerr72/

 Start new life with «CanadianPharmacy»!

Robert Schnor


From akepner at sgi.com  Fri Dec 21 08:42:10 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Fri, 21 Dec 2007 08:42:10 -0800
Subject: [ofa-general] Re: [PATCH] libibmad: initialize sm portid in
	ib_resolve_smlid()
In-Reply-To: <20071221163037.GG15888@sashak.voltaire.com>
References: <20071221163037.GG15888@sashak.voltaire.com>
Message-ID: <20071221164210.GU412@sgi.com>

On Fri, Dec 21, 2007 at 04:30:37PM +0000, Sasha Khapyorsky wrote:
> 
> Initialize sm portid in ib_resolve_smlid() - likely it fixes crash when
> in debug mode uninitialized direct path array printed.
> 
> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>

Tested by: Arthur Kepner <akepner at sgi.com>

Sasha - seems to fix things up when using the debug flags, 
furthermore, it seems to make the original problem (the 
failure of "smpquery -G ...") go away:

Before, I'd get:

# ./smpquery_1.3_rc1 -G nodeinfo  0x00066a01a000737c
ibwarn: [7459] _do_madrpc: recv failed: Connection timed out
ibwarn: [7459] ib_path_query: sa call path_query failed
smpquery_1.3_rc1: iberror: failed: can't resolve destination port 0x00066a01a000737c

But if I link against the patched library, I get:
# ./smpquery_1.3_rc1 -G nodeinfo  0x00066a01a000737c
# Node info: Lid 3
BaseVers:........................1
ClassVers:.......................1
NodeType:........................Channel Adapter
NumPorts:........................2
SystemGuid:......................0x00066a009800737c
Guid:............................0x00066a009800737c
PortGuid:........................0x00066a01a000737c
PartCap:.........................64
DevId:...........................0x6278
Revision:........................0x000000a0
LocalPort:.......................2
VendorId:........................0x00066a

Thanks!

-- 
Arthur


From changquing.tang at hp.com  Fri Dec 21 09:13:26 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Fri, 21 Dec 2007 17:13:26 +0000
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	any	one user process
In-Reply-To: <200712211031.59761.jackm@dev.mellanox.co.il>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<476A86E8.8020308@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>
	<200712211031.59761.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FE239C7D1@G5W0278.americas.hpqcorp.net>


> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Friday, December 21, 2007 2:32 AM
> To: Tang, Changqing
> Cc: pasha at dev.mellanox.co.il;
> mvapich-discuss at cse.ohio-state.edu;
> general at lists.openfabrics.org; Open MPI Developers
> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> independent of any one user process
>
> On Thursday 20 December 2007 18:24, Tang, Changqing wrote:
> >        If I have a MPI server processes on a node, many other MPI
> > client processes will dynamically connect/disconnect with
> the server. The server use same XRC domain.
> >
> >         Will this cause accumulating the "kernel" QP for such
> > application ? we want the server to run 365 days a year.
>
> Yes, it will.  I have no way of knowing when a given
> receiving XRC QP is no longer needed -- except when the
> domain it belongs to is finally closed.
>
> I don't see that adding a userspace "destroy" verb for this
> QP will help:

This kernel QP is for receiving only, so when there is no activity on this QP,
can the kernel sends a heart-beat message to check if the remote sending QP
is still there (still connected) ? if not, the kernel is safe to cleanup
this qp.

So whenever the RC connection is broken, kernel can destroy this QP.


>
> The only one who actually knows that the XRC QP is no longer
> required is the userspace process which created the QP at the
> remote end of the RC connection of the receiving XRC QP.
>
> This remote process can only send a request to destroy the QP
> to some local process (via its own private protocol).
> However, you pointed out that the process which originally
> created the QP may not be around any more (this was the
> source of the problem which led to the RFC in this thread) --
> and sending the destroy request to all the remote processes
> on that node which it communicates with is REALLY ugly.
>
> I'm not familiar with MPI, so this may be a silly question:
> Can the MPI server process create a new domain for each
> client process, and destroy that domain when the client
> process is done (i.e., is this MPI server process a
> supervisor of resources for distributed computations (but is
> not a participant in these computations)?).

The server could be process group across multiple nodes, there are
parallel database searching engine, for example.


>
> (Actually, what I'm asking -- is it possible to allocate a
> new XRC domain for a distributed computation, and destroy
> that domain at the end of that computation?)

Yes, it could, but it makes MPI harder to manage the code. And also
we have a connect/accept speed concern.

We hope not to do it this way.


--CQ


>
>
> -- Jack
>


From jackm at dev.mellanox.co.il  Fri Dec 21 10:09:26 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Fri, 21 Dec 2007 20:09:26 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	=?iso-8859-1?q?any=09one_user?= process
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE239C7D1@G5W0278.americas.hpqcorp.net>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<200712211031.59761.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE239C7D1@G5W0278.americas.hpqcorp.net>
Message-ID: <200712212009.26816.jackm@dev.mellanox.co.il>

On Friday 21 December 2007 19:13, Tang, Changqing wrote:
> This kernel QP is for receiving only, so when there is no activity on this QP,
> can the kernel sends a heart-beat message to check if the remote sending QP
> is still there (still connected) ? if not, the kernel is safe to cleanup
> this qp.
> 
> So whenever the RC connection is broken, kernel can destroy this QP.
> 
This increases the XRC complexity considerably:

1. Need to have a separate kernel thread which will scan ALL xrc domains on this host for XRC receive QPs.
   This thread will need to do some form of RDMA_READ/WRITE, because otherwise it will interfere with
   the remote (sending side) operation.  Furthermore, the sending-side XRC QP may not have anyone listening
   on an associated XRC SRQ qp -- it is not meant to be set up to receive.  We only need an operation that
   will yield a RETRY_EXCEEDED error completion if the connection has broken.

2. This opens the door for all sorts of nasty race conditions, since we will now have a bi-directional
   protocol. For example, what if this feature is being combined with APM (valid for RC QPs), and we
   are simply in the middle of a migration, and maybe communication is temporarily interrupted.
   We will be killing off the QP without allowing any error recovery mechanism to work.

3. The application complexity goes up -- we now need the sending-side QP to declare a memory region and send
   this region's address to the receiving side so that the receiving side (the kernel thread mentioned above)
   can periodically try to read from this region.

Still, I'll give this some thought.  For example, maybe we can rdma_read some random (illegal) address --
If the connection is alive, we'll get a "remote access error" completion, while if its dead, we'll get
retry exceeded (need to check that the bad rdma read request does not cause the QPs to enter an error state).

- Jack


From changquing.tang at hp.com  Fri Dec 21 10:22:29 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Fri, 21 Dec 2007 18:22:29 +0000
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	any	one user process
In-Reply-To: <200712212009.26816.jackm@dev.mellanox.co.il>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<200712211031.59761.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE239C7D1@G5W0278.americas.hpqcorp.net>
	<200712212009.26816.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FE241E196@G5W0278.americas.hpqcorp.net>


What we do for heart-beat is using zero-byte rdma_write, the message goes to the peer QP only, there is no need to post anything
on remote side, no need for pinned memory.


--CQ


> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Friday, December 21, 2007 12:09 PM
> To: Tang, Changqing
> Cc: pasha at dev.mellanox.co.il;
> mvapich-discuss at cse.ohio-state.edu;
> general at lists.openfabrics.org; Open MPI Developers
> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> independent of any one user process
>
> On Friday 21 December 2007 19:13, Tang, Changqing wrote:
> > This kernel QP is for receiving only, so when there is no
> activity on
> > this QP, can the kernel sends a heart-beat message to check if the
> > remote sending QP is still there (still connected) ? if not, the
> > kernel is safe to cleanup this qp.
> >
> > So whenever the RC connection is broken, kernel can destroy this QP.
> >
> This increases the XRC complexity considerably:
>
> 1. Need to have a separate kernel thread which will scan ALL
> xrc domains on this host for XRC receive QPs.
>    This thread will need to do some form of RDMA_READ/WRITE,
> because otherwise it will interfere with
>    the remote (sending side) operation.  Furthermore, the
> sending-side XRC QP may not have anyone listening
>    on an associated XRC SRQ qp -- it is not meant to be set
> up to receive.  We only need an operation that
>    will yield a RETRY_EXCEEDED error completion if the
> connection has broken.
>
> 2. This opens the door for all sorts of nasty race
> conditions, since we will now have a bi-directional
>    protocol. For example, what if this feature is being
> combined with APM (valid for RC QPs), and we
>    are simply in the middle of a migration, and maybe
> communication is temporarily interrupted.
>    We will be killing off the QP without allowing any error
> recovery mechanism to work.
>
> 3. The application complexity goes up -- we now need the
> sending-side QP to declare a memory region and send
>    this region's address to the receiving side so that the
> receiving side (the kernel thread mentioned above)
>    can periodically try to read from this region.
>
> Still, I'll give this some thought.  For example, maybe we
> can rdma_read some random (illegal) address -- If the
> connection is alive, we'll get a "remote access error"
> completion, while if its dead, we'll get retry exceeded (need
> to check that the bad rdma read request does not cause the
> QPs to enter an error state).
>
> - Jack
>


From sashak at voltaire.com  Fri Dec 21 11:45:30 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 21 Dec 2007 19:45:30 +0000
Subject: [ofa-general] Re: [PATCH] libibmad: initialize sm portid in
	ib_resolve_smlid()
In-Reply-To: <20071221164210.GU412@sgi.com>
References: <20071221163037.GG15888@sashak.voltaire.com>
	<20071221164210.GU412@sgi.com>
Message-ID: <20071221194530.GI15888@sashak.voltaire.com>

On 08:42 Fri 21 Dec     , akepner at sgi.com wrote:
> On Fri, Dec 21, 2007 at 04:30:37PM +0000, Sasha Khapyorsky wrote:
> > 
> > Initialize sm portid in ib_resolve_smlid() - likely it fixes crash when
> > in debug mode uninitialized direct path array printed.
> > 
> > Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> 
> Tested by: Arthur Kepner <akepner at sgi.com>

Thanks!

> Sasha - seems to fix things up when using the debug flags, 
> furthermore, it seems to make the original problem (the 
> failure of "smpquery -G ...") go away:

After looking more in the code I see that it is very possible - when
smpquery asks SA for PathRecord it uses PKey index value from sm_id
which was uninitialized before the patch. Assuming that PKey index
value was garbaged PathRecord query has been rejected by CA even
before it reached SM.

Seems we fixed both bugs in this patch (by initializing sm_id) - cool.
Thanks for the reporting and the great help with debugging this.

Sasha


From sashak at voltaire.com  Fri Dec 21 11:50:15 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 21 Dec 2007 19:50:15 +0000
Subject: [ofa-general] [PATCH] libibcommon: fix overflow in debug/log prints
In-Reply-To: <20071221163037.GG15888@sashak.voltaire.com>
References: <20071221163037.GG15888@sashak.voltaire.com>
Message-ID: <20071221195015.GJ15888@sashak.voltaire.com>


When long strings are passed to libibcommon debug/log functions it
overflows local buffers. Use vsnprintf() instead of vsprintf() to prevent
this.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 libibcommon/src/util.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/libibcommon/src/util.c b/libibcommon/src/util.c
index 7da967e..5ba164c 100644
--- a/libibcommon/src/util.c
+++ b/libibcommon/src/util.c
@@ -66,7 +66,7 @@ ibwarn(const char * const fn, char *msg, ...)
 	int n;
 
 	va_start(va, msg);
-	n = vsprintf(buf, msg, va);
+	n = vsnprintf(buf, sizeof(buf), msg, va);
 	va_end(va);
 	buf[n] = 0;
 
@@ -81,7 +81,7 @@ ibpanic(const char * const fn, char *msg, ...)
 	int n;
 
 	va_start(va, msg);
-	n = vsprintf(buf, msg, va);
+	n = vsnprintf(buf, sizeof(buf), msg, va);
 	va_end(va);
 	buf[n] = 0;
 
@@ -99,7 +99,7 @@ logmsg(const char * const fn, char *msg, ...)
 	int n;
 
 	va_start(va, msg);
-	n = vsprintf(buf, msg, va);
+	n = vsnprintf(buf, sizeof(buf), msg, va);
 	va_end(va);
 	buf[n] = 0;
 
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Fri Dec 21 11:52:07 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 21 Dec 2007 19:52:07 +0000
Subject: [ofa-general] [PATCH] opensm: rename __osm_epi_plugin_t to
	osm_event_plugin_t
Message-ID: <20071221195207.GK15888@sashak.voltaire.com>


Rename __osm_epi_plugin_t to osm_event_plugin_t.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/include/opensm/osm_event_plugin.h   |    6 +++---
 opensm/opensm/osm_event_plugin.c           |    2 +-
 opensm/osmeventplugin/src/osmeventplugin.c |    2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/opensm/include/opensm/osm_event_plugin.h b/opensm/include/opensm/osm_event_plugin.h
index 0b69d48..b71f6f7 100644
--- a/opensm/include/opensm/osm_event_plugin.h
+++ b/opensm/include/opensm/osm_event_plugin.h
@@ -149,7 +149,7 @@ typedef struct {
  */
 #define OSM_EVENT_PLUGIN_IMPL_NAME "osm_event_plugin"
 #define OSM_EVENT_PLUGIN_INTERFACE_VER (1)
-typedef struct {
+typedef struct osm_event_plugin {
 	int interface_version;
 	void *(*construct) (osm_log_t * osm_log);
 	void (*destroy) (void *plugin_data);
@@ -157,14 +157,14 @@ typedef struct {
 	void (*report) (void *plugin_data,
 			osm_epi_event_id_t event_id, void *event_data);
 
-} __osm_epi_plugin_t;
+} osm_event_plugin_t;
 
 /** =========================================================================
  * The plugin structure should be considered opaque
  */
 typedef struct {
 	void *handle;
-	__osm_epi_plugin_t *impl;
+	osm_event_plugin_t *impl;
 	void *plugin_data;
 	osm_log_t *p_log;
 	char *plugin_name;
diff --git a/opensm/opensm/osm_event_plugin.c b/opensm/opensm/osm_event_plugin.c
index 5f062cf..5ffb255 100644
--- a/opensm/opensm/osm_event_plugin.c
+++ b/opensm/opensm/osm_event_plugin.c
@@ -83,7 +83,7 @@ osm_epi_plugin_t *osm_epi_construct(osm_log_t * p_log, char *plugin_name)
 	}
 
 	rc->impl =
-	    (__osm_epi_plugin_t *) dlsym(rc->handle,
+	    (osm_event_plugin_t *) dlsym(rc->handle,
 					 OSM_EVENT_PLUGIN_IMPL_NAME);
 	if (!rc->impl) {
 		osm_log(p_log, OSM_LOG_ERROR,
diff --git a/opensm/osmeventplugin/src/osmeventplugin.c b/opensm/osmeventplugin/src/osmeventplugin.c
index 5adf87d..6cc4c70 100644
--- a/opensm/osmeventplugin/src/osmeventplugin.c
+++ b/opensm/osmeventplugin/src/osmeventplugin.c
@@ -171,7 +171,7 @@ static void report(void *_log, osm_epi_event_id_t event_id, void *event_data)
 /** =========================================================================
  * Define the object symbol for loading
  */
-__osm_epi_plugin_t osm_event_plugin = {
+osm_event_plugin_t osm_event_plugin = {
       interface_version:OSM_EVENT_PLUGIN_INTERFACE_VER,
       construct:construct,
       destroy:destroy,
-- 
1.5.3.4.206.g58ba4


From dillowda at ornl.gov  Fri Dec 21 12:39:04 2007
From: dillowda at ornl.gov (David Dillow)
Date: Fri, 21 Dec 2007 15:39:04 -0500
Subject: [ofa-general] [PATCH] IB/srp: add identifying information to log
	messages
Message-ID: <1198269544.9979.26.camel@lap75545.ornl.gov>

When you have multiple targets, it gets really confusing when you try to
track down who did a reset when there is no identifying information in
the log message, especially when the same extension ID is mapped through
two different local IB ports. So, add an identifier that can be used to
track back to which local IB port/remote target pair is the one having
problems.

Signed-off-by: David Dillow <dillowda at ornl.gov>
---
This is against the previous three patches to respect the credit limit
and allow scatter/gather. I may apply with offsets without those.

 ib_srp.c |   79 +++++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 52 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 4f58f94..717f186 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -272,7 +272,8 @@ static void srp_path_rec_completion(int status,
 
 	target->status = status;
 	if (status)
-		printk(KERN_ERR PFX "Got failed path rec status %d\n", status);
+		printk(KERN_ERR PFX "scsi%d: Got failed path rec status %d\n",
+		       target->scsi_host->host_no, status);
 	else
 		target->path = *pathrec;
 	complete(&target->done);
@@ -303,7 +304,8 @@ static int srp_lookup_path(struct srp_target_port *target)
 	wait_for_completion(&target->done);
 
 	if (target->status < 0)
-		printk(KERN_WARNING PFX "Path record query failed\n");
+		printk(KERN_WARNING PFX "scsi%d: Path record query failed\n",
+		       target->scsi_host->host_no);
 
 	return target->status;
 }
@@ -400,7 +402,8 @@ static void srp_disconnect_target(struct srp_target_port *target)
 
 	init_completion(&target->done);
 	if (ib_send_cm_dreq(target->cm_id, NULL, 0)) {
-		printk(KERN_DEBUG PFX "Sending CM DREQ failed\n");
+		printk(KERN_DEBUG PFX "scsi%d: Sending CM DREQ failed\n",
+		       target->scsi_host->host_no);
 		return;
 	}
 	wait_for_completion(&target->done);
@@ -568,7 +571,8 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	return ret;
 
 err:
-	printk(KERN_ERR PFX "reconnect failed (%d), removing target port.\n", ret);
+	printk(KERN_ERR PFX "scsi%d: reconnect failed (%d), removing target "
+	       "port.\n", target->scsi_host->host_no, ret);
 
 	/*
 	 * We couldn't reconnect, so kill our target port off.
@@ -683,8 +687,8 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 
 	if (scmnd->sc_data_direction != DMA_FROM_DEVICE &&
 	    scmnd->sc_data_direction != DMA_TO_DEVICE) {
-		printk(KERN_WARNING PFX "Unhandled data direction %d\n",
-		       scmnd->sc_data_direction);
+		printk(KERN_WARNING PFX "scsi%d: Unhandled data direction %d\n",
+		       target->scsi_host->host_no, scmnd->sc_data_direction);
 		return -EINVAL;
 	}
 
@@ -786,7 +790,8 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 	} else {
 		scmnd = req->scmnd;
 		if (!scmnd)
-			printk(KERN_ERR "Null scmnd for RSP w/tag %016llx\n",
+			printk(KERN_ERR "scsi%d: Null scmnd for RSP w/tag "
+			       "%016llx\n", target->scsi_host->host_no,
 			       (unsigned long long) rsp->tag);
 		scmnd->result = rsp->status;
 
@@ -831,7 +836,8 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
 	if (0) {
 		int i;
 
-		printk(KERN_ERR PFX "recv completion, opcode 0x%02x\n", opcode);
+		printk(KERN_ERR PFX "scsi%d: recv completion, opcode 0x%02x\n",
+		       target->scsi_host->host_no, opcode);
 
 		for (i = 0; i < wc->byte_len; ++i) {
 			if (i % 8 == 0)
@@ -852,11 +858,13 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
 
 	case SRP_T_LOGOUT:
 		/* XXX Handle target logout */
-		printk(KERN_WARNING PFX "Got target logout request\n");
+		printk(KERN_WARNING PFX "scsi%d: Got target logout request\n",
+		       target->scsi_host->host_no);
 		break;
 
 	default:
-		printk(KERN_WARNING PFX "Unhandled SRP opcode 0x%02x\n", opcode);
+		printk(KERN_WARNING PFX "scsi%d: Unhandled SRP opcode 0x%02x\n",
+		       target->scsi_host->host_no, opcode);
 		break;
 	}
 
@@ -872,7 +880,8 @@ static void srp_completion(struct ib_cq *cq, void *target_ptr)
 	ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
 	while (ib_poll_cq(cq, 1, &wc) > 0) {
 		if (wc.status) {
-			printk(KERN_ERR PFX "failed %s status %d\n",
+			printk(KERN_ERR PFX "scsi%d: failed %s status %d\n",
+			       target->scsi_host->host_no,
 			       wc.wr_id & SRP_OP_RECV ? "receive" : "send",
 			       wc.status);
 			target->qp_in_error = 1;
@@ -1027,12 +1036,14 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd,
 
 	len = srp_map_data(scmnd, target, req);
 	if (len < 0) {
-		printk(KERN_ERR PFX "Failed to map data\n");
+		printk(KERN_ERR PFX "scsi%d: Failed to map data\n",
+		       target->scsi_host->host_no);
 		goto err;
 	}
 
 	if (__srp_post_recv(target)) {
-		printk(KERN_ERR PFX "Recv failed\n");
+		printk(KERN_ERR PFX "scsi%d: Recv failed\n",
+		       target->scsi_host->host_no);
 		goto err_unmap;
 	}
 
@@ -1040,7 +1051,8 @@ static int srp_queuecommand(struct scsi_cmnd *scmnd,
 				      DMA_TO_DEVICE);
 
 	if (__srp_post_send(target, iu, len)) {
-		printk(KERN_ERR PFX "Send failed\n");
+		printk(KERN_ERR PFX "scsi%d: Send failed\n",
+		       target->scsi_host->host_no);
 		goto err_unmap;
 	}
 
@@ -1171,7 +1183,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 
 	switch (event->event) {
 	case IB_CM_REQ_ERROR:
-		printk(KERN_DEBUG PFX "Sending CM REQ failed\n");
+		printk(KERN_DEBUG PFX "scsi%d: Sending CM REQ failed\n",
+		       target->scsi_host->host_no);
 		comp = 1;
 		target->status = -ECONNRESET;
 		break;
@@ -1186,7 +1199,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 			target->max_ti_iu_len = be32_to_cpu(rsp->max_ti_iu_len);
 			target->req_lim       = be32_to_cpu(rsp->req_lim_delta);
 		} else {
-			printk(KERN_WARNING PFX "Unhandled RSP opcode %#x\n", opcode);
+			printk(KERN_WARNING PFX "scsi%d: Unhandled RSP opcode "
+			       "%#x\n", target->scsi_host->host_no, opcode);
 			target->status = -ECONNRESET;
 			break;
 		}
@@ -1232,20 +1246,24 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 		break;
 
 	case IB_CM_REJ_RECEIVED:
-		printk(KERN_DEBUG PFX "REJ received\n");
+		printk(KERN_DEBUG PFX "scsi%d: REJ received\n",
+		       target->scsi_host->host_no);
 		comp = 1;
 
 		srp_cm_rej_handler(cm_id, event, target);
 		break;
 
 	case IB_CM_DREQ_RECEIVED:
-		printk(KERN_WARNING PFX "DREQ received - connection closed\n");
+		printk(KERN_WARNING PFX "scsi%d: DREQ received - connection "
+		       "closed\n", target->scsi_host->host_no);
 		if (ib_send_cm_drep(cm_id, NULL, 0))
-			printk(KERN_ERR PFX "Sending CM DREP failed\n");
+			printk(KERN_ERR PFX "scsi%d: Sending CM DREP failed\n",
+			       target->scsi_host->host_no);
 		break;
 
 	case IB_CM_TIMEWAIT_EXIT:
-		printk(KERN_ERR PFX "connection closed\n");
+		printk(KERN_ERR PFX "scsi%d: connection closed\n",
+		       target->scsi_host->host_no);
 
 		comp = 1;
 		target->status = 0;
@@ -1257,7 +1275,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 		break;
 
 	default:
-		printk(KERN_WARNING PFX "Unhandled CM event %d\n", event->event);
+		printk(KERN_WARNING PFX "scsi%d: Unhandled CM event %d\n",
+		       target->scsi_host->host_no, event->event);
 		break;
 	}
 
@@ -1334,7 +1353,8 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 	struct srp_request *req;
 	int ret = SUCCESS;
 
-	printk(KERN_ERR "SRP abort called\n");
+	printk(KERN_ERR "scsi%d: SRP abort called\n",
+	       target->scsi_host->host_no);
 
 	if (target->qp_in_error)
 		return FAILED;
@@ -1364,7 +1384,8 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 	struct srp_target_port *target = host_to_target(scmnd->device->host);
 	struct srp_request *req, *tmp;
 
-	printk(KERN_ERR "SRP reset_device called\n");
+	printk(KERN_ERR "scsi%d: SRP reset_device called\n",
+	       target->scsi_host->host_no);
 
 	if (target->qp_in_error)
 		return FAILED;
@@ -1391,7 +1412,8 @@ static int srp_reset_host(struct scsi_cmnd *scmnd)
 	struct srp_target_port *target = host_to_target(scmnd->device->host);
 	int ret = FAILED;
 
-	printk(KERN_ERR PFX "SRP reset_host called\n");
+	printk(KERN_ERR PFX "scsi%d: SRP reset_host called\n",
+	       target->scsi_host->host_no);
 
 	if (!srp_reconnect_target(target))
 		ret = SUCCESS;
@@ -1827,8 +1849,10 @@ static ssize_t srp_create_target(struct class_device *class_dev,
 
 	ib_get_cached_gid(host->dev->dev, host->port, 0, &target->path.sgid);
 
-	printk(KERN_DEBUG PFX "new target: id_ext %016llx ioc_guid %016llx pkey %04x "
-	       "service_id %016llx dgid %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n",
+	printk(KERN_DEBUG PFX "scsi%d: new target: id_ext %016llx "
+	       "ioc_guid %016llx pkey %04x service_id %016llx "
+	       "dgid %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n",
+	       target->scsi_host->host_no,
 	       (unsigned long long) be64_to_cpu(target->id_ext),
 	       (unsigned long long) be64_to_cpu(target->ioc_guid),
 	       be16_to_cpu(target->path.pkey),
@@ -1855,7 +1879,8 @@ static ssize_t srp_create_target(struct class_device *class_dev,
 	target->qp_in_error = 0;
 	ret = srp_connect_target(target);
 	if (ret) {
-		printk(KERN_ERR PFX "Connection failed\n");
+		printk(KERN_ERR PFX "scsi%d: Connection failed\n",
+		       target->scsi_host->host_no);
 		goto err_cm_id;
 	}
 

From jackm at dev.mellanox.co.il  Fri Dec 21 12:51:23 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Fri, 21 Dec 2007 22:51:23 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	=?iso-8859-1?q?any=09one_user?= process
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE241E196@G5W0278.americas.hpqcorp.net>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<200712212009.26816.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE241E196@G5W0278.americas.hpqcorp.net>
Message-ID: <200712212251.24330.jackm@dev.mellanox.co.il>

On Friday 21 December 2007 20:22, Tang, Changqing wrote:
> What we do for heart-beat is using zero-byte rdma_write, the message goes to the peer QP only, there is no need to post anything
> on remote side, no need for pinned memory.
> 
I'll look into this solution on Sunday (I've not used 0-byte rdma_reads myself yet).
(Question -- does the 0-byte rdma-read need to access a valid address 
(i.e., region) on the remote side, even if it is zero-byte? or are the
remote address and rkey fields "don't care" in the 
post_send work request in this case?)

If you can, please send me a coding example at the userspace ibv (verbs) layer.
It will save time.
(jackm at dev.mellanox.co.il).

Thanks!

- Jack


From pradeeps at linux.vnet.ibm.com  Fri Dec 21 13:08:23 2007
From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana)
Date: Fri, 21 Dec 2007 13:08:23 -0800
Subject: [ofa-general] [PATCH] IPOIB/CM Enable SRQ support on HCAs with less
 than 16 SG entries
Message-ID: <476C2B47.5060507@linux.vnet.ibm.com>

Some HCAs like ehca2 support fewer than 16 SG entries. Currently IPoIB/CM
implicitly assumes all HCAs will support 16 SG entries of 4K pages for 64K 
MTUs. This patch removes that restriction.

This patch continues to use order 0 allocations and enables implementation of 
connected mode on such HCAs with smaller MTUs. HCAs having the capability to 
support 16 SG entries are left untouched.

This patch addresses bug# 728:
https://bugs.openfabrics.org/show_bug.cgi?id=728

The primary difference between this version and the previous one is 
incorporating Roland's suggestion of making max_cm_mtu and num_of_frags
per interface. Also eliminated a bogus kfree() left over from a previous
patch.

Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
---

--- a/drivers/infiniband/ulp/ipoib/ipoib.h	2007-11-03 14:37:02.000000000 -0400
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h	2007-12-21 13:29:06.000000000 -0500
@@ -238,6 +238,8 @@ struct ipoib_cm_dev_priv {
 	struct ib_sge		rx_sge[IPOIB_CM_RX_SG];
 	struct ib_recv_wr       rx_wr;
 	int			nonsrq_conn_qp;
+	int			max_cm_mtu;
+	int			num_of_frags;
 };
 
 /*
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2007-11-21 10:46:35.000000000 -0500
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2007-12-21 15:45:30.000000000 -0500
@@ -96,13 +96,13 @@ static int ipoib_cm_post_receive_srq(str
 
 	priv->cm.rx_wr.wr_id = id | IPOIB_OP_CM | IPOIB_OP_RECV;
 
-	for (i = 0; i < IPOIB_CM_RX_SG; ++i)
+	for (i = 0; i < priv->cm.num_of_frags; ++i)
 		priv->cm.rx_sge[i].addr = priv->cm.srq_ring[id].mapping[i];
 
 	ret = ib_post_srq_recv(priv->cm.srq, &priv->cm.rx_wr, &bad_wr);
 	if (unlikely(ret)) {
 		ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret);
-		ipoib_cm_dma_unmap_rx(priv, IPOIB_CM_RX_SG - 1,
+		ipoib_cm_dma_unmap_rx(priv, priv->cm.num_of_frags - 1,
 				      priv->cm.srq_ring[id].mapping);
 		dev_kfree_skb_any(priv->cm.srq_ring[id].skb);
 		priv->cm.srq_ring[id].skb = NULL;
@@ -1399,16 +1399,16 @@ int ipoib_cm_add_mode_attr(struct net_de
 	return device_create_file(&dev->dev, &dev_attr_mode);
 }
 
-static void ipoib_cm_create_srq(struct net_device *dev)
+static void ipoib_cm_create_srq(struct net_device *dev, int max_sge)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 	struct ib_srq_init_attr srq_init_attr = {
 		.attr = {
 			.max_wr  = ipoib_recvq_size,
-			.max_sge = IPOIB_CM_RX_SG
 		}
 	};
 
+	srq_init_attr.attr.max_sge = max_sge;
 	priv->cm.srq = ib_create_srq(priv->pd, &srq_init_attr);
 	if (IS_ERR(priv->cm.srq)) {
 		if (PTR_ERR(priv->cm.srq) != -ENOSYS)
@@ -1431,7 +1431,9 @@ static void ipoib_cm_create_srq(struct n
 int ipoib_cm_dev_init(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
-	int i;
+	int i, ret, num_of_frags;
+	struct ib_srq_attr srq_attr;
+	struct ib_device_attr attr;
 
 	INIT_LIST_HEAD(&priv->cm.passive_ids);
 	INIT_LIST_HEAD(&priv->cm.reap_list);
@@ -1448,22 +1450,52 @@ int ipoib_cm_dev_init(struct net_device 
 
 	skb_queue_head_init(&priv->cm.skb_queue);
 
-	for (i = 0; i < IPOIB_CM_RX_SG; ++i)
+	ret = ib_query_device(priv->ca, &attr);
+	if (ret) {
+		printk(KERN_WARNING "ib_query_device() failed with %d\n", ret);
+		return ret;
+	}
+
+	ipoib_dbg(priv, "max_srq_sge=%d\n", attr.max_srq_sge);
+
+	ipoib_cm_create_srq(dev, attr.max_srq_sge);
+
+	if (ipoib_cm_has_srq(dev)) {
+		ret = ib_query_srq(priv->cm.srq, &srq_attr);
+		if (ret) {
+			printk(KERN_WARNING "ib_query_srq() failed with %d\n", ret);
+			return -EINVAL;
+		}
+		if (srq_attr.max_sge > IPOIB_CM_RX_SG)
+			srq_attr.max_sge = IPOIB_CM_RX_SG;
+
+		/* pad similar to IPOIB_CM_MTU */
+		priv->cm.max_cm_mtu = srq_attr.max_sge * PAGE_SIZE - 0x10;
+		priv->cm.num_of_frags = srq_attr.max_sge;
+		ipoib_dbg(priv, "max_cm_mtu = 0x%x, num_of_frags=%d\n",
+			  priv->cm.max_cm_mtu, priv->cm.num_of_frags);
+	} else {
+		/* In the nonsrq case the num of SG elements is set at qp creation */
+		priv->cm.max_cm_mtu = IPOIB_CM_MTU;
+		priv->cm.num_of_frags  = IPOIB_CM_RX_SG;
+	}
+
+	num_of_frags = priv->cm.num_of_frags;
+
+	for (i = 0; i < num_of_frags; ++i)
 		priv->cm.rx_sge[i].lkey	= priv->mr->lkey;
 
 	priv->cm.rx_sge[0].length = IPOIB_CM_HEAD_SIZE;
-	for (i = 1; i < IPOIB_CM_RX_SG; ++i)
+	for (i = 1; i < num_of_frags; ++i)
 		priv->cm.rx_sge[i].length = PAGE_SIZE;
 	priv->cm.rx_wr.next = NULL;
 	priv->cm.rx_wr.sg_list = priv->cm.rx_sge;
-	priv->cm.rx_wr.num_sge = IPOIB_CM_RX_SG;
-
-	ipoib_cm_create_srq(dev);
+	priv->cm.rx_wr.num_sge = num_of_frags;
 
 	if (ipoib_cm_has_srq(dev)) {
 		for (i = 0; i < ipoib_recvq_size; ++i) {
 			if (!ipoib_cm_alloc_rx_skb(dev, priv->cm.srq_ring, i,
-						   IPOIB_CM_RX_SG - 1,
+						   num_of_frags - 1,
 						   priv->cm.srq_ring[i].mapping)) {
 				ipoib_warn(priv, "failed to allocate "
 					   "receive buffer %d\n", i);
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-12-19 17:02:15.000000000 -0500
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-12-21 00:08:04.000000000 -0500
@@ -182,12 +182,15 @@ static int ipoib_change_mtu(struct net_d
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 
 	/* dev->mtu > 2K ==> connected mode */
-	if (ipoib_cm_admin_enabled(dev) && new_mtu <= IPOIB_CM_MTU) {
-		if (new_mtu > priv->mcast_mtu)
-			ipoib_warn(priv, "mtu > %d will cause multicast packet drops.\n",
+	if (ipoib_cm_admin_enabled(dev)) {
+		if (new_mtu <= priv->cm.max_cm_mtu) {
+			if (new_mtu > priv->mcast_mtu)
+				ipoib_warn(priv, "mtu > %d will cause multicast packet drops.\n",
 				   priv->mcast_mtu);
-		dev->mtu = new_mtu;
-		return 0;
+			dev->mtu = new_mtu;
+			return 0;
+		} else
+			return -EINVAL;
 	}
 
 	if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) {


From pradeeps at linux.vnet.ibm.com  Fri Dec 21 13:25:54 2007
From: pradeeps at linux.vnet.ibm.com (Pradeep Satyanarayana)
Date: Fri, 21 Dec 2007 13:25:54 -0800
Subject: [ofa-general] [PATCH][RFC] IPOIB/CM increase retry counts
Message-ID: <476C2F62.2020900@linux.vnet.ibm.com>

I have seen sporadic errors while running the HCAs in connected mode.
These errors appear to be related to the speeds of the different HCAs.
Increasing the retry counts solves the problem.

I looked at the RFC as regards to warnings about retries. The warnings 
is to make sure that the IB timeouts do not interfere with TCP timeouts.
The TCP timeout are so much larger than the IB timeouts (even with 
non zero values) that we are nowhere close to interfering with TCP
timeouts.

Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
---

--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2007-12-21 16:06:49.000000000 -0500
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2007-12-21 16:07:28.000000000 -0500
@@ -990,8 +990,8 @@ static int ipoib_cm_send_req(struct net_
 	req.responder_resources		= 4;
 	req.remote_cm_response_timeout	= 20;
 	req.local_cm_response_timeout	= 20;
-	req.retry_count			= 0; /* RFC draft warns against retries */
-	req.rnr_retry_count		= 0; /* RFC draft warns against retries */
+	req.retry_count			= 3;
+	req.rnr_retry_count		= 3;
 	req.max_cm_retries		= 15;
 	req.srq				= ipoib_cm_has_srq(dev);
 	return ib_send_cm_req(id, &req);


From dillowda at ornl.gov  Fri Dec 21 13:52:53 2007
From: dillowda at ornl.gov (David Dillow)
Date: Fri, 21 Dec 2007 16:52:53 -0500
Subject: [ofa-general] list corruption on ib_srp load in v2.6.24-rc5
Message-ID: <1198273973.9979.34.camel@lap75545.ornl.gov>

I'm getting the following oops when doing the following commands:

modprobe ib_srp
<add targets(s) to ib_srp using sysfs>
rmmod ib_srp
modprobe ib_srp
<OOPS>

I'm going to try and track down how the list is getting corrupted; it
looks like attribute_container_list in
drivers/base/attribute_container.c is the one getting corrupted.

Before I get too far into this, has anyone seen this one before? I
looked at 'git diff v2.6.24-rc4..' and didn't see any changes that would
stand out as fixing it.


list_add corruption. prev->next should be next (ffffffff81423ad0), but was 0000000000000000. (prev=ffff8108464cddd8).
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:33!
invalid opcode: 0000 [1] SMP 
CPU 3 
Modules linked in: ib_srp sg sd_mod ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr scsi_transport_srp scsi_mod ib_cm ib_ipoib ib_sa ib_uverbs ib_umad ib_mthca ib_mad ib_core ehci_hcd ohci_hcd nfs lockd nfs_acl sunrpc unionfs forcedeth
Pid: 3640, comm: modprobe Not tainted 2.6.24-rc5 #2
RIP: 0010:[<ffffffff810f53aa>]  [<ffffffff810f53aa>] __list_add+0x42/0x56
RSP: 0018:ffff810844077f18  EFLAGS: 00010292
RAX: 0000000000000079 RBX: ffff810846619000 RCX: ffffffff812b7308
RDX: ffffffff812b7308 RSI: 0000000000000046 RDI: ffffffff812b7300
RBP: ffffffff881432c0 R08: ffffffff812b72f0 R09: 0000000000000000
R10: 0000000100000000 R11: 0000000000000000 R12: 00002b54c0f72010
R13: 0000000000000000 R14: 00000000006180c8 R15: 00002b54c0f72010
FS:  00002b54c14bb6e0(0000) GS:ffff810846531840(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b54c0fcd00f CR3: 0000000843ddc000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 3640, threadinfo ffff810844076000, task ffff8108402444c0)
Stack:  0000000000000000 ffffffff8115170f ffff810846619000 ffffffff88138448
 0000000000000000 ffffffff88142080 000000000005b1cb ffffffff8809e00d
 ffffffff88142080 ffffffff81056af1 0000000000000000 0000000000000000
Call Trace:
 [<ffffffff8115170f>] attribute_container_register+0x44/0x54
 [<ffffffff88138448>] :scsi_transport_srp:srp_attach_transport+0x74/0x115
 [<ffffffff8809e00d>] :ib_srp:srp_init_module+0xd/0xc1
 [<ffffffff81056af1>] sys_init_module+0xad/0x14b
 [<ffffffff8100be6e>] system_call+0x7e/0x83


Code: 0f 0b eb fe 48 89 7e 08 48 89 37 48 89 47 08 48 89 38 59 c3 
RIP  [<ffffffff810f53aa>] __list_add+0x42/0x56
 RSP <ffff810844077f18>


#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc5
# Fri Dec 21 14:46:03 2007
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
# CONFIG_QUICKLIST is not set
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_SUPPORTS_OPROFILE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
# CONFIG_KTIME_SCALAR is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_TREE=y
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
# CONFIG_FAIR_GROUP_SCHED is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_BLK_DEV_BSG=y
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_X86_VSMP is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
CONFIG_MK8=y
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_GART_IOMMU=y
CONFIG_CALGARY_IOMMU=y
CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT=y
CONFIG_SWIOTLB=y
CONFIG_NR_CPUS=64
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_INTEL is not set
CONFIG_X86_MCE_AMD=y
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_NUMA=y
# CONFIG_K8_NUMA is not set
CONFIG_X86_64_ACPI_NUMA=y
# CONFIG_NUMA_EMU is not set
CONFIG_NODES_SHIFT=6
CONFIG_ARCH_DISCONTIGMEM_ENABLE=y
CONFIG_ARCH_DISCONTIGMEM_DEFAULT=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
# CONFIG_DISCONTIGMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
# CONFIG_SPARSEMEM_STATIC is not set
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_MIGRATION=y
CONFIG_RESOURCES_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_MTRR=y
# CONFIG_SECCOMP is not set
# CONFIG_CC_STACKPROTECTOR is not set
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x200000
CONFIG_HOTPLUG_CPU=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y

#
# Power management options
#
CONFIG_PM=y
# CONFIG_PM_LEGACY is not set
# CONFIG_PM_DEBUG is not set
CONFIG_SUSPEND_SMP_POSSIBLE=y
# CONFIG_SUSPEND is not set
CONFIG_HIBERNATION_SMP_POSSIBLE=y
# CONFIG_HIBERNATION is not set
CONFIG_ACPI=y
CONFIG_ACPI_PROCFS=y
CONFIG_ACPI_PROCFS_POWER=y
CONFIG_ACPI_PROC_EVENT=y
# CONFIG_ACPI_AC is not set
# CONFIG_ACPI_BATTERY is not set
# CONFIG_ACPI_BUTTON is not set
# CONFIG_ACPI_FAN is not set
# CONFIG_ACPI_DOCK is not set
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_NUMA=y
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
# CONFIG_ACPI_SBS is not set

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
# CONFIG_DMAR is not set
CONFIG_PCIEPORTBUS=y
CONFIG_PCIEAER=y
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
CONFIG_HT_IRQ=y
CONFIG_ISA_DMA_API=y
CONFIG_K8_NB=y
# CONFIG_PCCARD is not set
# CONFIG_HOTPLUG_PCI is not set

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_IA32_EMULATION=y
# CONFIG_IA32_AOUT is not set
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y

#
# Networking
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
# CONFIG_IP_ROUTE_VERBOSE is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
# CONFIG_INET_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
CONFIG_INET_LRO=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=m
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=m
CONFIG_TCP_CONG_HTCP=m
CONFIG_TCP_CONG_HSTCP=m
CONFIG_TCP_CONG_HYBLA=m
CONFIG_TCP_CONG_VEGAS=m
CONFIG_TCP_CONG_SCALABLE=m
CONFIG_TCP_CONG_LP=m
CONFIG_TCP_CONG_VENO=m
CONFIG_TCP_CONG_YEAH=m
CONFIG_TCP_CONG_ILLINOIS=m
# CONFIG_DEFAULT_BIC is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_HTCP is not set
# CONFIG_DEFAULT_VEGAS is not set
# CONFIG_DEFAULT_WESTWOOD is not set
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
# CONFIG_IPV6 is not set
# CONFIG_INET6_XFRM_TUNNEL is not set
# CONFIG_INET6_TUNNEL is not set
CONFIG_NETWORK_SECMARK=y
# CONFIG_NETFILTER is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_SCHED is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_TCPPROBE is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
CONFIG_FIB_RULES=y

#
# Wireless
#
# CONFIG_CFG80211 is not set
CONFIG_WIRELESS_EXT=y
# CONFIG_MAC80211 is not set
# CONFIG_IEEE80211 is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_DEBUG_DRIVER is not set
CONFIG_DEBUG_DEVRES=y
# CONFIG_SYS_HYPERVISOR is not set
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
# CONFIG_MTD is not set
# CONFIG_PARPORT is not set
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
# CONFIG_BLK_DEV_LOOP is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
# CONFIG_BLK_DEV_RAM is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
CONFIG_MISC_DEVICES=y
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_MSI_LAPTOP is not set
# CONFIG_SONY_LAPTOP is not set
# CONFIG_THINKPAD_ACPI is not set
# CONFIG_IDE is not set

#
# SCSI device support
#
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=m
CONFIG_SCSI_DMA=y
CONFIG_SCSI_TGT=m
# CONFIG_SCSI_NETLINK is not set
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
CONFIG_CHR_DEV_SG=m
# CONFIG_CHR_DEV_SCH is not set

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
# CONFIG_SCSI_FC_ATTRS is not set
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
CONFIG_SCSI_SAS_LIBSAS=m
# CONFIG_SCSI_SAS_LIBSAS_DEBUG is not set
CONFIG_SCSI_SRP_ATTRS=m
# CONFIG_SCSI_SRP_TGT_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_DEBUG is not set
CONFIG_SCSI_SRP=m
# CONFIG_ATA is not set
# CONFIG_MD is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_IEEE1394 is not set
# CONFIG_I2O is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
# CONFIG_NETDEVICES_MULTIQUEUE is not set
# CONFIG_DUMMY is not set
# CONFIG_BONDING is not set
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_VETH is not set
# CONFIG_NET_SB1000 is not set
# CONFIG_IP1000 is not set
# CONFIG_ARCNET is not set
# CONFIG_PHYLIB is not set
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
# CONFIG_IBM_NEW_EMAC_ZMII is not set
# CONFIG_IBM_NEW_EMAC_RGMII is not set
# CONFIG_IBM_NEW_EMAC_TAH is not set
# CONFIG_IBM_NEW_EMAC_EMAC4 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
CONFIG_FORCEDETH=m
# CONFIG_FORCEDETH_NAPI is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_SC92031 is not set
# CONFIG_NETDEV_1000 is not set
# CONFIG_NETDEV_10000 is not set
CONFIG_MLX4_CORE=m
# CONFIG_TR is not set

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
# CONFIG_WLAN_80211 is not set

#
# USB Network Adapters
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_POLLDEV=m

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
CONFIG_INPUT_MOUSE=y
# CONFIG_MOUSE_PS2 is not set
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_APPLETOUCH is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
# CONFIG_ROCKETPORT is not set
# CONFIG_CYCLADES is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_MOXA_SMARTIO_NEW is not set
# CONFIG_ISI is not set
# CONFIG_SYNCLINK is not set
# CONFIG_SYNCLINKMP is not set
# CONFIG_SYNCLINK_GT is not set
# CONFIG_N_HDLC is not set
# CONFIG_SPECIALIX is not set
# CONFIG_SX is not set
# CONFIG_RIO is not set
# CONFIG_STALDRV is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
CONFIG_HW_RANDOM=y
# CONFIG_HW_RANDOM_INTEL is not set
CONFIG_HW_RANDOM_AMD=m
CONFIG_NVRAM=y
CONFIG_RTC=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
# CONFIG_HPET_MMAP is not set
CONFIG_HANGCHECK_TIMER=m
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
# CONFIG_I2C is not set

#
# SPI support
#
# CONFIG_SPI is not set
# CONFIG_SPI_MASTER is not set
# CONFIG_W1 is not set
# CONFIG_POWER_SUPPLY is not set
# CONFIG_HWMON is not set
# CONFIG_WATCHDOG is not set

#
# Sonics Silicon Backplane
#
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_SM501 is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
# CONFIG_DVB_CORE is not set
# CONFIG_DAB is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_DRM is not set
# CONFIG_VGASTATE is not set
# CONFIG_VIDEO_OUTPUT_CONTROL is not set
# CONFIG_FB is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
# CONFIG_LCD_CLASS_DEVICE is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_CORGI is not set
# CONFIG_BACKLIGHT_PROGEAR is not set

#
# Display device support
#
# CONFIG_DISPLAY_SUPPORT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_VIDEO_SELECT=y
CONFIG_DUMMY_CONSOLE=y

#
# Sound
#
# CONFIG_SOUND is not set
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
CONFIG_HID_DEBUG=y
# CONFIG_HIDRAW is not set

#
# USB Input Devices
#
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT_POWERBOOK=y
# CONFIG_HID_FF is not set
CONFIG_USB_HIDDEV=y
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_DEVICE_CLASS is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
CONFIG_USB_SUSPEND=y
# CONFIG_USB_PERSIST is not set
# CONFIG_USB_OTG is not set

#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
CONFIG_USB_EHCI_SPLIT_ISO=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
# CONFIG_USB_ISP116X_HCD is not set
CONFIG_USB_OHCI_HCD=m
# CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=m
# CONFIG_USB_SL811_HCD is not set
# CONFIG_USB_R8A66597_HCD is not set

#
# USB Device Class drivers
#
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set

#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
#

#
# may also be needed; see USB_STORAGE Help for more information
#
# CONFIG_USB_STORAGE is not set
# CONFIG_USB_LIBUSUAL is not set

#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USB_MON is not set

#
# USB port drivers
#

#
# USB Serial Converter support
#
# CONFIG_USB_SERIAL is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
# CONFIG_USB_AUERSWALD is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_BERRY_CHARGE is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYPRESS_CY7C63 is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGET is not set
# CONFIG_USB_IDMOUSE is not set
# CONFIG_USB_FTDI_ELAN is not set
# CONFIG_USB_APPLEDISPLAY is not set
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
# CONFIG_USB_TEST is not set

#
# USB DSL modem support
#

#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
# CONFIG_MMC is not set
# CONFIG_NEW_LEDS is not set
CONFIG_INFINIBAND=m
CONFIG_INFINIBAND_USER_MAD=m
CONFIG_INFINIBAND_USER_ACCESS=m
CONFIG_INFINIBAND_USER_MEM=y
CONFIG_INFINIBAND_ADDR_TRANS=y
CONFIG_INFINIBAND_MTHCA=m
CONFIG_INFINIBAND_MTHCA_DEBUG=y
CONFIG_INFINIBAND_IPATH=m
CONFIG_INFINIBAND_AMSO1100=m
# CONFIG_INFINIBAND_AMSO1100_DEBUG is not set
CONFIG_MLX4_INFINIBAND=m
CONFIG_INFINIBAND_IPOIB=m
# CONFIG_INFINIBAND_IPOIB_CM is not set
CONFIG_INFINIBAND_IPOIB_DEBUG=y
CONFIG_INFINIBAND_IPOIB_DEBUG_DATA=y
CONFIG_INFINIBAND_SRP=m
CONFIG_INFINIBAND_ISER=m
CONFIG_EDAC=y

#
# Reporting subsystems
#
# CONFIG_EDAC_DEBUG is not set
CONFIG_EDAC_MM_EDAC=m
CONFIG_EDAC_E752X=m
CONFIG_EDAC_I82975X=m
CONFIG_EDAC_I5000=m
# CONFIG_RTC_CLASS is not set
# CONFIG_DMADEVICES is not set
# CONFIG_VIRTUALIZATION is not set

#
# Userspace I/O
#
# CONFIG_UIO is not set

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y

#
# File systems
#
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
# CONFIG_EXT4DEV_FS is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
# CONFIG_QUOTA is not set
CONFIG_DNOTIFY=y
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_FUSE_FS is not set
CONFIG_GENERIC_ACL=y

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=y

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFS_DIRECTIO=y
# CONFIG_NFSD is not set
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_XPRT_RDMA=m
CONFIG_SUNRPC_BIND34=y
CONFIG_RPCSEC_GSS_KRB5=m
# CONFIG_RPCSEC_GSS_SPKM3 is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_NLS is not set
# CONFIG_DLM is not set
CONFIG_INSTRUMENTATION=y
CONFIG_PROFILING=y
CONFIG_OPROFILE=m
CONFIG_KPROBES=y
# CONFIG_MARKERS is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_PRINTK_TIME is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_UNUSED_SYMBOLS is not set
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_DETECT_SOFTLOCKUP=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
CONFIG_TIMER_STATS=y
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_VM is not set
CONFIG_DEBUG_LIST=y
# CONFIG_DEBUG_SG is not set
# CONFIG_FRAME_POINTER is not set
# CONFIG_FORCED_INLINING is not set
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_LKDTM is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_SAMPLES is not set
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACK_USAGE is not set
CONFIG_DEBUG_RODATA=y
# CONFIG_IOMMU_DEBUG is not set

#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITY_FILE_CAPABILITIES is not set
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_BLKCIPHER=m
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=y
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_TGR192 is not set
CONFIG_CRYPTO_GF128MUL=m
# CONFIG_CRYPTO_ECB is not set
CONFIG_CRYPTO_CBC=m
# CONFIG_CRYPTO_PCBC is not set
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_XTS is not set
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_FCRYPT is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES is not set
CONFIG_CRYPTO_AES_X86_64=y
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_ANUBIS is not set
# CONFIG_CRYPTO_SEED is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_CAMELLIA is not set
# CONFIG_CRYPTO_TEST is not set
# CONFIG_CRYPTO_AUTHENC is not set
# CONFIG_CRYPTO_HW is not set

#
# Library routines
#
CONFIG_BITREVERSE=y
# CONFIG_CRC_CCITT is not set
# CONFIG_CRC16 is not set
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=y
CONFIG_PLIST=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y


From dillowda at ornl.gov  Fri Dec 21 13:54:49 2007
From: dillowda at ornl.gov (David Dillow)
Date: Fri, 21 Dec 2007 16:54:49 -0500
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <1198273973.9979.34.camel@lap75545.ornl.gov>
References: <1198273973.9979.34.camel@lap75545.ornl.gov>
Message-ID: <1198274089.9979.35.camel@lap75545.ornl.gov>


On Fri, 2007-12-21 at 16:52 -0500, David Dillow wrote:
> Before I get too far into this, has anyone seen this one before? I
> looked at 'git diff v2.6.24-rc4..' and didn't see any changes that would
> stand out as fixing it.

This should read v2.6.24-rc5...


From sashak at voltaire.com  Fri Dec 21 14:29:50 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Fri, 21 Dec 2007 22:29:50 +0000
Subject: [ofa-general] [PATCH] opensm: remove useless
	osm_node_get_remote_type()
Message-ID: <20071221222950.GL15888@sashak.voltaire.com>


Remove useless osm_node_get_remote_type() function.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/include/opensm/osm_node.h |   40 --------------------------------------
 opensm/opensm/osm_dump.c         |    2 +-
 2 files changed, 1 insertions(+), 41 deletions(-)

diff --git a/opensm/include/opensm/osm_node.h b/opensm/include/opensm/osm_node.h
index 8af5418..a900e03 100644
--- a/opensm/include/opensm/osm_node.h
+++ b/opensm/include/opensm/osm_node.h
@@ -478,46 +478,6 @@ osm_node_get_remote_base_lid(IN const osm_node_t * const p_node,
 *	Node object
 *********/
 
-/****f* OpenSM: Node/osm_node_get_remote_type
-* NAME
-*	osm_node_get_remote_type
-*
-* DESCRIPTION
-*	Returns the type of the node on the other side
-*	of the wire from the specified port on this node.
-*	The remote node must exist.
-*
-* SYNOPSIS
-*/
-static inline uint8_t
-osm_node_get_remote_type(IN const osm_node_t * const p_node,
-			 IN const uint8_t port_num)
-{
-	osm_node_t *p_remote_node;
-
-	p_remote_node = osm_node_get_remote_node(p_node, port_num, NULL);
-	CL_ASSERT(p_remote_node);
-	return (osm_node_get_type(p_remote_node));
-}
-
-/*
-* PARAMETERS
-*	p_node
-*		[in] Pointer to an osm_node_t object.
-*
-*	port_num
-*		[in] Local port number.
-*
-* RETURN VALUES
-*	Returns the type of the node on the other side
-*	of the wire from the specified port on this node.
-*
-* NOTES
-*
-* SEE ALSO
-*	Node object
-*********/
-
 /****f* OpenSM: Node/osm_node_get_lmc
 * NAME
 *	osm_node_get_lmc
diff --git a/opensm/opensm/osm_dump.c b/opensm/opensm/osm_dump.c
index fa07f83..0cf3d44 100644
--- a/opensm/opensm/osm_dump.c
+++ b/opensm/opensm/osm_dump.c
@@ -99,7 +99,7 @@ static void dump_ucast_path_distribution(cl_map_item_t * p_map_item, void *cxt)
 		remote_guid_ho =
 		    cl_ntoh64(osm_node_get_node_guid(p_remote_node));
 
-		switch (osm_node_get_remote_type(p_node, i)) {
+		switch (osm_node_get_type(p_remote_node)) {
 		case IB_NODE_TYPE_SWITCH:
 			osm_log_printf(&p_osm->log, OSM_LOG_DEBUG,
 				       " (link to switch");
-- 
1.5.3.4.206.g58ba4


From dave at thedillows.org  Fri Dec 21 14:18:52 2007
From: dave at thedillows.org (David Dillow)
Date: Fri, 21 Dec 2007 17:18:52 -0500
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <1198273973.9979.34.camel@lap75545.ornl.gov>
References: <1198273973.9979.34.camel@lap75545.ornl.gov>
Message-ID: <1198275532.9979.43.camel@lap75545.ornl.gov>


On Fri, 2007-12-21 at 16:52 -0500, David Dillow wrote:
> I'm getting the following oops when doing the following commands:
> 
> modprobe ib_srp
> <add targets(s) to ib_srp using sysfs>
> rmmod ib_srp
> modprobe ib_srp
> <OOPS>
> 
> I'm going to try and track down how the list is getting corrupted; it
> looks like attribute_container_list in
> drivers/base/attribute_container.c is the one getting corrupted.

Ok, found the culprit, now to figure out the motive and fix it.

ib_srp's srp_cleanup_module calls srp_release_transport(), which calls
transport_container_unregister() for the rport_attr_cont member of
struct srp_internal.

That last unregister call is returning -EBUSY, but it gets ignored, and
the list node gets erased (or just reused) when the module's text/memory
is free'd.

Now, to see if ib_srp should be waiting for everything to be destroyed
before calling srp_release_transport(), or if it is just not removing
some attributes properly.

That's a next week thing if no one beats me to it.


From rdreier at cisco.com  Fri Dec 21 14:21:09 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Fri, 21 Dec 2007 14:21:09 -0800
Subject: [ofa-general] list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <1198273973.9979.34.camel@lap75545.ornl.gov> (David Dillow's
	message of "Fri, 21 Dec 2007 16:52:53 -0500")
References: <1198273973.9979.34.camel@lap75545.ornl.gov>
Message-ID: <adabq8jy9ii.fsf@cisco.com>

 > I'm getting the following oops when doing the following commands:
 > 
 > modprobe ib_srp
 > <add targets(s) to ib_srp using sysfs>
 > rmmod ib_srp
 > modprobe ib_srp
 > <OOPS>
 > 
 > I'm going to try and track down how the list is getting corrupted; it
 > looks like attribute_container_list in
 > drivers/base/attribute_container.c is the one getting corrupted.
 > 
 > Before I get too far into this, has anyone seen this one before? I
 > looked at 'git diff v2.6.24-rc4..' and didn't see any changes that would
 > stand out as fixing it.

I haven't seen this, but I actually haven't done much srp testing with
post-2.6.23 kernels.  From a quick skim through the code, I would
guess that one of the calls to transport_container_unregister() in
srp_release_transport() (which is called one module unload) is
failing and leaving something bogus on the attribute_container_list.

This could because the underlying call to attribute_container_unregister()
fails because the k_list is not empty.  I don't know if this is some
sort of leak in the ib_srp driver, or something broken elsewhere...

 - R.


From vikx at atmail.com  Fri Dec 21 17:20:20 2007
From: vikx at atmail.com (=?windows-1255?B?4/DpIOU=?=)
Date: Sat, 22 Dec 2007 03:20:20 +0200
Subject: [ofa-general] =?windows-1255?b?5+X0+SDr7Ovs6Q==?=
Message-ID: <20071222012019.6D3A6E60328@openfabrics.org>

חופש כלכלי
השחקן והבימאי היהודי המפורסם וודי אלן אמר פעם:  

"אני רוצה למות צעיר - אבל כמה שיותר מאוחר!" 

מעטים מצליחים להשיג עבור עצמם את המטרה שוודי אלן הציב לעצמו. 
 
 
במקביל אנשים רבים רוצים: 

"לפרוש בגיל צעיר - אבל עם מספיק כסף כדי לחיות ברמת חיים גבוהה!" 

והרבה פחות אנשים מצליחים להשיג עבור עצמם את המטרה הזאת. 
 

האם אתה* בכיוון הנכון לעבר השגת המטרה הזאת? 

האם אתה נע בכיוון של השגת חופש כלכלי עבורך עצמך ועבור הקרובים לך? 

או שאולי אתה נע בדיוק בכיוון ההפוך . . .  
 

לקבלת תחזית (ללא תשלום) על מידת החופש הכלכלי שלך, לחץ כאן (רשום שם וטלפון)
התחזית מבוססת על עבודתו של אלי שליט, מנהל ומייסד האתר www.homebiz.co.il
בברכת שנת חופש כלכלי, 

דני וידיסלבסקי
מרצה ויועץ בכיר 

* הכתוב מתייחס לזכר ולנקבה כאחד


להסרה לחץ כאן
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071222/4638bd2f/attachment.html>

From changquing.tang at hp.com  Fri Dec 21 18:51:38 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Sat, 22 Dec 2007 02:51:38 +0000
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	any	one user process
In-Reply-To: <200712212251.24330.jackm@dev.mellanox.co.il>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<200712212009.26816.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE241E196@G5W0278.americas.hpqcorp.net>
	<200712212251.24330.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FE241E42A@G5W0278.americas.hpqcorp.net>


> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Friday, December 21, 2007 2:51 PM
> To: Tang, Changqing
> Cc: pasha at dev.mellanox.co.il; general at lists.openfabrics.org
> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> independent of any one user process
>
> On Friday 21 December 2007 20:22, Tang, Changqing wrote:
> > What we do for heart-beat is using zero-byte rdma_write,
> the message
> > goes to the peer QP only, there is no need to post anything
> on remote side, no need for pinned memory.
> >
> I'll look into this solution on Sunday (I've not used 0-byte
> rdma_reads myself yet).
> (Question -- does the 0-byte rdma-read need to access a valid
> address (i.e., region) on the remote side, even if it is
> zero-byte? or are the remote address and rkey fields "don't
> care" in the post_send work request in this case?)

We need to ask Roland to confirm this.


>
> If you can, please send me a coding example at the userspace
> ibv (verbs) layer.
> It will save time.

I did not use zero byte rdma_read, I only use zero byte rdma_write.

Here is our code:

                sr.next = NULL;
                sr.wr_id = (uint64_t)(AULONG)rdmahdr;

                sr.sg_list = &ssg;
                sr.num_sge = 0;
                sr.opcode = IBV_WR_RDMA_WRITE;
                sr.send_flags = IBV_SEND_INLINE|IBV_SEND_SIGNALED;

                err = ibv_post_send(ibvproc->connection[i].qp_hndl, &sr, &bad_sr);
                if (err != 0) {
                        hpmp_printf("ibv_post_send() failed");
                        return (-1);
                }

Note, ssg is not initialized (Maybe we can set sr.sg_list = NULL ?)


--CQ


> (jackm at dev.mellanox.co.il).
>
> Thanks!
>
> - Jack
>


From a-a4ever at achayan.com  Fri Dec 21 19:57:34 2007
From: a-a4ever at achayan.com (Dixie Kim)
Date: Sat, 22 Dec 2007 11:57:34 +0800
Subject: [ofa-general] Let's chat
Message-ID: <01c84491$d7040250$c2920d79@a-a4ever>

Hello! I am bored this evening. I am nice girl that would like to chat with you. Email me at Linda at ShineBal.info only, because I am using my friend's email to write this. Would you mind if I share some of my pictures with you?


From kliteyn at mellanox.co.il  Fri Dec 21 21:43:11 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 22 Dec 2007 07:43:11 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-22:normal completion
Message-ID: <MTLEXCH01WySvX5w0Yg0000170b@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-21
OpenSM git rev = Mon_Dec_17_15:20:43_2007 [9988f459cb81dd025bde8b2dd53b3c551616be0c]
ibutils git rev = Wed_Dec_19_12:06:28_2007 [9961475294fbf1d3782edb8f377a77b13fa80d70]
 
 
Total=560  Pass=559  Fail=1
 
 
Pass:
42 Stability IS1-16.topo
42 Pkey IS1-16.topo
42 OsmTest IS1-16.topo
42 OsmStress IS1-16.topo
42 Multicast IS1-16.topo
42 LidMgr IS1-16.topo
14 Stability IS3-loop.topo
14 Stability IS3-128.topo
14 Pkey IS3-128.topo
14 OsmTest IS3-loop.topo
14 OsmTest IS3-128.topo
14 OsmStress IS3-128.topo
14 Multicast IS3-loop.topo
14 LidMgr IS3-128.topo
14 FatTree merge-roots-4-ary-2-tree.topo
14 FatTree merge-root-4-ary-3-tree.topo
14 FatTree gnu-stallion-64.topo
14 FatTree blend-4-ary-2-tree.topo
14 FatTree RhinoDDR.topo
14 FatTree FullGnu.topo
14 FatTree 4-ary-2-tree.topo
14 FatTree 2-ary-4-tree.topo
14 FatTree 12-node-spaced.topo
14 FTreeFail 4-ary-2-tree-missing-sw-link.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
14 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
13 Multicast IS3-128.topo

Failures:
1 Multicast IS3-128.topo


From jackm at dev.mellanox.co.il  Sat Dec 22 00:05:28 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Sat, 22 Dec 2007 10:05:28 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	=?iso-8859-1?q?any=09one_user?= process
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE241E42A@G5W0278.americas.hpqcorp.net>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<200712212251.24330.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE241E42A@G5W0278.americas.hpqcorp.net>
Message-ID: <200712221005.29126.jackm@dev.mellanox.co.il>

On Saturday 22 December 2007 04:51, Tang, Changqing wrote:
> We need to ask Roland to confirm this.
> 
I'll speak to the firmware guys here for clarification.
> 
> 
> I did not use zero byte rdma_read, I only use zero byte rdma_write.
> 
> Here is our code:
> 
>                 sr.next = NULL;
>                 sr.wr_id = (uint64_t)(AULONG)rdmahdr;
> 
>                 sr.sg_list = &ssg;
>                 sr.num_sge = 0;
>                 sr.opcode = IBV_WR_RDMA_WRITE;
>                 sr.send_flags = IBV_SEND_INLINE|IBV_SEND_SIGNALED;
> 
>                 err = ibv_post_send(ibvproc->connection[i].qp_hndl, &sr, &bad_sr);
>                 if (err != 0) {
>                         hpmp_printf("ibv_post_send() failed");
>                         return (-1);
>                 }
> 
> Note, ssg is not initialized (Maybe we can set sr.sg_list = NULL ?)
You don't need to initialize ssg -- you have set num_sge to zero, so sg_list is not relevant.
> 
I notice that the rdma fields are also not initialized. 
That implies that no validity checking is done on the remote host side once
the length is zero.

I just looked at the ConnectX PRM, revision 0.35, section 8.4.1.11.  It specifically addresses the issue of
sending 0-byte RDMA reads/writes: send a wqe with NO data segments, and you will get a 0-length RDMA-write.
However, it does not mention whether the remote address and rkey provided in the WQE must be valid or not.

I'll check with F/W regarding this issue, too.

- Jack


From sashak at voltaire.com  Sat Dec 22 03:21:03 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 11:21:03 +0000
Subject: [ofa-general] [PATCH] opensm/osm_helper: make some functions static
Message-ID: <20071222112103.GO15888@sashak.voltaire.com>


Make some functions static.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/libopensm.map |    2 -
 opensm/opensm/osm_helper.c  |  220 +++++++++++++++++++++---------------------
 2 files changed, 110 insertions(+), 112 deletions(-)

diff --git a/opensm/opensm/libopensm.map b/opensm/opensm/libopensm.map
index 909b641..b3d4fe0 100644
--- a/opensm/opensm/libopensm.map
+++ b/opensm/opensm/libopensm.map
@@ -17,8 +17,6 @@ OPENSM_1.5 {
 		ib_get_sm_method_str;
 		ib_get_sm_attr_str;
 		ib_get_sa_attr_str;
-		osm_dbg_do_line;
-		osm_dbg_get_capabilities_str;
 		osm_dump_port_info;
 		osm_dump_portinfo_record;
 		osm_dump_guidinfo_record;
diff --git a/opensm/opensm/osm_helper.c b/opensm/opensm/osm_helper.c
index 5b85dc8..95c702c 100644
--- a/opensm/opensm/osm_helper.c
+++ b/opensm/opensm/osm_helper.c
@@ -495,11 +495,11 @@ const char *ib_get_sa_attr_str(IN ib_net16_t attr)
 
 /**********************************************************************
  **********************************************************************/
-ib_api_status_t
-osm_dbg_do_line(IN char **pp_local,
-		IN const uint32_t buf_size,
-		IN const char *const p_prefix_str,
-		IN const char *const p_new_str, IN uint32_t * const p_total_len)
+static ib_api_status_t
+dbg_do_line(IN char **pp_local,
+	    IN const uint32_t buf_size,
+	    IN const char *const p_prefix_str,
+	    IN const char *const p_new_str, IN uint32_t * const p_total_len)
 {
 	char line[LINE_LENGTH];
 	uint32_t len;
@@ -517,11 +517,11 @@ osm_dbg_do_line(IN char **pp_local,
 
 /**********************************************************************
  **********************************************************************/
-void
-osm_dbg_get_capabilities_str(IN char *p_buf,
-			     IN const uint32_t buf_size,
-			     IN const char *const p_prefix_str,
-			     IN const ib_port_info_t * const p_pi)
+static void
+dbg_get_capabilities_str(IN char *p_buf,
+			 IN const uint32_t buf_size,
+			 IN const char *const p_prefix_str,
+			 IN const ib_port_info_t * const p_pi)
 {
 	uint32_t total_len = 0;
 	char *p_local = p_buf;
@@ -530,195 +530,195 @@ osm_dbg_get_capabilities_str(IN char *p_buf,
 	p_local += strlen(p_local);
 
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV0) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV0\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV0\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_IS_SM) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_IS_SM\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_IS_SM\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_NOTICE) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_NOTICE\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_NOTICE\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_TRAP) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_TRAP\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_TRAP\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_IPD) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_IPD\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_IPD\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_AUTO_MIG) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_AUTO_MIG\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_AUTO_MIG\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_SL_MAP) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_SL_MAP\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_SL_MAP\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_NV_MKEY) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_NV_MKEY\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_NV_MKEY\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_NV_PKEY) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_NV_PKEY\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_NV_PKEY\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_LED_INFO) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_LED_INFO\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_LED_INFO\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_SM_DISAB) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_SM_DISAB\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_SM_DISAB\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_SYS_IMG_GUID) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_SYS_IMG_GUID\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_SYS_IMG_GUID\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_PKEY_SW_EXT_PORT_TRAP) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_PKEY_SW_EXT_PORT_TRAP\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_PKEY_SW_EXT_PORT_TRAP\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV13) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV13\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV13\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV14) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV14\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV14\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV15) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV15\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV15\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_COM_MGT) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_COM_MGT\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_COM_MGT\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_SNMP) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_SNMP\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_SNMP\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_REINIT) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_REINIT\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_REINIT\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_DEV_MGT) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_DEV_MGT\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_DEV_MGT\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_VEND_CLS) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_VEND_CLS\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_VEND_CLS\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_DR_NTC) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_DR_NTC\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_DR_NTC\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_CAP_NTC) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_CAP_NTC\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_CAP_NTC\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_BM) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_BM\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_BM\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_LINK_RT_LATENCY) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_LINK_RT_LATENCY\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_LINK_RT_LATENCY\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_HAS_CLIENT_REREG) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_HAS_CLIENT_REREG\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_HAS_CLIENT_REREG\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV26) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV26\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV26\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV27) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV27\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV27\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV28) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV28\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV28\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV29) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV29\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV29\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV30) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV30\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV30\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 	if (p_pi->capability_mask & IB_PORT_CAP_RESV31) {
-		if (osm_dbg_do_line(&p_local, buf_size, p_prefix_str,
-				    "IB_PORT_CAP_RESV31\n",
-				    &total_len) != IB_SUCCESS)
+		if (dbg_do_line(&p_local, buf_size, p_prefix_str,
+				"IB_PORT_CAP_RESV31\n",
+				&total_len) != IB_SUCCESS)
 			return;
 	}
 }
@@ -806,8 +806,8 @@ osm_dump_port_info(IN osm_log_t * const p_log,
 
 		/*  show the capabilities mask */
 		if (p_pi->capability_mask) {
-			osm_dbg_get_capabilities_str(buf, BUF_SIZE, "\t\t\t\t",
-						     p_pi);
+			dbg_get_capabilities_str(buf, BUF_SIZE, "\t\t\t\t",
+						 p_pi);
 			osm_log(p_log, log_level, "%s", buf);
 		}
 	}
@@ -894,8 +894,8 @@ osm_dump_portinfo_record(IN osm_log_t * const p_log,
 
 		/*  show the capabilities mask */
 		if (p_pi->capability_mask) {
-			osm_dbg_get_capabilities_str(buf, BUF_SIZE, "\t\t\t\t",
-						     p_pi);
+			dbg_get_capabilities_str(buf, BUF_SIZE, "\t\t\t\t",
+						 p_pi);
 			osm_log(p_log, log_level, "%s", buf);
 		}
 	}
-- 
1.5.3.4.206.g58ba4


From vlad at lists.openfabrics.org  Sat Dec 22 03:16:25 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sat, 22 Dec 2007 03:16:25 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071222-0200 daily build status
Message-ID: <20071222111625.4E2ABE6017C@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.19
Passed on x86_64 with linux-2.6.12
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.17
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.19
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.15
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.14
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.16
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:


From sashak at voltaire.com  Sat Dec 22 03:17:39 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 11:17:39 +0000
Subject: [ofa-general] [PATCH] complib: make __cl_thread_wrapper() static
Message-ID: <20071222111739.GN15888@sashak.voltaire.com>


Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/complib/cl_thread.c    |    2 +-
 opensm/complib/libosmcomp.map |    1 -
 2 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/opensm/complib/cl_thread.c b/opensm/complib/cl_thread.c
index 004b118..ab1a329 100644
--- a/opensm/complib/cl_thread.c
+++ b/opensm/complib/cl_thread.c
@@ -47,7 +47,7 @@
  * This function is always run as a result of creation a new user mode thread.
  * Its main job is to synchronize the creation and running of the new thread.
  */
-void *__cl_thread_wrapper(void *arg)
+static void *__cl_thread_wrapper(void *arg)
 {
 	cl_thread_t *p_thread = (cl_thread_t *) arg;
 
diff --git a/opensm/complib/libosmcomp.map b/opensm/complib/libosmcomp.map
index 7ee845d..435c2fe 100644
--- a/opensm/complib/libosmcomp.map
+++ b/opensm/complib/libosmcomp.map
@@ -109,7 +109,6 @@ OSMCOMP_2.3 {
 		cl_spinlock_acquire;
 		cl_spinlock_release;
 		cl_status_text;
-		__cl_thread_wrapper;
 		cl_thread_construct;
 		cl_thread_init;
 		cl_thread_destroy;
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Sat Dec 22 03:16:27 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 11:16:27 +0000
Subject: [ofa-general] [PATCH] manangement: kill __WORDSIZE macro checks
Message-ID: <20071222111627.GM15888@sashak.voltaire.com>


Kill #if __WORRDSIZE == 64 checks.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 infiniband-diags/src/ibping.c           |   23 ++----------
 libibcommon/include/infiniband/common.h |    5 ---
 libibcommon/src/sysfs.c                 |    5 ---
 libibmad/src/dump.c                     |   59 ++++++-------------------------
 libibmad/src/portid.c                   |    7 +---
 opensm/include/opensm/st.h              |    4 --
 opensm/opensm/osm_db_pack.c             |    6 +---
 opensm/opensm/osm_prtn_config.c         |   10 +-----
 opensm/opensm/osm_qos_parser.y          |    4 --
 opensm/opensm/osm_subnet.c              |    4 --
 10 files changed, 18 insertions(+), 109 deletions(-)

diff --git a/infiniband-diags/src/ibping.c b/infiniband-diags/src/ibping.c
index ea46002..ba32508 100644
--- a/infiniband-diags/src/ibping.c
+++ b/infiniband-diags/src/ibping.c
@@ -139,13 +139,8 @@ ibping(ib_portid_t *portid, int quiet)
 		memcpy(last_host, data, sizeof last_host);
 
 	if (!quiet)
-#if __WORDSIZE == 64
-		printf("Pong from %s (%s): time %lu.%03lu ms\n",
+		printf("Pong from %s (%s): time %" PRIu64 ".%03" PRIu64 " ms\n",
 			data, portid2str(portid), rtt/1000, rtt%1000);
-#else
-		printf("Pong from %s (%s): time %llu.%03llu ms\n",
-			data, portid2str(portid), rtt/1000, rtt%1000);
-#endif
 
 	return rtt;
 }
@@ -178,27 +173,15 @@ report(int sig)
 	DEBUG("out due signal %d", sig);
 
 	printf("\n--- %s (%s) ibping statistics ---\n", last_host, portid2str(&portid));
-#if __WORDSIZE == 64
-	printf("%lu packets transmitted, %lu received, %lu%% packet loss, time %lu ms\n",
+	printf("%" PRIu64 " packets transmitted, %" PRIu64 " received, %" PRIu64 "%% packet loss, time %" PRIu64 " ms\n",
 		ntrans, replied,
-		(lost != 0) ?  lost * 100ul / ntrans : 0ul, total_time / 1000ul);
-	printf("rtt min/avg/max = %lu.%03lu/%lu.%03lu/%lu.%03lu ms\n",
-		minrtt == ~0ull ? 0 : minrtt/1000,
-		minrtt == ~0ull ? 0 : minrtt%1000,
-		replied ? total_rtt/replied/1000 : 0,
-		replied ? (total_rtt/replied)%1000 : 0,
-		maxrtt/1000, maxrtt%1000);
-#else
-	printf("%llu packets transmitted, %llu received, %llu%% packet loss, time %llu ms\n",
-		(unsigned long long)ntrans, (unsigned long long)replied,
 		(lost != 0) ?  lost * 100ull / ntrans : 0ull, total_time / 1000ull);
-	printf("rtt min/avg/max = %llu.%03llu/%llu.%03llu/%llu.%03llu ms\n",
+	printf("rtt min/avg/max = %" PRIu64 ".%03" PRIu64 "/%" PRIu64 ".%03" PRIu64 "/%" PRIu64 ".%03" PRIu64 " ms\n",
 		minrtt == ~0ull ? 0 : minrtt/1000,
 		minrtt == ~0ull ? 0 : minrtt%1000,
 		replied ? total_rtt/replied/1000 : 0,
 		replied ? (total_rtt/replied)%1000 : 0,
 		maxrtt/1000, maxrtt%1000);
-#endif
 
 	exit(0);
 }
diff --git a/libibcommon/include/infiniband/common.h b/libibcommon/include/infiniband/common.h
index 4eb3872..01fc796 100644
--- a/libibcommon/include/infiniband/common.h
+++ b/libibcommon/include/infiniband/common.h
@@ -49,11 +49,6 @@
 
 BEGIN_C_DECLS
 
-/**
- * Byte alignments for structures
- */
-#define BUFALIGN __WORDSIZE
-
 #if __BYTE_ORDER == __LITTLE_ENDIAN
 #ifndef ntohll
 static inline uint64_t ntohll(uint64_t x) {
diff --git a/libibcommon/src/sysfs.c b/libibcommon/src/sysfs.c
index 49fc79c..f3f6711 100644
--- a/libibcommon/src/sysfs.c
+++ b/libibcommon/src/sysfs.c
@@ -59,11 +59,6 @@
 
 #include "common.h"
 
-#if __WORDSIZE == 64
-#define strtoll		strtol
-#define strtoull	strtoul
-#endif
-
 static int
 ret_code(void)
 {
diff --git a/libibmad/src/dump.c b/libibmad/src/dump.c
index 3d8593d..254106b 100644
--- a/libibmad/src/dump.c
+++ b/libibmad/src/dump.c
@@ -40,6 +40,7 @@
 #include <stdlib.h>
 #include <unistd.h>
 #include <string.h>
+#include <inttypes.h>
 #include <netinet/in.h>
 
 #include <mad.h>
@@ -63,11 +64,7 @@ mad_dump_int(char *buf, int bufsz, void *val, int valsz)
 	case 6:
 	case 7:
 	case 8:
-#if __WORDSIZE == 64
-		snprintf(buf, bufsz, "%lu", *(uint64_t *)val);
-#else
-		snprintf(buf, bufsz, "%llu", *(uint64_t *)val);
-#endif
+		snprintf(buf, bufsz, "%" PRIu64, *(uint64_t *)val);
 		break;
 	default:
 		IBWARN("bad int sz %d", valsz);
@@ -93,11 +90,7 @@ mad_dump_uint(char *buf, int bufsz, void *val, int valsz)
 	case 6:
 	case 7:
 	case 8:
-#if __WORDSIZE == 64
-		snprintf(buf, bufsz, "%lu", *(uint64_t *)val);
-#else
-		snprintf(buf, bufsz, "%llu", *(uint64_t *)val);
-#endif
+		snprintf(buf, bufsz, "%" PRIu64, *(uint64_t *)val);
 		break;
 	default:
 		IBWARN("bad int sz %u", valsz);
@@ -121,33 +114,18 @@ mad_dump_hex(char *buf, int bufsz, void *val, int valsz)
 	case 4:
 		snprintf(buf, bufsz, "0x%08x", *(uint32_t *)val);
 		break;
-#if __WORDSIZE == 64
 	case 5:
-		snprintf(buf, bufsz, "0x%010lx", *(uint64_t *)val & 0xfffffffffflu);
+		snprintf(buf, bufsz, "0x%010" PRIx64, *(uint64_t *)val & 0xffffffffffllu);
 		break;
 	case 6:
-		snprintf(buf, bufsz, "0x%012lx", *(uint64_t *)val & 0xfffffffffffflu);
+		snprintf(buf, bufsz, "0x%012" PRIx64, *(uint64_t *)val & 0xffffffffffffllu);
 		break;
 	case 7:
-		snprintf(buf, bufsz, "0x%014lx", *(uint64_t *)val & 0xfffffffffffffflu);
+		snprintf(buf, bufsz, "0x%014" PRIx64, *(uint64_t *)val & 0xffffffffffffffllu);
 		break;
 	case 8:
-		snprintf(buf, bufsz, "0x%016lx", *(uint64_t *)val);
+		snprintf(buf, bufsz, "0x%016" PRIx64, *(uint64_t *)val);
 		break;
-#else
-	case 5:
-		snprintf(buf, bufsz, "0x%010llx", *(uint64_t *)val & 0xffffffffffllu);
-		break;
-	case 6:
-		snprintf(buf, bufsz, "0x%012llx", *(uint64_t *)val & 0xffffffffffffllu);
-		break;
-	case 7:
-		snprintf(buf, bufsz, "0x%014llx", *(uint64_t *)val & 0xffffffffffffffllu);
-		break;
-	case 8:
-		snprintf(buf, bufsz, "0x%016llx", *(uint64_t *)val);
-		break;
-#endif
 	default:
 		IBWARN("bad int sz %d", valsz);
 		buf[0] = 0;
@@ -170,33 +148,18 @@ mad_dump_rhex(char *buf, int bufsz, void *val, int valsz)
 	case 4:
 		snprintf(buf, bufsz, "%08x", *(uint32_t *)val);
 		break;
-#if __WORDSIZE == 64
-	case 5:
-		snprintf(buf, bufsz, "%010lx", *(uint64_t *)val & 0xfffffffffflu);
-		break;
-	case 6:
-		snprintf(buf, bufsz, "%012lx", *(uint64_t *)val & 0xfffffffffffflu);
-		break;
-	case 7:
-		snprintf(buf, bufsz, "%014lx", *(uint64_t *)val & 0xfffffffffffffflu);
-		break;
-	case 8:
-		snprintf(buf, bufsz, "%016lx", *(uint64_t *)val);
-		break;
-#else
 	case 5:
-		snprintf(buf, bufsz, "%010llx", *(uint64_t *)val & 0xffffffffffllu);
+		snprintf(buf, bufsz, "%010" PRIx64, *(uint64_t *)val & 0xffffffffffllu);
 		break;
 	case 6:
-		snprintf(buf, bufsz, "%012llx", *(uint64_t *)val & 0xffffffffffffllu);
+		snprintf(buf, bufsz, "%012" PRIx64, *(uint64_t *)val & 0xffffffffffffllu);
 		break;
 	case 7:
-		snprintf(buf, bufsz, "%014llx", *(uint64_t *)val & 0xffffffffffffffllu);
+		snprintf(buf, bufsz, "%014" PRIx64, *(uint64_t *)val & 0xffffffffffffffllu);
 		break;
 	case 8:
-		snprintf(buf, bufsz, "%016llx", *(uint64_t *)val);
+		snprintf(buf, bufsz, "%016" PRIx64, *(uint64_t *)val);
 		break;
-#endif
 	default:
 		IBWARN("bad int sz %d", valsz);
 		buf[0] = 0;
diff --git a/libibmad/src/portid.c b/libibmad/src/portid.c
index dde1161..056b03d 100644
--- a/libibmad/src/portid.c
+++ b/libibmad/src/portid.c
@@ -41,6 +41,7 @@
 #include <pthread.h>
 #include <sys/time.h>
 #include <string.h>
+#include <inttypes.h>
 
 #include <mad.h>
 #include <infiniband/common.h>
@@ -70,11 +71,7 @@ portid2str(ib_portid_t *portid)
 	if (portid->lid > 0) {
 		s += sprintf(s, "Lid %d", portid->lid);
 		if (portid->grh_present) {
-#if __WORDSIZE == 64
-			s += sprintf(s, " Gid 0x%lx%lx",
-#else
-			s += sprintf(s, " Gid 0x%Lx%Lx",
-#endif
+			s += sprintf(s, " Gid 0x%" PRIx64 "%" PRIx64,
 					ntohll(*(uint64_t *)portid->gid),
 					ntohll(*(uint64_t *)(portid->gid+8)));
 		}
diff --git a/opensm/include/opensm/st.h b/opensm/include/opensm/st.h
index f32d3bb..30cc308 100644
--- a/opensm/include/opensm/st.h
+++ b/opensm/include/opensm/st.h
@@ -49,11 +49,7 @@
 #endif				/* __cplusplus */
 
 BEGIN_C_DECLS
-#if (__WORDSIZE == 64) || defined (_WIN64)
-#define st_ptr_t unsigned long long
-#else
 #define st_ptr_t unsigned long
-#endif
 typedef st_ptr_t st_data_t;
 
 #define ST_DATA_T_DEFINED
diff --git a/opensm/opensm/osm_db_pack.c b/opensm/opensm/osm_db_pack.c
index de5550f..bf56169 100644
--- a/opensm/opensm/osm_db_pack.c
+++ b/opensm/opensm/osm_db_pack.c
@@ -48,11 +48,7 @@ static inline void __osm_pack_guid(uint64_t guid, char *p_guid_str)
 
 static inline uint64_t __osm_unpack_guid(char *p_guid_str)
 {
-#if __WORDSIZE == 64
-	return (strtoul(p_guid_str, NULL, 0));
-#else
-	return (strtoull(p_guid_str, NULL, 0));
-#endif
+	return strtoull(p_guid_str, NULL, 0);
 }
 
 static inline void
diff --git a/opensm/opensm/osm_prtn_config.c b/opensm/opensm/osm_prtn_config.c
index 811b9eb..7889e35 100644
--- a/opensm/opensm/osm_prtn_config.c
+++ b/opensm/opensm/osm_prtn_config.c
@@ -56,14 +56,6 @@
 #include <opensm/osm_subnet.h>
 #include <opensm/osm_log.h>
 
-#if __WORDSIZE == 64
-#define STRTO_IB_NET64(str, end, base) strtoul(str, end, base)
-#else
-#define STRTO_IB_NET64(str, end, base) strtoull(str, end, base)
-#endif
-
-/*
- */
 struct part_conf {
 	osm_log_t *p_log;
 	osm_subn_t *p_subn;
@@ -230,7 +222,7 @@ static int partition_add_port(unsigned lineno, struct part_conf *conf,
 		guid = cl_ntoh64(conf->p_subn->sm_port_guid);
 	} else {
 		char *end;
-		guid = STRTO_IB_NET64(name, &end, 0);
+		guid = strtoull(name, &end, 0);
 		if (!guid || *end)
 			return -1;
 	}
diff --git a/opensm/opensm/osm_qos_parser.y b/opensm/opensm/osm_qos_parser.y
index edfa25c..8cae5f3 100644
--- a/opensm/opensm/osm_qos_parser.y
+++ b/opensm/opensm/osm_qos_parser.y
@@ -2420,11 +2420,7 @@ static char * __parser_strip_white(char * str)
 
 static void __parser_str2uint64(uint64_t * p_val, char * str)
 {
-#if __WORDSIZE == 64
-   *p_val = strtoul(str, NULL, 0);
-#else
    *p_val = strtoull(str, NULL, 0);
-#endif
 }
 
 /***************************************************
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index 2fe21bc..f9eb714 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -497,11 +497,7 @@ opts_unpack_net64(IN char *p_req_key,
 	uint64_t val;
 
 	if (!strcmp(p_req_key, p_key)) {
-#if __WORDSIZE == 64
-		val = strtoul(p_val_str, NULL, 0);
-#else
 		val = strtoull(p_val_str, NULL, 0);
-#endif
 		if (cl_hton64(val) != *p_val) {
 			char buff[128];
 			sprintf(buff,
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Sat Dec 22 03:35:32 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 11:35:32 +0000
Subject: [ofa-general] Re: [PATCH] opensm: Add "perfmgr print_counters node"
	to the console to print individual values
In-Reply-To: <20071217173014.037d4ae9.weiny2@llnl.gov>
References: <20071217173014.037d4ae9.weiny2@llnl.gov>
Message-ID: <20071222113532.GQ15888@sashak.voltaire.com>

On 17:30 Mon 17 Dec     , Ira Weiny wrote:
> From 14671d63a4315a98a7f8ed17ece2bd833aed39f2 Mon Sep 17 00:00:00 2001
> From: Ira K. Weiny <weiny2 at llnl.gov>
> Date: Fri, 14 Dec 2007 15:57:30 -0800
> Subject: [PATCH] Add "perfmgr print_counters node" to the console to print individual values
> 
> directly on the console.
> 
> Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Sat Dec 22 04:42:28 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 12:42:28 +0000
Subject: [ofa-general] Re: [PATCH 3/3] OpenSM: Fix incorrect identification
	of routing engine used
In-Reply-To: <1198024898.28105.66.camel@cardanus.llnl.gov>
References: <1198024898.28105.66.camel@cardanus.llnl.gov>
Message-ID: <20071222124228.GR15888@sashak.voltaire.com>

Hi Al,

Some notes below...

On 16:41 Tue 18 Dec     , Al Chu wrote:
> Hey Sasha,
> 
> And like my previous serious of patches, this patch 3/3 fixes several
> locations in the code that incorrectly determined what routing algorithm
> was used to route the subnet.
> 
> Thanks,
> Al
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From 4691381e56879de190c4968bd1e90c92995e3ac0 Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Tue, 18 Dec 2007 10:48:02 -0800
> Subject: [PATCH] fix incorrect identification of routing engine
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>
> ---
>  opensm/opensm/osm_dump.c           |    6 ++++--
>  opensm/opensm/osm_sa_path_record.c |   21 ++++++++++++---------
>  opensm/opensm/osm_ucast_lash.c     |    6 +++++-
>  3 files changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/opensm/opensm/osm_dump.c b/opensm/opensm/osm_dump.c
> index fa07f83..ed948d3 100644
> --- a/opensm/opensm/osm_dump.c
> +++ b/opensm/opensm/osm_dump.c
> @@ -49,6 +49,7 @@
>  #include <iba/ib_types.h>
>  #include <complib/cl_qmap.h>
>  #include <complib/cl_debug.h>
> +#include <complib/cl_passivelock.h>
>  #include <opensm/osm_opensm.h>
>  #include <opensm/osm_log.h>
>  #include <opensm/osm_node.h>
> @@ -150,8 +151,9 @@ static void dump_ucast_routes(cl_map_item_t * p_map_item, void *cxt)
>  		"LID    : Port : Hops : Optimal\n",
>  		cl_ntoh64(osm_node_get_node_guid(p_node)));
>  
> -	dor = (p_osm->routing_engine.name &&
> -	       (strcmp(p_osm->routing_engine.name, "dor") == 0));
> +        cl_plock_acquire(&p_osm->lock);
> +	dor = (p_osm->routing_engine_used == OSM_ROUTING_ENGINE_TYPE_DOR);
> +        cl_plock_release(&p_osm->lock);
>  
>  	for (lid_ho = 1; lid_ho <= max_lid_ho; lid_ho++) {
>  		fprintf(file, "0x%04X : ", lid_ho);
> diff --git a/opensm/opensm/osm_sa_path_record.c b/opensm/opensm/osm_sa_path_record.c
> index 4f20d8e..efd7c41 100644
> --- a/opensm/opensm/osm_sa_path_record.c
> +++ b/opensm/opensm/osm_sa_path_record.c
> @@ -240,6 +240,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv,
>  	const osm_physp_t *p_src_physp;
>  	const osm_physp_t *p_dest_physp;
>  	const osm_prtn_t *p_prtn = NULL;
> +	osm_opensm_t *p_osm;
>  	const ib_port_info_t *p_pi;
>  	ib_api_status_t status = IB_SUCCESS;
>  	ib_net16_t pkey;
> @@ -256,6 +257,7 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv,
>  	ib_slvl_table_t *p_slvl_tbl = NULL;
>  	osm_qos_level_t *p_qos_level = NULL;
>  	uint16_t valid_sl_mask = 0xffff;
> +	int is_lash;
>  
>  	OSM_LOG_ENTER(p_rcv->p_log, __osm_pr_rcv_get_path_parms);
>  
> @@ -266,6 +268,8 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv,
>  	p_src_physp = p_physp;
>  	p_pi = &p_physp->port_info;
>  
> +	p_osm = p_rcv->p_subn->p_osm;
> +
>  	mtu = ib_port_info_get_mtu_cap(p_pi);
>  	rate = ib_port_info_compute_rate(p_pi);
>  
> @@ -733,6 +737,10 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv,
>  	 * Set PathRecord SL.
>  	 */
>  
> +	cl_plock_acquire(&p_osm->lock);
> +	is_lash = (p_osm->routing_engine_used == OSM_ROUTING_ENGINE_TYPE_LASH);	
> +	cl_plock_release(&p_osm->lock);
> +

Here p_osm->lock is grabbed already (look at the caller 
osm_pr_rcv_process() - there p_rcv->p_lock refers &p_sm->lock (again,
I must admit that in OpenSM a lock and another pointers are copied over
various structures multiple times, this mess makes it very hard to track
where and which lock is actually used and we need to improve this)).

On Linux OpenSM doesn't deadlock here because cl_plock_acquire() is
implemented via pthread_rwlock_rdlock() (read-only lock which can be
acquired multiple times), on other systems or with another
cl_plock_acquire() implementation it could be not a true.

So I'm removing this locking here and in another "already locked"
sections.

>  	if (comp_mask & IB_PR_COMPMASK_SL) {
>  		/*
>  		 * Specific SL was requested
> @@ -750,10 +758,8 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv,
>  			goto Exit;
>  		}
>  
> -		if (p_rcv->p_subn->opt.routing_engine_name &&
> -		    strcmp(p_rcv->p_subn->opt.routing_engine_name, "lash") == 0
> -		    && osm_get_lash_sl(p_rcv->p_subn->p_osm, p_src_port,
> -				       p_dest_port) != sl) {
> +		if (is_lash
> +		    && osm_get_lash_sl(p_osm, p_src_port, p_dest_port) != sl) {
>  			osm_log(p_rcv->p_log, OSM_LOG_ERROR,
>  				"__osm_pr_rcv_get_path_parms: ERR 1F23: "
>  				"Required PathRecord SL (%u) doesn't "
> @@ -762,16 +768,13 @@ __osm_pr_rcv_get_path_parms(IN osm_pr_rcv_t * const p_rcv,
>  			goto Exit;
>  		}
>  
> -	} else if (p_rcv->p_subn->opt.routing_engine_name &&
> -		   strcmp(p_rcv->p_subn->opt.routing_engine_name,
> -			  "lash") == 0) {
> +	} else if (is_lash) {
>  		/*
>  		 * No specific SL in PathRecord request.
>  		 * If it's LASH routing - use its SL.
>  		 * slid and dest_lid are stored in network in lash.
>  		 */
> -		sl = osm_get_lash_sl(p_rcv->p_subn->p_osm,
> -				     p_src_port, p_dest_port);
> +		sl = osm_get_lash_sl(p_osm, p_src_port, p_dest_port);
>  
>  	} else if (p_qos_level && p_qos_level->sl_set) {
>  		/*
> diff --git a/opensm/opensm/osm_ucast_lash.c b/opensm/opensm/osm_ucast_lash.c
> index 10dda3a..71e8b56 100644
> --- a/opensm/opensm/osm_ucast_lash.c
> +++ b/opensm/opensm/osm_ucast_lash.c
> @@ -1425,8 +1425,12 @@ uint8_t osm_get_lash_sl(osm_opensm_t * p_osm, osm_port_t * p_src_port,
>  	unsigned src_id;
>  	osm_switch_t *p_sw;
>  
> -	if (p_osm->routing_engine.ucast_build_fwd_tables != lash_process)
> +	cl_plock_acquire(&p_osm->lock);
> +	if (p_osm->routing_engine_used != OSM_ROUTING_ENGINE_TYPE_LASH) {
> +		cl_plock_release(&p_osm->lock);
>  		return OSM_DEFAULT_SL;
> +	}
> +	cl_plock_release(&p_osm->lock);

The same story is here too.

Sasha

>  
>  	p_sw = get_osm_switch_from_port(p_dst_port);
>  	if (!p_sw)
> -- 
> 1.5.1
> 


From sashak at voltaire.com  Sat Dec 22 05:28:04 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 13:28:04 +0000
Subject: [ofa-general] Re: [PATCH 1/3] OpenSM: osm routing engine type
In-Reply-To: <1198024892.28105.64.camel@cardanus.llnl.gov>
References: <1198024892.28105.64.camel@cardanus.llnl.gov>
Message-ID: <20071222132804.GS15888@sashak.voltaire.com>

On 16:41 Tue 18 Dec     , Al Chu wrote:
> Hey Sasha,
> 
> Here's the first patch in my set of patches to fix the incorrect routing
> engine reporting problem described in the earlier thread (thread: [PATCH
> 2/3] OpenSM: Fix incorrect reporting of routing engine/algorithm used).
> It's been redone as discussed in the thread.
> 
> This patch 1/3 just defines the enumeration of routing engine types, a
> few functions for mapping between enums and strings, and sticks a value
> into osm_opensm_t for tracking the routing engine.  The only interesting
> thing to note is due to current implementation, the function
> osm_routing_engine_type() considers a NULL pointer and the string "null"
> to mean that the minhop algorithm was specified.
> 
> Thanks,
> Al
> 
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From 2b8ffc06d734d3da9e5e5abba766548052aac923 Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Tue, 18 Dec 2007 10:47:31 -0800
> Subject: [PATCH] support osm_routing_engine_type_t enumeration
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Sat Dec 22 05:28:40 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 13:28:40 +0000
Subject: [ofa-general] Re: [PATCH 2/3] OpenSM: Fix incorrect reporting of
	routing engine used
In-Reply-To: <1198024894.28105.65.camel@cardanus.llnl.gov>
References: <1198024894.28105.65.camel@cardanus.llnl.gov>
Message-ID: <20071222132840.GT15888@sashak.voltaire.com>

On 16:41 Tue 18 Dec     , Al Chu wrote:
> Hey Sasha,
> 
> This patch 2/3 fixes the incorrect reporting of what routing engine was
> used in the logs and in the console.  Based on which routing engine
> succeeded in osm_ucast_mgr_process(), the result is stored in the
> 'routing_engine_used' and that result is used for the eventual output.
> The lock in print_status() is across all p_osm data now and now uses
> p_osm->lock.  The logic has been reverted in osm_ucast_mgr_process().
> 
> Some "special case" handling had to be done in osm_ucast_mgr_process()
> to determine if a routing engine suceeded or failed.  It's not pretty.
> I figure when routing engine chains are supported later on some re-org
> in the routing engine code will have to be done, so this could be fixed
> more properly at that time.
> 
> Thanks,
> Al
> 
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From 84571863cfe2ca2f427ee900b83c109afaca1897 Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Tue, 18 Dec 2007 10:47:47 -0800
> Subject: [PATCH] fix incorrect reporting of routing engine
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>

Applied. Thanks.

Sasha


From sashak at voltaire.com  Sat Dec 22 05:29:20 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 13:29:20 +0000
Subject: [ofa-general] Re: [PATCH 3/3] OpenSM: Fix incorrect identification
	of routing engine used
In-Reply-To: <1198024898.28105.66.camel@cardanus.llnl.gov>
References: <1198024898.28105.66.camel@cardanus.llnl.gov>
Message-ID: <20071222132920.GU15888@sashak.voltaire.com>

On 16:41 Tue 18 Dec     , Al Chu wrote:
> Hey Sasha,
> 
> And like my previous serious of patches, this patch 3/3 fixes several
> locations in the code that incorrectly determined what routing algorithm
> was used to route the subnet.
> 
> Thanks,
> Al
> -- 
> Albert Chu
> chu11 at llnl.gov
> 925-422-5311
> Computer Scientist
> High Performance Systems Division
> Lawrence Livermore National Laboratory

> From 4691381e56879de190c4968bd1e90c92995e3ac0 Mon Sep 17 00:00:00 2001
> From: Albert L. Chu <chu11 at llnl.gov>
> Date: Tue, 18 Dec 2007 10:48:02 -0800
> Subject: [PATCH] fix incorrect identification of routing engine
> 
> 
> Signed-off-by: Albert L. Chu <chu11 at llnl.gov>

Applied (with noted changes). Thanks.

Sasha


From sashak at voltaire.com  Sat Dec 22 05:33:50 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 13:33:50 +0000
Subject: [ofa-general] [PATCH] opensm: some micro-optimizations
In-Reply-To: <1198024892.28105.64.camel@cardanus.llnl.gov>
References: <1198024892.28105.64.camel@cardanus.llnl.gov>
Message-ID: <20071222133350.GV15888@sashak.voltaire.com>


- switch() instead of if/else if/else in osm_routing_engine_type_str()
- flow simplifications in osm_ucast_mgr_process()

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_opensm.c    |   21 ++++++++++++---------
 opensm/opensm/osm_ucast_mgr.c |   31 ++++++++++---------------------
 2 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/opensm/opensm/osm_opensm.c b/opensm/opensm/osm_opensm.c
index 7a73040..a78307c 100644
--- a/opensm/opensm/osm_opensm.c
+++ b/opensm/opensm/osm_opensm.c
@@ -90,22 +90,25 @@ const static struct routing_engine_module routing_modules[] = {
  **********************************************************************/
 const char *osm_routing_engine_type_str(IN osm_routing_engine_type_t type)
 {
-	if (type == OSM_ROUTING_ENGINE_TYPE_NONE)
+	switch (type) {
+	case OSM_ROUTING_ENGINE_TYPE_NONE:
 		return "none";
-	else if (type == OSM_ROUTING_ENGINE_TYPE_MINHOP)
+	case OSM_ROUTING_ENGINE_TYPE_MINHOP:
 		return "minhop";
-	else if (type == OSM_ROUTING_ENGINE_TYPE_UPDN)
+	case OSM_ROUTING_ENGINE_TYPE_UPDN:
 		return "updn";
-	else if (type == OSM_ROUTING_ENGINE_TYPE_FILE)
+	case OSM_ROUTING_ENGINE_TYPE_FILE:
 		return "file";
-	else if (type == OSM_ROUTING_ENGINE_TYPE_FTREE)
+	case OSM_ROUTING_ENGINE_TYPE_FTREE:
 		return "ftree";
-	else if (type == OSM_ROUTING_ENGINE_TYPE_LASH)
+	case OSM_ROUTING_ENGINE_TYPE_LASH:
 		return "lash";
-	else if (type == OSM_ROUTING_ENGINE_TYPE_DOR)
+	case OSM_ROUTING_ENGINE_TYPE_DOR:
 		return "dor";
-	else
-		return "unknown";
+	default:
+		break;
+	}
+	return "unknown";
 }
 
 /**********************************************************************
diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index dd914e0..1841219 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -793,25 +793,17 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
 
 	p_mgr->any_change = FALSE;
 
-	if (p_routing_eng->build_lid_matrices) {
-		blm = p_routing_eng->build_lid_matrices(p_routing_eng->context);
-		if (blm)
-			osm_ucast_mgr_build_lid_matrices(p_mgr);
-	} else
+	if (!p_routing_eng->build_lid_matrices ||
+	    (blm = p_routing_eng->build_lid_matrices(p_routing_eng->context)))
 		osm_ucast_mgr_build_lid_matrices(p_mgr);
 
 	/*
 	   Now that the lid matrices have been built, we can
 	   build and download the switch forwarding tables.
 	 */
-	if (p_routing_eng->ucast_build_fwd_tables) {
-		ubft =
-		    p_routing_eng->ucast_build_fwd_tables(p_routing_eng->
-							  context);
-		if (ubft)
-			cl_qmap_apply_func(p_sw_guid_tbl,
-					   __osm_ucast_mgr_process_tbl, p_mgr);
-	} else
+	if (!p_routing_eng->ucast_build_fwd_tables ||
+	    (ubft =
+	     p_routing_eng->ucast_build_fwd_tables(p_routing_eng->context)))
 		cl_qmap_apply_func(p_sw_guid_tbl, __osm_ucast_mgr_process_tbl,
 				   p_mgr);
 
@@ -819,14 +811,11 @@ osm_signal_t osm_ucast_mgr_process(IN osm_ucast_mgr_t * const p_mgr)
 	if (p_routing_eng->name && (strcmp(p_routing_eng->name, "file") == 0)
 	    && (!blm || !ubft))
 		p_osm->routing_engine_used = OSM_ROUTING_ENGINE_TYPE_FILE;
-	else {
-		if (!blm && !ubft)
-			p_osm->routing_engine_used =
-			    osm_routing_engine_type(p_routing_eng->name);
-		else
-			p_osm->routing_engine_used =
-			    OSM_ROUTING_ENGINE_TYPE_MINHOP;
-	}
+	else if (!blm && !ubft)
+		p_osm->routing_engine_used =
+		    osm_routing_engine_type(p_routing_eng->name);
+	else
+		p_osm->routing_engine_used = OSM_ROUTING_ENGINE_TYPE_MINHOP;
 
 	osm_log(p_mgr->p_log, OSM_LOG_INFO,
 		"osm_ucast_mgr_process: "
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Sat Dec 22 05:35:49 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 22 Dec 2007 13:35:49 +0000
Subject: [ofa-general] [PATCH] opensm/updn: report fallback properly
In-Reply-To: <1198024892.28105.64.camel@cardanus.llnl.gov>
References: <1198024892.28105.64.camel@cardanus.llnl.gov>
Message-ID: <20071222133549.GW15888@sashak.voltaire.com>


Report fallback to upper layer (osm_ucast_mgr_process()) on up/down algo
failures or when root nodes were not found.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---
 opensm/opensm/osm_ucast_updn.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c
index 836fa62..5dd4705 100644
--- a/opensm/opensm/osm_ucast_updn.c
+++ b/opensm/opensm/osm_ucast_updn.c
@@ -570,6 +570,7 @@ static int __osm_updn_call(void *ctx)
 	updn_t *p_updn = ctx;
 	cl_map_item_t *p_item;
 	osm_switch_t *p_sw;
+	int ret = 0;
 
 	OSM_LOG_ENTER(&p_updn->p_osm->log, __osm_updn_call);
 
@@ -600,16 +601,18 @@ static int __osm_updn_call(void *ctx)
 	if (p_updn->updn_ucast_reg_inputs.num_guids > 0) {
 		osm_log(&(p_updn->p_osm->log), OSM_LOG_DEBUG,
 			"__osm_updn_call: " "activating UPDN algorithm\n");
-		__osm_subn_calc_up_down_min_hop_table(p_updn->
+		ret = __osm_subn_calc_up_down_min_hop_table(p_updn->
 						      updn_ucast_reg_inputs.
 						      num_guids,
 						      p_updn->
 						      updn_ucast_reg_inputs.
 						      guid_list, p_updn);
-	} else
+	} else {
 		osm_log(&p_updn->p_osm->log, OSM_LOG_INFO,
 			"__osm_updn_call: "
 			"disabling UPDN algorithm, no root nodes were found\n");
+		ret = 1;
+	}
 
 	p_item = cl_qmap_head(&p_updn->p_osm->subn.sw_guid_tbl);
 	while (p_item != cl_qmap_end(&p_updn->p_osm->subn.sw_guid_tbl)) {
@@ -619,7 +622,7 @@ static int __osm_updn_call(void *ctx)
 	}
 
 	OSM_LOG_EXIT(&p_updn->p_osm->log);
-	return 0;
+	return ret;
 }
 
 /**********************************************************************
-- 
1.5.3.4.206.g58ba4


From pw at osc.edu  Sat Dec 22 06:56:12 2007
From: pw at osc.edu (Pete Wyckoff)
Date: Sat, 22 Dec 2007 09:56:12 -0500
Subject: [ofa-general] [PATCH] IB/srp: add identifying information to
	log messages
In-Reply-To: <1198269544.9979.26.camel@lap75545.ornl.gov>
References: <1198269544.9979.26.camel@lap75545.ornl.gov>
Message-ID: <20071222145612.GA10085@osc.edu>

dillowda at ornl.gov wrote on Fri, 21 Dec 2007 15:39 -0500:
> When you have multiple targets, it gets really confusing when you try to
> track down who did a reset when there is no identifying information in
> the log message, especially when the same extension ID is mapped through
> two different local IB ports. So, add an identifier that can be used to
> track back to which local IB port/remote target pair is the one having
> problems.
> 
> Signed-off-by: David Dillow <dillowda at ornl.gov>
> ---
> This is against the previous three patches to respect the credit limit
> and allow scatter/gather. I may apply with offsets without those.
> 
>  ib_srp.c |   79 +++++++++++++++++++++++++++++++++++++++++----------------------
>  1 file changed, 52 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 4f58f94..717f186 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -272,7 +272,8 @@ static void srp_path_rec_completion(int status,
>  
>  	target->status = status;
>  	if (status)
> -		printk(KERN_ERR PFX "Got failed path rec status %d\n", status);
> +		printk(KERN_ERR PFX "scsi%d: Got failed path rec status %d\n",
> +		       target->scsi_host->host_no, status);

Good idea to fix these.

Could you use the standard dev_err(), dev_printk() and friends here
instead?  dev = &target->scsi_host->shost_gendev.  In fact, for
struct Scsi_host, you can do one better and use shost_printk().

		-- Pete


From pw at osc.edu  Sat Dec 22 07:15:59 2007
From: pw at osc.edu (Pete Wyckoff)
Date: Sat, 22 Dec 2007 10:15:59 -0500
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <1198275532.9979.43.camel@lap75545.ornl.gov>
References: <1198273973.9979.34.camel@lap75545.ornl.gov>
	<1198275532.9979.43.camel@lap75545.ornl.gov>
Message-ID: <20071222151559.GB10085@osc.edu>

dave at thedillows.org wrote on Fri, 21 Dec 2007 17:18 -0500:
> On Fri, 2007-12-21 at 16:52 -0500, David Dillow wrote:
> > I'm getting the following oops when doing the following commands:
> > 
> > modprobe ib_srp
> > <add targets(s) to ib_srp using sysfs>
> > rmmod ib_srp
> > modprobe ib_srp
> > <OOPS>
> > 
> > I'm going to try and track down how the list is getting corrupted; it
> > looks like attribute_container_list in
> > drivers/base/attribute_container.c is the one getting corrupted.
> 
> Ok, found the culprit, now to figure out the motive and fix it.
> 
> ib_srp's srp_cleanup_module calls srp_release_transport(), which calls
> transport_container_unregister() for the rport_attr_cont member of
> struct srp_internal.
> 
> That last unregister call is returning -EBUSY, but it gets ignored, and
> the list node gets erased (or just reused) when the module's text/memory
> is free'd.
> 
> Now, to see if ib_srp should be waiting for everything to be destroyed
> before calling srp_release_transport(), or if it is just not removing
> some attributes properly.

I don't see where srp_cleanup_module() is calling srp_remove_host().
That is the likely way that transport devices should be made to go
away.  Something on the order of srp_remove_work().

Or srp_remove_one() except with a call to srp_remove_host() may be
necessary.  In fact, maybe just adding that call will fix it, as
ib_unregister_client should drive the remove function.  Guesses, all
this.

		-- Pete


From tomof at acm.org  Sat Dec 22 08:41:03 2007
From: tomof at acm.org (FUJITA Tomonori)
Date: Sun, 23 Dec 2007 01:41:03 +0900
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <1198275532.9979.43.camel@lap75545.ornl.gov>
References: <1198273973.9979.34.camel@lap75545.ornl.gov>
	<1198275532.9979.43.camel@lap75545.ornl.gov>
Message-ID: <20071223014407L.tomof@acm.org>

On Fri, 21 Dec 2007 17:18:52 -0500
David Dillow <dave at thedillows.org> wrote:

> 
> On Fri, 2007-12-21 at 16:52 -0500, David Dillow wrote:
> > I'm getting the following oops when doing the following commands:
> > 
> > modprobe ib_srp
> > <add targets(s) to ib_srp using sysfs>
> > rmmod ib_srp
> > modprobe ib_srp
> > <OOPS>
> > 
> > I'm going to try and track down how the list is getting corrupted; it
> > looks like attribute_container_list in
> > drivers/base/attribute_container.c is the one getting corrupted.
> 
> Ok, found the culprit, now to figure out the motive and fix it.
> 
> ib_srp's srp_cleanup_module calls srp_release_transport(), which calls
> transport_container_unregister() for the rport_attr_cont member of
> struct srp_internal.
> 
> That last unregister call is returning -EBUSY, but it gets ignored, and
> the list node gets erased (or just reused) when the module's text/memory
> is free'd.

transport_container_unregister(&i->rport_attr_cont) should not fail here.

It fails because there is still a srp rport.

I think that as Pete pointed out, srp_remove_one needs to call
srp_remove_host.

Can you try this?

Thanks,

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 950228f..bdb6f85 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device)
 
 		list_for_each_entry_safe(target, tmp_target,
 					 &host->target_list, list) {
+			srp_remove_host(target->scsi_host);
 			scsi_remove_host(target->scsi_host);
 			srp_disconnect_target(target);
 			ib_destroy_cm_id(target->cm_id);


From akstcambafrancemnsdgs at ambafrance.hu  Sat Dec 22 08:52:12 2007
From: akstcambafrancemnsdgs at ambafrance.hu (Bryce Dyer)
Date: Sat, 22 Dec 2007 17:52:12 +0100
Subject: [ofa-general] Men & Women Designer Footwear from Chanel Gucci Prada
	Dior Versace
Message-ID: <009650693.39562062755668@ambafrance.hu>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071222/6871fa02/attachment.html>

From akstcambafrancemnsdgs at ambafrance.hu  Sat Dec 22 08:52:12 2007
From: akstcambafrancemnsdgs at ambafrance.hu (Bryce Dyer)
Date: Sat, 22 Dec 2007 17:52:12 +0100
Subject: [ofa-general] Men & Women Designer Footwear from Chanel Gucci Prada
	Dior Versace
Message-ID: <009650693.39562062755668@ambafrance.hu>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071222/6871fa02/attachment-0001.html>

From a-all at aerofoto.com  Sat Dec 22 09:22:46 2007
From: a-all at aerofoto.com (Opal Espinosa)
Date: , 23 Dec 2007 02:22:46 +0900
Subject: [ofa-general] I was looking for you
Message-ID: <01c8450a$b4dab240$1b4b667c@a-all>

Hello! I am bored this evening. I am nice girl that would like to chat with you. Email me at Isabella at ShineBal.info only, because I am using my friend's email to write this. Mind me sending some of my pictures to you?


From dave at thedillows.org  Sat Dec 22 09:32:48 2007
From: dave at thedillows.org (Dave Dillow)
Date: Sat, 22 Dec 2007 12:32:48 -0500
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <20071223014407L.tomof@acm.org>
References: <1198273973.9979.34.camel@lap75545.ornl.gov>
	<1198275532.9979.43.camel@lap75545.ornl.gov>
	<20071223014407L.tomof@acm.org>
Message-ID: <1198344768.10564.1.camel@obelisk.thedillows.org>

On Sun, 2007-12-23 at 01:41 +0900, FUJITA Tomonori wrote:
> I think that as Pete pointed out, srp_remove_one needs to call
> srp_remove_host.
> 
> Can you try this?

If I need to escape family during the holidays, I'll test it from home.
Otherwise I'll be able to test on Wednesday.

Thanks for the patch,
Dave

> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 950228f..bdb6f85 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device)
>  
>  		list_for_each_entry_safe(target, tmp_target,
>  					 &host->target_list, list) {
> +			srp_remove_host(target->scsi_host);
>  			scsi_remove_host(target->scsi_host);
>  			srp_disconnect_target(target);
>  			ib_destroy_cm_id(target->cm_id);


From dillowda at ornl.gov  Sat Dec 22 10:14:12 2007
From: dillowda at ornl.gov (Dave Dillow)
Date: Sat, 22 Dec 2007 13:14:12 -0500
Subject: [ofa-general] [PATCH] IB/srp: add identifying information to
	log messages
In-Reply-To: <20071222145612.GA10085@osc.edu>
References: <1198269544.9979.26.camel@lap75545.ornl.gov>
	<20071222145612.GA10085@osc.edu>
Message-ID: <1198347252.10564.5.camel@obelisk.thedillows.org>

On Sat, 2007-12-22 at 09:56 -0500, Pete Wyckoff wrote:
> dillowda at ornl.gov wrote on Fri, 21 Dec 2007 15:39 -0500:
> > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> > index 4f58f94..717f186 100644
> > --- a/drivers/infiniband/ulp/srp/ib_srp.c
> > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> > @@ -272,7 +272,8 @@ static void srp_path_rec_completion(int status,
> >  
> >  	target->status = status;
> >  	if (status)
> > -		printk(KERN_ERR PFX "Got failed path rec status %d\n", status);
> > +		printk(KERN_ERR PFX "scsi%d: Got failed path rec status %d\n",
> > +		       target->scsi_host->host_no, status);
> 
> Good idea to fix these.
> 
> Could you use the standard dev_err(), dev_printk() and friends here
> instead?  dev = &target->scsi_host->shost_gendev.  In fact, for

Did you mean to use those just for srp_path_rec_completion() or all of
them?

> struct Scsi_host, you can do one better and use shost_printk().

I'll look into using this. It'll be nice if it cleans things up.

The big thing I'm looking to implement is a stable identifier for a
given HBA port -> target port mapping, as the probing can be done
asynchronously, as can aborts/resets and such.

Dave


From dwsittm at sitt.at  Sat Dec 22 10:12:07 2007
From: dwsittm at sitt.at (Keri Diggs)
Date: , 23 Dec 2007 02:12:07 +0800
Subject: [ofa-general] Hot sex with Viagra pills
Message-ID: <01c84509$37da1340$c867aedc@dwsittm>

Do you love sex but have ed problems? 
Forget about them with Viagra or Cialis meds!
Save your money, buy high-quality meds at low price!

http://geocities.com/MabelSpencer78/

Instant shipping and quality are guaranteed! 


From ybgiuq at bmyk.com  Sat Dec 22 13:18:45 2007
From: ybgiuq at bmyk.com (Kenneth Hobson)
Date: Sat, 22 Dec 2007 13:18:45 -0800
Subject: [ofa-general] Get any soft you need without delays.
Message-ID: <01c8449d$2ddbe880$fe625618@ybgiuq>

  Need some software urgently? Purchase, download and install right now! Software in English, German, French, Italian, and Spanish for IBM PC and Macintosh! Cheap prices give you the possibility to save or buy more software than you can afford purchasing software on a CD!

 After purchasing you can install our software on any computer you'd like since it's not restricted. Access to all updates! Money back guarantee!

http://geocities.com/DevinMccormick26/

   Take this time and money saving offer!


From arthur.jones at qlogic.com  Sat Dec 22 14:24:34 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Sat, 22 Dec 2007 14:24:34 -0800
Subject: [ofa-general] [PATCH] IB/ipath - more cleanups for 2.6.25
Message-ID: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>

hi roland,  here are some cleanups for 2.6.25...

these changes are avail for git pull from:

git://git.qlogic.com/ipath-linux-2.6 for-roland

arthur


From arthur.jones at qlogic.com  Sat Dec 22 14:24:39 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Sat, 22 Dec 2007 14:24:39 -0800
Subject: [ofa-general] [PATCH 1/5] IB/ipath - fix RNR NAK handling
In-Reply-To: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
References: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071222222439.17599.12330.stgit@eng-46.internal.keyresearch.com>

From: Ralph Campbell <ralph.campbell at qlogic.com>

This patch fixes a couple of minor problems with RNR NAK handling.
The insertion sort was causing extra delay when inserting ahead
vs. behind an existing entry on the list.
A resend of a first packet of a message which is still not ready,
needs another RNR NAK (i.e., it was suppressed when it shouldn't).
Also, the resend tasklet doesn't need to be woken up unless the
ACK/NAK actually indicates progress has been made.

Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_rc.c  |   18 +++++++-----------
 drivers/infiniband/hw/ipath/ipath_ruc.c |    6 +++++-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_rc.c b/drivers/infiniband/hw/ipath/ipath_rc.c
index 120a61b..459e46e 100644
--- a/drivers/infiniband/hw/ipath/ipath_rc.c
+++ b/drivers/infiniband/hw/ipath/ipath_rc.c
@@ -647,6 +647,7 @@ static void send_rc_ack(struct ipath_qp *qp)
 
 queue_ack:
 	spin_lock_irqsave(&qp->s_lock, flags);
+	dev->n_rc_qacks++;
 	qp->s_flags |= IPATH_S_ACK_PENDING;
 	qp->s_nak_state = qp->r_nak_state;
 	qp->s_ack_psn = qp->r_ack_psn;
@@ -798,11 +799,13 @@ bail:
 
 static inline void update_last_psn(struct ipath_qp *qp, u32 psn)
 {
-	if (qp->s_wait_credit) {
-		qp->s_wait_credit = 0;
-		tasklet_hi_schedule(&qp->s_task);
+	if (qp->s_last_psn != psn) {
+		qp->s_last_psn = psn;
+		if (qp->s_wait_credit) {
+			qp->s_wait_credit = 0;
+			tasklet_hi_schedule(&qp->s_task);
+		}
 	}
-	qp->s_last_psn = psn;
 }
 
 /**
@@ -1653,13 +1656,6 @@ void ipath_rc_rcv(struct ipath_ibdev *dev, struct ipath_ib_header *hdr,
 	case OP(SEND_FIRST):
 		if (!ipath_get_rwqe(qp, 0)) {
 		rnr_nak:
-			/*
-			 * A RNR NAK will ACK earlier sends and RDMA writes.
-			 * Don't queue the NAK if a RDMA read or atomic
-			 * is pending though.
-			 */
-			if (qp->r_nak_state)
-				goto done;
 			qp->r_nak_state = IB_RNR_NAK | qp->r_min_rnr_timer;
 			qp->r_ack_psn = qp->r_psn;
 			goto send_ack;
diff --git a/drivers/infiniband/hw/ipath/ipath_ruc.c b/drivers/infiniband/hw/ipath/ipath_ruc.c
index 1b4f7e1..a59bdbd 100644
--- a/drivers/infiniband/hw/ipath/ipath_ruc.c
+++ b/drivers/infiniband/hw/ipath/ipath_ruc.c
@@ -98,11 +98,15 @@ void ipath_insert_rnr_queue(struct ipath_qp *qp)
 		while (qp->s_rnr_timeout >= nqp->s_rnr_timeout) {
 			qp->s_rnr_timeout -= nqp->s_rnr_timeout;
 			l = l->next;
-			if (l->next == &dev->rnrwait)
+			if (l->next == &dev->rnrwait) {
+				nqp = NULL;
 				break;
+			}
 			nqp = list_entry(l->next, struct ipath_qp,
 					 timerwait);
 		}
+		if (nqp)
+			nqp->s_rnr_timeout -= qp->s_rnr_timeout;
 		list_add(&qp->timerwait, l);
 	}
 	spin_unlock_irqrestore(&dev->pending_lock, flags);


From arthur.jones at qlogic.com  Sat Dec 22 14:24:44 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Sat, 22 Dec 2007 14:24:44 -0800
Subject: [ofa-general] [PATCH 2/5] IB/ipath - cleanup ipath_get_egrbuf
In-Reply-To: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
References: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071222222444.17599.87814.stgit@eng-46.internal.keyresearch.com>

From: Ralph Campbell <ralph.campbell at qlogic.com>

Remove an unused parameter and fixup the comment

Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_driver.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c
index 5b84be1..247403f 100644
--- a/drivers/infiniband/hw/ipath/ipath_driver.c
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c
@@ -1006,12 +1006,10 @@ static void get_rhf_errstring(u32 err, char *msg, size_t len)
  * ipath_get_egrbuf - get an eager buffer
  * @dd: the infinipath device
  * @bufnum: the eager buffer to get
- * @err: unused
  *
  * must only be called if ipath_pd[port] is known to be allocated
  */
-static inline void *ipath_get_egrbuf(struct ipath_devdata *dd, u32 bufnum,
-				     int err)
+static inline void *ipath_get_egrbuf(struct ipath_devdata *dd, u32 bufnum)
 {
 	return dd->ipath_port0_skbinfo ?
 		(void *) dd->ipath_port0_skbinfo[bufnum].skb->data : NULL;
@@ -1159,7 +1157,7 @@ reloop:
 			etail = ipath_hdrget_index((__le32 *) rc);
 			if (tlen > sizeof(*hdr) ||
 			    etype == RCVHQ_RCV_TYPE_NON_KD)
-				ebuf = ipath_get_egrbuf(dd, etail, 0);
+				ebuf = ipath_get_egrbuf(dd, etail);
 		}
 
 		/*


From arthur.jones at qlogic.com  Sat Dec 22 14:24:49 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Sat, 22 Dec 2007 14:24:49 -0800
Subject: [ofa-general] [PATCH 3/5] IB/ipath - kreceive uses portdata rather
	than devdata
In-Reply-To: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
References: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071222222449.17599.41146.stgit@eng-46.internal.keyresearch.com>

From: Ralph Campbell <ralph.campbell at qlogic.com>

kreceive is now portdata * instead of devdata * and
other kreceive related cleanups...

Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_driver.c    |   18 ++++++++++--------
 drivers/infiniband/hw/ipath/ipath_file_ops.c  |   11 ++++++++---
 drivers/infiniband/hw/ipath/ipath_init_chip.c |    7 +++----
 drivers/infiniband/hw/ipath/ipath_intr.c      |    8 +++++---
 drivers/infiniband/hw/ipath/ipath_kernel.h    |   17 ++++++++++++++---
 drivers/infiniband/hw/ipath/ipath_stats.c     |   11 ++++++-----
 6 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c
index 247403f..4b37e46 100644
--- a/drivers/infiniband/hw/ipath/ipath_driver.c
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c
@@ -1101,13 +1101,14 @@ static void ipath_rcv_hdrerr(struct ipath_devdata *dd,
 
 /*
  * ipath_kreceive - receive a packet
- * @dd: the infinipath device
+ * @pd: the infinipath port
  *
  * called from interrupt handler for errors or receive interrupt
  */
-void ipath_kreceive(struct ipath_devdata *dd)
+void ipath_kreceive(struct ipath_portdata *pd)
 {
 	u64 *rc;
+	struct ipath_devdata *dd = pd->port_dd;
 	void *ebuf;
 	const u32 rsize = dd->ipath_rcvhdrentsize;	/* words */
 	const u32 maxcnt = dd->ipath_rcvhdrcnt * rsize;	/* words */
@@ -1122,8 +1123,8 @@ void ipath_kreceive(struct ipath_devdata *dd)
 		goto bail;
 	}
 
-	l = dd->ipath_port0head;
-	hdrqtail = (u32) le64_to_cpu(*dd->ipath_hdrqtailptr);
+	l = pd->port_head;
+	hdrqtail = ipath_get_rcvhdrtail(pd);
 	if (l == hdrqtail)
 		goto bail;
 
@@ -1132,7 +1133,7 @@ reloop:
 		u32 qp;
 		u8 *bthbytes;
 
-		rc = (u64 *) (dd->ipath_pd[0]->port_rcvhdrq + (l << 2));
+		rc = (u64 *) (pd->port_rcvhdrq + (l << 2));
 		hdr = (struct ipath_message_header *)&rc[1];
 		/*
 		 * could make a network order version of IPATH_KD_QP, and
@@ -1242,7 +1243,7 @@ reloop:
 		 * earlier packets, we "almost" guarantee we have covered
 		 * that case.
 		 */
-		u32 hqtail = (u32)le64_to_cpu(*dd->ipath_hdrqtailptr);
+		u32 hqtail = ipath_get_rcvhdrtail(pd);
 		if (hqtail != hdrqtail) {
 			hdrqtail = hqtail;
 			reloop = 1; /* loop 1 extra time at most */
@@ -1252,7 +1253,7 @@ reloop:
 
 	pkttot += i;
 
-	dd->ipath_port0head = l;
+	pd->port_head = l;
 
 	if (pkttot > ipath_stats.sps_maxpkts_call)
 		ipath_stats.sps_maxpkts_call = pkttot;
@@ -1602,7 +1603,8 @@ int ipath_create_rcvhdrq(struct ipath_devdata *dd,
 
 	/* clear for security and sanity on each use */
 	memset(pd->port_rcvhdrq, 0, pd->port_rcvhdrq_size);
-	memset(pd->port_rcvhdrtail_kvaddr, 0, PAGE_SIZE);
+	if (pd->port_rcvhdrtail_kvaddr)
+		memset(pd->port_rcvhdrtail_kvaddr, 0, PAGE_SIZE);
 
 	/*
 	 * tell chip each time we init it, even if we are re-using previous
diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c
index 92dae6f..0624a60 100644
--- a/drivers/infiniband/hw/ipath/ipath_file_ops.c
+++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c
@@ -742,7 +742,8 @@ static int ipath_manage_rcvq(struct ipath_portdata *pd, unsigned subport,
 		 * updated and correct itself, even in the face of software
 		 * bugs.
 		 */
-		*(volatile u64 *)pd->port_rcvhdrtail_kvaddr = 0;
+		if (pd->port_rcvhdrtail_kvaddr)
+			ipath_clear_rcvhdrtail(pd);
 		set_bit(INFINIPATH_R_PORTENABLE_SHIFT + pd->port_port,
 			&dd->ipath_rcvctrl);
 	} else
@@ -1400,7 +1401,10 @@ static unsigned int ipath_poll_next(struct ipath_portdata *pd,
 	pollflag = ipath_poll_hdrqfull(pd);
 
 	head = ipath_read_ureg32(dd, ur_rcvhdrhead, pd->port_port);
-	tail = *(volatile u64 *)pd->port_rcvhdrtail_kvaddr;
+	if (pd->port_rcvhdrtail_kvaddr)
+		tail = ipath_get_rcvhdrtail(pd);
+	else
+		tail = ipath_read_ureg32(dd, ur_rcvhdrtail, pd->port_port);
 
 	if (head != tail)
 		pollflag |= POLLIN | POLLRDNORM;
@@ -1941,7 +1945,8 @@ static int ipath_do_user_init(struct file *fp,
 	 * We explictly set the in-memory copy to 0 beforehand, so we don't
 	 * have to wait to be sure the DMA update has happened.
 	 */
-	*(volatile u64 *)pd->port_rcvhdrtail_kvaddr = 0ULL;
+	if (pd->port_rcvhdrtail_kvaddr)
+		ipath_clear_rcvhdrtail(pd);
 	set_bit(INFINIPATH_R_PORTENABLE_SHIFT + pd->port_port,
 		&dd->ipath_rcvctrl);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl,
diff --git a/drivers/infiniband/hw/ipath/ipath_init_chip.c b/drivers/infiniband/hw/ipath/ipath_init_chip.c
index 1c65ab9..e161cad 100644
--- a/drivers/infiniband/hw/ipath/ipath_init_chip.c
+++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c
@@ -526,12 +526,11 @@ static void enable_chip(struct ipath_devdata *dd,
 	 */
 	val = ipath_read_ureg32(dd, ur_rcvegrindextail, 0);
 	(void)ipath_write_ureg(dd, ur_rcvegrindexhead, val, 0);
-	dd->ipath_port0head = ipath_read_ureg32(dd, ur_rcvhdrtail, 0);
 
 	/* Initialize so we interrupt on next packet received */
 	(void)ipath_write_ureg(dd, ur_rcvhdrhead,
 			       dd->ipath_rhdrhead_intr_off |
-			       dd->ipath_port0head, 0);
+			       dd->ipath_pd[0]->port_head, 0);
 
 	/*
 	 * by now pioavail updates to memory should have occurred, so
@@ -693,7 +692,7 @@ done:
  */
 int ipath_init_chip(struct ipath_devdata *dd, int reinit)
 {
-	int ret = 0, i;
+	int ret = 0;
 	u32 val32, kpiobufs;
 	u32 piobufs, uports;
 	u64 val;
@@ -750,7 +749,7 @@ int ipath_init_chip(struct ipath_devdata *dd, int reinit)
 		kpiobufs = ipath_kpiobufs;
 
 	if (kpiobufs + (uports * IPATH_MIN_USER_PORT_BUFCNT) > piobufs) {
-		i = (int) piobufs -
+		int i = (int) piobufs -
 			(int) (uports * IPATH_MIN_USER_PORT_BUFCNT);
 		if (i < 0)
 			i = 0;
diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c
index 4795cb8..ec18b9b 100644
--- a/drivers/infiniband/hw/ipath/ipath_intr.c
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c
@@ -683,7 +683,7 @@ static int handle_errors(struct ipath_devdata *dd, ipath_err_t errs)
 		for (i = 0; i < dd->ipath_cfgports; i++) {
 			struct ipath_portdata *pd = dd->ipath_pd[i];
 			if (i == 0) {
-				hd = dd->ipath_port0head;
+				hd = pd->port_head;
 				tl = (u32) le64_to_cpu(
 					*dd->ipath_hdrqtailptr);
 			} else if (pd && pd->port_cnt &&
@@ -712,6 +712,8 @@ static int handle_errors(struct ipath_devdata *dd, ipath_err_t errs)
 		}
 	}
 	if (errs & INFINIPATH_E_RRCVEGRFULL) {
+		struct ipath_portdata *pd = dd->ipath_pd[0];
+
 		/*
 		 * since this is of less importance and not likely to
 		 * happen without also getting hdrfull, only count
@@ -719,7 +721,7 @@ static int handle_errors(struct ipath_devdata *dd, ipath_err_t errs)
 		 * vs user)
 		 */
 		ipath_stats.sps_etidfull++;
-		if (dd->ipath_port0head !=
+		if (pd->port_head !=
 		    (u32) le64_to_cpu(*dd->ipath_hdrqtailptr))
 			chkerrpkts = 1;
 	}
@@ -1173,7 +1175,7 @@ irqreturn_t ipath_intr(int irq, void *data)
 	 * for receive are at the bottom.
 	 */
 	if (chk0rcv) {
-		ipath_kreceive(dd);
+		ipath_kreceive(dd->ipath_pd[0]);
 		istat &= ~port0rbits;
 	}
 
diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h
index 96553f4..e06dec1 100644
--- a/drivers/infiniband/hw/ipath/ipath_kernel.h
+++ b/drivers/infiniband/hw/ipath/ipath_kernel.h
@@ -167,6 +167,8 @@ struct ipath_portdata {
 	u32 active_slaves;
 	/* Type of packets or conditions we want to poll for */
 	u16 poll_type;
+	/* port rcvhdrq head offset */
+	u32 port_head;
 };
 
 struct sk_buff;
@@ -314,8 +316,6 @@ struct ipath_devdata {
 	 * supports, less gives more pio bufs/port, etc.
 	 */
 	u32 ipath_cfgports;
-	/* port0 rcvhdrq head offset */
-	u32 ipath_port0head;
 	/* count of port 0 hdrqfull errors */
 	u32 ipath_p0_hdrqfull;
 
@@ -690,7 +690,7 @@ void ipath_free_pddata(struct ipath_devdata *, struct ipath_portdata *);
 
 int ipath_parse_ushort(const char *str, unsigned short *valp);
 
-void ipath_kreceive(struct ipath_devdata *);
+void ipath_kreceive(struct ipath_portdata *);
 int ipath_setrcvhdrsize(struct ipath_devdata *, unsigned);
 int ipath_reset_device(int);
 void ipath_get_faststats(unsigned long);
@@ -928,6 +928,17 @@ static inline u32 ipath_read_creg32(const struct ipath_devdata *dd,
 		      (char __iomem *)dd->ipath_kregbase));
 }
 
+static inline void ipath_clear_rcvhdrtail(const struct ipath_portdata *pd)
+{
+	*((u64 *) pd->port_rcvhdrtail_kvaddr) = 0ULL;
+}
+
+static inline u32 ipath_get_rcvhdrtail(const struct ipath_portdata *pd)
+{
+	return (u32) le64_to_cpu(*((volatile __le64 *)
+				pd->port_rcvhdrtail_kvaddr));
+}
+
 /*
  * sysfs interface.
  */
diff --git a/drivers/infiniband/hw/ipath/ipath_stats.c b/drivers/infiniband/hw/ipath/ipath_stats.c
index f027141..fd89765 100644
--- a/drivers/infiniband/hw/ipath/ipath_stats.c
+++ b/drivers/infiniband/hw/ipath/ipath_stats.c
@@ -133,15 +133,16 @@ bail:
 static void ipath_qcheck(struct ipath_devdata *dd)
 {
 	static u64 last_tot_hdrqfull;
+	struct ipath_portdata *pd = dd->ipath_pd[0];
 	size_t blen = 0;
 	char buf[128];
 
 	*buf = 0;
-	if (dd->ipath_pd[0]->port_hdrqfull != dd->ipath_p0_hdrqfull) {
+	if (pd->port_hdrqfull != dd->ipath_p0_hdrqfull) {
 		blen = snprintf(buf, sizeof buf, "port 0 hdrqfull %u",
-				dd->ipath_pd[0]->port_hdrqfull -
+				pd->port_hdrqfull -
 				dd->ipath_p0_hdrqfull);
-		dd->ipath_p0_hdrqfull = dd->ipath_pd[0]->port_hdrqfull;
+		dd->ipath_p0_hdrqfull = pd->port_hdrqfull;
 	}
 	if (ipath_stats.sps_etidfull != dd->ipath_last_tidfull) {
 		blen += snprintf(buf + blen, sizeof buf - blen,
@@ -173,7 +174,7 @@ static void ipath_qcheck(struct ipath_devdata *dd)
 	if (blen)
 		ipath_dbg("%s\n", buf);
 
-	if (dd->ipath_port0head != (u32)
+	if (pd->port_head != (u32)
 	    le64_to_cpu(*dd->ipath_hdrqtailptr)) {
 		if (dd->ipath_lastport0rcv_cnt ==
 		    ipath_stats.sps_port0pkts) {
@@ -181,7 +182,7 @@ static void ipath_qcheck(struct ipath_devdata *dd)
 				   "port0 hd=%llx tl=%x; port0pkts %llx\n",
 				   (unsigned long long)
 				   le64_to_cpu(*dd->ipath_hdrqtailptr),
-				   dd->ipath_port0head,
+				   pd->port_head,
 				   (unsigned long long)
 				   ipath_stats.sps_port0pkts);
 		}


From arthur.jones at qlogic.com  Sat Dec 22 14:24:54 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Sat, 22 Dec 2007 14:24:54 -0800
Subject: [ofa-general] [PATCH 4/5] IB/ipath - generalize some macros_SHIFT
In-Reply-To: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
References: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071222222454.17599.82530.stgit@eng-46.internal.keyresearch.com>

From: Dave Olson <dave.olson at qlogic.com>

In preparation for upcoming chips that have different
values of INFINIPATH_R_PORTENABLE_SHIFT, INFINIPATH_R_INTRAVAIL_SHIFT,
INFINIPATH_R_TAILUPD_SHIFT, and portcfg_shift, remove the
shared #define's and use device specific variables instead

Signed-off-by: Dave Olson <dave.olson at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_file_ops.c  |   15 ++++++++-------
 drivers/infiniband/hw/ipath/ipath_iba6110.c   |   12 ++++++++++++
 drivers/infiniband/hw/ipath/ipath_iba6120.c   |    9 +++++++++
 drivers/infiniband/hw/ipath/ipath_init_chip.c |    6 +++---
 drivers/infiniband/hw/ipath/ipath_intr.c      |    2 +-
 drivers/infiniband/hw/ipath/ipath_kernel.h    |    6 ++++++
 drivers/infiniband/hw/ipath/ipath_registers.h |    3 +--
 7 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c
index 0624a60..22c2521 100644
--- a/drivers/infiniband/hw/ipath/ipath_file_ops.c
+++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c
@@ -744,10 +744,10 @@ static int ipath_manage_rcvq(struct ipath_portdata *pd, unsigned subport,
 		 */
 		if (pd->port_rcvhdrtail_kvaddr)
 			ipath_clear_rcvhdrtail(pd);
-		set_bit(INFINIPATH_R_PORTENABLE_SHIFT + pd->port_port,
+		set_bit(dd->ipath_r_portenable_shift + pd->port_port,
 			&dd->ipath_rcvctrl);
 	} else
-		clear_bit(INFINIPATH_R_PORTENABLE_SHIFT + pd->port_port,
+		clear_bit(dd->ipath_r_portenable_shift + pd->port_port,
 			  &dd->ipath_rcvctrl);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl,
 			 dd->ipath_rcvctrl);
@@ -1414,7 +1414,7 @@ static unsigned int ipath_poll_next(struct ipath_portdata *pd,
 		/* flush waiting flag so we don't miss an event */
 		wmb();
 
-		set_bit(pd->port_port + INFINIPATH_R_INTRAVAIL_SHIFT,
+		set_bit(pd->port_port + dd->ipath_r_intravail_shift,
 			&dd->ipath_rcvctrl);
 
 		ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl,
@@ -1947,10 +1947,11 @@ static int ipath_do_user_init(struct file *fp,
 	 */
 	if (pd->port_rcvhdrtail_kvaddr)
 		ipath_clear_rcvhdrtail(pd);
-	set_bit(INFINIPATH_R_PORTENABLE_SHIFT + pd->port_port,
+	set_bit(dd->ipath_r_portenable_shift + pd->port_port,
 		&dd->ipath_rcvctrl);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl,
-			 dd->ipath_rcvctrl & ~INFINIPATH_R_TAILUPD);
+			dd->ipath_rcvctrl &
+			~(1ULL << dd->ipath_r_tailupd_shift));
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl,
 			 dd->ipath_rcvctrl);
 	/* Notify any waiting slaves */
@@ -2059,9 +2060,9 @@ static int ipath_close(struct inode *in, struct file *fp)
 	if (dd->ipath_kregbase) {
 		int i;
 		/* atomically clear receive enable port and intr avail. */
-		clear_bit(INFINIPATH_R_PORTENABLE_SHIFT + port,
+		clear_bit(dd->ipath_r_portenable_shift + port,
 			  &dd->ipath_rcvctrl);
-		clear_bit(pd->port_port + INFINIPATH_R_INTRAVAIL_SHIFT,
+		clear_bit(pd->port_port + dd->ipath_r_intravail_shift,
 			  &dd->ipath_rcvctrl);
 		ipath_write_kreg( dd, dd->ipath_kregs->kr_rcvctrl,
 			dd->ipath_rcvctrl);
diff --git a/drivers/infiniband/hw/ipath/ipath_iba6110.c b/drivers/infiniband/hw/ipath/ipath_iba6110.c
index ddbebe4..c272a73 100644
--- a/drivers/infiniband/hw/ipath/ipath_iba6110.c
+++ b/drivers/infiniband/hw/ipath/ipath_iba6110.c
@@ -296,6 +296,12 @@ static const struct ipath_cregs ipath_ht_cregs = {
 #define INFINIPATH_RT_BUFSIZE_MASK 0x3FFFULL
 #define INFINIPATH_RT_BUFSIZE_SHIFT 48
 
+#define INFINIPATH_R_INTRAVAIL_SHIFT 16
+#define INFINIPATH_R_TAILUPD_SHIFT 31
+
+/* kr_xgxsconfig bits */
+#define INFINIPATH_XGXS_RESET          0x7ULL
+
 /*
  * masks and bits that are different in different chips, or present only
  * in one
@@ -1079,6 +1085,12 @@ static void ipath_init_ht_variables(struct ipath_devdata *dd)
 	dd->ipath_gpio_sda = IPATH_GPIO_SDA;
 	dd->ipath_gpio_scl = IPATH_GPIO_SCL;
 
+	/* Fill in shifts for RcvCtrl. */
+	dd->ipath_r_portenable_shift = INFINIPATH_R_PORTENABLE_SHIFT;
+	dd->ipath_r_intravail_shift = INFINIPATH_R_INTRAVAIL_SHIFT;
+	dd->ipath_r_tailupd_shift = INFINIPATH_R_TAILUPD_SHIFT;
+	dd->ipath_r_portcfg_shift = 0; /* Not on IBA6110 */
+
 	dd->ipath_i_bitsextant =
 		(INFINIPATH_I_RCVURG_MASK << INFINIPATH_I_RCVURG_SHIFT) |
 		(INFINIPATH_I_RCVAVAIL_MASK <<
diff --git a/drivers/infiniband/hw/ipath/ipath_iba6120.c b/drivers/infiniband/hw/ipath/ipath_iba6120.c
index 0103d6f..e6893eb 100644
--- a/drivers/infiniband/hw/ipath/ipath_iba6120.c
+++ b/drivers/infiniband/hw/ipath/ipath_iba6120.c
@@ -296,6 +296,9 @@ static const struct ipath_cregs ipath_pe_cregs = {
 #define IPATH_GPIO_SCL (1ULL << \
 	(_IPATH_GPIO_SCL_NUM+INFINIPATH_EXTC_GPIOOE_SHIFT))
 
+#define INFINIPATH_R_INTRAVAIL_SHIFT 16
+#define INFINIPATH_R_TAILUPD_SHIFT 31
+
 /* 6120 specific hardware errors... */
 static const struct ipath_hwerror_msgs ipath_6120_hwerror_msgs[] = {
 	INFINIPATH_HWE_MSG(PCIEPOISONEDTLP, "PCIe Poisoned TLP"),
@@ -916,6 +919,12 @@ static void ipath_init_pe_variables(struct ipath_devdata *dd)
 	dd->ipath_gpio_sda = IPATH_GPIO_SDA;
 	dd->ipath_gpio_scl = IPATH_GPIO_SCL;
 
+	/* Fill in shifts for RcvCtrl. */
+	dd->ipath_r_portenable_shift = INFINIPATH_R_PORTENABLE_SHIFT;
+	dd->ipath_r_intravail_shift = INFINIPATH_R_INTRAVAIL_SHIFT;
+	dd->ipath_r_tailupd_shift = INFINIPATH_R_TAILUPD_SHIFT;
+	dd->ipath_r_portcfg_shift = 0; /* Not on IBA6120 */
+
 	/* variables for sanity checking interrupt and errors */
 	dd->ipath_hwe_bitsextant =
 		(INFINIPATH_HWE_RXEMEMPARITYERR_MASK <<
diff --git a/drivers/infiniband/hw/ipath/ipath_init_chip.c b/drivers/infiniband/hw/ipath/ipath_init_chip.c
index e161cad..cf64d38 100644
--- a/drivers/infiniband/hw/ipath/ipath_init_chip.c
+++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c
@@ -508,9 +508,9 @@ static void enable_chip(struct ipath_devdata *dd,
 	 * enable port 0 receive, and receive interrupt.  other ports
 	 * done as user opens and inits them.
 	 */
-	dd->ipath_rcvctrl = INFINIPATH_R_TAILUPD |
-		(1ULL << INFINIPATH_R_PORTENABLE_SHIFT) |
-		(1ULL << INFINIPATH_R_INTRAVAIL_SHIFT);
+	dd->ipath_rcvctrl = (1ULL << dd->ipath_r_tailupd_shift) |
+		(1ULL << dd->ipath_r_portenable_shift) |
+		(1ULL << dd->ipath_r_intravail_shift);
 	ipath_write_kreg(dd, dd->ipath_kregs->kr_rcvctrl,
 			 dd->ipath_rcvctrl);
 
diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c
index ec18b9b..d9f8342 100644
--- a/drivers/infiniband/hw/ipath/ipath_intr.c
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c
@@ -975,7 +975,7 @@ static void handle_urcv(struct ipath_devdata *dd, u32 istat)
 		if (portr & (1 << i) && pd && pd->port_cnt) {
 			if (test_and_clear_bit(IPATH_PORT_WAITING_RCV,
 					       &pd->port_flag)) {
-				clear_bit(i + INFINIPATH_R_INTRAVAIL_SHIFT,
+				clear_bit(i + dd->ipath_r_intravail_shift,
 					  &dd->ipath_rcvctrl);
 				wake_up_interruptible(&pd->port_wait);
 				rcvdint = 1;
diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h
index e06dec1..b35dd33 100644
--- a/drivers/infiniband/hw/ipath/ipath_kernel.h
+++ b/drivers/infiniband/hw/ipath/ipath_kernel.h
@@ -550,6 +550,12 @@ struct ipath_devdata {
 	u8 ipath_minrev;
 	/* board rev, from ipath_revision */
 	u8 ipath_boardrev;
+
+	u8 ipath_r_portenable_shift;
+	u8 ipath_r_intravail_shift;
+	u8 ipath_r_tailupd_shift;
+	u8 ipath_r_portcfg_shift;
+
 	/* unit # of this chip, if present */
 	int ipath_unit;
 	/* saved for restore after reset */
diff --git a/drivers/infiniband/hw/ipath/ipath_registers.h b/drivers/infiniband/hw/ipath/ipath_registers.h
index 708eba3..d7181d4 100644
--- a/drivers/infiniband/hw/ipath/ipath_registers.h
+++ b/drivers/infiniband/hw/ipath/ipath_registers.h
@@ -82,8 +82,7 @@
 
 /* kr_rcvctrl bits */
 #define INFINIPATH_R_PORTENABLE_SHIFT 0
-#define INFINIPATH_R_INTRAVAIL_SHIFT 16
-#define INFINIPATH_R_TAILUPD   0x80000000
+#define INFINIPATH_R_QPMAP_ENABLE (1ULL << 38)
 
 /* kr_intstatus, kr_intclear, kr_intmask bits */
 #define INFINIPATH_I_RCVURG_SHIFT 0


From arthur.jones at qlogic.com  Sat Dec 22 14:25:00 2007
From: arthur.jones at qlogic.com (Arthur Jones)
Date: Sat, 22 Dec 2007 14:25:00 -0800
Subject: [ofa-general] [PATCH 5/5] IB/ipath - Changes for fields moving from
	devdata to portdata.
In-Reply-To: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
References: <20071222222434.17599.95501.stgit@eng-46.internal.keyresearch.com>
Message-ID: <20071222222459.17599.72860.stgit@eng-46.internal.keyresearch.com>

From: Dave Olson <dave.olson at qlogic.com>

This patch moves some arrays that were defined per device
to be variables defined in the per context data structure
thus avoiding extra kzalloc() calls.

Signed-off-by: Dave Olson <dave.olson at qlogic.com>
---

 drivers/infiniband/hw/ipath/ipath_file_ops.c  |    5 +++--
 drivers/infiniband/hw/ipath/ipath_init_chip.c |   15 ---------------
 drivers/infiniband/hw/ipath/ipath_intr.c      |    4 ++--
 drivers/infiniband/hw/ipath/ipath_kernel.h    |   16 ++++++----------
 drivers/infiniband/hw/ipath/ipath_stats.c     |   13 ++++++-------
 5 files changed, 17 insertions(+), 36 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c
index 22c2521..7295ffd 100644
--- a/drivers/infiniband/hw/ipath/ipath_file_ops.c
+++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c
@@ -1794,6 +1794,7 @@ static int find_shared_port(struct file *fp,
 			}
 			port_fp(fp) = pd;
 			subport_fp(fp) = pd->port_cnt++;
+			pd->port_subpid[subport_fp(fp)] = current->pid;
 			tidcursor_fp(fp) = 0;
 			pd->active_slaves |= 1 << subport_fp(fp);
 			ipath_cdbg(PROC,
@@ -1924,8 +1925,7 @@ static int ipath_do_user_init(struct file *fp,
 	 */
 	head32 = ipath_read_ureg32(dd, ur_rcvegrindextail, pd->port_port);
 	ipath_write_ureg(dd, ur_rcvegrindexhead, head32, pd->port_port);
-	dd->ipath_lastegrheads[pd->port_port] = -1;
-	dd->ipath_lastrcvhdrqtails[pd->port_port] = -1;
+	pd->port_lastrcvhdrqtail = -1;
 	ipath_cdbg(VERBOSE, "Wrote port%d egrhead %x from tail regs\n",
 		pd->port_port, head32);
 	pd->port_tidcursor = 0;	/* start at beginning after open */
@@ -2028,6 +2028,7 @@ static int ipath_close(struct inode *in, struct file *fp)
 		 * the slave(s) don't wait for receive data forever.
 		 */
 		pd->active_slaves &= ~(1 << fd->subport);
+		pd->port_subpid[fd->subport] = 0;
 		mutex_unlock(&ipath_mutex);
 		goto bail;
 	}
diff --git a/drivers/infiniband/hw/ipath/ipath_init_chip.c b/drivers/infiniband/hw/ipath/ipath_init_chip.c
index cf64d38..98b5146 100644
--- a/drivers/infiniband/hw/ipath/ipath_init_chip.c
+++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c
@@ -272,22 +272,7 @@ static int init_chip_first(struct ipath_devdata *dd,
 		goto done;
 	}
 
-	dd->ipath_lastegrheads = kzalloc(sizeof(*dd->ipath_lastegrheads)
-					 * dd->ipath_cfgports,
-					 GFP_KERNEL);
-	dd->ipath_lastrcvhdrqtails =
-		kzalloc(sizeof(*dd->ipath_lastrcvhdrqtails)
-			* dd->ipath_cfgports, GFP_KERNEL);
-
-	if (!dd->ipath_lastegrheads || !dd->ipath_lastrcvhdrqtails) {
-		ipath_dev_err(dd, "Unable to allocate head arrays, "
-			      "failing\n");
-		ret = -ENOMEM;
-		goto done;
-	}
-
 	pd = create_portdata0(dd);
-
 	if (!pd) {
 		ipath_dev_err(dd, "Unable to allocate portdata for port "
 			      "0, failing\n");
diff --git a/drivers/infiniband/hw/ipath/ipath_intr.c b/drivers/infiniband/hw/ipath/ipath_intr.c
index d9f8342..e2ce531 100644
--- a/drivers/infiniband/hw/ipath/ipath_intr.c
+++ b/drivers/infiniband/hw/ipath/ipath_intr.c
@@ -693,7 +693,7 @@ static int handle_errors(struct ipath_devdata *dd, ipath_err_t errs)
 				 * except kernel
 				 */
 				tl = *(u64 *) pd->port_rcvhdrtail_kvaddr;
-				if (tl == dd->ipath_lastrcvhdrqtails[i])
+				if (tl == pd->port_lastrcvhdrqtail)
 					continue;
 				hd = ipath_read_ureg32(dd, ur_rcvhdrhead,
 						       i);
@@ -703,7 +703,7 @@ static int handle_errors(struct ipath_devdata *dd, ipath_err_t errs)
 			    (!hd && tl == dd->ipath_hdrqlast)) {
 				if (i == 0)
 					chkerrpkts = 1;
-				dd->ipath_lastrcvhdrqtails[i] = tl;
+				pd->port_lastrcvhdrqtail = tl;
 				pd->port_hdrqfull++;
 				/* flush hdrqfull so that poll() sees it */
 				wmb();
diff --git a/drivers/infiniband/hw/ipath/ipath_kernel.h b/drivers/infiniband/hw/ipath/ipath_kernel.h
index b35dd33..977e88a 100644
--- a/drivers/infiniband/hw/ipath/ipath_kernel.h
+++ b/drivers/infiniband/hw/ipath/ipath_kernel.h
@@ -141,6 +141,11 @@ struct ipath_portdata {
 	u32 port_pionowait;
 	/* total number of rcvhdrqfull errors */
 	u32 port_hdrqfull;
+	/*
+	 * Used to suppress multiple instances of same
+	 * port staying stuck at same point.
+	 */
+	u32 port_lastrcvhdrqtail;
 	/* saved total number of rcvhdrqfull errors for poll edge trigger */
 	u32 port_hdrqfull_poll;
 	/* total number of polled urgent packets */
@@ -149,6 +154,7 @@ struct ipath_portdata {
 	u32 port_urgent_poll;
 	/* pid of process using this port */
 	pid_t port_pid;
+	pid_t port_subpid[INFINIPATH_MAX_SUBPORT];
 	/* same size as task_struct .comm[] */
 	char port_comm[16];
 	/* pkeys set by this use of this port */
@@ -320,16 +326,6 @@ struct ipath_devdata {
 	u32 ipath_p0_hdrqfull;
 
 	/*
-	 * (*cfgports) used to suppress multiple instances of same
-	 * port staying stuck at same point
-	 */
-	u32 *ipath_lastrcvhdrqtails;
-	/*
-	 * (*cfgports) used to suppress multiple instances of same
-	 * port staying stuck at same point
-	 */
-	u32 *ipath_lastegrheads;
-	/*
 	 * index of last piobuffer we used.  Speeds up searching, by
 	 * starting at this point.  Doesn't matter if multiple cpu's use and
 	 * update, last updater is only write that matters.  Whenever it
diff --git a/drivers/infiniband/hw/ipath/ipath_stats.c b/drivers/infiniband/hw/ipath/ipath_stats.c
index fd89765..d2725cd 100644
--- a/drivers/infiniband/hw/ipath/ipath_stats.c
+++ b/drivers/infiniband/hw/ipath/ipath_stats.c
@@ -238,7 +238,7 @@ static void ipath_chk_errormask(struct ipath_devdata *dd)
 void ipath_get_faststats(unsigned long opaque)
 {
 	struct ipath_devdata *dd = (struct ipath_devdata *) opaque;
-	u32 val;
+	int i;
 	static unsigned cnt;
 	unsigned long flags;
 	u64 traffic_wds;
@@ -322,12 +322,11 @@ void ipath_get_faststats(unsigned long opaque)
 
 	/* limit qfull messages to ~one per minute per port */
 	if ((++cnt & 0x10)) {
-		for (val = dd->ipath_cfgports - 1; ((int)val) >= 0;
-		     val--) {
-			if (dd->ipath_lastegrheads[val] != -1)
-				dd->ipath_lastegrheads[val] = -1;
-			if (dd->ipath_lastrcvhdrqtails[val] != -1)
-				dd->ipath_lastrcvhdrqtails[val] = -1;
+		for (i = (int) dd->ipath_cfgports; --i >= 0; ) {
+			struct ipath_portdata *pd = dd->ipath_pd[i];
+
+			if (pd && pd->port_lastrcvhdrqtail != -1)
+				pd->port_lastrcvhdrqtail = -1;
 		}
 	}
 

From akpm at linux-foundation.org  Sat Dec 22 16:51:14 2007
From: akpm at linux-foundation.org (Andrew Morton)
Date: Sat, 22 Dec 2007 16:51:14 -0800
Subject: [ofa-general] current infiniband git tree
Message-ID: <20071222165114.07dbe376.akpm@linux-foundation.org>


Could someone please do an i386 allmodconfig build and reduce some of this?

Thanks.

drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_init_cqp':
drivers/infiniband/hw/nes/nes_hw.c:834: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:835: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_hw.c:927: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:929: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:936: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:950: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:952: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:966: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:968: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:983: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:985: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_init_nic_qp':
drivers/infiniband/hw/nes/nes_hw.c:1340: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1341: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1413: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1414: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1421: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1453: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1455: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_destroy_nic_qp':
drivers/infiniband/hw/nes/nes_hw.c:1572: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1574: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1589: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:1591: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_cqp_ce_handler':
drivers/infiniband/hw/nes/nes_hw.c:2458: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:2459: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_process_iwarp_aeqe':
drivers/infiniband/hw/nes/nes_hw.c:2503: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:2723: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_manage_apbvt':
drivers/infiniband/hw/nes/nes_hw.c:2852: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:2854: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_manage_arp_cache':
drivers/infiniband/hw/nes/nes_hw.c:2920: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:2922: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c: In function 'flush_wqes':
drivers/infiniband/hw/nes/nes_hw.c:2973: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_hw.c:2974: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes.c: In function 'nes_rem_ref':
drivers/infiniband/hw/nes/nes.c:331: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes.c:333: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_utils.c: In function 'nes_post_cqp_request':
drivers/infiniband/hw/nes/nes_utils.c:592: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_utils.c:593: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_alloc_mw':
drivers/infiniband/hw/nes/nes_verbs.c:117: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:119: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_dealloc_mw':
drivers/infiniband/hw/nes/nes_verbs.c:207: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:209: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_bind_mw':
drivers/infiniband/hw/nes/nes_verbs.c:300: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:301: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_alloc_fmr':
drivers/infiniband/hw/nes/nes_verbs.c:541: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:543: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_create_qp':
drivers/infiniband/hw/nes/nes_verbs.c:1330: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1334: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1472: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1473: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1507: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1509: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_create_cq':
drivers/infiniband/hw/nes/nes_verbs.c:1824: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1826: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1841: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_destroy_cq':
drivers/infiniband/hw/nes/nes_verbs.c:1963: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:1965: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr':
drivers/infiniband/hw/nes/nes_verbs.c:2119: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:2121: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_dereg_mr':
drivers/infiniband/hw/nes/nes_verbs.c:2802: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:2804: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_hw_modify_qp':
drivers/infiniband/hw/nes/nes_verbs.c:2984: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:2985: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_post_send':
drivers/infiniband/hw/nes/nes_verbs.c:3416: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:3418: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:3438: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:3438: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:3483: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:3483: warning: cast to pointer from integer of different size
drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_post_recv':
drivers/infiniband/hw/nes/nes_verbs.c:3614: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_verbs.c:3615: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_cm.c: In function 'nes_accept':
drivers/infiniband/hw/nes/nes_cm.c:2351: warning: cast from pointer to integer of different size
drivers/infiniband/hw/nes/nes_cm.c:2423: warning: format '%lu' expects type 'long unsigned int', but argument 11 has type 'unsigned int'
drivers/infiniband/hw/nes/nes_cm.c: In function 'cm_event_connected':
drivers/infiniband/hw/nes/nes_cm.c:2770: warning: cast from pointer to integer of different size


From changquing.tang at hp.com  Sat Dec 22 19:10:05 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Sun, 23 Dec 2007 03:10:05 +0000
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	any	one user process
In-Reply-To: <200712221005.29126.jackm@dev.mellanox.co.il>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<200712212251.24330.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE241E42A@G5W0278.americas.hpqcorp.net>
	<200712221005.29126.jackm@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FE241E48F@G5W0278.americas.hpqcorp.net>


You mentioned you need a kernel thread to scan all the XRC domain to cleanup the QP, is there any existing timeout event or other event to trigger such a scan ?


--CQ

> -----Original Message-----
> From: Jack Morgenstein [mailto:jackm at dev.mellanox.co.il]
> Sent: Saturday, December 22, 2007 2:05 AM
> To: Tang, Changqing
> Cc: pasha at dev.mellanox.co.il; general at lists.openfabrics.org
> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> independent of any one user process
>
> On Saturday 22 December 2007 04:51, Tang, Changqing wrote:
> > We need to ask Roland to confirm this.
> >
> I'll speak to the firmware guys here for clarification.
> >
> >
> > I did not use zero byte rdma_read, I only use zero byte rdma_write.
> >
> > Here is our code:
> >
> >                 sr.next = NULL;
> >                 sr.wr_id = (uint64_t)(AULONG)rdmahdr;
> >
> >                 sr.sg_list = &ssg;
> >                 sr.num_sge = 0;
> >                 sr.opcode = IBV_WR_RDMA_WRITE;
> >                 sr.send_flags = IBV_SEND_INLINE|IBV_SEND_SIGNALED;
> >
> >                 err =
> ibv_post_send(ibvproc->connection[i].qp_hndl, &sr, &bad_sr);
> >                 if (err != 0) {
> >                         hpmp_printf("ibv_post_send() failed");
> >                         return (-1);
> >                 }
> >
> > Note, ssg is not initialized (Maybe we can set sr.sg_list = NULL ?)
> You don't need to initialize ssg -- you have set num_sge to
> zero, so sg_list is not relevant.
> >
> I notice that the rdma fields are also not initialized.
> That implies that no validity checking is done on the remote
> host side once the length is zero.
>
> I just looked at the ConnectX PRM, revision 0.35, section
> 8.4.1.11.  It specifically addresses the issue of sending
> 0-byte RDMA reads/writes: send a wqe with NO data segments,
> and you will get a 0-length RDMA-write.
> However, it does not mention whether the remote address and
> rkey provided in the WQE must be valid or not.
>
> I'll check with F/W regarding this issue, too.
>
> - Jack
>


From kliteyn at mellanox.co.il  Sat Dec 22 21:42:04 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 23 Dec 2007 07:42:04 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-23:normal completion
Message-ID: <MTLEXCH01xidGVwPGxg0000184f@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-22
OpenSM git rev = Sat_Dec_22_16:15:50_2007 [56399c9945bae3c11433e558a76c2b7c4ef67da6]
ibutils git rev = Wed_Dec_19_12:06:28_2007 [9961475294fbf1d3782edb8f377a77b13fa80d70]
 
 
Total=560  Pass=560  Fail=0
 
 
Pass:
42 Stability IS1-16.topo
42 Pkey IS1-16.topo
42 OsmTest IS1-16.topo
42 OsmStress IS1-16.topo
42 Multicast IS1-16.topo
42 LidMgr IS1-16.topo
14 Stability IS3-loop.topo
14 Stability IS3-128.topo
14 Pkey IS3-128.topo
14 OsmTest IS3-loop.topo
14 OsmTest IS3-128.topo
14 OsmStress IS3-128.topo
14 Multicast IS3-loop.topo
14 Multicast IS3-128.topo
14 LidMgr IS3-128.topo
14 FatTree merge-roots-4-ary-2-tree.topo
14 FatTree merge-root-4-ary-3-tree.topo
14 FatTree gnu-stallion-64.topo
14 FatTree blend-4-ary-2-tree.topo
14 FatTree RhinoDDR.topo
14 FatTree FullGnu.topo
14 FatTree 4-ary-2-tree.topo
14 FatTree 2-ary-4-tree.topo
14 FatTree 12-node-spaced.topo
14 FTreeFail 4-ary-2-tree-missing-sw-link.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
14 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From jackm at dev.mellanox.co.il  Sat Dec 22 22:15:25 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Sun, 23 Dec 2007 08:15:25 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	=?iso-8859-1?q?any=09one_user?= process
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE241E48F@G5W0278.americas.hpqcorp.net>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<200712221005.29126.jackm@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE241E48F@G5W0278.americas.hpqcorp.net>
Message-ID: <200712230815.25493.jackm@dev.mellanox.co.il>

On Sunday 23 December 2007 05:10, Tang, Changqing wrote:
> 
> You mentioned you need a kernel thread to scan all the XRC domain to cleanup the QP, 
> is there any existing timeout event or other event to trigger such a scan ? 
> 
Don't need any timeout event.  This can be handled by a linux workqueue.
When the heartbeat fails, the QP can be scheduled for deletion.

The workqueue (a kernel thread) sleeps until work is scheduled for it. After
the work is performed, it goes to sleep again.

- Jack


From dotanb at dev.mellanox.co.il  Sat Dec 22 23:12:06 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Sun, 23 Dec 2007 09:12:06 +0200
Subject: [ofa-general] Java invoke the verbs through JNI
In-Reply-To: <13432ab00712201740t10f0af8eldf13609ffa5ca680@mail.gmail.com>
References: <13432ab00712201740t10f0af8eldf13609ffa5ca680@mail.gmail.com>
Message-ID: <476E0A46.2060500@dev.mellanox.co.il>

Hi.

zhang Jackie wrote:
> Hi, all
>
> I just wrote a JNI program to use IB in Java program. I wrote some 
> simple test programs, It is ok. But when I want to integrate it with 
> another program , Local protection error is reported.  It is unstable 
> and it is wrong during the most of time.
> Can someone give me some advice? Thanks.

More info is needed:
what do you try to do?
what is the completion status?

thanks
Dotan


From vlad at lists.openfabrics.org  Sun Dec 23 03:15:35 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sun, 23 Dec 2007 03:15:35 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071223-0200 daily build status
Message-ID: <20071223111536.03805E609CF@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.18-8.el5
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.18
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.14
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on ppc64 with linux-2.6.18-8.el5

Failed:


From a-allisc at adslicks.net  Sun Dec 23 03:56:58 2007
From: a-allisc at adslicks.net (Berta Malone)
Date: , 23 Dec 2007 20:56:58 +0900
Subject: [ofa-general] Let's chat
Message-ID: <01c845a6$5bd1e870$ef084edc@a-allisc>

Hello! I am bored today. I am nice girl that would like to chat with you. Email me at Gun at ShineBal.info only, because I am using my friend's email to write this. Don't miss some of my naughty pictures.


From hmpt at bntgroup.com  Sun Dec 23 03:53:56 2007
From: hmpt at bntgroup.com (Paul Meresak)
Date: , 23 Dec 2007 12:53:56 +0100
Subject: [ofa-general] Lower your medication expenses with Canadian Pharmacy.
Message-ID: <01c84562$e141b330$9ecd3955@hmpt>

    «CanadianPharmacy» offers 100% safe, effective and pure generic medications. Only high quality pharmaceutical products at the cheapest prices in Canada! All the popular medications are available online. Great service, quick shipping and complete security!

http://geocities.com/DewayneShields71/

 Your own health and the health of those you love is something to be cherished.

Paul Meresak


From rdreier at cisco.com  Sun Dec 23 06:39:04 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Sun, 23 Dec 2007 06:39:04 -0800
Subject: [ofa-general] Re: current infiniband git tree
References: <20071222165114.07dbe376.akpm@linux-foundation.org>
Message-ID: <adatzm9wk53.fsf@cisco.com>

 > Could someone please do an i386 allmodconfig build and reduce some of this?

You mean people still use 32-bit systems?

Anyway, Glenn, this is all you... (unless your pending patches fix
this, but I don't see anything promising)

 > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_init_cqp':
 > drivers/infiniband/hw/nes/nes_hw.c:834: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:835: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:927: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:929: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:936: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:950: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:952: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:966: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:968: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:983: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:985: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_init_nic_qp':
 > drivers/infiniband/hw/nes/nes_hw.c:1340: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1341: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1413: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1414: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1421: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1453: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1455: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_destroy_nic_qp':
 > drivers/infiniband/hw/nes/nes_hw.c:1572: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1574: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1589: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:1591: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_cqp_ce_handler':
 > drivers/infiniband/hw/nes/nes_hw.c:2458: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:2459: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_process_iwarp_aeqe':
 > drivers/infiniband/hw/nes/nes_hw.c:2503: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:2723: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_manage_apbvt':
 > drivers/infiniband/hw/nes/nes_hw.c:2852: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:2854: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_manage_arp_cache':
 > drivers/infiniband/hw/nes/nes_hw.c:2920: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:2922: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c: In function 'flush_wqes':
 > drivers/infiniband/hw/nes/nes_hw.c:2973: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_hw.c:2974: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes.c: In function 'nes_rem_ref':
 > drivers/infiniband/hw/nes/nes.c:331: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes.c:333: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_utils.c: In function 'nes_post_cqp_request':
 > drivers/infiniband/hw/nes/nes_utils.c:592: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_utils.c:593: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_alloc_mw':
 > drivers/infiniband/hw/nes/nes_verbs.c:117: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:119: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_dealloc_mw':
 > drivers/infiniband/hw/nes/nes_verbs.c:207: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:209: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_bind_mw':
 > drivers/infiniband/hw/nes/nes_verbs.c:300: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:301: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_alloc_fmr':
 > drivers/infiniband/hw/nes/nes_verbs.c:541: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:543: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_create_qp':
 > drivers/infiniband/hw/nes/nes_verbs.c:1330: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1334: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1472: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1473: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1507: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1509: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_create_cq':
 > drivers/infiniband/hw/nes/nes_verbs.c:1824: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1826: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1841: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_destroy_cq':
 > drivers/infiniband/hw/nes/nes_verbs.c:1963: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:1965: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr':
 > drivers/infiniband/hw/nes/nes_verbs.c:2119: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:2121: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_dereg_mr':
 > drivers/infiniband/hw/nes/nes_verbs.c:2802: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:2804: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_hw_modify_qp':
 > drivers/infiniband/hw/nes/nes_verbs.c:2984: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:2985: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_post_send':
 > drivers/infiniband/hw/nes/nes_verbs.c:3416: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:3418: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:3438: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:3438: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:3483: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:3483: warning: cast to pointer from integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_post_recv':
 > drivers/infiniband/hw/nes/nes_verbs.c:3614: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_verbs.c:3615: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_cm.c: In function 'nes_accept':
 > drivers/infiniband/hw/nes/nes_cm.c:2351: warning: cast from pointer to integer of different size
 > drivers/infiniband/hw/nes/nes_cm.c:2423: warning: format '%lu' expects type 'long unsigned int', but argument 11 has type 'unsigned int'
 > drivers/infiniband/hw/nes/nes_cm.c: In function 'cm_event_connected':
 > drivers/infiniband/hw/nes/nes_cm.c:2770: warning: cast from pointer to integer of different size
 > 
 > 


From laqigchyojqm at boptart.com  Sun Dec 23 07:06:14 2007
From: laqigchyojqm at boptart.com (Bernie Ritter)
Date: , 23 Dec 2007 03:06:14 -1200
Subject: [ofa-general] Re: BOOK2WORD
Message-ID: <01c84510$c6f81700$47de093a@laqigchyojqm>

All Med*s onsale
100% safe and fast shipping
http://geocities.com/woodrowxgy93


From laqigchyojqm at boptart.com  Sun Dec 23 07:06:14 2007
From: laqigchyojqm at boptart.com (Bernie Ritter)
Date: , 23 Dec 2007 03:06:14 -1200
Subject: [ofa-general] Re: BOOK2WORD
Message-ID: <01c84510$c6f81700$47de093a@laqigchyojqm>

All Med*s onsale
100% safe and fast shipping
http://geocities.com/woodrowxgy93


From dwsaecircuitsm at saecircuits.com  Sun Dec 23 09:38:04 2007
From: dwsaecircuitsm at saecircuits.com (Annette Maiocco)
Date: , 23 Dec 2007 18:38:04 +0100
Subject: [ofa-general] CanadianPharmacy makes quality medications affordable.
Message-ID: <01c84592$f3ed0e00$bc5c1c53@dwsaecircuitsm>

    �CanadianPharmacy� offers medications from the leading world famous manufacturers which stand for their quality. Fast worldwide shipping! Utmost care about each customer! Flexible discount system allows you to save on your meds. 

http://geocities.com/RyanKramer18/

 Don�t hesitate to purchase with �CanadianPharmacy�!

Annette Maiocco


From udyaxjlqdf at borlandracing.com  Sun Dec 23 11:19:17 2007
From: udyaxjlqdf at borlandracing.com (Mayra Barron)
Date: Mon, 24 Dec 2007 03:19:17 +0800
Subject: [ofa-general] Want to be a hero in bed? 
Message-ID: <01c845db$c4162880$28bbc0dc@udyaxjlqdf>

Are U Tired with erectile dysfunction? 
Enhance your sexual life now! 
Want to be ready for sex in few minutes? 
Reproductive and ED problems solution 

http://geocities.com/GlenKnight20/

We are verified by VISA. Confidential purchase. 


From dwschwentim at schwenti.com  Sun Dec 23 20:55:13 2007
From: dwschwentim at schwenti.com (Jarred Melton)
Date: Mon, 24 Dec 2007 12:55:13 +0800
Subject: [ofa-general] =?iso-8859-1?q?NO_WORK!_ALL_PLAY=85_and_you_don=27t?=
	=?iso-8859-1?q?_pay!?=
Message-ID: <01c8462c$3911eac0$0130a9da@dwschwentim>

   
1 - REGISTER
2 - PLAY
3 - WIN!

EVERYONE including players from the USA are invited 
to join the fun where you'll: 

-	Get a $500 bonus immediately!!!

-	Get a chance to win our huge progressive jackpot - see it climb

-	Participate in tournaments in all your favorite games! 

-	Make deposits and collect your winnings quickly, safely & securely!
-	Get dedicated online support 
-	Enjoy a respected, award-winning establishment and join thousands of happy patrons 
Download Casino HERE 

http://geocities.com/MalloryCalderon92/ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/1a6df7cf/attachment.html>

From kliteyn at mellanox.co.il  Sun Dec 23 21:32:03 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 24 Dec 2007 07:32:03 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-24:normal completion
Message-ID: <MTLEXCH01GtXsL1c00g00001ade@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-23
OpenSM git rev = Sat_Dec_22_16:15:50_2007 [56399c9945bae3c11433e558a76c2b7c4ef67da6]
ibutils git rev = Wed_Dec_19_12:06:28_2007 [9961475294fbf1d3782edb8f377a77b13fa80d70]
 
 
Total=520  Pass=517  Fail=3
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
10 LidMgr IS3-128.topo

Failures:
3 LidMgr IS3-128.topo


From jrypm at bnf.com  Sun Dec 23 21:49:03 2007
From: jrypm at bnf.com (Tamera Triplett)
Date: Mon, 24 Dec 2007 12:49:03 +0700
Subject: [ofa-general] Purchase software at surprisingly low prices!
Message-ID: <01c8462b$5c87c980$3946fede@jrypm>

  Brilliant opportunity to get software right at the same time you need it without waiting for a CD to be delivered. Just pay money and download your soft. Low prices, discounts and special offers! Most popular localized software in German, French, Italian, Spanish, English and many other languages of the world!

 Free of charge professional installation consultations could be of great help. Prompt reply on all your requests. Money back guarantee ensures the quality of product.

http://geocities.com/GoldieSimpson88/

   Purchase perfectly functioning software.


From jackm at dev.mellanox.co.il  Sun Dec 23 22:27:30 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 24 Dec 2007 08:27:30 +0200
Subject: [ofa-general] Oops in mthca
In-Reply-To: <476B04C6.1040803@linux.vnet.ibm.com>
References: <476B04C6.1040803@linux.vnet.ibm.com>
Message-ID: <200712240827.30845.jackm@dev.mellanox.co.il>

On Friday 21 December 2007 02:11, Pradeep Satyanarayana wrote:
> I discovered the following Oops while developing a patch to enable SRQ on HCAs with fewer than
> 16 SG elements.
> 
> The root of this issue appears to be that ib_query_device(priv->ca, &attr)
> reports an incorrect value for attr.max_srq_sge. The value that
> ib_query_device returns is 28 (instead of 16 that I expected).
> 
Where did you get a value of 16 for max srq scatter-gather entries?
The value reported by ib_query_device is correct (based on a wqe stride of 512 bytes).

- Jack


From a-abboud at adbutler.de  Sun Dec 23 23:36:02 2007
From: a-abboud at adbutler.de (Shelton Heath)
Date: Mon, 24 Dec 2007 15:36:02 +0800
Subject: [ofa-general] Chatting online
Message-ID: <01c84642$b0c7d050$9403727c@a-abboud>

Hello! I am tired today. I am nice girl that would like to chat with you. Email me at Anneli at ShineBal.info only, because I am using my friend's email to write this. If you would like to see my pictures.


From a-aaronu at 4-direct.com  Sun Dec 23 23:39:01 2007
From: a-aaronu at 4-direct.com (Miriam Grace)
Date: Mon, 24 Dec 2007 15:39:01 +0800
Subject: [ofa-general] Can we talk?
Message-ID: <01c84643$1b031880$2f23b13c@a-aaronu>

Hello! I am bored this evening. I am nice girl that would like to chat with you. Email me at Karin at ShineBal.info only, because I am using my friend's email to write this. I would like to share some of my pics.


From kliteyn at dev.mellanox.co.il  Sun Dec 23 23:42:20 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 24 Dec 2007 09:42:20 +0200
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting full member
 pkey on switch ports
Message-ID: <476F62DC.2000804@dev.mellanox.co.il>

OpenSM was failing to set pkeys with full membership on external
switch ports - fixing the wrong condition.

Signed-off-by:  Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_pkey_mgr.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c
index 58eed04..f5c277f 100644
--- a/opensm/opensm/osm_pkey_mgr.c
+++ b/opensm/opensm/osm_pkey_mgr.c
@@ -212,7 +212,7 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,

 	p_pi = &p_physp->port_info;

-	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
+	if (!enforce || ((p_pi->vl_enforce & 0xc) != 0xc)) {
 		osm_log(p_log, OSM_LOG_DEBUG,
 			"pkey_mgr_enforce_partition: "
 			"No need to update PortInfo for "
-- 
1.5.1.4


From erezz at Voltaire.COM  Mon Dec 24 00:42:03 2007
From: erezz at Voltaire.COM (Erez Zilber)
Date: Mon, 24 Dec 2007 10:42:03 +0200
Subject: [ofa-general] Re: [ewg] New features for OFED 1.4
In-Reply-To: <47308DF2.70409@mellanox.co.il>
References: <47308DF2.70409@mellanox.co.il>
Message-ID: <476F70DB.1000701@Voltaire.COM>

Tziporet Koren wrote:
>
> I wish to collect requirements for new features for OFED 1.4
> Please reply with any request you have (features of existing modules,
> new modules etc.)
>
> Thanks,
> Tziporet
>
Tziporet,

Voltaire will add support for stgt (SCSI target). It includes support
for iSCSI over iSER/TCP. For more info about stgt with iSER support:

http://stgt.berlios.de/

https://wiki.openfabrics.org/tiki-index.php?page=ISER-target

Erez


From ingo.gottschalk at chicagolandavenue.com  Mon Dec 24 01:27:58 2007
From: ingo.gottschalk at chicagolandavenue.com (Kieth Finley)
Date: Mon, 24 Dec 2007 11:27:58 +0200
Subject: [ofa-general] Starten Sie mit Ihrer Software sofort nach dem Kauf
Message-ID: <452665694.95446122216922@chicagolandavenue.com>

Fand vor kurzum eine interessante Seite mit verschiedenen Programmen. Ihre Preise sind um 5-10 Mal niedriger als ueberall. Es ist viel Programm Deutsch. Also gibt es so viel Programme fur Macintosh und PC. 

Dazu ist alles grade davon zu laden, man braucht nicht CD per Post zu bestellen und lange zu warten. Ich habe schon ein paar Programme gekriegt und 250 Euro gespart.

Hier ist das Link
http://geocities.com/roxie.holder/">http://geocities.com/roxie.holder/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/e06446b0/attachment.html>

From gfq at bostonreservations.com  Mon Dec 24 02:28:09 2007
From: gfq at bostonreservations.com (Rocco Carlson)
Date: Mon, 24 Dec 2007 12:28:09 +0200
Subject: [ofa-general] Starten Sie mit Ihrer Software sofort nach dem Kauf
Message-ID: <355772332.29726783021362@bostonreservations.com>

Wir freuen uns darauf, Ihnen lokalisierte Versionen bekannter Programme anbieten zu k&#246;nnen: Englisch, Deutsch, Franz&#246;sisch, Italienisch, Spanisch und viele andere Sprachen! 

Sofort nach dem Kauf k&#246;nnen Sie jedes Programm herunterladen und installieren.

http://geocities.com/eliza.garner/

Unser Preis:
* Windows XP Professional With SP2 Full Version: $59.95
* Microsoft Office Enterprise 2007: $79.95
* Office 2003 Professional (including Publisher 2003): $59.95 
* Windows Vista Ultimate 32-bit: $79.95
* Adobe Creative Suite 3 Design Premium: $229.95
* AutoCAD 2008: $129.95

http://geocities.com/eliza.garner/

Wir haben mehr 300 verschiedener Programmes f&#252;r PC und Macintosh! Kaufen jetzt, warten Sie nicht!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/a2973043/attachment.html>

From a-amaury.pci at acutech.com  Mon Dec 24 03:26:15 2007
From: a-amaury.pci at acutech.com (Stacy Huber)
Date: Mon, 24 Dec 2007 19:26:15 +0800
Subject: [ofa-general] Where have you been?
Message-ID: <01c84662$d9ed5c50$949a293a@a-amaury.pci>

Hello! I am bored tonight. I am nice girl that would like to chat with you. Email me at Katarina at ShineBal.info only, because I am using my friend's email to write this. I want to show you some pictures.


From vlad at lists.openfabrics.org  Mon Dec 24 03:23:07 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Mon, 24 Dec 2007 03:23:07 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071224-0200 daily build status
Message-ID: <20071224112307.AE600E609F5@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18
Passed on ppc64 with linux-2.6.15
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.20
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.14
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.16
Passed on ia64 with linux-2.6.15
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.12
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.13
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-8.el5

Failed:


From charles at hairkutters.com  Mon Dec 24 03:49:42 2007
From: charles at hairkutters.com (Susan Burt)
Date: Mon, 24 Dec 2007 12:49:42 +0100
Subject: [ofa-general] Starten Sie mit Ihrer Software sofort nach dem Kauf
Message-ID: <503879335.76386713937780@hairkutters.com>

Wir freuen uns darauf, Ihnen lokalisierte Versionen bekannter Programme anbieten zu k&#246;nnen: Englisch, Deutsch, Franz&#246;sisch, Italienisch, Spanisch und viele andere Sprachen! 

Sofort nach dem Kauf k&#246;nnen Sie jedes Programm herunterladen und installieren.

http://geocities.com/stark.tommie/

Unser Preis:
* Adobe Acrobat 8.0 Professional: $69.95
* Windows XP Professional With SP2 Full Version: $59.95
* Windows Vista Ultimate 32-bit: $79.95
* Adobe Photoshop CS3 Extended: $79.95
* Adobe Photoshop CS2 with ImageReady CS2: $79.95 
* Frontpage 2003 Pro: $29.95 

http://geocities.com/stark.tommie/

Wir haben mehr 300 verschiedener Programmes f&#252;r PC und Macintosh! Kaufen jetzt, warten Sie nicht!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/c903b40b/attachment.html>

From jackm at dev.mellanox.co.il  Mon Dec 24 03:54:01 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 24 Dec 2007 13:54:01 +0200
Subject: [ofa-general] [PATCH] libmlx4: typo in mlx4_poll_one()
Message-ID: <200712241354.01590.jackm@dev.mellanox.co.il>

Typo fix in mlx4_poll_one.

Found by: Reuven Amitai of Mellanox.
Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

diff --git a/src/cq.c b/src/cq.c
index 5ce9d0d..749841d 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -333,7 +333,7 @@ static int mlx4_poll_one(struct mlx4_cq *cq,
 		wc->src_qp	   = ntohl(cqe->g_mlpath_rqpn) & 0xffffff;
 		wc->dlid_path_bits = (ntohl(cqe->g_mlpath_rqpn) >> 24) & 0x7f;
 		wc->pkey_index     = (uint16_t) ntohl(cqe->immed_rss_invalid);
-		wc->wc_flags      |= ntohs(cqe->g_mlpath_rqpn) & 0x80000000 ?
+		wc->wc_flags      |= ntohl(cqe->g_mlpath_rqpn) & 0x80000000 ?
 			IBV_WC_GRH : 0;
 	}
 
L


From tziporet at dev.mellanox.co.il  Mon Dec 24 04:36:21 2007
From: tziporet at dev.mellanox.co.il (Tziporet Koren)
Date: Mon, 24 Dec 2007 14:36:21 +0200
Subject: [ofa-general] Re: [ewg] New features for OFED 1.4
In-Reply-To: <476F70DB.1000701@Voltaire.COM>
References: <47308DF2.70409@mellanox.co.il> <476F70DB.1000701@Voltaire.COM>
Message-ID: <476FA7C5.5080301@mellanox.co.il>

Erez Zilber wrote:
> Voltaire will add support for stgt (SCSI target). It includes support
> for iSCSI over iSER/TCP. 
>
>
>   
great :-)

Tziporet


From dwsoylum at soylu.net  Mon Dec 24 05:56:26 2007
From: dwsoylum at soylu.net (Jeffry Katz)
Date: Mon, 24 Dec 2007 15:56:26 +0200
Subject: [ofa-general] Best medications, best prices! 
Message-ID: <01c84645$89e17100$7e522cd4@dwsoylum>

Want to be the top all night long? 
Buy top products at Canadian Pharmacy store. 
Here you can find brands that you trust. 
Buy high-quality Viagra at discount pharmacy. 

http://geocities.com/RandyHarrell54/

Only Confidential purchase. Verified by VISA! 


From pasha at dev.mellanox.co.il  Mon Dec 24 06:03:09 2007
From: pasha at dev.mellanox.co.il (Pavel Shamis (Pasha))
Date: Mon, 24 Dec 2007 16:03:09 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent
	of any	one user process
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<476A86E8.8020308@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>
Message-ID: <476FBC1D.6050900@dev.mellanox.co.il>

Hi CQ,
Tang, Changqing wrote:
>         If I have a MPI server processes on a node, many other MPI client processes will dynamically
> connect/disconnect with the server. The server use same XRC domain.
>
>         Will this cause accumulating the "kernel" QP for such application ? we want the server to run 365 days
> a year.
>   
I have some question about the scenario above. Did you call for the mpi 
disconnect on the both ends (server/client) before the client exit (did 
we must to do it?)

Regards,
Pasha.
>
> Thanks.
> --CQ
>
>
>
>
>   
>> -----Original Message-----
>> From: Pavel Shamis (Pasha) [mailto:pasha at dev.mellanox.co.il]
>> Sent: Thursday, December 20, 2007 9:15 AM
>> To: Jack Morgenstein
>> Cc: Tang, Changqing; Roland Dreier;
>> general at lists.openfabrics.org; Open MPI Developers;
>> mvapich-discuss at cse.ohio-state.edu
>> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
>> independent of any one user process
>>
>> Adding Open MPI and MVAPICH community to the thread.
>>
>> Pasha (Pavel Shamis)
>>
>> Jack Morgenstein wrote:
>>     
>>> background:  see "XRC Cleanup order issue thread" at
>>>
>>>
>>>
>>>       
>> http://lists.openfabrics.org/pipermail/general/2007-December/043935.ht
>>     
>>> ml
>>>
>>> (userspace process which created the receiving XRC qp on a
>>>       
>> given host
>>     
>>> dies before other processes which still need to receive XRC
>>>       
>> messages
>>     
>>> on their SRQs which are "paired" with the now-destroyed
>>>       
>> receiving XRC
>>     
>>> QP.)
>>>
>>> Solution: Add a userspace verb (as part of the XRC suite) which
>>> enables the user process to create an XRC QP owned by the
>>>       
>> kernel -- which belongs to the required XRC domain.
>>     
>>> This QP will be destroyed when the XRC domain is closed
>>>       
>> (i.e., as part
>>     
>>> of a ibv_close_xrc_domain call, but only when the domain's
>>>       
>> reference count goes to zero).
>>     
>>> Below, I give the new userspace API for this function.  Any
>>>       
>> feedback will be appreciated.
>>     
>>> This API will be implemented in the upcoming OFED 1.3
>>>       
>> release, so we need feedback ASAP.
>>     
>>> Notes:
>>> 1. There is no query or destroy verb for this QP. There is
>>>       
>> also no userspace object for the
>>     
>>>    QP. Userspace has ONLY the raw qp number to use when
>>>       
>> creating the (X)RC connection.
>>     
>>> 2. Since the QP is "owned" by kernel space, async events
>>>       
>> for this QP are also handled in kernel
>>     
>>>    space (i.e., reported in /var/log/messages). There are
>>>       
>> no completion events for the QP, since
>>     
>>>    it does not send, and all receives completions are
>>>       
>> reported in the XRC SRQ's cq.
>>     
>>>    If this QP enters the error state, the remote QP which
>>>       
>> sends will start receiving RETRY_EXCEEDED
>>     
>>>    errors, so the application will be aware of the failure.
>>>
>>> - Jack
>>>
>>>       
>> ======================================================================
>>     
>>> ================
>>> /**
>>>  * ibv_alloc_xrc_rcv_qp - creates an XRC QP for serving as
>>>       
>> a receive-side only QP,
>>     
>>>  *    and moves the created qp through the RESET->INIT and
>>>       
>> INIT->RTR transitions.
>>     
>>>  *      (The RTR->RTS transition is not needed, since this
>>>       
>> QP does no sending).
>>     
>>>  *    The sending XRC QP uses this QP as destination, while
>>>       
>> specifying an XRC SRQ
>>     
>>>  *    for actually receiving the transmissions and
>>>       
>> generating all completions on the
>>     
>>>  *    receiving side.
>>>  *
>>>  *    This QP is created in kernel space, and persists
>>>       
>> until the XRC domain is closed.
>>     
>>>  *    (i.e., its reference count goes to zero).
>>>  *
>>>  * @pd: protection domain to use.  At lower layer, this provides
>>> access to userspace obj
>>>  * @xrc_domain: xrc domain to use for the QP.
>>>  * @attr: modify-qp attributes needed to bring the QP to RTR.
>>>  * @attr_mask:  bitmap indicating which attributes are
>>>       
>> provided in the attr struct.
>>     
>>>  *    used for validity checking.
>>>  * @xrc_rcv_qpn: qp_num of created QP (if success). To be
>>>       
>> passed to the remote node. The
>>     
>>>  *               remote node will use xrc_rcv_qpn in
>>>       
>> ibv_post_send when sending to
>>     
>>>  *             XRC SRQ's on this host in the same xrc domain.
>>>  *
>>>  * RETURNS: success (0), or a (negative) error value.
>>>  */
>>>
>>> int ibv_alloc_xrc_rcv_qp(struct ibv_pd *pd,
>>>                        struct ibv_xrc_domain *xrc_domain,
>>>                        struct ibv_qp_attr *attr,
>>>                        enum ibv_qp_attr_mask attr_mask,
>>>                        uint32_t *xrc_rcv_qpn);
>>>
>>> Notes:
>>>
>>> 1. Although the kernel creates the qp in the kernel's own
>>>       
>> PD, we still need the PD
>>     
>>>    parameter to determine the device.
>>>
>>> 2. I chose to use struct ibv_qp_attr, which is used in
>>>       
>> modify QP, rather than create
>>     
>>>    a new structure for this purpose.  This also guards
>>>       
>> against API changes in the event
>>     
>>>    that during development I notice that more modify-qp
>>>       
>> parameters must be specified
>>     
>>>    for this operation to work.
>>>
>>> 3. Table of the ibv_qp_attr parameters showing what values to set:
>>>
>>> struct ibv_qp_attr {
>>>       enum ibv_qp_state       qp_state;               Not needed
>>>       enum ibv_qp_state       cur_qp_state;           Not needed
>>>               -- Driver starts from RESET and takes qp to RTR.
>>>       enum ibv_mtu            path_mtu;               Yes
>>>       enum ibv_mig_state      path_mig_state;         Yes
>>>       uint32_t                qkey;                   Yes
>>>       uint32_t                rq_psn;                 Yes
>>>       uint32_t                sq_psn;                 Not needed
>>>       uint32_t                dest_qp_num;            Yes
>>>       
>> -- this is the remote side QP for the RC conn.
>>     
>>>       int                     qp_access_flags;        Yes
>>>       struct ibv_qp_cap       cap;                    Need
>>>       
>> only XRC domain.
>>     
>>>                                                       Other
>>>       
>> caps will use hard-coded values:
>>     
>>   max_send_wr = 1;
>>     
>>   max_recv_wr = 0;
>>     
>>   max_send_sge = 1;
>>     
>>   max_recv_sge = 0;
>>     
>>   max_inline_data = 0;
>>     
>>>       struct ibv_ah_attr      ah_attr;                Yes
>>>       struct ibv_ah_attr      alt_ah_attr;            Optional
>>>       uint16_t                pkey_index;             Yes
>>>       uint16_t                alt_pkey_index;         Optional
>>>       uint8_t                 en_sqd_async_notify;    Not
>>>       
>> needed (No sq)
>>     
>>>       uint8_t                 sq_draining;            Not
>>>       
>> needed (No sq)
>>     
>>>       uint8_t                 max_rd_atomic;          Not
>>>       
>> needed (No sq)
>>     
>>>       uint8_t                 max_dest_rd_atomic;     Yes
>>>       
>> -- Total max outstanding RDMAs expected
>>     
>>>                                                       for
>>>       
>> ALL srq destinations using this receive QP.
>>     
>>>                                                       (if
>>>       
>> you are only using SENDs, this value can be 0).
>>     
>>>       uint8_t                 min_rnr_timer;          default - 0
>>>       uint8_t                 port_num;               Yes
>>>       uint8_t                 timeout;                Yes
>>>       uint8_t                 retry_cnt;              Yes
>>>       uint8_t                 rnr_retry;              Yes
>>>       uint8_t                 alt_port_num;           Optional
>>>       uint8_t                 alt_timeout;            Optional
>>> };
>>>
>>> 4. Attribute mask bits to set:
>>>       For RESET_to_INIT transition:
>>>               IB_QP_ACCESS_FLAGS | IB_QP_PKEY_INDEX | IB_QP_PORT
>>>
>>>       For INIT_to_RTR transition:
>>>               IB_QP_AV | IB_QP_PATH_MTU |
>>>               IB_QP_DEST_QPN | IB_QP_RQ_PSN | IB_QP_MIN_RNR_TIMER
>>>          If you are using RDMA or atomics, also set:
>>>               IB_QP_MAX_DEST_RD_ATOMIC
>>>
>>>
>>> _______________________________________________
>>> general mailing list
>>> general at lists.openfabrics.org
>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>
>>> To unsubscribe, please visit
>>> http://openib.org/mailman/listinfo/openib-general
>>>
>>>
>>>       
>> --
>> Pavel Shamis (Pasha)
>> Mellanox Technologies
>>
>>
>>     
>
>   


-- 
Pavel Shamis (Pasha)
Mellanox Technologies


From th.wilson at b2b.com  Mon Dec 24 05:51:06 2007
From: th.wilson at b2b.com (Thomas Wilson)
Date: Mon, 24 Dec 2007 06:51:06 -0700
Subject: [ofa-general] Last Minute Notice
Message-ID: <d344dcfe88bad5371a76ab49001d9baa@b2b.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/16faff4a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: le ad.gif
Type: image/gif
Size: 176104 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/16faff4a/attachment.gif>

From luminative at cruiserramp.com  Mon Dec 24 10:06:46 2007
From: luminative at cruiserramp.com (Betsy Snyder)
Date: Mon, 24 Dec 2007 14:06:46 -0400
Subject: [ofa-general] Autodesk 3D Studio Max 9 for XP for 149,
	Retails @ 6720 (You save 6590)
Message-ID: <000901c84644$dc6e5d80$0100007f@kylrwv>

type adobechristmas . com in Web Browser

conitec gamestudio pro a7 7.05 - 89
microsoft frontpage 2003 - 29
corel wordperfect office x3 standard - 49
adobe photoshop cs2 v 9.0 - 69
corel wordperfect office standard edition 12 - 49
scansoft dragon naturallyspeaking 7 preferred - 39
mcafee internet security suite version 7.0 - 29
autodesk 3ds max 9.0 - 149
alias studiotools 11.02 - 49
corel painter ix for mac - 39
ulead videostudio 11.0 plus - 39
roxio digitalmedia studio deluxe suite 7.0 - 49
autodesk architectural studio 3.0 - 39
acronis true image enterprise server 9.1.3666 - 79
acronis true image enterprise server 9.1.3666 - 79
intuit quickbooks premier edition 2007 - 79


From qjvaithek at bosch-bosch.com  Mon Dec 24 08:54:16 2007
From: qjvaithek at bosch-bosch.com (Ivy Shipman)
Date: Mon, 24 Dec 2007 08:54:16 -0800
Subject: [ofa-general] Just click to buy OEM. Best worldwide soft at
	increadeable prices
Message-ID: <406798610.12880671284858@bosch-bosch.com>

Our aim is to fulfil all our customers' needs by providing them with low-price PC and Mac software solutions. 
We definitely have the necessary software for you whenever you need it for your own PC, corporation or small-scale business.

VIEW WHAT WE GOT TO PROPOSE
http://geocities.com/wilbur.moreno/

Most demanding software are:
*Microsoft SUPER PACK: Retail price for this time - $929.95; Our only today - $169.95
*Adobe PACK - 1: Retail price this day - $2049.95; Our just - $179.95
*Macromedia PACK 1 - Sudio 8: Retail price now - $1199.95; Our only - $149.95
*Autodesk Map 3D 2007: Retail price for now - $5299.95; Our now just - $99.95
*Adobe Creative Suite 3 Master Collection: Retail price today - $2499.95; Our only for today - $299.95
*Windows XP Professional With SP2 Full Version: Retail price for this time - $259.99; Our only for today - $59.95
*Adobe Acrobat 8.0 Professional: Retail price this day - $449.95; Our just - $69.95

COME IN RIGHT NOW!
http://geocities.com/wilbur.moreno/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/e8035f37/attachment.html>

From YJia at tmriusa.com  Mon Dec 24 15:48:45 2007
From: YJia at tmriusa.com (Yicheng Jia)
Date: Mon, 24 Dec 2007 17:48:45 -0600
Subject: [ofa-general] synchronize commands issued to MTHCA
Message-ID: <OF61927CCB.4FC42CF1-ON862573BB.00816A4A-862573BB.0082E294@medical.local>

Hi Folks,

I'm using mellanox HCA and I'm a newbie to this IB community. I'm 
encountering a problem with the HCA board and looking forward to getting 
some help here.

I'm using OFED-1.0 and the problem I believe is related to command 
synchronization of HCA. The host issues a MAD_INF command at first and 
then a SW2HW_MTP command without waiting for the completion of the first 
command. Both of commands return with bad parameters error.

My question is why there's no synchronization mechanism for the command 
execution on HCA, can I use "spin_lock" or "sem_wait" to synchronize 
between every command?

Thanks!
Yicheng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071224/1e36d8f2/attachment.html>

From changquing.tang at hp.com  Mon Dec 24 15:49:37 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Mon, 24 Dec 2007 23:49:37 +0000
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	any	one user process
In-Reply-To: <476FBC1D.6050900@dev.mellanox.co.il>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<476A86E8.8020308@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>
	<476FBC1D.6050900@dev.mellanox.co.il>
Message-ID: <D89C2C212795564B837FA1665CAE02990FE241E577@G5W0278.americas.hpqcorp.net>


> -----Original Message-----
> From: Pavel Shamis (Pasha) [mailto:pasha at dev.mellanox.co.il]
> Sent: Monday, December 24, 2007 8:03 AM
> To: Tang, Changqing
> Cc: Jack Morgenstein; Roland Dreier;
> general at lists.openfabrics.org; Open MPI Developers;
> mvapich-discuss at cse.ohio-state.edu
> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> independent of any one user process
>
> Hi CQ,
> Tang, Changqing wrote:
> >         If I have a MPI server processes on a node, many other MPI
> > client processes will dynamically connect/disconnect with
> the server. The server use same XRC domain.
> >
> >         Will this cause accumulating the "kernel" QP for such
> > application ? we want the server to run 365 days a year.
> >
> I have some question about the scenario above. Did you call
> for the mpi disconnect on the both ends (server/client)
> before the client exit (did we must to do it?)

Yes, both ends will call disconnect. But for us, MPI_Comm_disconnect() call
is not a collective call, it is just a local operation.

--CQ


>
> Regards,
> Pasha.
> >
> > Thanks.
> > --CQ
> >
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: Pavel Shamis (Pasha) [mailto:pasha at dev.mellanox.co.il]
> >> Sent: Thursday, December 20, 2007 9:15 AM
> >> To: Jack Morgenstein
> >> Cc: Tang, Changqing; Roland Dreier;
> >> general at lists.openfabrics.org; Open MPI Developers;
> >> mvapich-discuss at cse.ohio-state.edu
> >> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> >> independent of any one user process
> >>
> >> Adding Open MPI and MVAPICH community to the thread.
> >>
> >> Pasha (Pavel Shamis)
> >>
> >> Jack Morgenstein wrote:
> >>
> >>> background:  see "XRC Cleanup order issue thread" at
> >>>
> >>>
> >>>
> >>>
> >>
> http://lists.openfabrics.org/pipermail/general/2007-December/043935.h
> >> t
> >>
> >>> ml
> >>>
> >>> (userspace process which created the receiving XRC qp on a
> >>>
> >> given host
> >>
> >>> dies before other processes which still need to receive XRC
> >>>
> >> messages
> >>
> >>> on their SRQs which are "paired" with the now-destroyed
> >>>
> >> receiving XRC
> >>
> >>> QP.)
> >>>
> >>> Solution: Add a userspace verb (as part of the XRC suite) which
> >>> enables the user process to create an XRC QP owned by the
> >>>
> >> kernel -- which belongs to the required XRC domain.
> >>
> >>> This QP will be destroyed when the XRC domain is closed
> >>>
> >> (i.e., as part
> >>
> >>> of a ibv_close_xrc_domain call, but only when the domain's
> >>>
> >> reference count goes to zero).
> >>
> >>> Below, I give the new userspace API for this function.  Any
> >>>
> >> feedback will be appreciated.
> >>
> >>> This API will be implemented in the upcoming OFED 1.3
> >>>
> >> release, so we need feedback ASAP.
> >>
> >>> Notes:
> >>> 1. There is no query or destroy verb for this QP. There is
> >>>
> >> also no userspace object for the
> >>
> >>>    QP. Userspace has ONLY the raw qp number to use when
> >>>
> >> creating the (X)RC connection.
> >>
> >>> 2. Since the QP is "owned" by kernel space, async events
> >>>
> >> for this QP are also handled in kernel
> >>
> >>>    space (i.e., reported in /var/log/messages). There are
> >>>
> >> no completion events for the QP, since
> >>
> >>>    it does not send, and all receives completions are
> >>>
> >> reported in the XRC SRQ's cq.
> >>
> >>>    If this QP enters the error state, the remote QP which
> >>>
> >> sends will start receiving RETRY_EXCEEDED
> >>
> >>>    errors, so the application will be aware of the failure.
> >>>
> >>> - Jack
> >>>
> >>>
> >>
> =====================================================================
> >> =
> >>
> >>> ================
> >>> /**
> >>>  * ibv_alloc_xrc_rcv_qp - creates an XRC QP for serving as
> >>>
> >> a receive-side only QP,
> >>
> >>>  *    and moves the created qp through the RESET->INIT and
> >>>
> >> INIT->RTR transitions.
> >>
> >>>  *      (The RTR->RTS transition is not needed, since this
> >>>
> >> QP does no sending).
> >>
> >>>  *    The sending XRC QP uses this QP as destination, while
> >>>
> >> specifying an XRC SRQ
> >>
> >>>  *    for actually receiving the transmissions and
> >>>
> >> generating all completions on the
> >>
> >>>  *    receiving side.
> >>>  *
> >>>  *    This QP is created in kernel space, and persists
> >>>
> >> until the XRC domain is closed.
> >>
> >>>  *    (i.e., its reference count goes to zero).
> >>>  *
> >>>  * @pd: protection domain to use.  At lower layer, this provides
> >>> access to userspace obj
> >>>  * @xrc_domain: xrc domain to use for the QP.
> >>>  * @attr: modify-qp attributes needed to bring the QP to RTR.
> >>>  * @attr_mask:  bitmap indicating which attributes are
> >>>
> >> provided in the attr struct.
> >>
> >>>  *    used for validity checking.
> >>>  * @xrc_rcv_qpn: qp_num of created QP (if success). To be
> >>>
> >> passed to the remote node. The
> >>
> >>>  *               remote node will use xrc_rcv_qpn in
> >>>
> >> ibv_post_send when sending to
> >>
> >>>  *             XRC SRQ's on this host in the same xrc domain.
> >>>  *
> >>>  * RETURNS: success (0), or a (negative) error value.
> >>>  */
> >>>
> >>> int ibv_alloc_xrc_rcv_qp(struct ibv_pd *pd,
> >>>                        struct ibv_xrc_domain *xrc_domain,
> >>>                        struct ibv_qp_attr *attr,
> >>>                        enum ibv_qp_attr_mask attr_mask,
> >>>                        uint32_t *xrc_rcv_qpn);
> >>>
> >>> Notes:
> >>>
> >>> 1. Although the kernel creates the qp in the kernel's own
> >>>
> >> PD, we still need the PD
> >>
> >>>    parameter to determine the device.
> >>>
> >>> 2. I chose to use struct ibv_qp_attr, which is used in
> >>>
> >> modify QP, rather than create
> >>
> >>>    a new structure for this purpose.  This also guards
> >>>
> >> against API changes in the event
> >>
> >>>    that during development I notice that more modify-qp
> >>>
> >> parameters must be specified
> >>
> >>>    for this operation to work.
> >>>
> >>> 3. Table of the ibv_qp_attr parameters showing what values to set:
> >>>
> >>> struct ibv_qp_attr {
> >>>       enum ibv_qp_state       qp_state;               Not needed
> >>>       enum ibv_qp_state       cur_qp_state;           Not needed
> >>>               -- Driver starts from RESET and takes qp to RTR.
> >>>       enum ibv_mtu            path_mtu;               Yes
> >>>       enum ibv_mig_state      path_mig_state;         Yes
> >>>       uint32_t                qkey;                   Yes
> >>>       uint32_t                rq_psn;                 Yes
> >>>       uint32_t                sq_psn;                 Not needed
> >>>       uint32_t                dest_qp_num;            Yes
> >>>
> >> -- this is the remote side QP for the RC conn.
> >>
> >>>       int                     qp_access_flags;        Yes
> >>>       struct ibv_qp_cap       cap;                    Need
> >>>
> >> only XRC domain.
> >>
> >>>                                                       Other
> >>>
> >> caps will use hard-coded values:
> >>
> >>   max_send_wr = 1;
> >>
> >>   max_recv_wr = 0;
> >>
> >>   max_send_sge = 1;
> >>
> >>   max_recv_sge = 0;
> >>
> >>   max_inline_data = 0;
> >>
> >>>       struct ibv_ah_attr      ah_attr;                Yes
> >>>       struct ibv_ah_attr      alt_ah_attr;            Optional
> >>>       uint16_t                pkey_index;             Yes
> >>>       uint16_t                alt_pkey_index;         Optional
> >>>       uint8_t                 en_sqd_async_notify;    Not
> >>>
> >> needed (No sq)
> >>
> >>>       uint8_t                 sq_draining;            Not
> >>>
> >> needed (No sq)
> >>
> >>>       uint8_t                 max_rd_atomic;          Not
> >>>
> >> needed (No sq)
> >>
> >>>       uint8_t                 max_dest_rd_atomic;     Yes
> >>>
> >> -- Total max outstanding RDMAs expected
> >>
> >>>                                                       for
> >>>
> >> ALL srq destinations using this receive QP.
> >>
> >>>                                                       (if
> >>>
> >> you are only using SENDs, this value can be 0).
> >>
> >>>       uint8_t                 min_rnr_timer;          default - 0
> >>>       uint8_t                 port_num;               Yes
> >>>       uint8_t                 timeout;                Yes
> >>>       uint8_t                 retry_cnt;              Yes
> >>>       uint8_t                 rnr_retry;              Yes
> >>>       uint8_t                 alt_port_num;           Optional
> >>>       uint8_t                 alt_timeout;            Optional
> >>> };
> >>>
> >>> 4. Attribute mask bits to set:
> >>>       For RESET_to_INIT transition:
> >>>               IB_QP_ACCESS_FLAGS | IB_QP_PKEY_INDEX | IB_QP_PORT
> >>>
> >>>       For INIT_to_RTR transition:
> >>>               IB_QP_AV | IB_QP_PATH_MTU |
> >>>               IB_QP_DEST_QPN | IB_QP_RQ_PSN | IB_QP_MIN_RNR_TIMER
> >>>          If you are using RDMA or atomics, also set:
> >>>               IB_QP_MAX_DEST_RD_ATOMIC
> >>>
> >>>
> >>> _______________________________________________
> >>> general mailing list
> >>> general at lists.openfabrics.org
> >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>>
> >>> To unsubscribe, please visit
> >>> http://openib.org/mailman/listinfo/openib-general
> >>>
> >>>
> >>>
> >> --
> >> Pavel Shamis (Pasha)
> >> Mellanox Technologies
> >>
> >>
> >>
> >
> >
>
>
> --
> Pavel Shamis (Pasha)
> Mellanox Technologies
>
>


From reaccession at vonder.com  Mon Dec 24 16:17:00 2007
From: reaccession at vonder.com (Julianto Allison)
Date: Tue, 25 Dec 2007 09:17:00 +0900
Subject: [ofa-general] Adobe Master Suite for XP/Vista for 299,
	Retails @ 2499 (You save 2199)
Message-ID: <000301c84689$73f59f80$0100007f@ownysn>

type merrysoftware .com in Internet Explorer

adobe premiere pro cs3 - 79
corel wordperfect office standard edition 12 - 49
sony vegas 6 - 69
cadlink signlab vinyl 7.1 - 69
adobe fireworks cs3 - 59
avid liquid pro 7 - 69
coreldraw graphics suite x3 - 59
alias motionbuilder 6.0 - 49
nero 7 premium - 39
abbyy finereader 8.0 professional multilanguage - 49
symantec antivirus corporate 10 - 29
adobe photoshop cs3 extended - 89
microsoft visual studio 2005 professional edition - 149
creative suite 3 design premium for win - 269
sony vegas 6 - 69
acronis true image enterprise server 9.1.3666 - 79


From kliteyn at mellanox.co.il  Mon Dec 24 21:06:23 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 25 Dec 2007 07:06:23 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-25:normal completion
Message-ID: <MTLEXCH01FnW0kKIBXE00001cde@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-24
OpenSM git rev = Sat_Dec_22_16:15:50_2007 [56399c9945bae3c11433e558a76c2b7c4ef67da6]
ibutils git rev = Wed_Dec_19_12:06:28_2007 [9961475294fbf1d3782edb8f377a77b13fa80d70]
 
 
Total=520  Pass=519  Fail=1
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
12 LidMgr IS3-128.topo

Failures:
1 LidMgr IS3-128.topo


From sashak at voltaire.com  Mon Dec 24 22:47:16 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 25 Dec 2007 06:47:16 +0000
Subject: [ofa-general] [PATCH] ibnetdiscover - ports report
In-Reply-To: <C44068BB95F2E54DB07822CE539BCF1CBD90D1@exus01.voltaire.com>
References: <C44068BB95F2E54DB07822CE539BCF1CBD90D1@exus01.voltaire.com>
Message-ID: <20071225064716.GD7012@sashak.voltaire.com>

On 23:38 Thu 20 Dec     , Erez Strauss wrote:
> Hello IB developers and users,
> 
>  
> 
> I would like to get feedback on the following patch to ibnetdiscover.
> 
>  
> 
> The patch introduce additional output mode for ibnetdiscover which is
> focused on the ports, and print one line for each port with the needed
> port information.
> 
>  
> 
> The output looks like:
> 
>  
> 
> SW     4 18 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4 17 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4 16 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4 15 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4 14 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4 13 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4  9 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4  8 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4  7 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4  6 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4  5 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4  4 0x0008f104003f0838 4x SDR
> 'ISR9288/ISR9096 Voltaire sLB-24'
> 
> SW     4  1 0x0008f104003f0838 4x SDR - SW     6  3 0x0008f104004005f5 (
> 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> 
> SW     4  2 0x0008f104003f0838 4x SDR - SW     7  3 0x0008f104004005f6 (
> 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> 
> SW     4  3 0x0008f104003f0838 4x SDR - SW     1  3 0x0008f104004005f7 (
> 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> 
> SW     4 10 0x0008f104003f0838 4x SDR - SW     8  3 0x0008f104004006f5 (
> 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> 
> SW     4 11 0x0008f104003f0838 4x SDR - SW     9  3 0x0008f104004006f6 (
> 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> 
> SW     4 12 0x0008f104003f0838 4x SDR - SW    10  3 0x0008f104004006f7 (
> 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> 
> CA    14  1 0x0008f10403960091 4x SDR - SW     4 20 0x0008f104003f0838 (
> 'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )
> 
> CA    11  1 0x0002c90107a4e431 4x SDR - SW     4 19 0x0008f104003f0838 (
> 'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )
> 
> CA     2  1 0x0008f1000102d801 4x SDR - SW     1 15 0x0008f104004005f7 (
> 'Voltaire IB-to-TCP/IP Router' - 'ISR9288 Voltaire 
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Erez Strauss
> 
> Voltaire.
> 
>  
> 
> -------------
> 
> Date:   Thu Dec 20 19:36:14 2007 -0500
> 
>  
> 
>      Added the -p(orts) option, to generate ports reports
> 
>  
> 
>     Signed-off-by: Erez Strauss <erezs _at_ voltaire.com>

Applied. Thanks.

Please note, the originally posted patch was completely mangled by your
mailer (so attached version was useful).

Sasha


From glebn at voltaire.com  Mon Dec 24 22:43:06 2007
From: glebn at voltaire.com (Gleb Natapov)
Date: Tue, 25 Dec 2007 08:43:06 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent
	of any one user process
In-Reply-To: <D89C2C212795564B837FA1665CAE02990FE241E577@G5W0278.americas.hpqcorp.net>
References: <200712201535.37527.jackm@dev.mellanox.co.il>
	<476A86E8.8020308@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE239BCBD@G5W0278.americas.hpqcorp.net>
	<476FBC1D.6050900@dev.mellanox.co.il>
	<D89C2C212795564B837FA1665CAE02990FE241E577@G5W0278.americas.hpqcorp.net>
Message-ID: <20071225064306.GE26019@minantech.com>

On Mon, Dec 24, 2007 at 11:49:37PM +0000, Tang, Changqing wrote:
> 
> 
> > -----Original Message-----
> > From: Pavel Shamis (Pasha) [mailto:pasha at dev.mellanox.co.il]
> > Sent: Monday, December 24, 2007 8:03 AM
> > To: Tang, Changqing
> > Cc: Jack Morgenstein; Roland Dreier;
> > general at lists.openfabrics.org; Open MPI Developers;
> > mvapich-discuss at cse.ohio-state.edu
> > Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> > independent of any one user process
> >
> > Hi CQ,
> > Tang, Changqing wrote:
> > >         If I have a MPI server processes on a node, many other MPI
> > > client processes will dynamically connect/disconnect with
> > the server. The server use same XRC domain.
> > >
> > >         Will this cause accumulating the "kernel" QP for such
> > > application ? we want the server to run 365 days a year.
> > >
> > I have some question about the scenario above. Did you call
> > for the mpi disconnect on the both ends (server/client)
> > before the client exit (did we must to do it?)
> 
> Yes, both ends will call disconnect. But for us, MPI_Comm_disconnect() call
> is not a collective call, it is just a local operation.
Bust spec says that MPI_Comm_disconnect() is a collective call:
http://www.mpi-forum.org/docs/mpi-20-html/node114.htm#Node114

> 
> --CQ
> 
> 
> >
> > Regards,
> > Pasha.
> > >
> > > Thanks.
> > > --CQ
> > >
> > >
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: Pavel Shamis (Pasha) [mailto:pasha at dev.mellanox.co.il]
> > >> Sent: Thursday, December 20, 2007 9:15 AM
> > >> To: Jack Morgenstein
> > >> Cc: Tang, Changqing; Roland Dreier;
> > >> general at lists.openfabrics.org; Open MPI Developers;
> > >> mvapich-discuss at cse.ohio-state.edu
> > >> Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP
> > >> independent of any one user process
> > >>
> > >> Adding Open MPI and MVAPICH community to the thread.
> > >>
> > >> Pasha (Pavel Shamis)
> > >>
> > >> Jack Morgenstein wrote:
> > >>
> > >>> background:  see "XRC Cleanup order issue thread" at
> > >>>
> > >>>
> > >>>
> > >>>
> > >>
> > http://lists.openfabrics.org/pipermail/general/2007-December/043935.h
> > >> t
> > >>
> > >>> ml
> > >>>
> > >>> (userspace process which created the receiving XRC qp on a
> > >>>
> > >> given host
> > >>
> > >>> dies before other processes which still need to receive XRC
> > >>>
> > >> messages
> > >>
> > >>> on their SRQs which are "paired" with the now-destroyed
> > >>>
> > >> receiving XRC
> > >>
> > >>> QP.)
> > >>>
> > >>> Solution: Add a userspace verb (as part of the XRC suite) which
> > >>> enables the user process to create an XRC QP owned by the
> > >>>
> > >> kernel -- which belongs to the required XRC domain.
> > >>
> > >>> This QP will be destroyed when the XRC domain is closed
> > >>>
> > >> (i.e., as part
> > >>
> > >>> of a ibv_close_xrc_domain call, but only when the domain's
> > >>>
> > >> reference count goes to zero).
> > >>
> > >>> Below, I give the new userspace API for this function.  Any
> > >>>
> > >> feedback will be appreciated.
> > >>
> > >>> This API will be implemented in the upcoming OFED 1.3
> > >>>
> > >> release, so we need feedback ASAP.
> > >>
> > >>> Notes:
> > >>> 1. There is no query or destroy verb for this QP. There is
> > >>>
> > >> also no userspace object for the
> > >>
> > >>>    QP. Userspace has ONLY the raw qp number to use when
> > >>>
> > >> creating the (X)RC connection.
> > >>
> > >>> 2. Since the QP is "owned" by kernel space, async events
> > >>>
> > >> for this QP are also handled in kernel
> > >>
> > >>>    space (i.e., reported in /var/log/messages). There are
> > >>>
> > >> no completion events for the QP, since
> > >>
> > >>>    it does not send, and all receives completions are
> > >>>
> > >> reported in the XRC SRQ's cq.
> > >>
> > >>>    If this QP enters the error state, the remote QP which
> > >>>
> > >> sends will start receiving RETRY_EXCEEDED
> > >>
> > >>>    errors, so the application will be aware of the failure.
> > >>>
> > >>> - Jack
> > >>>
> > >>>
> > >>
> > =====================================================================
> > >> =
> > >>
> > >>> ================
> > >>> /**
> > >>>  * ibv_alloc_xrc_rcv_qp - creates an XRC QP for serving as
> > >>>
> > >> a receive-side only QP,
> > >>
> > >>>  *    and moves the created qp through the RESET->INIT and
> > >>>
> > >> INIT->RTR transitions.
> > >>
> > >>>  *      (The RTR->RTS transition is not needed, since this
> > >>>
> > >> QP does no sending).
> > >>
> > >>>  *    The sending XRC QP uses this QP as destination, while
> > >>>
> > >> specifying an XRC SRQ
> > >>
> > >>>  *    for actually receiving the transmissions and
> > >>>
> > >> generating all completions on the
> > >>
> > >>>  *    receiving side.
> > >>>  *
> > >>>  *    This QP is created in kernel space, and persists
> > >>>
> > >> until the XRC domain is closed.
> > >>
> > >>>  *    (i.e., its reference count goes to zero).
> > >>>  *
> > >>>  * @pd: protection domain to use.  At lower layer, this provides
> > >>> access to userspace obj
> > >>>  * @xrc_domain: xrc domain to use for the QP.
> > >>>  * @attr: modify-qp attributes needed to bring the QP to RTR.
> > >>>  * @attr_mask:  bitmap indicating which attributes are
> > >>>
> > >> provided in the attr struct.
> > >>
> > >>>  *    used for validity checking.
> > >>>  * @xrc_rcv_qpn: qp_num of created QP (if success). To be
> > >>>
> > >> passed to the remote node. The
> > >>
> > >>>  *               remote node will use xrc_rcv_qpn in
> > >>>
> > >> ibv_post_send when sending to
> > >>
> > >>>  *             XRC SRQ's on this host in the same xrc domain.
> > >>>  *
> > >>>  * RETURNS: success (0), or a (negative) error value.
> > >>>  */
> > >>>
> > >>> int ibv_alloc_xrc_rcv_qp(struct ibv_pd *pd,
> > >>>                        struct ibv_xrc_domain *xrc_domain,
> > >>>                        struct ibv_qp_attr *attr,
> > >>>                        enum ibv_qp_attr_mask attr_mask,
> > >>>                        uint32_t *xrc_rcv_qpn);
> > >>>
> > >>> Notes:
> > >>>
> > >>> 1. Although the kernel creates the qp in the kernel's own
> > >>>
> > >> PD, we still need the PD
> > >>
> > >>>    parameter to determine the device.
> > >>>
> > >>> 2. I chose to use struct ibv_qp_attr, which is used in
> > >>>
> > >> modify QP, rather than create
> > >>
> > >>>    a new structure for this purpose.  This also guards
> > >>>
> > >> against API changes in the event
> > >>
> > >>>    that during development I notice that more modify-qp
> > >>>
> > >> parameters must be specified
> > >>
> > >>>    for this operation to work.
> > >>>
> > >>> 3. Table of the ibv_qp_attr parameters showing what values to set:
> > >>>
> > >>> struct ibv_qp_attr {
> > >>>       enum ibv_qp_state       qp_state;               Not needed
> > >>>       enum ibv_qp_state       cur_qp_state;           Not needed
> > >>>               -- Driver starts from RESET and takes qp to RTR.
> > >>>       enum ibv_mtu            path_mtu;               Yes
> > >>>       enum ibv_mig_state      path_mig_state;         Yes
> > >>>       uint32_t                qkey;                   Yes
> > >>>       uint32_t                rq_psn;                 Yes
> > >>>       uint32_t                sq_psn;                 Not needed
> > >>>       uint32_t                dest_qp_num;            Yes
> > >>>
> > >> -- this is the remote side QP for the RC conn.
> > >>
> > >>>       int                     qp_access_flags;        Yes
> > >>>       struct ibv_qp_cap       cap;                    Need
> > >>>
> > >> only XRC domain.
> > >>
> > >>>                                                       Other
> > >>>
> > >> caps will use hard-coded values:
> > >>
> > >>   max_send_wr = 1;
> > >>
> > >>   max_recv_wr = 0;
> > >>
> > >>   max_send_sge = 1;
> > >>
> > >>   max_recv_sge = 0;
> > >>
> > >>   max_inline_data = 0;
> > >>
> > >>>       struct ibv_ah_attr      ah_attr;                Yes
> > >>>       struct ibv_ah_attr      alt_ah_attr;            Optional
> > >>>       uint16_t                pkey_index;             Yes
> > >>>       uint16_t                alt_pkey_index;         Optional
> > >>>       uint8_t                 en_sqd_async_notify;    Not
> > >>>
> > >> needed (No sq)
> > >>
> > >>>       uint8_t                 sq_draining;            Not
> > >>>
> > >> needed (No sq)
> > >>
> > >>>       uint8_t                 max_rd_atomic;          Not
> > >>>
> > >> needed (No sq)
> > >>
> > >>>       uint8_t                 max_dest_rd_atomic;     Yes
> > >>>
> > >> -- Total max outstanding RDMAs expected
> > >>
> > >>>                                                       for
> > >>>
> > >> ALL srq destinations using this receive QP.
> > >>
> > >>>                                                       (if
> > >>>
> > >> you are only using SENDs, this value can be 0).
> > >>
> > >>>       uint8_t                 min_rnr_timer;          default - 0
> > >>>       uint8_t                 port_num;               Yes
> > >>>       uint8_t                 timeout;                Yes
> > >>>       uint8_t                 retry_cnt;              Yes
> > >>>       uint8_t                 rnr_retry;              Yes
> > >>>       uint8_t                 alt_port_num;           Optional
> > >>>       uint8_t                 alt_timeout;            Optional
> > >>> };
> > >>>
> > >>> 4. Attribute mask bits to set:
> > >>>       For RESET_to_INIT transition:
> > >>>               IB_QP_ACCESS_FLAGS | IB_QP_PKEY_INDEX | IB_QP_PORT
> > >>>
> > >>>       For INIT_to_RTR transition:
> > >>>               IB_QP_AV | IB_QP_PATH_MTU |
> > >>>               IB_QP_DEST_QPN | IB_QP_RQ_PSN | IB_QP_MIN_RNR_TIMER
> > >>>          If you are using RDMA or atomics, also set:
> > >>>               IB_QP_MAX_DEST_RD_ATOMIC
> > >>>
> > >>>
> > >>> _______________________________________________
> > >>> general mailing list
> > >>> general at lists.openfabrics.org
> > >>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > >>>
> > >>> To unsubscribe, please visit
> > >>> http://openib.org/mailman/listinfo/openib-general
> > >>>
> > >>>
> > >>>
> > >> --
> > >> Pavel Shamis (Pasha)
> > >> Mellanox Technologies
> > >>
> > >>
> > >>
> > >
> > >
> >
> >
> > --
> > Pavel Shamis (Pasha)
> > Mellanox Technologies
> >
> >
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

--
			Gleb.


From mfandjf at ascentassurance.com  Tue Dec 25 00:30:57 2007
From: mfandjf at ascentassurance.com (mfandjf at ascentassurance.com)
Date: Tue, 25 Dec 2007 03:30:57 -0500
Subject: [ofa-general] Time for a little Christmas Cheer.
Message-ID: <001c01c846d0$791c0340$b3789354@woj>

hey,

Open this when no one is around, I think you will get a kick out of it
but it is for your eyes only, ok? Take out a few minutes and give your
self a gift. lol
http://merrychristmasdude.com/


From bulten at netmarkpatent.com  Tue Dec 25 00:48:09 2007
From: bulten at netmarkpatent.com (NETMARK PATENT)
Date: Tue, 25 Dec 2007 10:48:09 +0200
Subject: [ofa-general] =?windows-1254?q?G=FCne=FE_Enerji_Paneli!?=
Message-ID: <3848-22007122258488750@ugur>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071225/fa23c1d6/attachment.html>

From a-aala at aegchem.com  Tue Dec 25 01:35:20 2007
From: a-aala at aegchem.com (Salvatore Keller)
Date: Tue, 25 Dec 2007 17:35:20 +0800
Subject: [ofa-general] I was looking for you
Message-ID: <01c8471c$85a7eb50$0d35727b@a-aala>

Hello! I am tired today. I am nice girl that would like to chat with you. Email me at Maria at ShineBal.info only, because I am using my friend's email to write this. If you would like to see my pictures.


From kliteyn at dev.mellanox.co.il  Tue Dec 25 01:30:54 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 25 Dec 2007 11:30:54 +0200
Subject: [ofa-general] [PATCH v3] opensm: osm_state_mgr.c - purge idle queue
 if heavy sweep requested
Message-ID: <4770CDCE.8040200@dev.mellanox.co.il>

If a heavy sweep requested during idle queue processing, OSM continues
to process it till the end and only then notices the heavy sweep request.
In some cases this might leave a topology change unhandled for several
minutes. Instead, OSM should purge the idle queue, which will cause it
to start the heavy sweep immediately.
When the heavy sweep will be completed, OSM will recalculate multicast
routing.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_state_mgr.c |   30 +++++++++++++++++++++++++++++-
 1 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index 5c39f11..2172ea0 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -1062,6 +1062,28 @@ static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
 }

 /**********************************************************************
+ **********************************************************************/
+static void __process_idle_time_queue_purge(IN osm_state_mgr_t * const p_mgr)
+{
+	cl_qlist_t *p_list = &p_mgr->idle_time_list;
+	cl_list_item_t *p_list_item;
+	osm_idle_item_t *p_process_item;
+
+	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
+
+	cl_spinlock_acquire(&p_mgr->idle_lock);
+	p_list_item = cl_qlist_remove_head(p_list);
+	while (p_list_item != cl_qlist_end(p_list)) {
+		p_process_item = (osm_idle_item_t *) p_list_item;
+		free(p_process_item);
+	}
+	cl_spinlock_release(&p_mgr->idle_lock);
+
+	OSM_LOG_EXIT(p_mgr->p_log);
+	return;
+}
+
+/**********************************************************************
  * Go over all the remote SMs (as updated in the sm_guid_tbl).
  * Find if there is a remote sm that is a master SM.
  * If there is a remote master SM - return a pointer to it,
@@ -1607,11 +1629,17 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
 				/* CALL the done function */
 				__process_idle_time_queue_done(p_mgr);

+				if (p_mgr->p_subn->force_immediate_heavy_sweep)
+					/*
+					 * Immediate heavy sweep is requested, so it's
+					 * more important than processing idle queue, and
+					 * new heavy sweep makes rest of idle queue obsolete.
+					 */
+					__process_idle_time_queue_purge(p_mgr);
 				/*
 				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
 				 * so that the next element in the queue gets processed
 				 */
-
 				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
 				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
 				break;
-- 
1.5.1.4


From trouncesp2 at jupiterstownsville.com  Tue Dec 25 02:27:36 2007
From: trouncesp2 at jupiterstownsville.com (Jenifer Poole)
Date: Tue, 25 Dec 2007 13:27:36 +0300
Subject: [ofa-general] I have been on the product 2 months now and I have
	already gained 1 inch.
Message-ID: <01c846f9$e9995400$42bbff50@trouncesp2>

We have mulitiple distribution centres around the world, and endever to get you your packet as quick as possable. All orders will be processed and dispatched within 24hrs. VPXL has been labeled a "Herbal Breakthrough" with over 1,500,000 bottles sold worldwide. VPXL is the only penis enlargement pill that has been manufactured in a FDA Approved laboratory.http://joionsd.comWomen view men with a larger penis size as being more sexually attractive and sexually capable. An overall larger penis size also means a larger surface area, which stimulates more nerve endings, resulting in a more pleasurable experience for both you and your partner. A larger and more muscular penis is also more of a natural, visual turn on for women

From dwseyfarthm at seyfarth.com  Tue Dec 25 03:10:46 2007
From: dwseyfarthm at seyfarth.com (Shaun Cowan)
Date: Tue, 25 Dec 2007 13:10:46 +0200
Subject: [ofa-general] Medications that you need.
Message-ID: <01c846f7$8f978f00$dc5e6c55@dwseyfarthm>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/MartyWiley29/

We provide confidential and secure purchase! 


From vlad at lists.openfabrics.org  Tue Dec 25 03:15:09 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Tue, 25 Dec 2007 03:15:09 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071225-0200 daily build status
Message-ID: <20071225111509.92237E6022D@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on ia64 with linux-2.6.18
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.13
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on powerpc with linux-2.6.12
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.22
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ppc64 with linux-2.6.18-8.el5

Failed:


From kliteyn at dev.mellanox.co.il  Tue Dec 25 04:29:47 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 25 Dec 2007 14:29:47 +0200
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only outbound
 partition enforcement on switch
Message-ID: <4770F7BB.6040502@dev.mellanox.co.il>

Fixing wrong setting of partition enforcement bits on switch ports.
When an HCA port is configured with a certain pkey, the peer port
on the switch should turn on outbound partition enforcement bit only.
Turning on the inbound enforcement will cause the switch to drop
valid packets if the HCA is partial member.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_pkey_mgr.c |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c
index 58eed04..209fa71 100644
--- a/opensm/opensm/osm_pkey_mgr.c
+++ b/opensm/opensm/osm_pkey_mgr.c
@@ -212,7 +212,8 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,

 	p_pi = &p_physp->port_info;

-	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
+	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
+	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
 		osm_log(p_log, OSM_LOG_DEBUG,
 			"pkey_mgr_enforce_partition: "
 			"No need to update PortInfo for "
@@ -227,10 +228,13 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,
 	memcpy(payload, p_pi, sizeof(ib_port_info_t));

 	p_pi = (ib_port_info_t *) payload;
+
+	/* clearing enforcement in both directions */
+	p_pi->vl_enforce &= ~0xc;
 	if (enforce == TRUE)
-		p_pi->vl_enforce |= 0xc;
-	else
-		p_pi->vl_enforce &= ~0xc;
+		/* enforcing only in outbound direction */
+		p_pi->vl_enforce |= 0x4;
+
 	p_pi->state_info2 = 0;
 	ib_port_info_set_port_state(p_pi, IB_LINK_NO_CHANGE);

-- 
1.5.1.4


From dwskhpotterym at skhpottery.com  Tue Dec 25 05:26:41 2007
From: dwskhpotterym at skhpottery.com (Loraine Ferreira)
Date: Tue, 25 Dec 2007 14:26:41 +0100
Subject: [ofa-general] Reputable online casino!
Message-ID: <01c84702$2a956680$2dc37750@dwskhpotterym>

 Welcome to Golden Gate Casino that offers you a unique possibility to win real money online. Download for free totally realistic and secure software which brings game excitement right into your home and receive 2400$ welcome bonus!

 Among our advantages are: fast payouts, high degree of security, all around the clock customer support. These are few reasons why Golden Gate casino is so popular

http://geocities.com/JanRowe77/

   Don't hesitate, register now!


From dwspacewm at spacew.com  Tue Dec 25 07:49:11 2007
From: dwspacewm at spacew.com (Ivory Jordan)
Date: Tue, 25 Dec 2007 18:49:11 +0300
Subject: [ofa-general] Medications that you need.
Message-ID: <01c84726$d650ad80$1eb543d9@dwspacewm>

Buy Must Have medications at Canada based pharmacy.
No prescription at all! Same quality! 
Save your money, buy pills immediately! 

http://geocities.com/ReggieConrad40/ 

We provide confidential and secure purchase! 


From sashak at voltaire.com  Tue Dec 25 09:19:59 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 25 Dec 2007 17:19:59 +0000
Subject: [ofa-general] Re: [PATCH v3] opensm: osm_state_mgr.c - purge idle
	queue if heavy sweep requested
In-Reply-To: <4770CDCE.8040200@dev.mellanox.co.il>
References: <4770CDCE.8040200@dev.mellanox.co.il>
Message-ID: <20071225171959.GC369@sashak.voltaire.com>

Hi Yevgeny,

On 11:30 Tue 25 Dec     , Yevgeny Kliteynik wrote:
> If a heavy sweep requested during idle queue processing, OSM continues
> to process it till the end and only then notices the heavy sweep request.
> In some cases this might leave a topology change unhandled for several
> minutes. Instead, OSM should purge the idle queue, which will cause it
> to start the heavy sweep immediately.
> When the heavy sweep will be completed, OSM will recalculate multicast
> routing.
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
>  opensm/opensm/osm_state_mgr.c |   30 +++++++++++++++++++++++++++++-
>  1 files changed, 29 insertions(+), 1 deletions(-)
> 
> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
> index 5c39f11..2172ea0 100644
> --- a/opensm/opensm/osm_state_mgr.c
> +++ b/opensm/opensm/osm_state_mgr.c
> @@ -1062,6 +1062,28 @@ static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
>  }
> 
>  /**********************************************************************
> + **********************************************************************/
> +static void __process_idle_time_queue_purge(IN osm_state_mgr_t * const p_mgr)
> +{
> +	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> +	cl_list_item_t *p_list_item;
> +	osm_idle_item_t *p_process_item;
> +
> +	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
> +
> +	cl_spinlock_acquire(&p_mgr->idle_lock);
> +	p_list_item = cl_qlist_remove_head(p_list);
> +	while (p_list_item != cl_qlist_end(p_list)) {
> +		p_process_item = (osm_idle_item_t *) p_list_item;
> +		free(p_process_item);
> +	}
> +	cl_spinlock_release(&p_mgr->idle_lock);
> +
> +	OSM_LOG_EXIT(p_mgr->p_log);
> +	return;
> +}
> +
> +/**********************************************************************
>   * Go over all the remote SMs (as updated in the sm_guid_tbl).
>   * Find if there is a remote sm that is a master SM.
>   * If there is a remote master SM - return a pointer to it,
> @@ -1607,11 +1629,17 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>  				/* CALL the done function */
>  				__process_idle_time_queue_done(p_mgr);
> 
> +				if (p_mgr->p_subn->force_immediate_heavy_sweep)
> +					/*
> +					 * Immediate heavy sweep is requested, so it's
> +					 * more important than processing idle queue, and
> +					 * new heavy sweep makes rest of idle queue obsolete.
> +					 */
> +					__process_idle_time_queue_purge(p_mgr);

I think here

	signal = OSM_SIGNAL_NONE;
	p_mgr->state = OSM_SM_STATE_IDLE;

should be added in order to not spend one more idle queue cycle. Right?

Sasha

>  				/*
>  				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>  				 * so that the next element in the queue gets processed
>  				 */
> -
>  				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>  				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>  				break;
> -- 
> 1.5.1.4
> 
> 


From sashak at voltaire.com  Tue Dec 25 09:24:47 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Tue, 25 Dec 2007 17:24:47 +0000
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <4770F7BB.6040502@dev.mellanox.co.il>
References: <4770F7BB.6040502@dev.mellanox.co.il>
Message-ID: <20071225172447.GD369@sashak.voltaire.com>

On 14:29 Tue 25 Dec     , Yevgeny Kliteynik wrote:
> Fixing wrong setting of partition enforcement bits on switch ports.
> When an HCA port is configured with a certain pkey, the peer port
> on the switch should turn on outbound partition enforcement bit only.
> Turning on the inbound enforcement will cause the switch to drop
> valid packets if the HCA is partial member.

Nice catch!

Interesting how "inbound enforcement" switch service could be useful at
all?

> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied. Thanks.

Sasha


From dwsierracomputersm at sierracomputers.com  Tue Dec 25 10:06:34 2007
From: dwsierracomputersm at sierracomputers.com (Elba Burnette)
Date: Tue, 25 Dec 2007 10:06:34 -0800
Subject: [ofa-general] Get your free 2400$ welcome bonus and win much more!
Message-ID: <01c846dd$d462f170$17a8067c@dwsierracomputersm>

 Online gambling is not only fun and exciting. It can bring real money! All you have to do is to download free software, receive great $2400 welcome bonus and start playing. Enjoy the real casino atmosphere with Golden Gate Casino!

 We guarantee absolute privacy of player information. Friendly 24/7 customer support, quick payouts, only fair gaming!

http://geocities.com/LanceFaulkner16/

   Start downloading free software now!


From kliteyn at dev.mellanox.co.il  Tue Dec 25 13:17:11 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 25 Dec 2007 23:17:11 +0200
Subject: [ofa-general] Re: [PATCH v3] opensm: osm_state_mgr.c - purge idle
 queue if heavy sweep requested
In-Reply-To: <20071225171959.GC369@sashak.voltaire.com>
References: <4770CDCE.8040200@dev.mellanox.co.il>
	<20071225171959.GC369@sashak.voltaire.com>
Message-ID: <47717357.204@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> Hi Yevgeny,
> 
> On 11:30 Tue 25 Dec     , Yevgeny Kliteynik wrote:
>> If a heavy sweep requested during idle queue processing, OSM continues
>> to process it till the end and only then notices the heavy sweep request.
>> In some cases this might leave a topology change unhandled for several
>> minutes. Instead, OSM should purge the idle queue, which will cause it
>> to start the heavy sweep immediately.
>> When the heavy sweep will be completed, OSM will recalculate multicast
>> routing.
>>
>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>> ---
>>  opensm/opensm/osm_state_mgr.c |   30 +++++++++++++++++++++++++++++-
>>  1 files changed, 29 insertions(+), 1 deletions(-)
>>
>> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
>> index 5c39f11..2172ea0 100644
>> --- a/opensm/opensm/osm_state_mgr.c
>> +++ b/opensm/opensm/osm_state_mgr.c
>> @@ -1062,6 +1062,28 @@ static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
>>  }
>>
>>  /**********************************************************************
>> + **********************************************************************/
>> +static void __process_idle_time_queue_purge(IN osm_state_mgr_t * const p_mgr)
>> +{
>> +	cl_qlist_t *p_list = &p_mgr->idle_time_list;
>> +	cl_list_item_t *p_list_item;
>> +	osm_idle_item_t *p_process_item;
>> +
>> +	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
>> +
>> +	cl_spinlock_acquire(&p_mgr->idle_lock);
>> +	p_list_item = cl_qlist_remove_head(p_list);
>> +	while (p_list_item != cl_qlist_end(p_list)) {
>> +		p_process_item = (osm_idle_item_t *) p_list_item;
>> +		free(p_process_item);
>> +	}
>> +	cl_spinlock_release(&p_mgr->idle_lock);
>> +
>> +	OSM_LOG_EXIT(p_mgr->p_log);
>> +	return;
>> +}
>> +
>> +/**********************************************************************
>>   * Go over all the remote SMs (as updated in the sm_guid_tbl).
>>   * Find if there is a remote sm that is a master SM.
>>   * If there is a remote master SM - return a pointer to it,
>> @@ -1607,11 +1629,17 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>>  				/* CALL the done function */
>>  				__process_idle_time_queue_done(p_mgr);
>>
>> +				if (p_mgr->p_subn->force_immediate_heavy_sweep)
>> +					/*
>> +					 * Immediate heavy sweep is requested, so it's
>> +					 * more important than processing idle queue, and
>> +					 * new heavy sweep makes rest of idle queue obsolete.
>> +					 */
>> +					__process_idle_time_queue_purge(p_mgr);
> 
> I think here
> 
> 	signal = OSM_SIGNAL_NONE;
> 	p_mgr->state = OSM_SM_STATE_IDLE;
> 
> should be added in order to not spend one more idle queue cycle. Right?

Basically - yes. That is what I did in the v2 of the patch.
But here I thought to leave the flow as much intact as I could.

I don't have a real example to show that changing the state
and signal would be bad. I's just a general feeling that doing
it would be kind of a hack, while w/o these two additional lines
state manager continues its usual flow after purging the queue.

The price of one additional iteration doesn't look as something
we should consider, especially giving the fact that this whole
flow is relevant only in extreme cases.

-- Yevgeny


> Sasha
> 
>>  				/*
>>  				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>>  				 * so that the next element in the queue gets processed
>>  				 */
>> -
>>  				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>>  				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>>  				break;
>> -- 
>> 1.5.1.4
>>
>>
> 


From mba at bldfe3.vnet.ibm.com  Tue Dec 25 13:37:56 2007
From: mba at bldfe3.vnet.ibm.com (Fran Turner)
Date: Tue, 25 Dec 2007 22:37:56 +0100
Subject: [ofa-general] Enhance your life with these products!
Message-ID: <967175820.66990260247890@bldfe3.vnet.ibm.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071225/7befb0c1/attachment.html>

From kliteyn at dev.mellanox.co.il  Tue Dec 25 13:40:58 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Tue, 25 Dec 2007 23:40:58 +0200
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c: setting only
 outbound partition enforcement on switch
In-Reply-To: <20071225172447.GD369@sashak.voltaire.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<20071225172447.GD369@sashak.voltaire.com>
Message-ID: <477178EA.6070204@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> On 14:29 Tue 25 Dec     , Yevgeny Kliteynik wrote:
>> Fixing wrong setting of partition enforcement bits on switch ports.
>> When an HCA port is configured with a certain pkey, the peer port
>> on the switch should turn on outbound partition enforcement bit only.
>> Turning on the inbound enforcement will cause the switch to drop
>> valid packets if the HCA is partial member.
> 
> Nice catch!
> 
> Interesting how "inbound enforcement" switch service could be useful at
> all?

Beats me...

-- Yevgeny

>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> 
> Applied. Thanks.
> 
> Sasha
> 


From kliteyn at mellanox.co.il  Tue Dec 25 21:32:26 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 26 Dec 2007 07:32:26 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-26:normal completion
Message-ID: <MTLEXCH01r0fVwcdZ0600001e4d@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-25
OpenSM git rev = Tue_Dec_25_09:06:00_2007 [f9870eeac88e224f51e69bf9e5710b825e8d9ccd]
ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f]
 
 
Total=560  Pass=560  Fail=0
 
 
Pass:
42 Stability IS1-16.topo
42 Pkey IS1-16.topo
42 OsmTest IS1-16.topo
42 OsmStress IS1-16.topo
42 Multicast IS1-16.topo
42 LidMgr IS1-16.topo
14 Stability IS3-loop.topo
14 Stability IS3-128.topo
14 Pkey IS3-128.topo
14 OsmTest IS3-loop.topo
14 OsmTest IS3-128.topo
14 OsmStress IS3-128.topo
14 Multicast IS3-loop.topo
14 Multicast IS3-128.topo
14 LidMgr IS3-128.topo
14 FatTree merge-roots-4-ary-2-tree.topo
14 FatTree merge-root-4-ary-3-tree.topo
14 FatTree gnu-stallion-64.topo
14 FatTree blend-4-ary-2-tree.topo
14 FatTree RhinoDDR.topo
14 FatTree FullGnu.topo
14 FatTree 4-ary-2-tree.topo
14 FatTree 2-ary-4-tree.topo
14 FatTree 12-node-spaced.topo
14 FTreeFail 4-ary-2-tree-missing-sw-link.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
14 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From dwsingleclicksm at singleclicks.de  Wed Dec 26 00:53:36 2007
From: dwsingleclicksm at singleclicks.de (Taylor Breeton)
Date: Wed, 26 Dec 2007 10:53:36 +0200
Subject: [ofa-general] Great news about saving opportunities.
Message-ID: <01c847ad$908af800$a9caf358@dwsingleclicksm>

    �CanadianPharmacy� offers medications from the leading world famous manufacturers which stand for their quality. Fast worldwide shipping! Utmost care about each customer! Flexible discount system allows you to save on your meds. 

http://geocities.com/CarrollLeonard41/

 Make significant savings buying medications in Canada!

Taylor Breeton


From vlad at lists.openfabrics.org  Wed Dec 26 03:15:07 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Wed, 26 Dec 2007 03:15:07 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071226-0200 daily build status
Message-ID: <20071226111507.54C2FE60051@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.14
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.15
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.16
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.13
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.18-8.el5

Failed:


From kliteyn at dev.mellanox.co.il  Wed Dec 26 03:51:36 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Wed, 26 Dec 2007 13:51:36 +0200
Subject: [ofa-general] [PATCH] osm/osm_mcast_mgr.c: coredump in ofed_1_2
Message-ID: <47724048.2000302@dev.mellanox.co.il>

Hi Sasha,

Protecting against possible NULL returned by osm_node_get_remote_node().

Please apply this fix to branch ofed_1_2 only.
It appears that this coredump has already been fixed for ofed_1_3.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 osm/opensm/osm_mcast_mgr.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/osm/opensm/osm_mcast_mgr.c b/osm/opensm/osm_mcast_mgr.c
index 28c4dcd..51d948f 100644
--- a/osm/opensm/osm_mcast_mgr.c
+++ b/osm/opensm/osm_mcast_mgr.c
@@ -814,6 +814,8 @@ __osm_mcast_mgr_branch(

     p_node = p_sw->p_node;
     p_remote_node = osm_node_get_remote_node( p_node, i, NULL );
+    if (!p_remote_node)
+      continue;

     if( osm_node_get_type( p_remote_node ) == IB_NODE_TYPE_SWITCH )
     {
-- 
1.5.1.4


From sashak at voltaire.com  Wed Dec 26 07:42:18 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Wed, 26 Dec 2007 15:42:18 +0000
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in ofed_1_2
In-Reply-To: <47724048.2000302@dev.mellanox.co.il>
References: <47724048.2000302@dev.mellanox.co.il>
Message-ID: <20071226154218.GN7012@sashak.voltaire.com>

On 13:51 Wed 26 Dec     , Yevgeny Kliteynik wrote:
> Hi Sasha,
> 
> Protecting against possible NULL returned by osm_node_get_remote_node().
> 
> Please apply this fix to branch ofed_1_2 only.
> It appears that this coredump has already been fixed for ofed_1_3.
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>

Applied to ofed_1_2. Thanks.

Sasha


From dwsedwardsm at sedwards.com  Wed Dec 26 08:22:03 2007
From: dwsedwardsm at sedwards.com (Margaret Norris)
Date: Thu, 27 Dec 2007 00:22:03 +0800
Subject: [ofa-general] Best medications, best prices! 
Message-ID: <01c8481e$813d0de0$ec5b8bdc@dwsedwardsm>

Want to be the top all night long? 
Buy top products at Canadian Pharmacy store. 
Here you can find brands that you trust. 
Buy high-quality Viagra at discount pharmacy. 

http://geocities.com/IsabellaDominguez79/

Only Confidential purchase. Verified by VISA! 


From dillowda at ornl.gov  Wed Dec 26 09:14:11 2007
From: dillowda at ornl.gov (David Dillow)
Date: Wed, 26 Dec 2007 12:14:11 -0500
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <20071223014407L.tomof@acm.org>
References: <1198273973.9979.34.camel@lap75545.ornl.gov>
	<1198275532.9979.43.camel@lap75545.ornl.gov>
	<20071223014407L.tomof@acm.org>
Message-ID: <1198689251.25003.2.camel@lap75545.ornl.gov>


On Sun, 2007-12-23 at 01:41 +0900, FUJITA Tomonori wrote:
> transport_container_unregister(&i->rport_attr_cont) should not fail here.
> 
> It fails because there is still a srp rport.
> 
> I think that as Pete pointed out, srp_remove_one needs to call
> srp_remove_host.
> 
> Can you try this?

That patched oopsed in scsi_remove_host(), but reversing the order has
survived over 500 insert/probe/remove cycles.

Tested-by: David Dillow <dillowda at ornl.gov>
---
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 950228f..77e8b90 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device)
 		list_for_each_entry_safe(target, tmp_target,
 					 &host->target_list, list) {
 			scsi_remove_host(target->scsi_host);
+			srp_remove_host(target->scsi_host);
 			srp_disconnect_target(target);
 			ib_destroy_cm_id(target->cm_id);
 			srp_free_target_ib(target);


From fujita.tomonori at lab.ntt.co.jp  Wed Dec 26 18:58:17 2007
From: fujita.tomonori at lab.ntt.co.jp (FUJITA Tomonori)
Date: Thu, 27 Dec 2007 11:58:17 +0900
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <1198689251.25003.2.camel@lap75545.ornl.gov>
References: <1198275532.9979.43.camel@lap75545.ornl.gov>
	<20071223014407L.tomof@acm.org>
	<1198689251.25003.2.camel@lap75545.ornl.gov>
Message-ID: <20071227115817I.fujita.tomonori@lab.ntt.co.jp>

On Wed, 26 Dec 2007 12:14:11 -0500
David Dillow <dillowda at ornl.gov> wrote:

> 
> On Sun, 2007-12-23 at 01:41 +0900, FUJITA Tomonori wrote:
> > transport_container_unregister(&i->rport_attr_cont) should not fail here.
> > 
> > It fails because there is still a srp rport.
> > 
> > I think that as Pete pointed out, srp_remove_one needs to call
> > srp_remove_host.
> > 
> > Can you try this?
> 
> That patched oopsed in scsi_remove_host(), but reversing the order has
> survived over 500 insert/probe/remove cycles.

Thanks,

Can you post the oops message? The srp class might have bugs related
to it.


> Tested-by: David Dillow <dillowda at ornl.gov>
> ---
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 950228f..77e8b90 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device)
>  		list_for_each_entry_safe(target, tmp_target,
>  					 &host->target_list, list) {
>  			scsi_remove_host(target->scsi_host);
> +			srp_remove_host(target->scsi_host);
>  			srp_disconnect_target(target);
>  			ib_destroy_cm_id(target->cm_id);
>  			srp_free_target_ib(target);
> 
> 


From fujita.tomonori at lab.ntt.co.jp  Wed Dec 26 20:39:46 2007
From: fujita.tomonori at lab.ntt.co.jp (FUJITA Tomonori)
Date: Thu, 27 Dec 2007 13:39:46 +0900
Subject: [ofa-general] Re: [Stgt-devel] [ewg] New features for OFED 1.4
In-Reply-To: <476F70DB.1000701@Voltaire.COM>
References: <47308DF2.70409@mellanox.co.il>
	<476F70DB.1000701@Voltaire.COM>
Message-ID: <20071227133946F.fujita.tomonori@lab.ntt.co.jp>

On Mon, 24 Dec 2007 10:42:03 +0200
Erez Zilber <erezz at Voltaire.COM> wrote:

> Tziporet Koren wrote:
> >
> > I wish to collect requirements for new features for OFED 1.4
> > Please reply with any request you have (features of existing modules,
> > new modules etc.)
> >
> > Thanks,
> > Tziporet
> >
> Tziporet,
> 
> Voltaire will add support for stgt (SCSI target). It includes support
> for iSCSI over iSER/TCP. For more info about stgt with iSER support:
> 
> http://stgt.berlios.de/
> 
> https://wiki.openfabrics.org/tiki-index.php?page=ISER-target

The latest snapshot includes iSER support:

http://stgt.berlios.de/releases/tgt-20071227.tar.bz2

Maybe it's time to update the wiki page.


From kliteyn at mellanox.co.il  Wed Dec 26 21:40:18 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 27 Dec 2007 07:40:18 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-27:normal completion
Message-ID: <MTLEXCH01i67EKRuYDl00002008@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-26
OpenSM git rev = Tue_Dec_25_14:29:47_2007 [eea8ce4965c401a091d1196bd38d001f92260ede]
ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f]
 
 
Total=560  Pass=559  Fail=1
 
 
Pass:
42 Stability IS1-16.topo
42 Pkey IS1-16.topo
42 OsmTest IS1-16.topo
42 OsmStress IS1-16.topo
42 Multicast IS1-16.topo
42 LidMgr IS1-16.topo
14 Stability IS3-loop.topo
14 Stability IS3-128.topo
14 Pkey IS3-128.topo
14 OsmTest IS3-loop.topo
14 OsmTest IS3-128.topo
14 OsmStress IS3-128.topo
14 Multicast IS3-loop.topo
14 Multicast IS3-128.topo
14 FatTree merge-roots-4-ary-2-tree.topo
14 FatTree merge-root-4-ary-3-tree.topo
14 FatTree gnu-stallion-64.topo
14 FatTree blend-4-ary-2-tree.topo
14 FatTree RhinoDDR.topo
14 FatTree FullGnu.topo
14 FatTree 4-ary-2-tree.topo
14 FatTree 2-ary-4-tree.topo
14 FatTree 12-node-spaced.topo
14 FTreeFail 4-ary-2-tree-missing-sw-link.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
14 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
13 LidMgr IS3-128.topo

Failures:
1 LidMgr IS3-128.topo


From doteumx at bluegiant.com  Wed Dec 26 23:22:23 2007
From: doteumx at bluegiant.com (Chase Dyer)
Date: Thu, 27 Dec 2007 11:22:23 +0400
Subject: [ofa-general] Help your loved one stop smoking
Message-ID: <880892985.05396906529819@bluegiant.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071227/a87ba8a7/attachment.html>

From xwfcouc at bobdegeorge.com  Thu Dec 27 01:48:30 2007
From: xwfcouc at bobdegeorge.com (Amy Roach)
Date: Thu, 27 Dec 2007 16:48:30 +0700
Subject: [ofa-general] Improve your health supersite
Message-ID: <064338913.23659450790176@bobdegeorge.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071227/2093b563/attachment.html>

From vlad at lists.openfabrics.org  Thu Dec 27 03:13:57 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Thu, 27 Dec 2007 03:13:57 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071227-0200 daily build status
Message-ID: <20071227111357.1B2FAE60062@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.14
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.16
Passed on powerpc with linux-2.6.13
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.18
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.17
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.15
Passed on x86_64 with linux-2.6.13
Passed on ppc64 with linux-2.6.15
Passed on ia64 with linux-2.6.14
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.12
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.18-53.el5

Failed:


From kliteyn at mellanox.co.il  Thu Dec 27 05:07:38 2007
From: kliteyn at mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 27 Dec 2007 15:07:38 +0200
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <477178EA.6070204@dev.mellanox.co.il>
References: <4770F7BB.6040502@dev.mellanox.co.il>	<20071225172447.GD369@sashak.voltaire.com>
	<477178EA.6070204@dev.mellanox.co.il>
Message-ID: <4773A39A.9070803@mellanox.co.il>

Yevgeny Kliteynik wrote:
> Sasha Khapyorsky wrote:
>> On 14:29 Tue 25 Dec     , Yevgeny Kliteynik wrote:
>>> Fixing wrong setting of partition enforcement bits on switch ports.
>>> When an HCA port is configured with a certain pkey, the peer port
>>> on the switch should turn on outbound partition enforcement bit only.
>>> Turning on the inbound enforcement will cause the switch to drop
>>> valid packets if the HCA is partial member.
>>
>> Nice catch!
>>
>> Interesting how "inbound enforcement" switch service could be useful at
>> all?
It appears that the inbound enforcement is actually a useful thing.
Basically, it all boils down to the question where would you drop
the packet with a wrong pkey.
With only outbound enforcement the packet is dropped by the
*last* port before target. With inbound the packet is dropped
by the *first* port - the one that is connected to the source HCA.

Anyway, I'm reviewing the spec again in this area, and I'll probably
post a patch that turns on the inbound enforcement too later today.

-- Yevgeny


> Beats me...
>
> -- Yevgeny
>
>>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>>
>> Applied. Thanks.
>>
>> Sasha
>>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
>


From hrosenstock at xsigo.com  Thu Dec 27 06:57:47 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 06:57:47 -0800
Subject: [ofa-general] [PATCH] ibnetdiscover - ports report
In-Reply-To: <20071225064716.GD7012@sashak.voltaire.com>
References: <C44068BB95F2E54DB07822CE539BCF1CBD90D1@exus01.voltaire.com>
	<20071225064716.GD7012@sashak.voltaire.com>
Message-ID: <1198767467.23289.194.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-25 at 06:47 +0000, Sasha Khapyorsky wrote:
> On 23:38 Thu 20 Dec     , Erez Strauss wrote:
> > Hello IB developers and users,
> > 
> >  
> > 
> > I would like to get feedback on the following patch to ibnetdiscover.
> > 
> >  
> > 
> > The patch introduce additional output mode for ibnetdiscover which is
> > focused on the ports, and print one line for each port with the needed
> > port information.
> > 
> >  
> > 
> > The output looks like:
> > 
> >  
> > 
> > SW     4 18 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4 17 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4 16 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4 15 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4 14 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4 13 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4  9 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4  8 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4  7 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4  6 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4  5 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4  4 0x0008f104003f0838 4x SDR
> > 'ISR9288/ISR9096 Voltaire sLB-24'
> > 
> > SW     4  1 0x0008f104003f0838 4x SDR - SW     6  3 0x0008f104004005f5 (
> > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > 
> > SW     4  2 0x0008f104003f0838 4x SDR - SW     7  3 0x0008f104004005f6 (
> > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > 
> > SW     4  3 0x0008f104003f0838 4x SDR - SW     1  3 0x0008f104004005f7 (
> > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > 
> > SW     4 10 0x0008f104003f0838 4x SDR - SW     8  3 0x0008f104004006f5 (
> > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > 
> > SW     4 11 0x0008f104003f0838 4x SDR - SW     9  3 0x0008f104004006f6 (
> > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > 
> > SW     4 12 0x0008f104003f0838 4x SDR - SW    10  3 0x0008f104004006f7 (
> > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > 
> > CA    14  1 0x0008f10403960091 4x SDR - SW     4 20 0x0008f104003f0838 (
> > 'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )
> > 
> > CA    11  1 0x0002c90107a4e431 4x SDR - SW     4 19 0x0008f104003f0838 (
> > 'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )
> > 
> > CA     2  1 0x0008f1000102d801 4x SDR - SW     1 15 0x0008f104004005f7 (
> > 'Voltaire IB-to-TCP/IP Router' - 'ISR9288 Voltaire 
> > 
> >  
> > 
> >  
> > 
> > Thanks,
> > 
> >  
> > 
> > Erez Strauss
> > 
> > Voltaire.
> > 
> >  
> > 
> > -------------
> > 
> > Date:   Thu Dec 20 19:36:14 2007 -0500
> > 
> >  
> > 
> >      Added the -p(orts) option, to generate ports reports
> > 
> >  
> > 
> >     Signed-off-by: Erez Strauss <erezs _at_ voltaire.com>
> 
> Applied. Thanks.
> 
> Please note, the originally posted patch was completely mangled by your
> mailer (so attached version was useful).

What about updating the ibnetdiscover man page for this too ?

-- Hal

> 
> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Thu Dec 27 06:57:53 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 06:57:53 -0800
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in
	ofed_1_2
In-Reply-To: <20071226154218.GN7012@sashak.voltaire.com>
References: <47724048.2000302@dev.mellanox.co.il>
	<20071226154218.GN7012@sashak.voltaire.com>
Message-ID: <1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>

On Wed, 2007-12-26 at 15:42 +0000, Sasha Khapyorsky wrote:
> On 13:51 Wed 26 Dec     , Yevgeny Kliteynik wrote:
> > Hi Sasha,
> > 
> > Protecting against possible NULL returned by osm_node_get_remote_node().
> > 
> > Please apply this fix to branch ofed_1_2 only.
> > It appears that this coredump has already been fixed for ofed_1_3.
> > 
> > Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> 
> Applied to ofed_1_2. Thanks.

Which ofed_1_2 tree ?

BTW, up to now, such patches have been rejected saying OFED 1.2 (for
OpenSM) was not being maintained. Is there now a change of policy on
this ?

-- Hal

> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Thu Dec 27 07:00:08 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 07:00:08 -0800
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <4770F7BB.6040502@dev.mellanox.co.il>
References: <4770F7BB.6040502@dev.mellanox.co.il>
Message-ID: <1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> Fixing wrong setting of partition enforcement bits on switch ports.
> When an HCA port is configured with a certain pkey, the peer port
> on the switch should turn on outbound partition enforcement bit only.
> Turning on the inbound enforcement will cause the switch to drop
> valid packets if the HCA is partial member.

Inbound enforcement is actually the more useful case. If there is
inbound enforcement, outbound enforcement doesn't add much.

In the case of partial only (not both partial and full) membership, the
peer switch physical port would need to be set to full membership.

-- Hal

> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
>  opensm/opensm/osm_pkey_mgr.c |   12 ++++++++----
>  1 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c
> index 58eed04..209fa71 100644
> --- a/opensm/opensm/osm_pkey_mgr.c
> +++ b/opensm/opensm/osm_pkey_mgr.c
> @@ -212,7 +212,8 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,
> 
>  	p_pi = &p_physp->port_info;
> 
> -	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
> +	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
> +	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
>  		osm_log(p_log, OSM_LOG_DEBUG,
>  			"pkey_mgr_enforce_partition: "
>  			"No need to update PortInfo for "
> @@ -227,10 +228,13 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,
>  	memcpy(payload, p_pi, sizeof(ib_port_info_t));
> 
>  	p_pi = (ib_port_info_t *) payload;
> +
> +	/* clearing enforcement in both directions */
> +	p_pi->vl_enforce &= ~0xc;
>  	if (enforce == TRUE)
> -		p_pi->vl_enforce |= 0xc;
> -	else
> -		p_pi->vl_enforce &= ~0xc;
> +		/* enforcing only in outbound direction */
> +		p_pi->vl_enforce |= 0x4;
> +
>  	p_pi->state_info2 = 0;
>  	ib_port_info_set_port_state(p_pi, IB_LINK_NO_CHANGE);
> 


From hrosenstock at xsigo.com  Thu Dec 27 07:01:02 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 07:01:02 -0800
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <20071225172447.GD369@sashak.voltaire.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<20071225172447.GD369@sashak.voltaire.com>
Message-ID: <1198767662.23289.201.camel@hrosenstock-ws.xsigo.com>

On Tue, 2007-12-25 at 17:24 +0000, Sasha Khapyorsky wrote:
> On 14:29 Tue 25 Dec     , Yevgeny Kliteynik wrote:
> > Fixing wrong setting of partition enforcement bits on switch ports.
> > When an HCA port is configured with a certain pkey, the peer port
> > on the switch should turn on outbound partition enforcement bit only.
> > Turning on the inbound enforcement will cause the switch to drop
> > valid packets if the HCA is partial member.
> 
> Nice catch!
> 
> Interesting how "inbound enforcement" switch service could be useful at
> all?

It's way more interesting than outbound enforcement IMO. It should be
made to work as indicated in previous email.

-- Hal

> 
> > Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> 
> Applied. Thanks.
> 
> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Thu Dec 27 07:04:59 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 07:04:59 -0800
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <4773A39A.9070803@mellanox.co.il>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<20071225172447.GD369@sashak.voltaire.com>
	<477178EA.6070204@dev.mellanox.co.il> <4773A39A.9070803@mellanox.co.il>
Message-ID: <1198767899.23289.206.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-27 at 15:07 +0200, Yevgeny Kliteynik wrote:
> Yevgeny Kliteynik wrote:
> > Sasha Khapyorsky wrote:
> >> On 14:29 Tue 25 Dec     , Yevgeny Kliteynik wrote:
> >>> Fixing wrong setting of partition enforcement bits on switch ports.
> >>> When an HCA port is configured with a certain pkey, the peer port
> >>> on the switch should turn on outbound partition enforcement bit only.
> >>> Turning on the inbound enforcement will cause the switch to drop
> >>> valid packets if the HCA is partial member.
> >>
> >> Nice catch!
> >>
> >> Interesting how "inbound enforcement" switch service could be useful at
> >> all?
> It appears that the inbound enforcement is actually a useful thing.

Indeed :-)

> Basically, it all boils down to the question where would you drop
> the packet with a wrong pkey.

Exactly.

> With only outbound enforcement the packet is dropped by the
> *last* port before target.

(and it would be dropped at the endport anyhow).

>  With inbound the packet is dropped
> by the *first* port - the one that is connected to the source HCA.

Way more useful as this saves the network bandwidth.

> Anyway, I'm reviewing the spec again in this area, and I'll probably
> post a patch that turns on the inbound enforcement too later today.

Good to hear.

-- Hal

> -- Yevgeny
> 
> 
> > Beats me...
> >
> > -- Yevgeny
> >
> >>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> >>
> >> Applied. Thanks.
> >>
> >> Sasha
> >>
> >
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >
> > To unsubscribe, please visit 
> > http://openib.org/mailman/listinfo/openib-general
> >
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From kliteyn at dev.mellanox.co.il  Thu Dec 27 07:39:32 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 27 Dec 2007 17:39:32 +0200
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting/clearing both
 inbound and outbound partition enforcements
Message-ID: <4773C734.6000609@dev.mellanox.co.il>

Turning on inbound partition enforcement bit on switch ports -
undoing part of the changes that were done by the previous pkey patch.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/opensm/osm_pkey_mgr.c |   11 +++++------
 1 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c
index 209fa71..61bd076 100644
--- a/opensm/opensm/osm_pkey_mgr.c
+++ b/opensm/opensm/osm_pkey_mgr.c
@@ -212,7 +212,7 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,

 	p_pi = &p_physp->port_info;

-	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
+	if (((p_pi->vl_enforce & 0xc) == 0xc && enforce) ||
 	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
 		osm_log(p_log, OSM_LOG_DEBUG,
 			"pkey_mgr_enforce_partition: "
@@ -229,11 +229,10 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,

 	p_pi = (ib_port_info_t *) payload;

-	/* clearing enforcement in both directions */
-	p_pi->vl_enforce &= ~0xc;
-	if (enforce == TRUE)
-		/* enforcing only in outbound direction */
-		p_pi->vl_enforce |= 0x4;
+	if (enforce)
+		p_pi->vl_enforce |= 0xc;
+	else
+		p_pi->vl_enforce &= ~0xc;

 	p_pi->state_info2 = 0;
 	ib_port_info_set_port_state(p_pi, IB_LINK_NO_CHANGE);
-- 
1.5.1.4


From sashak at voltaire.com  Thu Dec 27 08:13:25 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 27 Dec 2007 16:13:25 +0000
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071227161325.GB13378@sashak.voltaire.com>

Hi Hal,

On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
> On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> > Fixing wrong setting of partition enforcement bits on switch ports.
> > When an HCA port is configured with a certain pkey, the peer port
> > on the switch should turn on outbound partition enforcement bit only.
> > Turning on the inbound enforcement will cause the switch to drop
> > valid packets if the HCA is partial member.
> 
> Inbound enforcement is actually the more useful case. If there is
> inbound enforcement, outbound enforcement doesn't add much.
> 
> In the case of partial only (not both partial and full) membership, the
> peer switch physical port would need to be set to full membership.

Then it could break outbound enforcement. Isn't it?

Sasha


From hrosenstock at xsigo.com  Thu Dec 27 08:20:37 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 08:20:37 -0800
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <20071227161325.GB13378@sashak.voltaire.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
Message-ID: <1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
> Hi Hal,
> 
> On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
> > On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> > > Fixing wrong setting of partition enforcement bits on switch ports.
> > > When an HCA port is configured with a certain pkey, the peer port
> > > on the switch should turn on outbound partition enforcement bit only.
> > > Turning on the inbound enforcement will cause the switch to drop
> > > valid packets if the HCA is partial member.
> > 
> > Inbound enforcement is actually the more useful case. If there is
> > inbound enforcement, outbound enforcement doesn't add much.
> > 
> > In the case of partial only (not both partial and full) membership, the
> > peer switch physical port would need to be set to full membership.
> 
> Then it could break outbound enforcement. Isn't it?

What I wrote was wrong. Limited pkey is sufficient. See o18-14

> 
> Sasha


From sashak at voltaire.com  Thu Dec 27 08:35:39 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 27 Dec 2007 16:35:39 +0000
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c: setting/clearing
	both inbound and outbound partition enforcements
In-Reply-To: <4773C734.6000609@dev.mellanox.co.il>
References: <4773C734.6000609@dev.mellanox.co.il>
Message-ID: <20071227163539.GD13378@sashak.voltaire.com>

Hi Yevgeny,

On 17:39 Thu 27 Dec     , Yevgeny Kliteynik wrote:
> Turning on inbound partition enforcement bit on switch ports -
> undoing part of the changes that were done by the previous pkey patch.

It looks like full revert, is it what did you mean?

As far as I understand Hal's point was different - to set *full* P_Key
values on switch's ports (and revert original patch). Not quite sure it
is the best option - IMO it doesn't enforce limited-to-limited
communication at nearest switch port and breaks limited-to-limited
enforcement on outbound ports. However I don't see an ideal solution
here. Finally we will need sort of convention.

Sasha

> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
>  opensm/opensm/osm_pkey_mgr.c |   11 +++++------
>  1 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c
> index 209fa71..61bd076 100644
> --- a/opensm/opensm/osm_pkey_mgr.c
> +++ b/opensm/opensm/osm_pkey_mgr.c
> @@ -212,7 +212,7 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,
> 
>  	p_pi = &p_physp->port_info;
> 
> -	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
> +	if (((p_pi->vl_enforce & 0xc) == 0xc && enforce) ||
>  	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
>  		osm_log(p_log, OSM_LOG_DEBUG,
>  			"pkey_mgr_enforce_partition: "
> @@ -229,11 +229,10 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,
> 
>  	p_pi = (ib_port_info_t *) payload;
> 
> -	/* clearing enforcement in both directions */
> -	p_pi->vl_enforce &= ~0xc;
> -	if (enforce == TRUE)
> -		/* enforcing only in outbound direction */
> -		p_pi->vl_enforce |= 0x4;
> +	if (enforce)
> +		p_pi->vl_enforce |= 0xc;
> +	else
> +		p_pi->vl_enforce &= ~0xc;
> 
>  	p_pi->state_info2 = 0;
>  	ib_port_info_set_port_state(p_pi, IB_LINK_NO_CHANGE);
> -- 
> 1.5.1.4
> 


From sashak at voltaire.com  Thu Dec 27 08:40:28 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 27 Dec 2007 16:40:28 +0000
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
	<1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071227164028.GE13378@sashak.voltaire.com>

On 08:20 Thu 27 Dec     , Hal Rosenstock wrote:
> On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
> > Hi Hal,
> > 
> > On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
> > > On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> > > > Fixing wrong setting of partition enforcement bits on switch ports.
> > > > When an HCA port is configured with a certain pkey, the peer port
> > > > on the switch should turn on outbound partition enforcement bit only.
> > > > Turning on the inbound enforcement will cause the switch to drop
> > > > valid packets if the HCA is partial member.
> > > 
> > > Inbound enforcement is actually the more useful case. If there is
> > > inbound enforcement, outbound enforcement doesn't add much.
> > > 
> > > In the case of partial only (not both partial and full) membership, the
> > > peer switch physical port would need to be set to full membership.
> > 
> > Then it could break outbound enforcement. Isn't it?
> 
> What I wrote was wrong. Limited pkey is sufficient. See o18-14

Do you mean last paragraph of o18-14? Assuming so - it makes sense. So
we need just revert the original patch.

Sasha


From sashak at voltaire.com  Thu Dec 27 08:47:37 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 27 Dec 2007 16:47:37 +0000
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in
	ofed_1_2
In-Reply-To: <1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>
References: <47724048.2000302@dev.mellanox.co.il>
	<20071226154218.GN7012@sashak.voltaire.com>
	<1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071227164737.GF13378@sashak.voltaire.com>

On 06:57 Thu 27 Dec     , Hal Rosenstock wrote:
> On Wed, 2007-12-26 at 15:42 +0000, Sasha Khapyorsky wrote:
> > On 13:51 Wed 26 Dec     , Yevgeny Kliteynik wrote:
> > > Hi Sasha,
> > > 
> > > Protecting against possible NULL returned by osm_node_get_remote_node().
> > > 
> > > Please apply this fix to branch ofed_1_2 only.
> > > It appears that this coredump has already been fixed for ofed_1_3.
> > > 
> > > Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> > 
> > Applied to ofed_1_2. Thanks.
> 
> Which ofed_1_2 tree ?

  ofed_1_2 branch of git://git.openfabrics.org/~sashak/management

> BTW, up to now, such patches have been rejected saying OFED 1.2 (for
> OpenSM) was not being maintained. Is there now a change of policy on
> this ?

Not really. As far as I remember it was first explicit patch for 1.2.

I don't backport non critical fixes and improvements from master to 1.2
- that is true.

Sasha


From hrosenstock at xsigo.com  Thu Dec 27 08:37:37 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 08:37:37 -0800
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <20071227164028.GE13378@sashak.voltaire.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
	<1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
	<20071227164028.GE13378@sashak.voltaire.com>
Message-ID: <1198773457.23289.212.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-27 at 16:40 +0000, Sasha Khapyorsky wrote:
> On 08:20 Thu 27 Dec     , Hal Rosenstock wrote:
> > On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
> > > Hi Hal,
> > > 
> > > On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
> > > > On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> > > > > Fixing wrong setting of partition enforcement bits on switch ports.
> > > > > When an HCA port is configured with a certain pkey, the peer port
> > > > > on the switch should turn on outbound partition enforcement bit only.
> > > > > Turning on the inbound enforcement will cause the switch to drop
> > > > > valid packets if the HCA is partial member.
> > > > 
> > > > Inbound enforcement is actually the more useful case. If there is
> > > > inbound enforcement, outbound enforcement doesn't add much.
> > > > 
> > > > In the case of partial only (not both partial and full) membership, the
> > > > peer switch physical port would need to be set to full membership.
> > > 
> > > Then it could break outbound enforcement. Isn't it?
> > 
> > What I wrote was wrong. Limited pkey is sufficient. See o18-14
> 
> Do you mean last paragraph of o18-14?

Last bullet (for limited). Switch (external/physical) port enforcement
is different than end port enforcement.

-- Hal

>  Assuming so - it makes sense. So
> we need just revert the original patch.
> 
> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Thu Dec 27 08:39:15 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 08:39:15 -0800
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c:
	setting/clearing both inbound and outbound partition enforcements
In-Reply-To: <20071227163539.GD13378@sashak.voltaire.com>
References: <4773C734.6000609@dev.mellanox.co.il>
	<20071227163539.GD13378@sashak.voltaire.com>
Message-ID: <1198773555.23289.215.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-27 at 16:35 +0000, Sasha Khapyorsky wrote:
> Hi Yevgeny,
> 
> On 17:39 Thu 27 Dec     , Yevgeny Kliteynik wrote:
> > Turning on inbound partition enforcement bit on switch ports -
> > undoing part of the changes that were done by the previous pkey patch.
> 
> It looks like full revert, is it what did you mean?
> 
> As far as I understand Hal's point was different - to set *full* P_Key
> values on switch's ports

That was before I looked up o18-14.

>  (and revert original patch). Not quite sure it
> is the best option - IMO it doesn't enforce limited-to-limited
> communication at nearest switch port and breaks limited-to-limited
> enforcement on outbound ports. However I don't see an ideal solution
> here.

After reading o18-14, do you still think so ?

>  Finally we will need sort of convention.

Not sure what you mean by this.

-- Hal

> 
> Sasha
> 
> > 
> > Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> > ---
> >  opensm/opensm/osm_pkey_mgr.c |   11 +++++------
> >  1 files changed, 5 insertions(+), 6 deletions(-)
> > 
> > diff --git a/opensm/opensm/osm_pkey_mgr.c b/opensm/opensm/osm_pkey_mgr.c
> > index 209fa71..61bd076 100644
> > --- a/opensm/opensm/osm_pkey_mgr.c
> > +++ b/opensm/opensm/osm_pkey_mgr.c
> > @@ -212,7 +212,7 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,
> > 
> >  	p_pi = &p_physp->port_info;
> > 
> > -	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
> > +	if (((p_pi->vl_enforce & 0xc) == 0xc && enforce) ||
> >  	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
> >  		osm_log(p_log, OSM_LOG_DEBUG,
> >  			"pkey_mgr_enforce_partition: "
> > @@ -229,11 +229,10 @@ pkey_mgr_enforce_partition(IN osm_log_t * p_log,
> > 
> >  	p_pi = (ib_port_info_t *) payload;
> > 
> > -	/* clearing enforcement in both directions */
> > -	p_pi->vl_enforce &= ~0xc;
> > -	if (enforce == TRUE)
> > -		/* enforcing only in outbound direction */
> > -		p_pi->vl_enforce |= 0x4;
> > +	if (enforce)
> > +		p_pi->vl_enforce |= 0xc;
> > +	else
> > +		p_pi->vl_enforce &= ~0xc;
> > 
> >  	p_pi->state_info2 = 0;
> >  	ib_port_info_set_port_state(p_pi, IB_LINK_NO_CHANGE);
> > -- 
> > 1.5.1.4
> > 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From sashak at voltaire.com  Thu Dec 27 08:50:01 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 27 Dec 2007 16:50:01 +0000
Subject: [ofa-general] [PATCH] ibnetdiscover - ports report
In-Reply-To: <1198767467.23289.194.camel@hrosenstock-ws.xsigo.com>
References: <C44068BB95F2E54DB07822CE539BCF1CBD90D1@exus01.voltaire.com>
	<20071225064716.GD7012@sashak.voltaire.com>
	<1198767467.23289.194.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071227165001.GG13378@sashak.voltaire.com>

On 06:57 Thu 27 Dec     , Hal Rosenstock wrote:
> On Tue, 2007-12-25 at 06:47 +0000, Sasha Khapyorsky wrote:
> > On 23:38 Thu 20 Dec     , Erez Strauss wrote:
> > > Hello IB developers and users,
> > > 
> > >  
> > > 
> > > I would like to get feedback on the following patch to ibnetdiscover.
> > > 
> > >  
> > > 
> > > The patch introduce additional output mode for ibnetdiscover which is
> > > focused on the ports, and print one line for each port with the needed
> > > port information.
> > > 
> > >  
> > > 
> > > The output looks like:
> > > 
> > >  
> > > 
> > > SW     4 18 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4 17 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4 16 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4 15 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4 14 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4 13 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4  9 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4  8 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4  7 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4  6 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4  5 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4  4 0x0008f104003f0838 4x SDR
> > > 'ISR9288/ISR9096 Voltaire sLB-24'
> > > 
> > > SW     4  1 0x0008f104003f0838 4x SDR - SW     6  3 0x0008f104004005f5 (
> > > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > > 
> > > SW     4  2 0x0008f104003f0838 4x SDR - SW     7  3 0x0008f104004005f6 (
> > > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > > 
> > > SW     4  3 0x0008f104003f0838 4x SDR - SW     1  3 0x0008f104004005f7 (
> > > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > > 
> > > SW     4 10 0x0008f104003f0838 4x SDR - SW     8  3 0x0008f104004006f5 (
> > > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > > 
> > > SW     4 11 0x0008f104003f0838 4x SDR - SW     9  3 0x0008f104004006f6 (
> > > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > > 
> > > SW     4 12 0x0008f104003f0838 4x SDR - SW    10  3 0x0008f104004006f7 (
> > > 'ISR9288/ISR9096 Voltaire sLB-24' - 'ISR9288 Voltaire sFB-12' )
> > > 
> > > CA    14  1 0x0008f10403960091 4x SDR - SW     4 20 0x0008f104003f0838 (
> > > 'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )
> > > 
> > > CA    11  1 0x0002c90107a4e431 4x SDR - SW     4 19 0x0008f104003f0838 (
> > > 'Voltaire HCA400' - 'ISR9288/ISR9096 Voltaire sLB-24' )
> > > 
> > > CA     2  1 0x0008f1000102d801 4x SDR - SW     1 15 0x0008f104004005f7 (
> > > 'Voltaire IB-to-TCP/IP Router' - 'ISR9288 Voltaire 
> > > 
> > >  
> > > 
> > >  
> > > 
> > > Thanks,
> > > 
> > >  
> > > 
> > > Erez Strauss
> > > 
> > > Voltaire.
> > > 
> > >  
> > > 
> > > -------------
> > > 
> > > Date:   Thu Dec 20 19:36:14 2007 -0500
> > > 
> > >  
> > > 
> > >      Added the -p(orts) option, to generate ports reports
> > > 
> > >  
> > > 
> > >     Signed-off-by: Erez Strauss <erezs _at_ voltaire.com>
> > 
> > Applied. Thanks.
> > 
> > Please note, the originally posted patch was completely mangled by your
> > mailer (so attached version was useful).
> 
> What about updating the ibnetdiscover man page for this too ?

Sure, already asked Erez for the man page update.

Sasha


From sashak at voltaire.com  Thu Dec 27 08:51:52 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 27 Dec 2007 16:51:52 +0000
Subject: [ofa-general] Re: [PATCH] opensm/osm_pkey_mgr.c:
	setting/clearing both inbound and outbound partition enforcements
In-Reply-To: <1198773555.23289.215.camel@hrosenstock-ws.xsigo.com>
References: <4773C734.6000609@dev.mellanox.co.il>
	<20071227163539.GD13378@sashak.voltaire.com>
	<1198773555.23289.215.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071227165152.GH13378@sashak.voltaire.com>

On 08:39 Thu 27 Dec     , Hal Rosenstock wrote:
> On Thu, 2007-12-27 at 16:35 +0000, Sasha Khapyorsky wrote:
> > Hi Yevgeny,
> > 
> > On 17:39 Thu 27 Dec     , Yevgeny Kliteynik wrote:
> > > Turning on inbound partition enforcement bit on switch ports -
> > > undoing part of the changes that were done by the previous pkey patch.
> > 
> > It looks like full revert, is it what did you mean?
> > 
> > As far as I understand Hal's point was different - to set *full* P_Key
> > values on switch's ports
> 
> That was before I looked up o18-14.
> 
> >  (and revert original patch). Not quite sure it
> > is the best option - IMO it doesn't enforce limited-to-limited
> > communication at nearest switch port and breaks limited-to-limited
> > enforcement on outbound ports. However I don't see an ideal solution
> > here.
> 
> After reading o18-14, do you still think so ?

No (after reading this up to the last line :))

Sasha


From hrosenstock at xsigo.com  Thu Dec 27 08:43:24 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 08:43:24 -0800
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in
	ofed_1_2
In-Reply-To: <20071227164737.GF13378@sashak.voltaire.com>
References: <47724048.2000302@dev.mellanox.co.il>
	<20071226154218.GN7012@sashak.voltaire.com>
	<1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>
	<20071227164737.GF13378@sashak.voltaire.com>
Message-ID: <1198773804.23289.220.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-27 at 16:47 +0000, Sasha Khapyorsky wrote:
> On 06:57 Thu 27 Dec     , Hal Rosenstock wrote:
> > On Wed, 2007-12-26 at 15:42 +0000, Sasha Khapyorsky wrote:
> > > On 13:51 Wed 26 Dec     , Yevgeny Kliteynik wrote:
> > > > Hi Sasha,
> > > > 
> > > > Protecting against possible NULL returned by osm_node_get_remote_node().
> > > > 
> > > > Please apply this fix to branch ofed_1_2 only.
> > > > It appears that this coredump has already been fixed for ofed_1_3.
> > > > 
> > > > Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> > > 
> > > Applied to ofed_1_2. Thanks.
> > 
> > Which ofed_1_2 tree ?
> 
>   ofed_1_2 branch of git://git.openfabrics.org/~sashak/management
> 
> > BTW, up to now, such patches have been rejected saying OFED 1.2 (for
> > OpenSM) was not being maintained. Is there now a change of policy on
> > this ?
> 
> Not really. As far as I remember it was first explicit patch for 1.2.

I don't think that's the case but has been for quite a while now since a
number of 1.2 patches were "rejected" as being 1.2 and in fact there was
some explicit email on whether this was to be done or not and the answer
was a resounding NO. I can dig out the emails if this is really needed.

> I don't backport non critical fixes and improvements from master to 1.2
> - that is true.

There were a number of fixes originally supplied for 1.2 up ported to
1.3. Guess you could always consider them non critical although I would
beg to differ on some of those.

-- Hal

> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From sashak at voltaire.com  Thu Dec 27 09:07:58 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Thu, 27 Dec 2007 17:07:58 +0000
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in
	ofed_1_2
In-Reply-To: <1198773804.23289.220.camel@hrosenstock-ws.xsigo.com>
References: <47724048.2000302@dev.mellanox.co.il>
	<20071226154218.GN7012@sashak.voltaire.com>
	<1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>
	<20071227164737.GF13378@sashak.voltaire.com>
	<1198773804.23289.220.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071227170758.GI13378@sashak.voltaire.com>

On 08:43 Thu 27 Dec     , Hal Rosenstock wrote:
> On Thu, 2007-12-27 at 16:47 +0000, Sasha Khapyorsky wrote:
> > On 06:57 Thu 27 Dec     , Hal Rosenstock wrote:
> > > On Wed, 2007-12-26 at 15:42 +0000, Sasha Khapyorsky wrote:
> > > > On 13:51 Wed 26 Dec     , Yevgeny Kliteynik wrote:
> > > > > Hi Sasha,
> > > > > 
> > > > > Protecting against possible NULL returned by osm_node_get_remote_node().
> > > > > 
> > > > > Please apply this fix to branch ofed_1_2 only.
> > > > > It appears that this coredump has already been fixed for ofed_1_3.
> > > > > 
> > > > > Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> > > > 
> > > > Applied to ofed_1_2. Thanks.
> > > 
> > > Which ofed_1_2 tree ?
> > 
> >   ofed_1_2 branch of git://git.openfabrics.org/~sashak/management
> > 
> > > BTW, up to now, such patches have been rejected saying OFED 1.2 (for
> > > OpenSM) was not being maintained. Is there now a change of policy on
> > > this ?
> > 
> > Not really. As far as I remember it was first explicit patch for 1.2.
> 
> I don't think that's the case but has been for quite a while now since a
> number of 1.2 patches were "rejected" as being 1.2 and in fact there was
> some explicit email on whether this was to be done or not and the answer
> was a resounding NO.

I may be wrong about it, but IIRC it is all about outstanding counter
update fix in error path in the vendor layer, the patch was for the
master, and yes - I didn't want to backport it to 1.2 branch for "use
master, not 1.2" reason.

> I can dig out the emails if this is really needed.
> 
> > I don't backport non critical fixes and improvements from master to 1.2
> > - that is true.
> 
> There were a number of fixes originally supplied for 1.2 up ported to
> 1.3. Guess you could always consider them non critical although I would
> beg to differ on some of those.

Always - no, but in general I prefer to run master in the field.

Sasha


From hrosenstock at xsigo.com  Thu Dec 27 09:15:15 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Thu, 27 Dec 2007 09:15:15 -0800
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in
	ofed_1_2
In-Reply-To: <20071227170758.GI13378@sashak.voltaire.com>
References: <47724048.2000302@dev.mellanox.co.il>
	<20071226154218.GN7012@sashak.voltaire.com>
	<1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>
	<20071227164737.GF13378@sashak.voltaire.com>
	<1198773804.23289.220.camel@hrosenstock-ws.xsigo.com>
	<20071227170758.GI13378@sashak.voltaire.com>
Message-ID: <1198775715.23289.225.camel@hrosenstock-ws.xsigo.com>

On Thu, 2007-12-27 at 17:07 +0000, Sasha Khapyorsky wrote:
> On 08:43 Thu 27 Dec     , Hal Rosenstock wrote:
> > On Thu, 2007-12-27 at 16:47 +0000, Sasha Khapyorsky wrote:
> > > On 06:57 Thu 27 Dec     , Hal Rosenstock wrote:
> > > > On Wed, 2007-12-26 at 15:42 +0000, Sasha Khapyorsky wrote:
> > > > > On 13:51 Wed 26 Dec     , Yevgeny Kliteynik wrote:
> > > > > > Hi Sasha,
> > > > > > 
> > > > > > Protecting against possible NULL returned by osm_node_get_remote_node().
> > > > > > 
> > > > > > Please apply this fix to branch ofed_1_2 only.
> > > > > > It appears that this coredump has already been fixed for ofed_1_3.
> > > > > > 
> > > > > > Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> > > > > 
> > > > > Applied to ofed_1_2. Thanks.
> > > > 
> > > > Which ofed_1_2 tree ?
> > > 
> > >   ofed_1_2 branch of git://git.openfabrics.org/~sashak/management
> > > 
> > > > BTW, up to now, such patches have been rejected saying OFED 1.2 (for
> > > > OpenSM) was not being maintained. Is there now a change of policy on
> > > > this ?
> > > 
> > > Not really. As far as I remember it was first explicit patch for 1.2.
> > 
> > I don't think that's the case but has been for quite a while now since a
> > number of 1.2 patches were "rejected" as being 1.2 and in fact there was
> > some explicit email on whether this was to be done or not and the answer
> > was a resounding NO.
> 
> I may be wrong about it, but IIRC it is all about outstanding counter
> update fix in error path in the vendor layer, the patch was for the
> master, and yes - I didn't want to backport it to 1.2 branch for "use
> master, not 1.2" reason.

There were other patches I supplied, etc. I don't think it was just the
outstanding counter fix but others too.

> > I can dig out the emails if this is really needed.
> > 
> > > I don't backport non critical fixes and improvements from master to 1.2
> > > - that is true.
> > 
> > There were a number of fixes originally supplied for 1.2 up ported to
> > 1.3. Guess you could always consider them non critical although I would
> > beg to differ on some of those.
> 
> Always - no, but in general I prefer to run master in the field.

Not everyone has that luxury (and master has not even shipped as 1.3
yet). I tried to get this changed but seems no one else shares this
issue. As I said before, the implication of this is that 1.2 is not
officially supported based on this policy (although there are fixes
going into 1.2 as 1.2.5 is still "live").

-- Hal

> Sasha


From dillowda at ornl.gov  Thu Dec 27 09:53:38 2007
From: dillowda at ornl.gov (David Dillow)
Date: Thu, 27 Dec 2007 12:53:38 -0500
Subject: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
In-Reply-To: <20071227115817I.fujita.tomonori@lab.ntt.co.jp>
References: <1198275532.9979.43.camel@lap75545.ornl.gov>
	<20071223014407L.tomof@acm.org>
	<1198689251.25003.2.camel@lap75545.ornl.gov>
	<20071227115817I.fujita.tomonori@lab.ntt.co.jp>
Message-ID: <1198778018.9960.1.camel@lap75545.ornl.gov>


On Thu, 2007-12-27 at 11:58 +0900, FUJITA Tomonori wrote:
> On Wed, 26 Dec 2007 12:14:11 -0500
> David Dillow <dillowda at ornl.gov> wrote:
> 
> > 
> > On Sun, 2007-12-23 at 01:41 +0900, FUJITA Tomonori wrote:
> > > transport_container_unregister(&i->rport_attr_cont) should not fail here.
> > > 
> > > It fails because there is still a srp rport.
> > > 
> > > I think that as Pete pointed out, srp_remove_one needs to call
> > > srp_remove_host.
> > > 
> > > Can you try this?
> > 
> > That patched oopsed in scsi_remove_host(), but reversing the order has
> > survived over 500 insert/probe/remove cycles.
> 
> Thanks,
> 
> Can you post the oops message? The srp class might have bugs related
> to it.

This is the oops generated by doing srp_remove_host() prior to
scsi_remove_host() in 2.6.24-rc5:

Unable to handle kernel NULL pointer dereference at 0000000000000020 RIP: 
 [<ffffffff811d058d>] klist_del+0xa/0x46
PGD 8450d8067 PUD 843cbd067 PMD 0 
Oops: 0000 [1] SMP 
CPU 3 
Modules linked in: sg sd_mod ib_iser libiscsi scsi_transport_iscsi rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_mod ib_cm ib_ipoib ib_sa ib_uverbs ib_umad ib_mthca ib_mad ib_core ehci_hcd ohci_hcd nfs lockd nfs_acl sunrpc unionfs forcedeth
Pid: 2450, comm: rmmod Not tainted 2.6.24-rc5 #2
RIP: 0010:[<ffffffff811d058d>]  [<ffffffff811d058d>] klist_del+0xa/0x46
RSP: 0018:ffff81084192bd28  EFLAGS: 00010282
RAX: ffff81084600b000 RBX: 0000000000000000 RCX: ffffe2001ce562c8
RDX: 0000000000000000 RSI: ffff810447c1d000 RDI: ffff81084657f050
RBP: ffff81084657f028 R08: ffff810447c1d000 R09: ffff8108455a1800
R10: ffff8108455a1800 R11: ffff810846730808 R12: ffff81084657f050
R13: ffff810844c4a170 R14: ffff81084657f028 R15: 0000000000000880
FS:  00002afbf1b0b6e0(0000) GS:ffff810846531840(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000843c56000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 2450, threadinfo ffff81084192a000, task ffff810844d47620)
Stack:  ffff810844c4a000 ffff81084657f028 ffff81084657f000 ffffffff8114cbd6
 ffff810846730808 ffff810844c4a000 ffff81084657f028 ffff81084657f000
 0000000000000246 ffffffff88118322 ffff8108455a1800 ffff81084657f000
Call Trace:
 [<ffffffff8114cbd6>] device_del+0x20/0x2f0
 [<ffffffff88118322>] :scsi_mod:scsi_target_reap_usercontext+0x53/0xbd
 [<ffffffff810455ce>] execute_in_process_context+0x20/0x47
 [<ffffffff8811a4da>] :scsi_mod:scsi_device_dev_release_usercontext+0xd3/0x105
 [<ffffffff810455ce>] execute_in_process_context+0x20/0x47
 [<ffffffff810ed9b8>] kobject_cleanup+0x2f/0x51
 [<ffffffff810ed9da>] kobject_release+0x0/0x9
 [<ffffffff810ee692>] kref_put+0x74/0x82
 [<ffffffff88119f02>] :scsi_mod:scsi_forget_host+0x53/0x55
 [<ffffffff88112018>] :scsi_mod:scsi_remove_host+0x76/0xf7
 [<ffffffff8813d161>] :ib_srp:srp_remove_one+0x102/0x19d
 [<ffffffff880ac2bc>] :ib_core:ib_unregister_client+0x40/0xb3
 [<ffffffff8813d20a>] :ib_srp:srp_cleanup_module+0xe/0x34
 [<ffffffff810551f1>] sys_delete_module+0x18d/0x1bc
 [<ffffffff811d3879>] error_exit+0x0/0x51
 [<ffffffff8100be6e>] system_call+0x7e/0x83


Code: 48 8b 6b 20 48 89 df e8 b7 2f 00 00 4c 89 e7 e8 d2 ff ff ff 
RIP  [<ffffffff811d058d>] klist_del+0xa/0x46
 RSP <ffff81084192bd28>
CR2: 0000000000000020


From sqjta at bmonesbittburns.com  Thu Dec 27 10:08:57 2007
From: sqjta at bmonesbittburns.com (William Shaffer)
Date: Thu, 27 Dec 2007 23:08:57 +0500
Subject: [ofa-general] Make yourself more attractive to others
Message-ID: <01c848dd$75a6ff30$e48dbd59@sqjta>

Hey guys :)

Amazing weight loss stories here:

http://www.swaplou.com/?bcmgwypspa

     I've always had trouble with my weight ever since I was young. 
Of course I tried all the "best" fat loss products, nothing helped very much. It wasn't til I tried Anatrim that I saw the pounds seriously start to melt away! Nothing helped me lose weight faster. 
    I literally saw 15 pounds melt away within the first few weeks! There's nothing more exciting than watching pounds disappear, especially when you've tried all sorts of different methods and products before. 
    I've since read up on Anatrim and am amazed at the number of people who have benefited from its amazing results. I'm halfway to my goal, Anatrim will get me the rest of the way ;)

William Shaffer


From danderson at lnxi.com  Thu Dec 27 10:44:02 2007
From: danderson at lnxi.com (David B. Anderson)
Date: Thu, 27 Dec 2007 11:44:02 -0700
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to select
	sp4 patches for SLES9 kernel with minor versions equal or
	greater	than 305
In-Reply-To: <476A15BD.1050505@dev.mellanox.co.il>
References: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>
	<47670337.6080607@lnxi.com> <476A15BD.1050505@dev.mellanox.co.il>
Message-ID: <4773F272.8070309@lnxi.com>

Hi Vladimir,

The four patches named below are what I'm using to get the OFED 1.2.5 
kernel to build for SLES9 SP4.

commit 3db835ee0edb792b120ba10c8066e3d4409de2d7

git://git.openfabrics.org/ofed_1_2/linux-2.6.git


The patches are:

[PATCH 1/4] LNXI changed ofed_scripts configure to select sp4 patches

[PATCH 2/4] LNXI created backport patch addr_8802_to_2_6_5-7_308

[PATCH 3/4] LNXI fixed backport/2.6.5_sles9_sp4/rds_to_2_6_9.patch

[PATCH 4/4] LNXI fixed backport/2.6.5_sles9_sp4/cxg3_to_2_6_20.patch


I've tested these on my cluster.

Note: I changed your patch to the ofed_scripts/configure script, so that 
even if the SLES9

kernel is greater than 309 it will not revert to using SP3 patches.


David


Vladimir Sokolovsky wrote:
> David B. Anderson wrote:
>> I've all of these patches plus the following patch
>>
>>    kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch
>>
>> My current git repo is
>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
>> commit: 6974c285e6fb06264f570f9cf919865bab66c9e6
>>
>> My patch that I posted before fixes the kernel configure script so 
>> that it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel of 
>> 2.6.5-7.308 and above. The configure patch from 
>> FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only valid 
>> SP4 kernel which is incorrect. I get the same compiler error as before.
>>
>>
>>
>> Moshe Kazir wrote:
>>>  See patches in the attached message.
>>>
>>> It was applied by Vlad.
>>>
>>> Moshe
>>>
>>> ____________________________________________________________
>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>  
>>> Voltaire - The Grid Backbone
>>>  
>>>  www.voltaire.com
>>>
>>>  
>>>
>>> -----Original Message-----
>>> From: general-bounces at lists.openfabrics.org
>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David B.
>>> Anderson
>>> Sent: Saturday, December 15, 2007 3:31 AM
>>> To: general at lists.openfabrics.org; vlad at mellanox.co.il
>>> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
>>> select sp4 patches for SLES9 kernel with minor versions equal or 
>>> greater
>>> than 305
>>>
>>>
>>> Hi,
>>>
>>> I've created the following patch for OFED 1.2.5.4 to have the kernel 
>>> for
>>>
>>> SLES9 SP4 recognized (2.6.5-7.308).
>>>
>>> Even with the patch I then had two back port patches not apply 
>>> cleanly (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched 
>>> them but now I'm getting the following compiler errors:
>>>
>>> In file included from
>>> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
>>>                  from 
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>
>>> port/2.6.5_sles9_sp4/include/linux/module.h:4,
>>>                  from
>>> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
>>>                  from 
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>
>>> port/2.6.5_sles9_sp4/include/linux/device.h:4,
>>>                  from 
>>> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
>>>                  from 
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>
>>> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
>>>                  from 
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>
>>> /core/addr.c:32:
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>
>>> port/2.6.5_sles9_sp4/include/linux/sched.h:8: warning: static 
>>> declaration for `wait_for_completion_timeout' follows non-static
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>
>>> /core/addr.c:67: warning: initialization from incompatible pointer type
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>
>>> /core/addr.c: In function `addr_resolve_remote':
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>
>>> /core/addr.c:192: error: structure has no member named `idev'
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>
>>> /core/addr.c:193: error: structure has no member named `idev'
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>
>>> /core/addr.c:197: error: structure has no member named `idev'
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>
>>> /core/addr.c: At top level:
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>
>>> port/2.6.5_sles9_sp4/include/linux/device.h:48: warning: 
>>> `class_create' defined but not used
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>
>>> port/2.6.5_sles9_sp4/include/linux/device.h:82: warning: 
>>> `class_destroy' defined but not used
>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>
>>> port/2.6.5_sles9_sp4/include/linux/device.h:108: warning: 
>>> `class_device_create' defined but not used
>>> make[6]: *** 
>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban 
>>>
>>> d/core/addr.o] Error 1
>>> make[5]: *** 
>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban 
>>>
>>> d/core] Error 2
>>> make[4]: *** 
>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban 
>>>
>>> d] Error 2
>>> make[3]: *** 
>>> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] Error 2
>>> make[2]: *** [modules] Error 2
>>> make[1]: *** [modules] Error 2
>>> make[1]: Leaving directory
>>> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
>>> make: *** [kernel] Error 2
>>>
>>> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?
>>>
>>> Thanks
>>>
>>>  
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>> Subject:
>>> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4
>>> From:
>>> "Moshe Kazir" <moshek at voltaire.com>
>>> Date:
>>> Sun, 25 Nov 2007 09:59:26 +0200
>>> To:
>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>> <general at lists.openfabrics.org>
>>>
>>> To:
>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>> <general at lists.openfabrics.org>
>>>
>>>
>>> The attached files do the work.
>>>
>>> OFED_1.2.5_sles9_sp4_configure.diff  include the changes in the
>>> configure file.
>>> OFED_1.2.5_sles9_sp4_backport.diff  include the canges requiered in the
>>> kernel_patche and kernel_addons directories.
>>>
>>> Moshe
>>> ____________________________________________________________
>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>  
>>> Voltaire - The Grid Backbone
>>>  
>>>  www.voltaire.com
>>>
>
> Hi David,
> Please try the latest OFED-1.2.5.4-20071219-0824.tgz build on your 
> SLES9SP4.
>
> http://www.openfabrics.org/builds/connectx/OFED-1.2.5.4-20071219-0824.tgz
>
>
> Thanks,
> Vladimir
>


-- 
David B. Anderson 
Linux Networx
Sr. Software Engineer
Email: danderson at lnxi.com
Phone: (801) 649-1311


From kliteyn at dev.mellanox.co.il  Thu Dec 27 13:04:30 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 27 Dec 2007 23:04:30 +0200
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only	outbound
	partition enforcement on switch
In-Reply-To: <20071227164028.GE13378@sashak.voltaire.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
	<1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
	<20071227164028.GE13378@sashak.voltaire.com>
Message-ID: <4774135E.6060601@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> On 08:20 Thu 27 Dec     , Hal Rosenstock wrote:
>> On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
>>> Hi Hal,
>>>
>>> On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
>>>> On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
>>>>> Fixing wrong setting of partition enforcement bits on switch ports.
>>>>> When an HCA port is configured with a certain pkey, the peer port
>>>>> on the switch should turn on outbound partition enforcement bit only.
>>>>> Turning on the inbound enforcement will cause the switch to drop
>>>>> valid packets if the HCA is partial member.
>>>> Inbound enforcement is actually the more useful case. If there is
>>>> inbound enforcement, outbound enforcement doesn't add much.
>>>>
>>>> In the case of partial only (not both partial and full) membership, the
>>>> peer switch physical port would need to be set to full membership.
>>> Then it could break outbound enforcement. Isn't it?
>> What I wrote was wrong. Limited pkey is sufficient. See o18-14
> 
> Do you mean last paragraph of o18-14? Assuming so - it makes sense. So
> we need just revert the original patch.

Almost true. It would be nice to keep the new condition:

-	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
+	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
+	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {

-- Yevgeny


> Sasha
> 


From apdsb1 at gmail.com  Thu Dec 27 17:40:26 2007
From: apdsb1 at gmail.com (APD)
Date: Fri, 28 Dec 2007 09:40:26 +0800
Subject: [ofa-general] PROPERTY OUTLOOK FOR 2008 & EMERGING TRENDS IN
	MALAYSIAN REAL ESTATE - 28 Jan 2008/Jan-Apr 08 Seminar Series
	-www.asiapacific
Message-ID: <1198806026.875@openfabrics.org>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071228/4cfb074d/attachment.html>

From kliteyn at mellanox.co.il  Thu Dec 27 21:16:15 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 28 Dec 2007 07:16:15 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-28:normal completion
Message-ID: <MTLEXCH01luLCTaogxL00002206@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-27
OpenSM git rev = Tue_Dec_25_14:29:47_2007 [eea8ce4965c401a091d1196bd38d001f92260ede]
ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f]
 
 
Total=520  Pass=518  Fail=2
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
11 LidMgr IS3-128.topo

Failures:
2 LidMgr IS3-128.topo


From vlad at lists.openfabrics.org  Fri Dec 28 03:05:40 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Fri, 28 Dec 2007 03:05:40 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071228-0200 daily build status
Message-ID: <20071228110540.D6B65E6002E@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on x86_64 with linux-2.6.12
Passed on ppc64 with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.12
Passed on powerpc with linux-2.6.13
Passed on x86_64 with linux-2.6.18-8.el5
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.16
Passed on ia64 with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ia64 with linux-2.6.21.1
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-53.el5

Failed:


From SecureReturns at att.biz  Fri Dec 28 05:03:16 2007
From: SecureReturns at att.biz (High Return Investments)
Date: Fri, 28 Dec 2007 08:03:16 -0500
Subject: [ofa-general] A Smarter Way to Invest
Message-ID: <03y0ytihkDx_1A3XQFrX1W@att.biz>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071228/c4f006a4/attachment.html>

From dwsigmalambdabetam at sigmalambdabeta.com  Fri Dec 28 08:06:04 2007
From: dwsigmalambdabetam at sigmalambdabeta.com (Dallas Kirkpatrick)
Date: Fri, 28 Dec 2007 08:06:04 -0800
Subject: [ofa-general] Want to be a hero in bed? 
Message-ID: <01c84928$7de8fe00$aab4594c@dwsigmalambdabetam>

Are U Tired with erectile dysfunction? 
Enhance your sexual life now! 
Want to be ready for sex in few minutes? 
Reproductive and ED problems solution 

http://geocities.com/EvanJoseph34/

We are verified by VISA. Confidential purchase. 


From ggrundstrom at NetEffect.com  Fri Dec 28 08:10:28 2007
From: ggrundstrom at NetEffect.com (Glenn Grundstrom)
Date: Fri, 28 Dec 2007 10:10:28 -0600
Subject: [ofa-general] RE: current infiniband git tree
In-Reply-To: <adatzm9wk53.fsf@cisco.com>
References: <20071222165114.07dbe376.akpm@linux-foundation.org>
	<adatzm9wk53.fsf@cisco.com>
Message-ID: <5E701717F2B2ED4EA60F87C8AA57B7CC07B929D5@venom2>

Even after the patches I suspect several warnings will still
be around.  I'll clean them up.

Glenn.

> -----Original Message-----
> From: Roland Dreier [mailto:rdreier at cisco.com] 
> Sent: Sunday, December 23, 2007 8:39 AM
> To: Glenn Grundstrom
> Cc: Andrew Morton; general at lists.openfabrics.org
> Subject: Re: current infiniband git tree
> 
>  > Could someone please do an i386 allmodconfig build and 
> reduce some of this?
> 
> You mean people still use 32-bit systems?
> 
> Anyway, Glenn, this is all you... (unless your pending patches fix
> this, but I don't see anything promising)
> 
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_init_cqp':
>  > drivers/infiniband/hw/nes/nes_hw.c:834: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:835: warning: cast to 
> pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:927: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:929: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:936: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:950: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:952: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:966: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:968: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:983: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:985: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_init_nic_qp':
>  > drivers/infiniband/hw/nes/nes_hw.c:1340: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1341: warning: cast to 
> pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1413: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1414: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1421: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1453: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1455: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 
> 'nes_destroy_nic_qp':
>  > drivers/infiniband/hw/nes/nes_hw.c:1572: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1574: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1589: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:1591: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 
> 'nes_cqp_ce_handler':
>  > drivers/infiniband/hw/nes/nes_hw.c:2458: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:2459: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 
> 'nes_process_iwarp_aeqe':
>  > drivers/infiniband/hw/nes/nes_hw.c:2503: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:2723: warning: cast to 
> pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 'nes_manage_apbvt':
>  > drivers/infiniband/hw/nes/nes_hw.c:2852: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:2854: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 
> 'nes_manage_arp_cache':
>  > drivers/infiniband/hw/nes/nes_hw.c:2920: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:2922: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c: In function 'flush_wqes':
>  > drivers/infiniband/hw/nes/nes_hw.c:2973: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_hw.c:2974: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes.c: In function 'nes_rem_ref':
>  > drivers/infiniband/hw/nes/nes.c:331: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes.c:333: warning: cast from 
> pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_utils.c: In function 
> 'nes_post_cqp_request':
>  > drivers/infiniband/hw/nes/nes_utils.c:592: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_utils.c:593: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_alloc_mw':
>  > drivers/infiniband/hw/nes/nes_verbs.c:117: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:119: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 
> 'nes_dealloc_mw':
>  > drivers/infiniband/hw/nes/nes_verbs.c:207: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:209: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_bind_mw':
>  > drivers/infiniband/hw/nes/nes_verbs.c:300: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:301: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_alloc_fmr':
>  > drivers/infiniband/hw/nes/nes_verbs.c:541: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:543: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_create_qp':
>  > drivers/infiniband/hw/nes/nes_verbs.c:1330: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1334: warning: cast 
> to pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1472: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1473: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1507: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1509: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_create_cq':
>  > drivers/infiniband/hw/nes/nes_verbs.c:1824: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1826: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1841: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 
> 'nes_destroy_cq':
>  > drivers/infiniband/hw/nes/nes_verbs.c:1963: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:1965: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr':
>  > drivers/infiniband/hw/nes/nes_verbs.c:2119: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:2121: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_dereg_mr':
>  > drivers/infiniband/hw/nes/nes_verbs.c:2802: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:2804: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 
> 'nes_hw_modify_qp':
>  > drivers/infiniband/hw/nes/nes_verbs.c:2984: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:2985: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_post_send':
>  > drivers/infiniband/hw/nes/nes_verbs.c:3416: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:3418: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:3438: warning: cast 
> to pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:3438: warning: cast 
> to pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:3483: warning: cast 
> to pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:3483: warning: cast 
> to pointer from integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_post_recv':
>  > drivers/infiniband/hw/nes/nes_verbs.c:3614: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_verbs.c:3615: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_cm.c: In function 'nes_accept':
>  > drivers/infiniband/hw/nes/nes_cm.c:2351: warning: cast 
> from pointer to integer of different size
>  > drivers/infiniband/hw/nes/nes_cm.c:2423: warning: format 
> '%lu' expects type 'long unsigned int', but argument 11 has 
> type 'unsigned int'
>  > drivers/infiniband/hw/nes/nes_cm.c: In function 
> 'cm_event_connected':
>  > drivers/infiniband/hw/nes/nes_cm.c:2770: warning: cast 
> from pointer to integer of different size
>  > 
>  > 
> 


From dwifponlinem at ifponline.com  Fri Dec 28 08:53:54 2007
From: dwifponlinem at ifponline.com (Devon Miller)
Date: Fri, 28 Dec 2007 18:53:54 +0200
Subject: [ofa-general] Order drugs online in Canada
Message-ID: <134053662.29761467852284@ifponline.com>


In the last newsletter we've informed you that Canadian products are of the same quality as American but much more cheaper.  Now CanadianPharmacy offers even better opportunity to save on your drugs – 20% discount.Purchase world class products and pay less than you could pay for American drugs of the same quality. Order any item from really wide selection of safe and quality 100% generic products and save money.  You will appreciate the level of CanadianPharmacy's service and fast delivery. Get 12 free pills for over $300 order.Save your money with one mouse click. 
http://geocities.com/annbass33/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071228/be1799c9/attachment.html>

From dwsafebabym at safebaby.com  Fri Dec 28 10:51:10 2007
From: dwsafebabym at safebaby.com (Donny Kirkpatrick)
Date: Fri, 28 Dec 2007 20:51:10 +0200
Subject: [ofa-general] Best medications, best prices! 
Message-ID: <01c84993$6004b300$6a78cdc4@dwsafebabym>

Want to be the top all night long? 
Buy top products at Canadian Pharmacy store. 
Here you can find brands that you trust. 
Buy high-quality Viagra at discount pharmacy. 

http://geocities.com/BiancaColon77/

Only Confidential purchase. Verified by VISA! 


From dwharveynashm at harveynash.com  Fri Dec 28 12:46:41 2007
From: dwharveynashm at harveynash.com (Elnora Jeffers)
Date: Fri, 28 Dec 2007 21:46:41 +0100
Subject: [ofa-general] Order now and don't forget about a discount
Message-ID: <01c8499b$2172f680$5ea80e57@dwharveynashm>

Dear valued member.Don't miss our Huge Sale - up to 20% off the price for qualitative generic Canadian drugs.Canadian product is not in any ways less qualitative - the country imposes lower taxes on pharmaceutical industry, so the prime cost for making drugs is so much lower than in the USA. It really does pay to buy Canadian products instead of American ones - and it is as effective.Find out about our huge discounts right now.

http://geocities.com/NicholasMcknight69/
Best regards, Elnora Jeffers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071228/d9b567c4/attachment.html>

From terri_molinaro at cashmanequipment.com  Fri Dec 28 14:32:20 2007
From: terri_molinaro at cashmanequipment.com (Francine Gore)
Date: Sat, 29 Dec 2007 00:32:20 +0200
Subject: [ofa-general] Always be ready.
Message-ID: <01c849b2$45a9ee70$93b6f857@terri_molinaro>

 Sex can be one of the most enjoyable parts of your life. It strengthens relationships and adds excitement. 

       http://getzuik.com


From rdreier at cisco.com  Fri Dec 28 14:46:51 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Fri, 28 Dec 2007 14:46:51 -0800
Subject: [ofa-general] RE: current infiniband git tree
In-Reply-To: <5E701717F2B2ED4EA60F87C8AA57B7CC07B929D5@venom2> (Glenn
	Grundstrom's message of "Fri, 28 Dec 2007 10:10:28 -0600")
References: <20071222165114.07dbe376.akpm@linux-foundation.org>
	<adatzm9wk53.fsf@cisco.com>
	<5E701717F2B2ED4EA60F87C8AA57B7CC07B929D5@venom2>
Message-ID: <adair2iwi78.fsf@cisco.com>

 > Even after the patches I suspect several warnings will still
 > be around.  I'll clean them up.

Thanks... I got bored in the between holidays lull so I merged all
your changes and also cleaned up all the 32-bit compiler warnings and
added a few other cleanups, including one minor bug fix.  Since I
don't have hardware (yet?) you should check my tree and make sure I
didn't break anything.

By the way, running sparse to check endianness (build with "C=2 CF=-D__CHECK_ENDIAN__")
emits a rather large number of warnings.  It's probably worth fixing
them up, since some of them are probably real endianness bugs.

 - R.


From tequilasunrise1122 at yahoo.com  Fri Dec 28 18:36:13 2007
From: tequilasunrise1122 at yahoo.com (Monica Varner)
Date: Sat, 29 Dec 2007 11:36:13 +0900
Subject: [ofa-general] Best medications, best prices! 
Message-ID: <01c84a0f$041f63b0$72dd8c79@tequilasunrise1122>

Want to be the top all night long? 
Buy top products at Canadian Pharmacy store. 
Here you can find brands that you trust. 
Buy high-quality Viagra at discount pharmacy. 

http://geocities.com/RyanLindsey77/

Only Confidential purchase. Verified by VISA! 


From tennisqt4 at email.msn.com  Fri Dec 28 19:56:58 2007
From: tennisqt4 at email.msn.com (Hazel Geiger)
Date: Sat, 29 Dec 2007 12:56:58 +0900
Subject: [ofa-general] Looking at buying your partner or loved one a
	beautiful gift this Christmas?
Message-ID: <409403772.05330379515068@email.msn.com>

An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071229/33f7377f/attachment.html>

From merylcorp.com at esoleau.com  Fri Dec 28 00:45:11 2007
From: merylcorp.com at esoleau.com (Dylan Cooper)
Date: Fri, 28 Dec 2007 00:45:11 -0800
Subject: [ofa-general] Photoshop, Windows, Office
Message-ID: <000701c849d5$6ddb2800$0100007f@aabfl>

 $49 Windows XP Pro w/SP2
 $79 MS 0ffice Enterprise 2007
 $79 Adobe Acrobat 8 Pro
 $79 Microsoft Windows Vista Ultimate
 $59 Adobe Premiere 2.O
 $59 Core| Grafix Suite X3
 $59 Adobe I||ustrator CS2
 $49 Macromedia F1ash Professional 8
 $69 Adobe Photoshop CS3 V9.0
 $99 Macromedia Studio 8
$129 Autodesk Autocad 2OO7
$149 Adobe Creative Suite 2
$269 Adobe Creative Suite 3 Premium
http://sto.softwarenowprox.net/?53906403A0C5B3CE7F5C44F88C3A7A384C856F04AB89F1DB7D&t0
----
        Mac`s Top:
 $69 Adobe Acrobat Pro 7
 $49 Adobe After Effects
 $49 Macromedia Flash Pro 8
$149 Adobe Creative Suite 2 Premium
 $49 Ableton Live 5.0.1
 $49 Adobe Photoshop CS
http://sto.softwarenowprox.net/-software-for-mac-.php?53906403A0C5B3CE7F5C44F88C3A7A384C856F04AB89F1DB7D&t6
----


Patrick didnt dare look at Iai
Patrick had never seen before.


From kliteyn at mellanox.co.il  Fri Dec 28 21:34:01 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 29 Dec 2007 07:34:01 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-29:normal completion
Message-ID: <MTLEXCH019ny7BXx4XF00002364@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-28
OpenSM git rev = Tue_Dec_25_14:29:47_2007 [eea8ce4965c401a091d1196bd38d001f92260ede]
ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f]
 
 
Total=560  Pass=560  Fail=0
 
 
Pass:
42 Stability IS1-16.topo
42 Pkey IS1-16.topo
42 OsmTest IS1-16.topo
42 OsmStress IS1-16.topo
42 Multicast IS1-16.topo
42 LidMgr IS1-16.topo
14 Stability IS3-loop.topo
14 Stability IS3-128.topo
14 Pkey IS3-128.topo
14 OsmTest IS3-loop.topo
14 OsmTest IS3-128.topo
14 OsmStress IS3-128.topo
14 Multicast IS3-loop.topo
14 Multicast IS3-128.topo
14 LidMgr IS3-128.topo
14 FatTree merge-roots-4-ary-2-tree.topo
14 FatTree merge-root-4-ary-3-tree.topo
14 FatTree gnu-stallion-64.topo
14 FatTree blend-4-ary-2-tree.topo
14 FatTree RhinoDDR.topo
14 FatTree FullGnu.topo
14 FatTree 4-ary-2-tree.topo
14 FatTree 2-ary-4-tree.topo
14 FatTree 12-node-spaced.topo
14 FTreeFail 4-ary-2-tree-missing-sw-link.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
14 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
14 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo

Failures:


From rdreier at cisco.com  Fri Dec 28 21:41:09 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Fri, 28 Dec 2007 21:41:09 -0800
Subject: [ofa-general] synchronize commands issued to MTHCA
In-Reply-To: <OF61927CCB.4FC42CF1-ON862573BB.00816A4A-862573BB.0082E294@medical.local>
	(Yicheng Jia's message of "Mon, 24 Dec 2007 17:48:45 -0600")
References: <OF61927CCB.4FC42CF1-ON862573BB.00816A4A-862573BB.0082E294@medical.local>
Message-ID: <adaejd6vz0q.fsf@cisco.com>

 > I'm using OFED-1.0 and the problem I believe is related to command 
 > synchronization of HCA. The host issues a MAD_INF command at first and 
 > then a SW2HW_MTP command without waiting for the completion of the first 
 > command. Both of commands return with bad parameters error.

I guess you mean the MAD_IFC and SW2HW_MPT commands?  I've never heard
of a problem like that -- more details about your hardware/software
config and the exact symptoms you see would be helpful in debugging.

Anyway OFED 1.0 is ancient by now -- you are much better off just
using drivers from the standard kernel.  If you must use OFED, then
OFED 1.2 or even a 1.3 prerelease would be better.

 > My question is why there's no synchronization mechanism for the command 
 > execution on HCA, can I use "spin_lock" or "sem_wait" to synchronize 
 > between every command?

The HCA firmware allows multiple commands to be queued.  The
dev->cmd.event_sem semaphore is used to limit the number of
outstanding commands to the HCA's capabilities, and the
dev->cmd.hcr_mutex mutex is used to serialize the actual writing of
commands to the HCA.

There was a mmiowb() added to mthca_cmd_post() fairly recently that
might fix your problems if you are running on a large SGI Altix system.

 - R.


From Peggy at dcr.net  Sat Dec 29 01:50:49 2007
From: Peggy at dcr.net (Peggy S. Gregg)
Date: Sat, 29 Dec 2007 14:50:49 +0500
Subject: [ofa-general] Add more hard flesh to your package
Message-ID: <286901c84a00$4acf63a0$b7fd21d4@Peggy>

of only $15 million in 1974, the Soros Fund had grown to $381 millionquestions arise as to whether all of his financial activities have beenthe rope to hang myself.be reversed. So it was a boom-bust kind of sequence.


M
Eager to holiday season?
Get ready felicity on New Year!
A
Don't loved girl to most precious your present!
I
Celebrate longate with pen!s effective!
Keep medicine all our special offers! approaching year now!
http://Nineinchesperfec.com/

__________________________________________________________________
worth. He eventually earned $100 million from that move.why Marquez might have guessed wrong. Marquez remembered theSoros would have been all right had he been able to sustain a positiveThe large institutions watched in dismay as the value of their holdings
those years it showed a positive record.worth. He eventually earned $100 million from that move.Reagan policies, however, what he dubbed Reagans imperial circle,tripled in value, going from $6.1 million to $18 million. For each of
up. The other had to be the worst company in the industry, the mostnot just any two.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071229/ff388684/attachment.html>

From dwsalvagesalem at salvagesale.com  Sat Dec 29 02:08:49 2007
From: dwsalvagesalem at salvagesale.com (Sergio Cotrufo)
Date: Sat, 29 Dec 2007 11:08:49 +0100
Subject: [ofa-general] Order cheap medications in Canada and save money.  
Message-ID: <01c84a0b$2ff8f680$5a25e459@dwsalvagesalem>

    Cheap medications offered in ŤCanadianPharmacyť are of extremely high quality. Large selection of medications which are 100% generic! No other online drugstore offers such a level of service. Fast worldwide delivery, no damaged packages, no delays! Full confidentiality! 

http://geocities.com/AdanFleming73/

 Thanks for being with ŤCanadianPharmacyť!

Sergio Cotrufo


From vlad at lists.openfabrics.org  Sat Dec 29 03:06:42 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sat, 29 Dec 2007 03:06:42 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071229-0200 daily build status
Message-ID: <20071229110642.B5C33E60399@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.12
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.17
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.13
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.13
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ia64 with linux-2.6.14
Passed on x86_64 with linux-2.6.17
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.23
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-53.el5
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.18-8.el5
Passed on ppc64 with linux-2.6.18-8.el5

Failed:


From poeski at cabena.com  Sat Dec 29 05:05:36 2007
From: poeski at cabena.com (Bruno Gallegos)
Date: Sat, 29 Dec 2007 14:05:36 +0100
Subject: [ofa-general] Kaufen Software zu &#252; berraschend g&#252;
	nstigen Preisen
Message-ID: <01c84a23$e23cb800$ae3dba4f@poeski>

Die Software auf allen europaischen Sprachen, fur Windows und Macintosh vorherbestimmt. Die konnen Sie momentan bekommen. Nur bezahlen und auslasten. Hier prasentiert sind nicht teuere, aber echte und vollige Produkte der Software.Sie stellen jedes Programm leicht auf mit der Hilfe der professionellen Konsultation des Anwenderdienstes. Wenn Sie Fragen haben, bekommen Sie schnelle Antworte. Die Ruckzahlung ist moglich. So konnen Sie die vollkommen funktionierende Software leicht kaufen
http://geocities.com/liliangonzales98/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071229/0ef2442e/attachment.html>

From xxc at bookbroker.com  Sat Dec 29 06:08:36 2007
From: xxc at bookbroker.com (Kris Ellis)
Date: Sat, 29 Dec 2007 16:08:36 +0200
Subject: [ofa-general] There is no cheaper source of original and perfectly
	working software.
Message-ID: <01c84a35$110f5a00$fdb26355@xxc>

  Need some software urgently? Purchase, download and install right now! Software in English, German, French, Italian, and Spanish for IBM PC and Macintosh! Cheap prices give you the possibility to save or buy more software than you can afford purchasing software on a CD!

 We provide help in installing software. You can ask any question and get a free of charge consultation. Guaranteed access to all updates! Friendly and professional service!

http://geocities.com/JarredBarker28/

   Purchase perfectly functioning software.


From sashak at voltaire.com  Sat Dec 29 10:27:18 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 29 Dec 2007 18:27:18 +0000
Subject: [ofa-general] [PATCH RFC] opensm: mcast mgr improvements
In-Reply-To: <4770CDCE.8040200@dev.mellanox.co.il>
References: <4770CDCE.8040200@dev.mellanox.co.il>
Message-ID: <20071229182718.GA19160@sashak.voltaire.com>


This improves handling of mcast join/leave requests storming. Now mcast
routing will be recalculated for all mcast groups where changes occurred
and not one by one. For this it queues mcast groups instead of mcast
rerouting requests, this also makes state_mgr idle queue obsolete.

Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
---

Hi Yevgeny,

For me it looks that it should solve the original problem (mcast group
list is purged in osm_mcast_mgr_process()). Could you review and ideally
test it? Thanks.

Sasha

---
 opensm/include/opensm/osm_mcast_mgr.h |   14 +--
 opensm/include/opensm/osm_multicast.h |    2 +
 opensm/include/opensm/osm_sm.h        |    2 +
 opensm/include/opensm/osm_state_mgr.h |   95 -----------------
 opensm/opensm/osm_mcast_mgr.c         |  187 +++++++++++++++------------------
 opensm/opensm/osm_sm.c                |   70 ++++++-------
 opensm/opensm/osm_state_mgr.c         |  138 +------------------------
 7 files changed, 130 insertions(+), 378 deletions(-)

diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h
index 3e0b761..47b67ed 100644
--- a/opensm/include/opensm/osm_mcast_mgr.h
+++ b/opensm/include/opensm/osm_mcast_mgr.h
@@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr {
 	osm_req_t *p_req;
 	osm_log_t *p_log;
 	cl_plock_t *p_lock;
-
 } osm_mcast_mgr_t;
 /*
 * FIELDS
@@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr);
 *	Multicast Manager, Node Info Response Controller
 *********/
 
-/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb
+/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups
 * NAME
-*	osm_mcast_mgr_process_mgrp_cb
+*	osm_mcast_mgr_process_mgroups
 *
 * DESCRIPTION
-*	Callback entry point for the osm_mcast_mgr_process_mgrp function.
+*	Process only requested mcast groups.
 *
 * SYNOPSIS
 */
 osm_signal_t
-osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2);
+osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr);
 /*
 * PARAMETERS
-*	(Context1) p_mgr
+*	p_mgr
 *		[in] Pointer to an osm_mcast_mgr_t object.
 *
-*	(Context2) p_mgrp
-*		[in] Pointer to the multicast group to process.
-*
 * RETURN VALUES
 *	IB_SUCCESS
 *
diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h
index 729a2ea..f442a45 100644
--- a/opensm/include/opensm/osm_multicast.h
+++ b/opensm/include/opensm/osm_multicast.h
@@ -50,6 +50,7 @@
 
 #include <iba/ib_types.h>
 #include <complib/cl_qmap.h>
+#include <complib/cl_qlist.h>
 #include <complib/cl_spinlock.h>
 #include <opensm/osm_base.h>
 #include <opensm/osm_mtree.h>
@@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN osm_mcast_req_type_t req_type);
 * SYNOPSIS
 */
 typedef struct osm_mcast_mgr_ctxt {
+	cl_list_item_t list_item;
 	ib_net16_t mlid;
 	osm_mcast_req_type_t req_type;
 	ib_net64_t port_guid;
diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h
index 4c6ce27..a676cd6 100644
--- a/opensm/include/opensm/osm_sm.h
+++ b/opensm/include/opensm/osm_sm.h
@@ -140,6 +140,8 @@ typedef struct osm_sm {
 	cl_dispatcher_t *p_disp;
 	cl_plock_t *p_lock;
 	atomic32_t sm_trans_id;
+	cl_spinlock_t mgrp_lock;
+	cl_qlist_t mgrp_list;
 	osm_req_t req;
 	osm_resp_t resp;
 	osm_ni_rcv_t ni_rcv;
diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h
index dada097..f51593a 100644
--- a/opensm/include/opensm/osm_state_mgr.h
+++ b/opensm/include/opensm/osm_state_mgr.h
@@ -109,8 +109,6 @@ typedef struct _osm_state_mgr {
 	osm_stats_t *p_stats;
 	struct _osm_sm_state_mgr *p_sm_state_mgr;
 	const osm_sm_mad_ctrl_t *p_mad_ctrl;
-	cl_spinlock_t idle_lock;
-	cl_qlist_t idle_time_list;
 	cl_plock_t *p_lock;
 	cl_event_t *p_subnet_up_event;
 	osm_sm_state_t state;
@@ -172,99 +170,6 @@ typedef struct _osm_state_mgr {
 *	State Manager object
 *********/
 
-/****s* OpenSM: State Manager/_osm_idle_item
-* NAME
-*	_osm_idle_item
-*
-* DESCRIPTION
-*	Idle item.
-*
-* SYNOPSIS
-*/
-
-typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void *context2);
-
-typedef void
- (*osm_pfn_done_t) (IN void *context1, IN void *context2);
-
-typedef struct _osm_idle_item {
-	cl_list_item_t list_item;
-	void *context1;
-	void *context2;
-	osm_pfn_start_t pfn_start;
-	osm_pfn_done_t pfn_done;
-} osm_idle_item_t;
-
-/*
-* FIELDS
-*	list_item
-*		list item.
-*
-*	context1
-*		Context pointer
-*
-*	context2
-*		Context pointer
-*
-*	pfn_start
-*		Pointer to the start function.
-*
-*	pfn_done
-*		Pointer to the dine function.
-* SEE ALSO
-*	State Manager object
-*********/
-
-/****f* OpenSM: State Manager/osm_state_mgr_process_idle
-* NAME
-*	osm_state_mgr_process_idle
-*
-* DESCRIPTION
-*	Formulates the osm_idle_item and inserts it into the queue and
-*	signals the state manager.
-*
-* SYNOPSIS
-*/
-
-ib_api_status_t
-osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
-			   IN osm_pfn_start_t pfn_start,
-			   IN osm_pfn_done_t pfn_done,
-			   void *context1, void *context2);
-
-/*
-* PARAMETERS
-*	p_mgr
-*		[in] Pointer to a State Manager object to construct.
-*
-*	pfn_start
-*		[in] Pointer the start function which will be called at
-*			idle time.
-*
-*	pfn_done
-*		[in] pointer the done function which will be called
-*			when outstanding smps is zero
-*
-*	context1
-*		[in] Pointer to void
-*
-*	context2
-*		[in] Pointer to void
-*
-* RETURN VALUE
-*	IB_SUCCESS or IB_ERROR
-*
-* NOTES
-*	Allows osm_state_mgr_destroy
-*
-*	Calling osm_state_mgr_construct is a prerequisite to calling any other
-*	method except osm_state_mgr_init.
-*
-* SEE ALSO
-*	State Manager object, osm_state_mgr_init,
-*	osm_state_mgr_destroy
-*********/
-
 /****f* OpenSM: State Manager/osm_state_mgr_construct
 * NAME
 *	osm_state_mgr_construct
diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
index 50b95fd..f51a45a 100644
--- a/opensm/opensm/osm_mcast_mgr.c
+++ b/opensm/opensm/osm_mcast_mgr.c
@@ -815,7 +815,7 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr,
 	}
 
 	free(list_array);
-      Exit:
+Exit:
 	OSM_LOG_EXIT(p_mgr->p_log);
 	return (p_mtn);
 }
@@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t * const p_mgr,
 		"Configured MLID 0x%X for %u ports, max tree depth = %u\n",
 		cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth);
 
-      Exit:
+Exit:
 	OSM_LOG_EXIT(p_mgr->p_log);
 	return (status);
 }
@@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr,
 		}
 	}
 
-      Exit:
+Exit:
 	OSM_LOG_EXIT(p_mgr->p_log);
 	return (status);
 }
@@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * const p_mgr,
 							   port_guid);
 	}
 
-      Exit:
+Exit:
 	OSM_LOG_EXIT(p_mgr->p_log);
 	return (status);
 }
 
 /**********************************************************************
  Process the entire group.
-
  NOTE : The lock should be held externally!
  **********************************************************************/
-static osm_signal_t
-osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
-			   IN osm_mgrp_t * const p_mgrp,
-			   IN osm_mcast_req_type_t req_type,
-			   IN ib_net64_t port_guid)
+static ib_api_status_t
+mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
+		       IN osm_mgrp_t * const p_mgrp,
+		       IN osm_mcast_req_type_t req_type,
+		       IN ib_net64_t port_guid)
 {
-	osm_signal_t signal = OSM_SIGNAL_DONE;
 	ib_api_status_t status;
-	osm_switch_t *p_sw;
-	cl_qmap_t *p_sw_tbl;
-	boolean_t pending_transactions = FALSE;
 
 	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp);
 
-	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
-
 	status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid);
 	if (status != IB_SUCCESS) {
 		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
-			"osm_mcast_mgr_process_mgrp: ERR 0A19: "
+			"mcast_mgr_process_mgrp: ERR 0A19: "
 			"Unable to create spanning tree (%s)\n",
 			ib_get_err_str(status));
-
 		goto Exit;
 	}
+	p_mgrp->last_tree_id = p_mgrp->last_change_id;
 
-	/*
-	   Walk the switches and download the tables for each.
+	/* Remove MGRP only if osm_mcm_port_t count is 0 and
+	 * Not a well known group
 	 */
-	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
-	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
-		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
-		if (signal == OSM_SIGNAL_DONE_PENDING)
-			pending_transactions = TRUE;
-
-		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
+	if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) {
+		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
+			"mcast_mgr_process_mgrp: "
+			"Destroying mgrp with lid:0x%X\n",
+			cl_ntoh16(p_mgrp->mlid));
+		/* Send a Report to any InformInfo registered for
+		   Trap 67 : MCGroup delete */
+		osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
+					    p_mgrp);
+		cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
+				    (cl_map_item_t *) p_mgrp);
+		osm_mgrp_delete(p_mgrp);
 	}
 
-	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
-
-      Exit:
+Exit:
 	OSM_LOG_EXIT(p_mgr->p_log);
-
-	if (pending_transactions == TRUE)
-		return (OSM_SIGNAL_DONE_PENDING);
-	else
-		return (OSM_SIGNAL_DONE);
+	return status;
 }
 
 /**********************************************************************
@@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
 	osm_switch_t *p_sw;
 	cl_qmap_t *p_sw_tbl;
 	cl_qmap_t *p_mcast_tbl;
+	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
 	osm_mgrp_t *p_mgrp;
-	ib_api_status_t status;
 	boolean_t pending_transactions = FALSE;
 
 	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process);
 
 	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
-
 	p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl;
 	/*
 	   While holding the lock, iterate over all the established
@@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
 		/* We reached here due to some change that caused a heavy sweep
 		   of the subnet. Not due to a specific multicast request.
 		   So the request type is subnet_change and the port guid is 0. */
-		status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp,
-						    OSM_MCAST_REQ_TYPE_SUBNET_CHANGE,
-						    0);
-		if (status != IB_SUCCESS) {
-			osm_log(p_mgr->p_log, OSM_LOG_ERROR,
-				"osm_mcast_mgr_process: ERR 0A20: "
-				"Unable to create spanning tree (%s)\n",
-				ib_get_err_str(status));
-		}
-
+		mcast_mgr_process_mgrp(p_mgr, p_mgrp,
+				       OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0);
 		p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item);
 	}
 
@@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
 		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
 		if (signal == OSM_SIGNAL_DONE_PENDING)
 			pending_transactions = TRUE;
-
 		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
 	}
 
+	while (!cl_is_qlist_empty(p_list)) {
+		cl_list_item_t *p = cl_qlist_remove_head(p_list);
+		free(p);
+	}
+
 	CL_PLOCK_RELEASE(p_mgr->p_lock);
 
 	OSM_LOG_EXIT(p_mgr->p_log);
@@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcast_mgr_t * const p_mgr,
 
 /**********************************************************************
   This is the function that is invoked during idle time to handle the
-  process request. Context1 is simply the osm_mcast_mgr_t*, Context2
-  hold the mlid, port guid and action (join/leave/delete) required.
+  process request for mcast groups where join/leave/delete was required.
  **********************************************************************/
-osm_signal_t
-osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2)
+osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr)
 {
-	osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1;
+	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
+	osm_switch_t *p_sw;
+	cl_qmap_t *p_sw_tbl;
 	osm_mgrp_t *p_mgrp;
 	ib_net16_t mlid;
-	osm_signal_t signal = OSM_SIGNAL_DONE;
-	osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2;
-	osm_mcast_req_type_t req_type = p_ctxt->req_type;
-	ib_net64_t port_guid = p_ctxt->port_guid;
-
-	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb);
-
-	/* nice copy no warning on size diff */
-	memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid));
+	osm_signal_t ret, signal = OSM_SIGNAL_DONE;
+	osm_mcast_mgr_ctxt_t *ctx;
+	osm_mcast_req_type_t req_type;
+	ib_net64_t port_guid;
 
-	/* we can destroy the context now */
-	free(p_ctxt);
+	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups);
 
 	/* we need a lock to make sure the p_mgrp is not change other ways */
 	CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock);
-	p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
 
-	/* since we delayed the execution we prefer to pass the
-	   mlid as the mgrp identifier and then find it or abort */
+	if (cl_is_qlist_empty(p_list)) {
+		CL_PLOCK_RELEASE(p_mgr->p_lock);
+		return OSM_SIGNAL_NONE;
+	}
+
+	while (!cl_is_qlist_empty(p_list)) {
+		ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list);
+		req_type = ctx->req_type;
+		port_guid = ctx->port_guid;
+
+		/* nice copy no warning on size diff */
+		memcpy(&mlid, &ctx->mlid, sizeof(mlid));
 
-	if (p_mgrp) {
+		/* we can destroy the context now */
+		free(ctx);
+
+		/* since we delayed the execution we prefer to pass the
+		   mlid as the mgrp identifier and then find it or abort */
+		p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
+		if (!p_mgrp)
+			continue;
 
-		/* if there was no change from the last time we processed the group
-		   we can skip doing anything
+		/* if there was no change from the last time
+		 * we processed the group we can skip doing anything
 		 */
 		if (p_mgrp->last_change_id == p_mgrp->last_tree_id) {
 			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
-				"osm_mcast_mgr_process_mgrp_cb: "
+				"osm_mcast_mgr_process_mgroups: "
 				"Skip processing mgrp with lid:0x%X change id:%u\n",
 				cl_ntoh16(mlid), p_mgrp->last_change_id);
-		} else {
-			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
-				"osm_mcast_mgr_process_mgrp_cb: "
-				"Processing mgrp with lid:0x%X change id:%u\n",
-				cl_ntoh16(mlid), p_mgrp->last_change_id);
-
-			signal =
-			    osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type,
-						       port_guid);
-			p_mgrp->last_tree_id = p_mgrp->last_change_id;
+			continue;
 		}
 
-		/* Remove MGRP only if osm_mcm_port_t count is 0 and
-		 * Not a well known group
-		 */
-		if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) &&
-		    (p_mgrp->well_known == FALSE)) {
-			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
-				"osm_mcast_mgr_process_mgrp_cb: "
-				"Destroying mgrp with lid:0x%X\n",
-				cl_ntoh16(mlid));
-
-			/* Send a Report to any InformInfo registered for
-			   Trap 67 : MCGroup delete */
-			osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
-						    p_mgrp);
-
-			cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
-					    (cl_map_item_t *) p_mgrp);
+		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
+			"osm_mcast_mgr_process_mgroups: "
+			"Processing mgrp with lid:0x%X change id:%u\n",
+			cl_ntoh16(mlid), p_mgrp->last_change_id);
+		mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid);
+	}
 
-			osm_mgrp_delete(p_mgrp);
-		}
+	/*
+	   Walk the switches and download the tables for each.
+	 */
+	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
+	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
+	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
+		ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
+		if (ret == OSM_SIGNAL_DONE_PENDING)
+			signal = ret;
+		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
 	}
 
+	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
+
 	CL_PLOCK_RELEASE(p_mgr->p_lock);
 	OSM_LOG_EXIT(p_mgr->p_log);
 	return signal;
diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c
index 88e6d4a..b295a77 100644
--- a/opensm/opensm/osm_sm.c
+++ b/opensm/opensm/osm_sm.c
@@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm)
 	cl_event_construct(&p_sm->signal_event);
 	cl_event_construct(&p_sm->subnet_up_event);
 	cl_thread_construct(&p_sm->sweeper);
+	cl_spinlock_construct(&p_sm->mgrp_lock);
 	osm_req_construct(&p_sm->req);
 	osm_resp_construct(&p_sm->resp);
 	osm_ni_rcv_construct(&p_sm->ni_rcv);
@@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm)
 	cl_event_destroy(&p_sm->signal_event);
 	cl_event_destroy(&p_sm->subnet_up_event);
 	cl_spinlock_destroy(&p_sm->signal_lock);
+	cl_spinlock_destroy(&p_sm->mgrp_lock);
 
 	osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n");	/* Format Waived */
 	OSM_LOG_EXIT(p_sm->p_log);
@@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm,
 	if (status != CL_SUCCESS)
 		goto Exit;
 
+	cl_qlist_init(&p_sm->mgrp_list);
+
+	status = cl_spinlock_init(&p_sm->mgrp_lock);
+	if (status != CL_SUCCESS)
+		goto Exit;
+
 	status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl,
 				      p_sm->p_subn,
 				      p_sm->p_mad_pool,
@@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const ib_net64_t port_guid)
 /**********************************************************************
  **********************************************************************/
 static ib_api_status_t
-__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
+__osm_sm_mgrp_process(IN osm_sm_t * const p_sm,
 		      IN osm_mgrp_t * const p_mgrp,
 		      IN const ib_net64_t port_guid,
 		      IN osm_mcast_req_type_t req_type)
 {
-	ib_api_status_t status;
 	osm_mcast_mgr_ctxt_t *ctx2;
 
-	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect);
-
 	/*
 	 * 'Schedule' all the QP0 traffic for when the state manager
 	 * isn't busy trying to do something else.
 	 */
 	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
+	if (!ctx2)
+		return IB_ERROR;
+	memset(ctx2, 0, sizeof(*ctx2));
 	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
 	ctx2->req_type = req_type;
 	ctx2->port_guid = port_guid;
 
-	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
-					    osm_mcast_mgr_process_mgrp_cb,
-					    NULL, &p_sm->mcast_mgr,
-					    (void *)ctx2);
+	cl_spinlock_acquire(&p_sm->mgrp_lock);
+	cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item);
+	cl_spinlock_release(&p_sm->mgrp_lock);
 
-	OSM_LOG_EXIT(p_sm->p_log);
-	return (status);
+	osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
+
+	return IB_SUCCESS;
+}
+
+/**********************************************************************
+ **********************************************************************/
+static ib_api_status_t
+__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
+		      IN osm_mgrp_t * const p_mgrp,
+		      IN const ib_net64_t port_guid,
+		      IN osm_mcast_req_type_t req_type)
+{
+	return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type);
 }
 
 /**********************************************************************
@@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm,
 			 IN osm_mgrp_t * const p_mgrp,
 			 IN const ib_net64_t port_guid)
 {
-	ib_api_status_t status;
-	osm_mcast_mgr_ctxt_t *ctx2;
-
-	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect);
-
-	/*
-	 * 'Schedule' all the QP0 traffic for when the state manager
-	 * isn't busy trying to do something else.
-	 */
-	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
-	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
-	ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE;
-	ctx2->port_guid = port_guid;
-
-	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
-					    osm_mcast_mgr_process_mgrp_cb,
-					    NULL, &p_sm->mcast_mgr, ctx2);
-	if (status != IB_SUCCESS) {
-		osm_log(p_sm->p_log, OSM_LOG_ERROR,
-			"__osm_sm_mgrp_disconnect: ERR 2E11: "
-			"Failure processing multicast group (%s)\n",
-			ib_get_err_str(status));
-	}
-
-	OSM_LOG_EXIT(p_sm->p_log);
+	__osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, OSM_MCAST_REQ_TYPE_LEAVE);
 }
 
 /**********************************************************************
@@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm,
 		goto Exit;
 	}
 
-	CL_PLOCK_RELEASE(p_sm->p_lock);
 	status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type);
+	CL_PLOCK_RELEASE(p_sm->p_lock);
 
       Exit:
 	OSM_LOG_EXIT(p_sm->p_log);
@@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm,
 
 	osm_port_remove_mgrp(p_port, mlid);
 
-	CL_PLOCK_RELEASE(p_sm->p_lock);
-
 	__osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid);
+	CL_PLOCK_RELEASE(p_sm->p_lock);
 
       Exit:
 	OSM_LOG_EXIT(p_sm->p_log);
diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
index 5c39f11..d4dd782 100644
--- a/opensm/opensm/osm_state_mgr.c
+++ b/opensm/opensm/osm_state_mgr.c
@@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm);
 void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr)
 {
 	memset(p_mgr, 0, sizeof(*p_mgr));
-	cl_spinlock_construct(&p_mgr->idle_lock);
 	p_mgr->state = OSM_SM_STATE_INIT;
 }
 
@@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr)
 
 	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy);
 
-	/* destroy the locks */
-	cl_spinlock_destroy(&p_mgr->idle_lock);
-
 	OSM_LOG_EXIT(p_mgr->p_log);
 }
 
@@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
 		   IN cl_event_t * const p_subnet_up_event,
 		   IN osm_log_t * const p_log)
 {
-	cl_status_t status;
-
 	OSM_LOG_ENTER(p_log, osm_state_mgr_init);
 
 	CL_ASSERT(p_subn);
@@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
 	p_mgr->p_lock = p_lock;
 	p_mgr->p_subnet_up_event = p_subnet_up_event;
 
-	cl_qlist_init(&p_mgr->idle_time_list);
-
-	status = cl_spinlock_init(&p_mgr->idle_lock);
-	if (status != CL_SUCCESS) {
-		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
-			"osm_state_mgr_init: ERR 3302: "
-			"Spinlock init failed (%s)\n", CL_STATUS_MSG(status));
-	}
-
 	OSM_LOG_EXIT(p_mgr->p_log);
-	return (status);
+	return IB_SUCCESS;
 }
 
 /**********************************************************************
@@ -989,79 +974,6 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t *
 }
 
 /**********************************************************************
- **********************************************************************/
-static void __process_idle_time_queue_done(IN osm_state_mgr_t * const p_mgr)
-{
-	cl_qlist_t *p_list = &p_mgr->idle_time_list;
-	cl_list_item_t *p_list_item;
-	osm_idle_item_t *p_process_item;
-
-	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
-
-	cl_spinlock_acquire(&p_mgr->idle_lock);
-	p_list_item = cl_qlist_remove_head(p_list);
-
-	if (p_list_item == cl_qlist_end(p_list)) {
-		cl_spinlock_release(&p_mgr->idle_lock);
-		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
-			"__process_idle_time_queue_done: ERR 3314: "
-			"Idle time queue is empty\n");
-		return;
-	}
-	cl_spinlock_release(&p_mgr->idle_lock);
-
-	p_process_item = (osm_idle_item_t *) p_list_item;
-
-	if (p_process_item->pfn_done) {
-
-		p_process_item->pfn_done(p_process_item->context1,
-					 p_process_item->context2);
-	}
-
-	free(p_process_item);
-
-	OSM_LOG_EXIT(p_mgr->p_log);
-	return;
-}
-
-/**********************************************************************
- **********************************************************************/
-static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
-						    const p_mgr)
-{
-	cl_qlist_t *p_list = &p_mgr->idle_time_list;
-	cl_list_item_t *p_list_item;
-	osm_idle_item_t *p_process_item;
-	osm_signal_t signal;
-
-	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start);
-
-	cl_spinlock_acquire(&p_mgr->idle_lock);
-
-	p_list_item = cl_qlist_head(p_list);
-	if (p_list_item == cl_qlist_end(p_list)) {
-		cl_spinlock_release(&p_mgr->idle_lock);
-		OSM_LOG_EXIT(p_mgr->p_log);
-		return OSM_SIGNAL_NONE;
-	}
-
-	cl_spinlock_release(&p_mgr->idle_lock);
-
-	p_process_item = (osm_idle_item_t *) p_list_item;
-
-	CL_ASSERT(p_process_item->pfn_start);
-
-	signal =
-	    p_process_item->pfn_start(p_process_item->context1,
-				      p_process_item->context2);
-
-	CL_ASSERT(signal != OSM_SIGNAL_NONE);
-
-	OSM_LOG_EXIT(p_mgr->p_log);
-	return signal;
-}
-
-/**********************************************************************
  * Go over all the remote SMs (as updated in the sm_guid_tbl).
  * Find if there is a remote sm that is a master SM.
  * If there is a remote master SM - return a pointer to it,
@@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
 		case OSM_SM_STATE_PROCESS_REQUEST:
 			switch (signal) {
 			case OSM_SIGNAL_IDLE_TIME_PROCESS:
-				signal = __process_idle_time_queue_start(p_mgr);
+				signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr);
 				switch (signal) {
 				case OSM_SIGNAL_NONE:
 					p_mgr->state = OSM_SM_STATE_IDLE;
@@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
 			switch (signal) {
 			case OSM_SIGNAL_NO_PENDING_TRANSACTIONS:
 			case OSM_SIGNAL_DONE:
-				/* CALL the done function */
-				__process_idle_time_queue_done(p_mgr);
-
-				/*
-				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
-				 * so that the next element in the queue gets processed
-				 */
-
 				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
 				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
 				break;
@@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
 
 	OSM_LOG_EXIT(p_mgr->p_log);
 }
-
-/**********************************************************************
- **********************************************************************/
-ib_api_status_t
-osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
-			   IN osm_pfn_start_t pfn_start,
-			   IN osm_pfn_done_t pfn_done, void *context1,
-			   void *context2)
-{
-	osm_idle_item_t *p_idle_item;
-
-	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle);
-
-	p_idle_item = malloc(sizeof(osm_idle_item_t));
-	if (p_idle_item == NULL) {
-		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
-			"osm_state_mgr_process_idle: ERR 3321: "
-			"insufficient memory\n");
-		return IB_ERROR;
-	}
-
-	memset(p_idle_item, 0, sizeof(osm_idle_item_t));
-	p_idle_item->pfn_start = pfn_start;
-	p_idle_item->pfn_done = pfn_done;
-	p_idle_item->context1 = context1;
-	p_idle_item->context2 = context2;
-
-	cl_spinlock_acquire(&p_mgr->idle_lock);
-	cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item);
-	cl_spinlock_release(&p_mgr->idle_lock);
-
-	osm_sm_signal(&p_mgr->p_subn->p_osm->sm,
-		      OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
-
-	OSM_LOG_EXIT(p_mgr->p_log);
-
-	return IB_SUCCESS;
-}
-- 
1.5.3.4.206.g58ba4


From sashak at voltaire.com  Sat Dec 29 10:34:59 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 29 Dec 2007 18:34:59 +0000
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <4774135E.6060601@dev.mellanox.co.il>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
	<1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
	<20071227164028.GE13378@sashak.voltaire.com>
	<4774135E.6060601@dev.mellanox.co.il>
Message-ID: <20071229183459.GB19160@sashak.voltaire.com>

On 23:04 Thu 27 Dec     , Yevgeny Kliteynik wrote:
>  Sasha Khapyorsky wrote:
> > On 08:20 Thu 27 Dec     , Hal Rosenstock wrote:
> >> On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
> >>> Hi Hal,
> >>>
> >>> On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
> >>>> On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> >>>>> Fixing wrong setting of partition enforcement bits on switch ports.
> >>>>> When an HCA port is configured with a certain pkey, the peer port
> >>>>> on the switch should turn on outbound partition enforcement bit only.
> >>>>> Turning on the inbound enforcement will cause the switch to drop
> >>>>> valid packets if the HCA is partial member.
> >>>> Inbound enforcement is actually the more useful case. If there is
> >>>> inbound enforcement, outbound enforcement doesn't add much.
> >>>>
> >>>> In the case of partial only (not both partial and full) membership, the
> >>>> peer switch physical port would need to be set to full membership.
> >>> Then it could break outbound enforcement. Isn't it?
> >> What I wrote was wrong. Limited pkey is sufficient. See o18-14
> > Do you mean last paragraph of o18-14? Assuming so - it makes sense. So
> > we need just revert the original patch.
> 
>  Almost true. It would be nice to keep the new condition:
> 
>  -	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
>  +	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
>  +	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {

I liked the original version - it is shorter and looks cleaner for me.

I'm reverting entire patch.

Sasha


From sashak at voltaire.com  Sat Dec 29 10:50:57 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sat, 29 Dec 2007 18:50:57 +0000
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in
	ofed_1_2
In-Reply-To: <1198775715.23289.225.camel@hrosenstock-ws.xsigo.com>
References: <47724048.2000302@dev.mellanox.co.il>
	<20071226154218.GN7012@sashak.voltaire.com>
	<1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>
	<20071227164737.GF13378@sashak.voltaire.com>
	<1198773804.23289.220.camel@hrosenstock-ws.xsigo.com>
	<20071227170758.GI13378@sashak.voltaire.com>
	<1198775715.23289.225.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071229185057.GC19160@sashak.voltaire.com>

Hi Hal,

On 09:15 Thu 27 Dec     , Hal Rosenstock wrote:
> 
> There were other patches I supplied, etc. I don't think it was just the
> outstanding counter fix but others too.

I may be wrong about it, but I don't remember this now. If you like me
to review those patches again, let's review.

> > > I can dig out the emails if this is really needed.
> > > 
> > > > I don't backport non critical fixes and improvements from master to 1.2
> > > > - that is true.
> > > 
> > > There were a number of fixes originally supplied for 1.2 up ported to
> > > 1.3. Guess you could always consider them non critical although I would
> > > beg to differ on some of those.
> > 
> > Always - no, but in general I prefer to run master in the field.
> 
> Not everyone has that luxury (and master has not even shipped as 1.3
> yet).

The recent OpenSM (and management) versions are available as tarballs
and/or from the git repo.

> I tried to get this changed but seems no one else shares this
> issue. As I said before, the implication of this is that 1.2 is not
> officially supported based on this policy (although there are fixes
> going into 1.2 as 1.2.5 is still "live").

There should not be connection between "this policy" and the official
OFED 1.2 support which is provided by software/hardware vendors.

Sasha


From akstcamservemnsdgs at amserve.net  Sat Dec 29 13:45:03 2007
From: akstcamservemnsdgs at amserve.net (Landon Stahl)
Date: Sat, 29 Dec 2007 23:45:03 +0200
Subject: [ofa-general] Cut-price OEM software e-shop
Message-ID: <813740843.85401965364197@amserve.net>

Hi. You asked me to advise some good place to purchase software. Here is a cool site with great selection of computer software in many languages. All of them are cheap, by the way. Moreover, you can download the required software just after purchase. Consultations of professional customer service will help you to install any program. Fast response guaranteed. Money back guarantee.http://geocities.com/raymundo.pittman/Purchase perfectly functioning software.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071229/b79319d4/attachment.html>

From dwsandbaggersm at sandbaggers.net  Sat Dec 29 14:39:32 2007
From: dwsandbaggersm at sandbaggers.net (Jean Beck)
Date: Sat, 29 Dec 2007 14:39:32 -0800
Subject: [ofa-general] Best medications, best prices! 
Message-ID: <01c84a28$9fda99f0$5dedb718@dwsandbaggersm>

Want to be the top all night long? 
Buy top products at Canadian Pharmacy store. 
Here you can find brands that you trust. 
Buy high-quality Viagra at discount pharmacy. 

http://geocities.com/WoodrowEwing10/

Only Confidential purchase. Verified by VISA! 


From ericaa at myname.us  Sat Dec 29 15:59:44 2007
From: ericaa at myname.us (=?windows-1255?B?4Pjp9w==?=)
Date: Sun, 30 Dec 2007 01:59:44 +0200
Subject: [ofa-general] =?windows-1255?b?6+Xn5SD57CDk7uXj8g==?=
Message-ID: <20071229235944.B95B0E601C4@openfabrics.org>

מהו הכוח האמיתי שלך?
מה עוצר אותך?
כיצד תגיע לאן שאתה רוצה?
בוא לאבחון יכולת אישית - ללא תשלום
בפגישה תמלא שאלון ותקבל ניתוח מדויק על תכונות ותחומים חזקים וחלשים בחיים שלך
ותוכל לדעת מה עוצר אותך מלהשיג את יעדיך וכיצד תוכל להתגבר על כך...
להזמנת אבחון לחץ כאן


אריק יליזרוב - אבחון, ייעוץ מעשי והדרכה
יועץ סיינטולוגיה® מוסמך
מגדלי תל אביב, טל': 03-6956376


להסרה  לחץ כאן unlist
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071230/a940b0c6/attachment.html>

From kliteyn at mellanox.co.il  Sat Dec 29 21:04:54 2007
From: kliteyn at mellanox.co.il (kliteyn at mellanox.co.il)
Date: 30 Dec 2007 07:04:54 +0200
Subject: [ofa-general] nightly osm_sim report 2007-12-30:normal completion
Message-ID: <MTLEXCH01Jjt8gsVWDK00002477@mtlexch01.mtl.com>

OSM Simulation Regression Summary
 
[Generated mail - please do NOT reply]
 
 
OpenSM binary date = 2007-12-29
OpenSM git rev = Sat_Dec_29_21:02:49_2007 [f7b47c635d291a4aef38c609028791b4ad1f1259]
ibutils git rev = Mon_Dec_24_10:42:01_2007 [675bec82306d6920555dd0b5e2f664983e27e60f]
 
 
Total=520  Pass=519  Fail=1
 
 
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13 Stability IS3-128.topo
13 Pkey IS3-128.topo
13 OsmTest IS3-loop.topo
13 OsmTest IS3-128.topo
13 OsmStress IS3-128.topo
13 Multicast IS3-loop.topo
13 Multicast IS3-128.topo
13 FatTree merge-roots-4-ary-2-tree.topo
13 FatTree merge-root-4-ary-3-tree.topo
13 FatTree gnu-stallion-64.topo
13 FatTree blend-4-ary-2-tree.topo
13 FatTree RhinoDDR.topo
13 FatTree FullGnu.topo
13 FatTree 4-ary-2-tree.topo
13 FatTree 2-ary-4-tree.topo
13 FatTree 12-node-spaced.topo
13 FTreeFail 4-ary-2-tree-missing-sw-link.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-2.topo
13 FTreeFail 4-ary-2-tree-links-at-same-rank-1.topo
13 FTreeFail 4-ary-2-tree-diff-num-pgroups.topo
12 LidMgr IS3-128.topo

Failures:
1 LidMgr IS3-128.topo


From kliteyn at dev.mellanox.co.il  Sat Dec 29 23:31:37 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Sun, 30 Dec 2007 09:31:37 +0200
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only	outbound
	partition enforcement on switch
In-Reply-To: <20071229183459.GB19160@sashak.voltaire.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
	<1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
	<20071227164028.GE13378@sashak.voltaire.com>
	<4774135E.6060601@dev.mellanox.co.il>
	<20071229183459.GB19160@sashak.voltaire.com>
Message-ID: <47774959.1080208@dev.mellanox.co.il>

Sasha Khapyorsky wrote:
> On 23:04 Thu 27 Dec     , Yevgeny Kliteynik wrote:
>>  Sasha Khapyorsky wrote:
>>> On 08:20 Thu 27 Dec     , Hal Rosenstock wrote:
>>>> On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
>>>>> Hi Hal,
>>>>>
>>>>> On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
>>>>>> On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
>>>>>>> Fixing wrong setting of partition enforcement bits on switch ports.
>>>>>>> When an HCA port is configured with a certain pkey, the peer port
>>>>>>> on the switch should turn on outbound partition enforcement bit only.
>>>>>>> Turning on the inbound enforcement will cause the switch to drop
>>>>>>> valid packets if the HCA is partial member.
>>>>>> Inbound enforcement is actually the more useful case. If there is
>>>>>> inbound enforcement, outbound enforcement doesn't add much.
>>>>>>
>>>>>> In the case of partial only (not both partial and full) membership, the
>>>>>> peer switch physical port would need to be set to full membership.
>>>>> Then it could break outbound enforcement. Isn't it?
>>>> What I wrote was wrong. Limited pkey is sufficient. See o18-14
>>> Do you mean last paragraph of o18-14? Assuming so - it makes sense. So
>>> we need just revert the original patch.
>>  Almost true. It would be nice to keep the new condition:
>>
>>  -	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
>>  +	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
>>  +	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
> 
> I liked the original version - it is shorter and looks cleaner for me.

OK, fine by me.

-- Yevgeny

> I'm reverting entire patch.
> 
> Sasha
> 


From vlad at dev.mellanox.co.il  Sat Dec 29 23:37:07 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Sun, 30 Dec 2007 09:37:07 +0200
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to select
	sp4 patches for SLES9 kernel with minor versions equal
	or	greater	than 305
In-Reply-To: <4773F272.8070309@lnxi.com>
References: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>	<47670337.6080607@lnxi.com>
	<476A15BD.1050505@dev.mellanox.co.il> <4773F272.8070309@lnxi.com>
Message-ID: <47774AA3.90502@dev.mellanox.co.il>

Hi David,
Where can I get your patches?

Regards,
Vladimir

David B. Anderson wrote:
> Hi Vladimir,
> 
> The four patches named below are what I'm using to get the OFED 1.2.5 
> kernel to build for SLES9 SP4.
> 
> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7
> 
> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
> 
> 
> The patches are:
> 
> [PATCH 1/4] LNXI changed ofed_scripts configure to select sp4 patches
> 
> [PATCH 2/4] LNXI created backport patch addr_8802_to_2_6_5-7_308
> 
> [PATCH 3/4] LNXI fixed backport/2.6.5_sles9_sp4/rds_to_2_6_9.patch
> 
> [PATCH 4/4] LNXI fixed backport/2.6.5_sles9_sp4/cxg3_to_2_6_20.patch
> 
> 
> I've tested these on my cluster.
> 
> Note: I changed your patch to the ofed_scripts/configure script, so that 
> even if the SLES9
> 
> kernel is greater than 309 it will not revert to using SP3 patches.
> 
> 
> David
> 
> 
> Vladimir Sokolovsky wrote:
>> David B. Anderson wrote:
>>> I've all of these patches plus the following patch
>>>
>>>    kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch
>>>
>>> My current git repo is
>>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
>>> commit: 6974c285e6fb06264f570f9cf919865bab66c9e6
>>>
>>> My patch that I posted before fixes the kernel configure script so 
>>> that it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel of 
>>> 2.6.5-7.308 and above. The configure patch from 
>>> FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only valid 
>>> SP4 kernel which is incorrect. I get the same compiler error as before.
>>>
>>>
>>>
>>> Moshe Kazir wrote:
>>>>  See patches in the attached message.
>>>>
>>>> It was applied by Vlad.
>>>>
>>>> Moshe
>>>>
>>>> ____________________________________________________________
>>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>>  
>>>> Voltaire - The Grid Backbone
>>>>  
>>>>  www.voltaire.com
>>>>
>>>>  
>>>>
>>>> -----Original Message-----
>>>> From: general-bounces at lists.openfabrics.org
>>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David B.
>>>> Anderson
>>>> Sent: Saturday, December 15, 2007 3:31 AM
>>>> To: general at lists.openfabrics.org; vlad at mellanox.co.il
>>>> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
>>>> select sp4 patches for SLES9 kernel with minor versions equal or 
>>>> greater
>>>> than 305
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I've created the following patch for OFED 1.2.5.4 to have the kernel 
>>>> for
>>>>
>>>> SLES9 SP4 recognized (2.6.5-7.308).
>>>>
>>>> Even with the patch I then had two back port patches not apply 
>>>> cleanly (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched 
>>>> them but now I'm getting the following compiler errors:
>>>>
>>>> In file included from
>>>> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
>>>>                  from 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/module.h:4,
>>>>                  from
>>>> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
>>>>                  from 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:4,
>>>>                  from 
>>>> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
>>>>                  from 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
>>>>                  from 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>>
>>>> /core/addr.c:32:
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/sched.h:8: warning: static 
>>>> declaration for `wait_for_completion_timeout' follows non-static
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>>
>>>> /core/addr.c:67: warning: initialization from incompatible pointer type
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>>
>>>> /core/addr.c: In function `addr_resolve_remote':
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>>
>>>> /core/addr.c:192: error: structure has no member named `idev'
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>>
>>>> /core/addr.c:193: error: structure has no member named `idev'
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>>
>>>> /core/addr.c:197: error: structure has no member named `idev'
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniband 
>>>>
>>>> /core/addr.c: At top level:
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:48: warning: 
>>>> `class_create' defined but not used
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:82: warning: 
>>>> `class_destroy' defined but not used
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons/back 
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:108: warning: 
>>>> `class_device_create' defined but not used
>>>> make[6]: *** 
>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban 
>>>>
>>>> d/core/addr.o] Error 1
>>>> make[5]: *** 
>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban 
>>>>
>>>> d/core] Error 2
>>>> make[4]: *** 
>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infiniban 
>>>>
>>>> d] Error 2
>>>> make[3]: *** 
>>>> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] Error 2
>>>> make[2]: *** [modules] Error 2
>>>> make[1]: *** [modules] Error 2
>>>> make[1]: Leaving directory
>>>> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
>>>> make: *** [kernel] Error 2
>>>>
>>>> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?
>>>>
>>>> Thanks
>>>>
>>>>  
>>>> ------------------------------------------------------------------------ 
>>>>
>>>>
>>>> Subject:
>>>> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4
>>>> From:
>>>> "Moshe Kazir" <moshek at voltaire.com>
>>>> Date:
>>>> Sun, 25 Nov 2007 09:59:26 +0200
>>>> To:
>>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>>> <general at lists.openfabrics.org>
>>>>
>>>> To:
>>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>>> <general at lists.openfabrics.org>
>>>>
>>>>
>>>> The attached files do the work.
>>>>
>>>> OFED_1.2.5_sles9_sp4_configure.diff  include the changes in the
>>>> configure file.
>>>> OFED_1.2.5_sles9_sp4_backport.diff  include the canges requiered in the
>>>> kernel_patche and kernel_addons directories.
>>>>
>>>> Moshe
>>>> ____________________________________________________________
>>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>>  
>>>> Voltaire - The Grid Backbone
>>>>  
>>>>  www.voltaire.com
>>>>
>>
>> Hi David,
>> Please try the latest OFED-1.2.5.4-20071219-0824.tgz build on your 
>> SLES9SP4.
>>
>> http://www.openfabrics.org/builds/connectx/OFED-1.2.5.4-20071219-0824.tgz
>>
>>
>> Thanks,
>> Vladimir
>>
> 
> 


From moshek at voltaire.com  Sun Dec 30 01:00:12 2007
From: moshek at voltaire.com (Moshe Kazir)
Date: Sun, 30 Dec 2007 11:00:12 +0200
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
	selectsp4 patches for SLES9 kernel with minor versions
	equalor	greater	than 305
In-Reply-To: <47774AA3.90502@dev.mellanox.co.il>
References: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>	<47670337.6080607@lnxi.com><476A15BD.1050505@dev.mellanox.co.il>
	<4773F272.8070309@lnxi.com> <47774AA3.90502@dev.mellanox.co.il>
Message-ID: <39C75744D164D948A170E9792AF8E7CAC5AC9A@exil.voltaire.com>

He wrote ->

> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7
> 
> git://git.openfabrics.org/ofed_1_2/linux-2.6.git 

Moshe
____________________________________________________________
Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
 
Voltaire - The Grid Backbone
 
 www.voltaire.com

  
-----Original Message-----
From: general-bounces at lists.openfabrics.org
[mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladimir
Sokolovsky
Sent: Sunday, December 30, 2007 9:37 AM
To: David B. Anderson
Cc: general at lists.openfabrics.org
Subject: Re: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
selectsp4 patches for SLES9 kernel with minor versions equalor greater
than 305

Hi David,
Where can I get your patches?

Regards,
Vladimir

David B. Anderson wrote:
> Hi Vladimir,
> 
> The four patches named below are what I'm using to get the OFED 1.2.5 
> kernel to build for SLES9 SP4.
> 
> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7
> 
> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
> 
> 
> The patches are:
> 
> [PATCH 1/4] LNXI changed ofed_scripts configure to select sp4 patches
> 
> [PATCH 2/4] LNXI created backport patch addr_8802_to_2_6_5-7_308
> 
> [PATCH 3/4] LNXI fixed backport/2.6.5_sles9_sp4/rds_to_2_6_9.patch
> 
> [PATCH 4/4] LNXI fixed backport/2.6.5_sles9_sp4/cxg3_to_2_6_20.patch
> 
> 
> I've tested these on my cluster.
> 
> Note: I changed your patch to the ofed_scripts/configure script, so 
> that even if the SLES9
> 
> kernel is greater than 309 it will not revert to using SP3 patches.
> 
> 
> David
> 
> 
> Vladimir Sokolovsky wrote:
>> David B. Anderson wrote:
>>> I've all of these patches plus the following patch
>>>
>>>    kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch
>>>
>>> My current git repo is
>>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
>>> commit: 6974c285e6fb06264f570f9cf919865bab66c9e6
>>>
>>> My patch that I posted before fixes the kernel configure script so 
>>> that it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel 
>>> of
>>> 2.6.5-7.308 and above. The configure patch from 
>>> FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only 
>>> valid
>>> SP4 kernel which is incorrect. I get the same compiler error as
before.
>>>
>>>
>>>
>>> Moshe Kazir wrote:
>>>>  See patches in the attached message.
>>>>
>>>> It was applied by Vlad.
>>>>
>>>> Moshe
>>>>
>>>> ____________________________________________________________
>>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>>  
>>>> Voltaire - The Grid Backbone
>>>>  
>>>>  www.voltaire.com
>>>>
>>>>  
>>>>
>>>> -----Original Message-----
>>>> From: general-bounces at lists.openfabrics.org
>>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David
B.
>>>> Anderson
>>>> Sent: Saturday, December 15, 2007 3:31 AM
>>>> To: general at lists.openfabrics.org; vlad at mellanox.co.il
>>>> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to

>>>> select sp4 patches for SLES9 kernel with minor versions equal or 
>>>> greater than 305
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I've created the following patch for OFED 1.2.5.4 to have the 
>>>> kernel for
>>>>
>>>> SLES9 SP4 recognized (2.6.5-7.308).
>>>>
>>>> Even with the patch I then had two back port patches not apply 
>>>> cleanly (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched 
>>>> them but now I'm getting the following compiler errors:
>>>>
>>>> In file included from
>>>> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
>>>>                  from
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>> /back
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/module.h:4,
>>>>                  from
>>>> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
>>>>                  from
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>> /back
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:4,
>>>>                  from
>>>> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
>>>>                  from
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>> /back
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
>>>>                  from
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>> iband
>>>>
>>>> /core/addr.c:32:
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>> /back
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/sched.h:8: warning: static 
>>>> declaration for `wait_for_completion_timeout' follows non-static 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>> iband
>>>>
>>>> /core/addr.c:67: warning: initialization from incompatible pointer 
>>>> type 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>> iband
>>>>
>>>> /core/addr.c: In function `addr_resolve_remote':
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>> iband
>>>>
>>>> /core/addr.c:192: error: structure has no member named `idev'
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>> iband
>>>>
>>>> /core/addr.c:193: error: structure has no member named `idev'
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>> iband
>>>>
>>>> /core/addr.c:197: error: structure has no member named `idev'
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>> iband
>>>>
>>>> /core/addr.c: At top level:
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>> /back
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:48: warning: 
>>>> `class_create' defined but not used 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>> /back
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:82: warning: 
>>>> `class_destroy' defined but not used 
>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>> /back
>>>>
>>>> port/2.6.5_sles9_sp4/include/linux/device.h:108: warning: 
>>>> `class_device_create' defined but not used
>>>> make[6]: ***
>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi
>>>> niban
>>>>
>>>> d/core/addr.o] Error 1
>>>> make[5]: ***
>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi
>>>> niban
>>>>
>>>> d/core] Error 2
>>>> make[4]: ***
>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi
>>>> niban
>>>>
>>>> d] Error 2
>>>> make[3]: ***
>>>> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] 
>>>> Error 2
>>>> make[2]: *** [modules] Error 2
>>>> make[1]: *** [modules] Error 2
>>>> make[1]: Leaving directory
>>>> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
>>>> make: *** [kernel] Error 2
>>>>
>>>> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?
>>>>
>>>> Thanks
>>>>
>>>>  
>>>> -------------------------------------------------------------------
>>>> -----
>>>>
>>>>
>>>> Subject:
>>>> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4
>>>> From:
>>>> "Moshe Kazir" <moshek at voltaire.com>
>>>> Date:
>>>> Sun, 25 Nov 2007 09:59:26 +0200
>>>> To:
>>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>>> <general at lists.openfabrics.org>
>>>>
>>>> To:
>>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>>> <general at lists.openfabrics.org>
>>>>
>>>>
>>>> The attached files do the work.
>>>>
>>>> OFED_1.2.5_sles9_sp4_configure.diff  include the changes in the 
>>>> configure file.
>>>> OFED_1.2.5_sles9_sp4_backport.diff  include the canges requiered in

>>>> the kernel_patche and kernel_addons directories.
>>>>
>>>> Moshe
>>>> ____________________________________________________________
>>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>>  
>>>> Voltaire - The Grid Backbone
>>>>  
>>>>  www.voltaire.com
>>>>
>>
>> Hi David,
>> Please try the latest OFED-1.2.5.4-20071219-0824.tgz build on your 
>> SLES9SP4.
>>
>> http://www.openfabrics.org/builds/connectx/OFED-1.2.5.4-20071219-0824
>> .tgz
>>
>>
>> Thanks,
>> Vladimir
>>
> 
> 

_______________________________________________
general mailing list
general at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


From vlad at dev.mellanox.co.il  Sun Dec 30 01:07:47 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Sun, 30 Dec 2007 11:07:47 +0200
Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
	selectsp4
	patches for SLES9 kernel with minor versions equalor	greater	than 305
In-Reply-To: <39C75744D164D948A170E9792AF8E7CAC5AC9A@exil.voltaire.com>
References: <39C75744D164D948A170E9792AF8E7CA4D2CE1@exil.voltaire.com>	<47670337.6080607@lnxi.com><476A15BD.1050505@dev.mellanox.co.il>
	<4773F272.8070309@lnxi.com> <47774AA3.90502@dev.mellanox.co.il>
	<39C75744D164D948A170E9792AF8E7CAC5AC9A@exil.voltaire.com>
Message-ID: <47775FE3.400@dev.mellanox.co.il>

git://git.openfabrics.org/ofed_1_2/linux-2.6.git
- its my git tree,

David can't commit his patches to this tree (he does not have permissions)...
So, probably he have a clone of my tree somewhere.

Regards,
Vladimir

Moshe Kazir wrote:
> He wrote ->
> 
>> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7
>>
>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git 
> 
> Moshe
> ____________________________________________________________
> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>  
> Voltaire - The Grid Backbone
>  
>  www.voltaire.com
> 
>   
> 
> -----Original Message-----
> From: general-bounces at lists.openfabrics.org
> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Vladimir
> Sokolovsky
> Sent: Sunday, December 30, 2007 9:37 AM
> To: David B. Anderson
> Cc: general at lists.openfabrics.org
> Subject: Re: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
> selectsp4 patches for SLES9 kernel with minor versions equalor greater
> than 305
> 
> Hi David,
> Where can I get your patches?
> 
> Regards,
> Vladimir
> 
> David B. Anderson wrote:
>> Hi Vladimir,
>>
>> The four patches named below are what I'm using to get the OFED 1.2.5 
>> kernel to build for SLES9 SP4.
>>
>> commit 3db835ee0edb792b120ba10c8066e3d4409de2d7
>>
>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
>>
>>
>> The patches are:
>>
>> [PATCH 1/4] LNXI changed ofed_scripts configure to select sp4 patches
>>
>> [PATCH 2/4] LNXI created backport patch addr_8802_to_2_6_5-7_308
>>
>> [PATCH 3/4] LNXI fixed backport/2.6.5_sles9_sp4/rds_to_2_6_9.patch
>>
>> [PATCH 4/4] LNXI fixed backport/2.6.5_sles9_sp4/cxg3_to_2_6_20.patch
>>
>>
>> I've tested these on my cluster.
>>
>> Note: I changed your patch to the ofed_scripts/configure script, so 
>> that even if the SLES9
>>
>> kernel is greater than 309 it will not revert to using SP3 patches.
>>
>>
>> David
>>
>>
>> Vladimir Sokolovsky wrote:
>>> David B. Anderson wrote:
>>>> I've all of these patches plus the following patch
>>>>
>>>>    kernel_patches/backport/2.6.5_sles9_sp4/cxgb3_remove_eeh.patch
>>>>
>>>> My current git repo is
>>>> git://git.openfabrics.org/ofed_1_2/linux-2.6.git
>>>> commit: 6974c285e6fb06264f570f9cf919865bab66c9e6
>>>>
>>>> My patch that I posted before fixes the kernel configure script so 
>>>> that it applies 2.6.5_sles9_sp4 patches for the SP4 release kernel 
>>>> of
>>>> 2.6.5-7.308 and above. The configure patch from 
>>>> FED_1.2.5_sles9_sp4_configure.diff has 2.6.5-7.305* as the only 
>>>> valid
>>>> SP4 kernel which is incorrect. I get the same compiler error as
> before.
>>>>
>>>>
>>>> Moshe Kazir wrote:
>>>>>  See patches in the attached message.
>>>>>
>>>>> It was applied by Vlad.
>>>>>
>>>>> Moshe
>>>>>
>>>>> ____________________________________________________________
>>>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>>>  
>>>>> Voltaire - The Grid Backbone
>>>>>  
>>>>>  www.voltaire.com
>>>>>
>>>>>  
>>>>>
>>>>> -----Original Message-----
>>>>> From: general-bounces at lists.openfabrics.org
>>>>> [mailto:general-bounces at lists.openfabrics.org] On Behalf Of David
> B.
>>>>> Anderson
>>>>> Sent: Saturday, December 15, 2007 3:31 AM
>>>>> To: general at lists.openfabrics.org; vlad at mellanox.co.il
>>>>> Subject: [ofa-general] [PATCH] LNXI Fixed ofed_scripts configure to
> 
>>>>> select sp4 patches for SLES9 kernel with minor versions equal or 
>>>>> greater than 305
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I've created the following patch for OFED 1.2.5.4 to have the 
>>>>> kernel for
>>>>>
>>>>> SLES9 SP4 recognized (2.6.5-7.308).
>>>>>
>>>>> Even with the patch I then had two back port patches not apply 
>>>>> cleanly (cxg3_to_2_6_20.patch, rds_to_2_6_9.patch). I hand patched 
>>>>> them but now I'm getting the following compiler errors:
>>>>>
>>>>> In file included from
>>>>> /usr/src/linux-2.6.5-7.308/include/linux/module.h:10,
>>>>>                  from
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>>> /back
>>>>>
>>>>> port/2.6.5_sles9_sp4/include/linux/module.h:4,
>>>>>                  from
>>>>> /usr/src/linux-2.6.5-7.308/include/linux/device.h:21,
>>>>>                  from
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>>> /back
>>>>>
>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:4,
>>>>>                  from
>>>>> /usr/src/linux-2.6.5-7.308/include/linux/netdevice.h:38,
>>>>>                  from
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>>> /back
>>>>>
>>>>> port/2.6.5_sles9_sp4/include/linux/netdevice.h:4,
>>>>>                  from
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>>> iband
>>>>>
>>>>> /core/addr.c:32:
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>>> /back
>>>>>
>>>>> port/2.6.5_sles9_sp4/include/linux/sched.h:8: warning: static 
>>>>> declaration for `wait_for_completion_timeout' follows non-static 
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>>> iband
>>>>>
>>>>> /core/addr.c:67: warning: initialization from incompatible pointer 
>>>>> type 
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>>> iband
>>>>>
>>>>> /core/addr.c: In function `addr_resolve_remote':
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>>> iband
>>>>>
>>>>> /core/addr.c:192: error: structure has no member named `idev'
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>>> iband
>>>>>
>>>>> /core/addr.c:193: error: structure has no member named `idev'
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>>> iband
>>>>>
>>>>> /core/addr.c:197: error: structure has no member named `idev'
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infin
>>>>> iband
>>>>>
>>>>> /core/addr.c: At top level:
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>>> /back
>>>>>
>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:48: warning: 
>>>>> `class_create' defined but not used 
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>>> /back
>>>>>
>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:82: warning: 
>>>>> `class_destroy' defined but not used 
>>>>> /home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/kernel_addons
>>>>> /back
>>>>>
>>>>> port/2.6.5_sles9_sp4/include/linux/device.h:108: warning: 
>>>>> `class_device_create' defined but not used
>>>>> make[6]: ***
>>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi
>>>>> niban
>>>>>
>>>>> d/core/addr.o] Error 1
>>>>> make[5]: ***
>>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi
>>>>> niban
>>>>>
>>>>> d/core] Error 2
>>>>> make[4]: ***
>>>>> [/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default/drivers/infi
>>>>> niban
>>>>>
>>>>> d] Error 2
>>>>> make[3]: ***
>>>>> [_module_/home/danderson/rpmbuild/BUILD/ofa_kernel/obj/default] 
>>>>> Error 2
>>>>> make[2]: *** [modules] Error 2
>>>>> make[1]: *** [modules] Error 2
>>>>> make[1]: Leaving directory
>>>>> `/usr/src/linux-2.6.5-7.308-obj/x86_64/default'
>>>>> make: *** [kernel] Error 2
>>>>>
>>>>> Does anyone have OFED 1.2.5.4 building for SLES 9 SP4?
>>>>>
>>>>> Thanks
>>>>>
>>>>>  
>>>>> -------------------------------------------------------------------
>>>>> -----
>>>>>
>>>>>
>>>>> Subject:
>>>>> [ofa-general] OFED-1.2.5 backport patches for SLES9 SP4
>>>>> From:
>>>>> "Moshe Kazir" <moshek at voltaire.com>
>>>>> Date:
>>>>> Sun, 25 Nov 2007 09:59:26 +0200
>>>>> To:
>>>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>>>> <general at lists.openfabrics.org>
>>>>>
>>>>> To:
>>>>> "Vladimir Sokolovsky" <vlad at mellanox.co.il>, 
>>>>> <general at lists.openfabrics.org>
>>>>>
>>>>>
>>>>> The attached files do the work.
>>>>>
>>>>> OFED_1.2.5_sles9_sp4_configure.diff  include the changes in the 
>>>>> configure file.
>>>>> OFED_1.2.5_sles9_sp4_backport.diff  include the canges requiered in
> 
>>>>> the kernel_patche and kernel_addons directories.
>>>>>
>>>>> Moshe
>>>>> ____________________________________________________________
>>>>> Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)
>>>>>  
>>>>> Voltaire - The Grid Backbone
>>>>>  
>>>>>  www.voltaire.com
>>>>>
>>> Hi David,
>>> Please try the latest OFED-1.2.5.4-20071219-0824.tgz build on your 
>>> SLES9SP4.
>>>
>>> http://www.openfabrics.org/builds/connectx/OFED-1.2.5.4-20071219-0824
>>> .tgz
>>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
> 


From dotanb at dev.mellanox.co.il  Sun Dec 30 01:46:36 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Sun, 30 Dec 2007 11:46:36 +0200
Subject: [ofa-general] Registration of Shared memory with read only
	permission fails
Message-ID: <477768FC.9060100@dev.mellanox.co.il>

Hi.

I allocated a shared memory with read only permission and tried to 
register it
(as read only memory) and failed.
(in the kernel level: drivers/infiniband/core/umem.c, function 
get_user_pages failed)

Attached is a short code that reproduces this error.

Is this is an expected behavior?

thanks
Dotan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: reg_mr.c
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071230/ccff87c6/attachment.c>

From vlad at lists.openfabrics.org  Sun Dec 30 03:07:53 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Sun, 30 Dec 2007 03:07:53 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071230-0200 daily build status
Message-ID: <20071230110753.E3E8FE6009D@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on ppc64 with linux-2.6.17
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.13
Passed on powerpc with linux-2.6.13
Passed on ia64 with linux-2.6.14
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.12
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.16
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.12
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.18-53.el5
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ppc64 with linux-2.6.18-8.el5
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:


From eleniltonborges at gmail.com  Sun Dec 30 03:59:08 2007
From: eleniltonborges at gmail.com (Elenilton Borges)
Date: Sun, 30 Dec 2007 09:59:08 -0200
Subject: [ofa-general] Clube dos Investidores !!
Message-ID: <3b2cd210712300359p5d0d4238r5d64e6c0f583a9e5@mail.gmail.com>

*Tenho conhecimento que uma mensagem não solicitada é considerada SPAM, mas,
você ou algum contato seu enviou-me mensagens contendo seu endereço e com TODO
RESPEITO salvei-o em minha relação de endereços.  Faço parte de um grande
negócio e quero apresentá-lo. *

* *
*CLUBE DOS INVESTIDORES*
**
**Você só precisa de 6 convidados!!


Com um único investimento de R$ 20,00 (*que pode ser parcelado em 5 x de R$
4,00*), você receberá do site um montante em comissões que chegam aos R$
30.000,00.


E para que tudo isso se torne realidade, você só precisa *convidar* *6
amigos* assim como eu estou fazendo com você.


Dê uma olhada em meu site pessoal e veja como isso tudo é possível. Vale a
pena dar uma conferida quando tiver um tempinho (Lá você encontrara
explicações detalhadas sobre o sistema).


Saiba mais sobre o sistema acessando o meu link abaixo:


http://www.clube.cisb.pro.br/?id=292


Meu ID é = 292


Um Abraço e

*BOA SORTE!*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071230/6d37827c/attachment.html>

From hrosenstock at xsigo.com  Sun Dec 30 08:21:32 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sun, 30 Dec 2007 08:21:32 -0800
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <20071229183459.GB19160@sashak.voltaire.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
	<1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
	<20071227164028.GE13378@sashak.voltaire.com>
	<4774135E.6060601@dev.mellanox.co.il>
	<20071229183459.GB19160@sashak.voltaire.com>
Message-ID: <1199031692.23289.322.camel@hrosenstock-ws.xsigo.com>

On Sat, 2007-12-29 at 18:34 +0000, Sasha Khapyorsky wrote:
> On 23:04 Thu 27 Dec     , Yevgeny Kliteynik wrote:
> >  Sasha Khapyorsky wrote:
> > > On 08:20 Thu 27 Dec     , Hal Rosenstock wrote:
> > >> On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
> > >>> Hi Hal,
> > >>>
> > >>> On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
> > >>>> On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> > >>>>> Fixing wrong setting of partition enforcement bits on switch ports.
> > >>>>> When an HCA port is configured with a certain pkey, the peer port
> > >>>>> on the switch should turn on outbound partition enforcement bit only.
> > >>>>> Turning on the inbound enforcement will cause the switch to drop
> > >>>>> valid packets if the HCA is partial member.
> > >>>> Inbound enforcement is actually the more useful case. If there is
> > >>>> inbound enforcement, outbound enforcement doesn't add much.
> > >>>>
> > >>>> In the case of partial only (not both partial and full) membership, the
> > >>>> peer switch physical port would need to be set to full membership.
> > >>> Then it could break outbound enforcement. Isn't it?
> > >> What I wrote was wrong. Limited pkey is sufficient. See o18-14
> > > Do you mean last paragraph of o18-14? Assuming so - it makes sense. So
> > > we need just revert the original patch.
> > 
> >  Almost true. It would be nice to keep the new condition:
> > 
> >  -	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
> >  +	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
> >  +	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
> 
> I liked the original version - it is shorter and looks cleaner for me.
> 
> I'm reverting entire patch.

Is all the pkey handling back to being identical to before Yevgeny's 
change(s) now ?

Also, which branch(es) ? master and ofed_1_3 ?

BTW, are master and ofed_1_3 different right now ?

-- Hal

> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Sun Dec 30 08:34:05 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sun, 30 Dec 2007 08:34:05 -0800
Subject: [ofa-general] Re: [PATCH] osm/osm_mcast_mgr.c: coredump in
	ofed_1_2
In-Reply-To: <20071229185057.GC19160@sashak.voltaire.com>
References: <47724048.2000302@dev.mellanox.co.il>
	<20071226154218.GN7012@sashak.voltaire.com>
	<1198767473.23289.195.camel@hrosenstock-ws.xsigo.com>
	<20071227164737.GF13378@sashak.voltaire.com>
	<1198773804.23289.220.camel@hrosenstock-ws.xsigo.com>
	<20071227170758.GI13378@sashak.voltaire.com>
	<1198775715.23289.225.camel@hrosenstock-ws.xsigo.com>
	<20071229185057.GC19160@sashak.voltaire.com>
Message-ID: <1199032445.23289.334.camel@hrosenstock-ws.xsigo.com>

Hi Sasha,

On Sat, 2007-12-29 at 18:50 +0000, Sasha Khapyorsky wrote:
> Hi Hal,
> 
> On 09:15 Thu 27 Dec     , Hal Rosenstock wrote:
> > 
> > There were other patches I supplied, etc. I don't think it was just the
> > outstanding counter fix but others too.
> 
> I may be wrong about it, but I don't remember this now. If you like me
> to review those patches again, let's review.

There were multiple patches supplied for OFED 1.2 management but the
time has long (several months) come and gone and at this point, I'm not
sure I want to go back. I was more commenting on the change by
incorporating an OFED 1.2 patch when these were previously effectively
rejected.

> > > > I can dig out the emails if this is really needed.
> > > > 
> > > > > I don't backport non critical fixes and improvements from master to 1.2
> > > > > - that is true.
> > > > 
> > > > There were a number of fixes originally supplied for 1.2 up ported to
> > > > 1.3. Guess you could always consider them non critical although I would
> > > > beg to differ on some of those.
> > > 
> > > Always - no, but in general I prefer to run master in the field.
> > 
> > Not everyone has that luxury (and master has not even shipped as 1.3
> > yet).
> 
> The recent OpenSM (and management) versions are available as tarballs
> and/or from the git repo.

Yes, but that's largely a non sequitor. It does not address the issue
that not everyone can update to the latest and greatest just because a
new release is made by a maintainer (or meets the criteria for OFED
approval which has not yet occurred in terms of these). In fact, there
still appear to be reasonably large ongoing changes in the OpenSM area.

I believe this to be an ongoing issue (in the sense it will shortly
exist for OFED 1.3 as soon as it is in the "can"). I think the EWG needs
to address this.

> > I tried to get this changed but seems no one else shares this
> > issue. As I said before, the implication of this is that 1.2 is not
> > officially supported based on this policy (although there are fixes
> > going into 1.2 as 1.2.5 is still "live").
> 
> There should not be connection between "this policy" and the official
> OFED 1.2 support which is provided by software/hardware vendors.

Maybe support is the wrong word; not sure what the right word is. Sure;
support comes from the vendor (which in the case of OpenSM is only
perhaps Mellanox and those using it in their platforms).

In any case, it is a fact that there is some maintainer support for OFED
1.2 which is ongoing (which is not the case for management with the
exception of the point patch you incorporated into your ofed_1_2 tree.

-- Hal

> Sasha
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From hrosenstock at xsigo.com  Sun Dec 30 08:38:30 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Sun, 30 Dec 2007 08:38:30 -0800
Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements
In-Reply-To: <20071229182718.GA19160@sashak.voltaire.com>
References: <4770CDCE.8040200@dev.mellanox.co.il>
	<20071229182718.GA19160@sashak.voltaire.com>
Message-ID: <1199032710.23289.340.camel@hrosenstock-ws.xsigo.com>

On Sat, 2007-12-29 at 18:27 +0000, Sasha Khapyorsky wrote:
> This improves handling of mcast join/leave requests storming. Now mcast
> routing will be recalculated for all mcast groups where changes occurred
> and not one by one. For this it queues mcast groups instead of mcast
> rerouting requests, this also makes state_mgr idle queue obsolete.

Looks like a nice improvement.

What testing has been done with this change ? Can you comment on any
results ?

For which branches is this change being proposed ?

-- Hal

> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> ---
> 
> Hi Yevgeny,
> 
> For me it looks that it should solve the original problem (mcast group
> list is purged in osm_mcast_mgr_process()). Could you review and ideally
> test it? Thanks.
> 
> Sasha
> 
> ---
>  opensm/include/opensm/osm_mcast_mgr.h |   14 +--
>  opensm/include/opensm/osm_multicast.h |    2 +
>  opensm/include/opensm/osm_sm.h        |    2 +
>  opensm/include/opensm/osm_state_mgr.h |   95 -----------------
>  opensm/opensm/osm_mcast_mgr.c         |  187 +++++++++++++++------------------
>  opensm/opensm/osm_sm.c                |   70 ++++++-------
>  opensm/opensm/osm_state_mgr.c         |  138 +------------------------
>  7 files changed, 130 insertions(+), 378 deletions(-)
> 
> diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h
> index 3e0b761..47b67ed 100644
> --- a/opensm/include/opensm/osm_mcast_mgr.h
> +++ b/opensm/include/opensm/osm_mcast_mgr.h
> @@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr {
>  	osm_req_t *p_req;
>  	osm_log_t *p_log;
>  	cl_plock_t *p_lock;
> -
>  } osm_mcast_mgr_t;
>  /*
>  * FIELDS
> @@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr);
>  *	Multicast Manager, Node Info Response Controller
>  *********/
>  
> -/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb
> +/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups
>  * NAME
> -*	osm_mcast_mgr_process_mgrp_cb
> +*	osm_mcast_mgr_process_mgroups
>  *
>  * DESCRIPTION
> -*	Callback entry point for the osm_mcast_mgr_process_mgrp function.
> +*	Process only requested mcast groups.
>  *
>  * SYNOPSIS
>  */
>  osm_signal_t
> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2);
> +osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr);
>  /*
>  * PARAMETERS
> -*	(Context1) p_mgr
> +*	p_mgr
>  *		[in] Pointer to an osm_mcast_mgr_t object.
>  *
> -*	(Context2) p_mgrp
> -*		[in] Pointer to the multicast group to process.
> -*
>  * RETURN VALUES
>  *	IB_SUCCESS
>  *
> diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h
> index 729a2ea..f442a45 100644
> --- a/opensm/include/opensm/osm_multicast.h
> +++ b/opensm/include/opensm/osm_multicast.h
> @@ -50,6 +50,7 @@
>  
>  #include <iba/ib_types.h>
>  #include <complib/cl_qmap.h>
> +#include <complib/cl_qlist.h>
>  #include <complib/cl_spinlock.h>
>  #include <opensm/osm_base.h>
>  #include <opensm/osm_mtree.h>
> @@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN osm_mcast_req_type_t req_type);
>  * SYNOPSIS
>  */
>  typedef struct osm_mcast_mgr_ctxt {
> +	cl_list_item_t list_item;
>  	ib_net16_t mlid;
>  	osm_mcast_req_type_t req_type;
>  	ib_net64_t port_guid;
> diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h
> index 4c6ce27..a676cd6 100644
> --- a/opensm/include/opensm/osm_sm.h
> +++ b/opensm/include/opensm/osm_sm.h
> @@ -140,6 +140,8 @@ typedef struct osm_sm {
>  	cl_dispatcher_t *p_disp;
>  	cl_plock_t *p_lock;
>  	atomic32_t sm_trans_id;
> +	cl_spinlock_t mgrp_lock;
> +	cl_qlist_t mgrp_list;
>  	osm_req_t req;
>  	osm_resp_t resp;
>  	osm_ni_rcv_t ni_rcv;
> diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h
> index dada097..f51593a 100644
> --- a/opensm/include/opensm/osm_state_mgr.h
> +++ b/opensm/include/opensm/osm_state_mgr.h
> @@ -109,8 +109,6 @@ typedef struct _osm_state_mgr {
>  	osm_stats_t *p_stats;
>  	struct _osm_sm_state_mgr *p_sm_state_mgr;
>  	const osm_sm_mad_ctrl_t *p_mad_ctrl;
> -	cl_spinlock_t idle_lock;
> -	cl_qlist_t idle_time_list;
>  	cl_plock_t *p_lock;
>  	cl_event_t *p_subnet_up_event;
>  	osm_sm_state_t state;
> @@ -172,99 +170,6 @@ typedef struct _osm_state_mgr {
>  *	State Manager object
>  *********/
>  
> -/****s* OpenSM: State Manager/_osm_idle_item
> -* NAME
> -*	_osm_idle_item
> -*
> -* DESCRIPTION
> -*	Idle item.
> -*
> -* SYNOPSIS
> -*/
> -
> -typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void *context2);
> -
> -typedef void
> - (*osm_pfn_done_t) (IN void *context1, IN void *context2);
> -
> -typedef struct _osm_idle_item {
> -	cl_list_item_t list_item;
> -	void *context1;
> -	void *context2;
> -	osm_pfn_start_t pfn_start;
> -	osm_pfn_done_t pfn_done;
> -} osm_idle_item_t;
> -
> -/*
> -* FIELDS
> -*	list_item
> -*		list item.
> -*
> -*	context1
> -*		Context pointer
> -*
> -*	context2
> -*		Context pointer
> -*
> -*	pfn_start
> -*		Pointer to the start function.
> -*
> -*	pfn_done
> -*		Pointer to the dine function.
> -* SEE ALSO
> -*	State Manager object
> -*********/
> -
> -/****f* OpenSM: State Manager/osm_state_mgr_process_idle
> -* NAME
> -*	osm_state_mgr_process_idle
> -*
> -* DESCRIPTION
> -*	Formulates the osm_idle_item and inserts it into the queue and
> -*	signals the state manager.
> -*
> -* SYNOPSIS
> -*/
> -
> -ib_api_status_t
> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> -			   IN osm_pfn_start_t pfn_start,
> -			   IN osm_pfn_done_t pfn_done,
> -			   void *context1, void *context2);
> -
> -/*
> -* PARAMETERS
> -*	p_mgr
> -*		[in] Pointer to a State Manager object to construct.
> -*
> -*	pfn_start
> -*		[in] Pointer the start function which will be called at
> -*			idle time.
> -*
> -*	pfn_done
> -*		[in] pointer the done function which will be called
> -*			when outstanding smps is zero
> -*
> -*	context1
> -*		[in] Pointer to void
> -*
> -*	context2
> -*		[in] Pointer to void
> -*
> -* RETURN VALUE
> -*	IB_SUCCESS or IB_ERROR
> -*
> -* NOTES
> -*	Allows osm_state_mgr_destroy
> -*
> -*	Calling osm_state_mgr_construct is a prerequisite to calling any other
> -*	method except osm_state_mgr_init.
> -*
> -* SEE ALSO
> -*	State Manager object, osm_state_mgr_init,
> -*	osm_state_mgr_destroy
> -*********/
> -
>  /****f* OpenSM: State Manager/osm_state_mgr_construct
>  * NAME
>  *	osm_state_mgr_construct
> diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
> index 50b95fd..f51a45a 100644
> --- a/opensm/opensm/osm_mcast_mgr.c
> +++ b/opensm/opensm/osm_mcast_mgr.c
> @@ -815,7 +815,7 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr,
>  	}
>  
>  	free(list_array);
> -      Exit:
> +Exit:
>  	OSM_LOG_EXIT(p_mgr->p_log);
>  	return (p_mtn);
>  }
> @@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t * const p_mgr,
>  		"Configured MLID 0x%X for %u ports, max tree depth = %u\n",
>  		cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth);
>  
> -      Exit:
> +Exit:
>  	OSM_LOG_EXIT(p_mgr->p_log);
>  	return (status);
>  }
> @@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr,
>  		}
>  	}
>  
> -      Exit:
> +Exit:
>  	OSM_LOG_EXIT(p_mgr->p_log);
>  	return (status);
>  }
> @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * const p_mgr,
>  							   port_guid);
>  	}
>  
> -      Exit:
> +Exit:
>  	OSM_LOG_EXIT(p_mgr->p_log);
>  	return (status);
>  }
>  
>  /**********************************************************************
>   Process the entire group.
> -
>   NOTE : The lock should be held externally!
>   **********************************************************************/
> -static osm_signal_t
> -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> -			   IN osm_mgrp_t * const p_mgrp,
> -			   IN osm_mcast_req_type_t req_type,
> -			   IN ib_net64_t port_guid)
> +static ib_api_status_t
> +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> +		       IN osm_mgrp_t * const p_mgrp,
> +		       IN osm_mcast_req_type_t req_type,
> +		       IN ib_net64_t port_guid)
>  {
> -	osm_signal_t signal = OSM_SIGNAL_DONE;
>  	ib_api_status_t status;
> -	osm_switch_t *p_sw;
> -	cl_qmap_t *p_sw_tbl;
> -	boolean_t pending_transactions = FALSE;
>  
>  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp);
>  
> -	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> -
>  	status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid);
>  	if (status != IB_SUCCESS) {
>  		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> -			"osm_mcast_mgr_process_mgrp: ERR 0A19: "
> +			"mcast_mgr_process_mgrp: ERR 0A19: "
>  			"Unable to create spanning tree (%s)\n",
>  			ib_get_err_str(status));
> -
>  		goto Exit;
>  	}
> +	p_mgrp->last_tree_id = p_mgrp->last_change_id;
>  
> -	/*
> -	   Walk the switches and download the tables for each.
> +	/* Remove MGRP only if osm_mcm_port_t count is 0 and
> +	 * Not a well known group
>  	 */
> -	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> -	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> -		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> -		if (signal == OSM_SIGNAL_DONE_PENDING)
> -			pending_transactions = TRUE;
> -
> -		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> +	if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) {
> +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> +			"mcast_mgr_process_mgrp: "
> +			"Destroying mgrp with lid:0x%X\n",
> +			cl_ntoh16(p_mgrp->mlid));
> +		/* Send a Report to any InformInfo registered for
> +		   Trap 67 : MCGroup delete */
> +		osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> +					    p_mgrp);
> +		cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> +				    (cl_map_item_t *) p_mgrp);
> +		osm_mgrp_delete(p_mgrp);
>  	}
>  
> -	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> -
> -      Exit:
> +Exit:
>  	OSM_LOG_EXIT(p_mgr->p_log);
> -
> -	if (pending_transactions == TRUE)
> -		return (OSM_SIGNAL_DONE_PENDING);
> -	else
> -		return (OSM_SIGNAL_DONE);
> +	return status;
>  }
>  
>  /**********************************************************************
> @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
>  	osm_switch_t *p_sw;
>  	cl_qmap_t *p_sw_tbl;
>  	cl_qmap_t *p_mcast_tbl;
> +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
>  	osm_mgrp_t *p_mgrp;
> -	ib_api_status_t status;
>  	boolean_t pending_transactions = FALSE;
>  
>  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process);
>  
>  	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> -
>  	p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl;
>  	/*
>  	   While holding the lock, iterate over all the established
> @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
>  		/* We reached here due to some change that caused a heavy sweep
>  		   of the subnet. Not due to a specific multicast request.
>  		   So the request type is subnet_change and the port guid is 0. */
> -		status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp,
> -						    OSM_MCAST_REQ_TYPE_SUBNET_CHANGE,
> -						    0);
> -		if (status != IB_SUCCESS) {
> -			osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> -				"osm_mcast_mgr_process: ERR 0A20: "
> -				"Unable to create spanning tree (%s)\n",
> -				ib_get_err_str(status));
> -		}
> -
> +		mcast_mgr_process_mgrp(p_mgr, p_mgrp,
> +				       OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0);
>  		p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item);
>  	}
>  
> @@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
>  		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
>  		if (signal == OSM_SIGNAL_DONE_PENDING)
>  			pending_transactions = TRUE;
> -
>  		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
>  	}
>  
> +	while (!cl_is_qlist_empty(p_list)) {
> +		cl_list_item_t *p = cl_qlist_remove_head(p_list);
> +		free(p);
> +	}
> +
>  	CL_PLOCK_RELEASE(p_mgr->p_lock);
>  
>  	OSM_LOG_EXIT(p_mgr->p_log);
> @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcast_mgr_t * const p_mgr,
>  
>  /**********************************************************************
>    This is the function that is invoked during idle time to handle the
> -  process request. Context1 is simply the osm_mcast_mgr_t*, Context2
> -  hold the mlid, port guid and action (join/leave/delete) required.
> +  process request for mcast groups where join/leave/delete was required.
>   **********************************************************************/
> -osm_signal_t
> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2)
> +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr)
>  {
> -	osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1;
> +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
> +	osm_switch_t *p_sw;
> +	cl_qmap_t *p_sw_tbl;
>  	osm_mgrp_t *p_mgrp;
>  	ib_net16_t mlid;
> -	osm_signal_t signal = OSM_SIGNAL_DONE;
> -	osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2;
> -	osm_mcast_req_type_t req_type = p_ctxt->req_type;
> -	ib_net64_t port_guid = p_ctxt->port_guid;
> -
> -	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb);
> -
> -	/* nice copy no warning on size diff */
> -	memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid));
> +	osm_signal_t ret, signal = OSM_SIGNAL_DONE;
> +	osm_mcast_mgr_ctxt_t *ctx;
> +	osm_mcast_req_type_t req_type;
> +	ib_net64_t port_guid;
>  
> -	/* we can destroy the context now */
> -	free(p_ctxt);
> +	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups);
>  
>  	/* we need a lock to make sure the p_mgrp is not change other ways */
>  	CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock);
> -	p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
>  
> -	/* since we delayed the execution we prefer to pass the
> -	   mlid as the mgrp identifier and then find it or abort */
> +	if (cl_is_qlist_empty(p_list)) {
> +		CL_PLOCK_RELEASE(p_mgr->p_lock);
> +		return OSM_SIGNAL_NONE;
> +	}
> +
> +	while (!cl_is_qlist_empty(p_list)) {
> +		ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list);
> +		req_type = ctx->req_type;
> +		port_guid = ctx->port_guid;
> +
> +		/* nice copy no warning on size diff */
> +		memcpy(&mlid, &ctx->mlid, sizeof(mlid));
>  
> -	if (p_mgrp) {
> +		/* we can destroy the context now */
> +		free(ctx);
> +
> +		/* since we delayed the execution we prefer to pass the
> +		   mlid as the mgrp identifier and then find it or abort */
> +		p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
> +		if (!p_mgrp)
> +			continue;
>  
> -		/* if there was no change from the last time we processed the group
> -		   we can skip doing anything
> +		/* if there was no change from the last time
> +		 * we processed the group we can skip doing anything
>  		 */
>  		if (p_mgrp->last_change_id == p_mgrp->last_tree_id) {
>  			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> -				"osm_mcast_mgr_process_mgrp_cb: "
> +				"osm_mcast_mgr_process_mgroups: "
>  				"Skip processing mgrp with lid:0x%X change id:%u\n",
>  				cl_ntoh16(mlid), p_mgrp->last_change_id);
> -		} else {
> -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> -				"osm_mcast_mgr_process_mgrp_cb: "
> -				"Processing mgrp with lid:0x%X change id:%u\n",
> -				cl_ntoh16(mlid), p_mgrp->last_change_id);
> -
> -			signal =
> -			    osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type,
> -						       port_guid);
> -			p_mgrp->last_tree_id = p_mgrp->last_change_id;
> +			continue;
>  		}
>  
> -		/* Remove MGRP only if osm_mcm_port_t count is 0 and
> -		 * Not a well known group
> -		 */
> -		if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) &&
> -		    (p_mgrp->well_known == FALSE)) {
> -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> -				"osm_mcast_mgr_process_mgrp_cb: "
> -				"Destroying mgrp with lid:0x%X\n",
> -				cl_ntoh16(mlid));
> -
> -			/* Send a Report to any InformInfo registered for
> -			   Trap 67 : MCGroup delete */
> -			osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> -						    p_mgrp);
> -
> -			cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> -					    (cl_map_item_t *) p_mgrp);
> +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> +			"osm_mcast_mgr_process_mgroups: "
> +			"Processing mgrp with lid:0x%X change id:%u\n",
> +			cl_ntoh16(mlid), p_mgrp->last_change_id);
> +		mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid);
> +	}
>  
> -			osm_mgrp_delete(p_mgrp);
> -		}
> +	/*
> +	   Walk the switches and download the tables for each.
> +	 */
> +	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> +	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> +	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> +		ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> +		if (ret == OSM_SIGNAL_DONE_PENDING)
> +			signal = ret;
> +		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
>  	}
>  
> +	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> +
>  	CL_PLOCK_RELEASE(p_mgr->p_lock);
>  	OSM_LOG_EXIT(p_mgr->p_log);
>  	return signal;
> diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c
> index 88e6d4a..b295a77 100644
> --- a/opensm/opensm/osm_sm.c
> +++ b/opensm/opensm/osm_sm.c
> @@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm)
>  	cl_event_construct(&p_sm->signal_event);
>  	cl_event_construct(&p_sm->subnet_up_event);
>  	cl_thread_construct(&p_sm->sweeper);
> +	cl_spinlock_construct(&p_sm->mgrp_lock);
>  	osm_req_construct(&p_sm->req);
>  	osm_resp_construct(&p_sm->resp);
>  	osm_ni_rcv_construct(&p_sm->ni_rcv);
> @@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm)
>  	cl_event_destroy(&p_sm->signal_event);
>  	cl_event_destroy(&p_sm->subnet_up_event);
>  	cl_spinlock_destroy(&p_sm->signal_lock);
> +	cl_spinlock_destroy(&p_sm->mgrp_lock);
>  
>  	osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n");	/* Format Waived */
>  	OSM_LOG_EXIT(p_sm->p_log);
> @@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm,
>  	if (status != CL_SUCCESS)
>  		goto Exit;
>  
> +	cl_qlist_init(&p_sm->mgrp_list);
> +
> +	status = cl_spinlock_init(&p_sm->mgrp_lock);
> +	if (status != CL_SUCCESS)
> +		goto Exit;
> +
>  	status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl,
>  				      p_sm->p_subn,
>  				      p_sm->p_mad_pool,
> @@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const ib_net64_t port_guid)
>  /**********************************************************************
>   **********************************************************************/
>  static ib_api_status_t
> -__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> +__osm_sm_mgrp_process(IN osm_sm_t * const p_sm,
>  		      IN osm_mgrp_t * const p_mgrp,
>  		      IN const ib_net64_t port_guid,
>  		      IN osm_mcast_req_type_t req_type)
>  {
> -	ib_api_status_t status;
>  	osm_mcast_mgr_ctxt_t *ctx2;
>  
> -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect);
> -
>  	/*
>  	 * 'Schedule' all the QP0 traffic for when the state manager
>  	 * isn't busy trying to do something else.
>  	 */
>  	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> +	if (!ctx2)
> +		return IB_ERROR;
> +	memset(ctx2, 0, sizeof(*ctx2));
>  	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
>  	ctx2->req_type = req_type;
>  	ctx2->port_guid = port_guid;
>  
> -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> -					    osm_mcast_mgr_process_mgrp_cb,
> -					    NULL, &p_sm->mcast_mgr,
> -					    (void *)ctx2);
> +	cl_spinlock_acquire(&p_sm->mgrp_lock);
> +	cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item);
> +	cl_spinlock_release(&p_sm->mgrp_lock);
>  
> -	OSM_LOG_EXIT(p_sm->p_log);
> -	return (status);
> +	osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> +
> +	return IB_SUCCESS;
> +}
> +
> +/**********************************************************************
> + **********************************************************************/
> +static ib_api_status_t
> +__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> +		      IN osm_mgrp_t * const p_mgrp,
> +		      IN const ib_net64_t port_guid,
> +		      IN osm_mcast_req_type_t req_type)
> +{
> +	return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type);
>  }
>  
>  /**********************************************************************
> @@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm,
>  			 IN osm_mgrp_t * const p_mgrp,
>  			 IN const ib_net64_t port_guid)
>  {
> -	ib_api_status_t status;
> -	osm_mcast_mgr_ctxt_t *ctx2;
> -
> -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect);
> -
> -	/*
> -	 * 'Schedule' all the QP0 traffic for when the state manager
> -	 * isn't busy trying to do something else.
> -	 */
> -	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> -	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
> -	ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE;
> -	ctx2->port_guid = port_guid;
> -
> -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> -					    osm_mcast_mgr_process_mgrp_cb,
> -					    NULL, &p_sm->mcast_mgr, ctx2);
> -	if (status != IB_SUCCESS) {
> -		osm_log(p_sm->p_log, OSM_LOG_ERROR,
> -			"__osm_sm_mgrp_disconnect: ERR 2E11: "
> -			"Failure processing multicast group (%s)\n",
> -			ib_get_err_str(status));
> -	}
> -
> -	OSM_LOG_EXIT(p_sm->p_log);
> +	__osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, OSM_MCAST_REQ_TYPE_LEAVE);
>  }
>  
>  /**********************************************************************
> @@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm,
>  		goto Exit;
>  	}
>  
> -	CL_PLOCK_RELEASE(p_sm->p_lock);
>  	status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type);
> +	CL_PLOCK_RELEASE(p_sm->p_lock);
>  
>        Exit:
>  	OSM_LOG_EXIT(p_sm->p_log);
> @@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm,
>  
>  	osm_port_remove_mgrp(p_port, mlid);
>  
> -	CL_PLOCK_RELEASE(p_sm->p_lock);
> -
>  	__osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid);
> +	CL_PLOCK_RELEASE(p_sm->p_lock);
>  
>        Exit:
>  	OSM_LOG_EXIT(p_sm->p_log);
> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
> index 5c39f11..d4dd782 100644
> --- a/opensm/opensm/osm_state_mgr.c
> +++ b/opensm/opensm/osm_state_mgr.c
> @@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm);
>  void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr)
>  {
>  	memset(p_mgr, 0, sizeof(*p_mgr));
> -	cl_spinlock_construct(&p_mgr->idle_lock);
>  	p_mgr->state = OSM_SM_STATE_INIT;
>  }
>  
> @@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr)
>  
>  	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy);
>  
> -	/* destroy the locks */
> -	cl_spinlock_destroy(&p_mgr->idle_lock);
> -
>  	OSM_LOG_EXIT(p_mgr->p_log);
>  }
>  
> @@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
>  		   IN cl_event_t * const p_subnet_up_event,
>  		   IN osm_log_t * const p_log)
>  {
> -	cl_status_t status;
> -
>  	OSM_LOG_ENTER(p_log, osm_state_mgr_init);
>  
>  	CL_ASSERT(p_subn);
> @@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
>  	p_mgr->p_lock = p_lock;
>  	p_mgr->p_subnet_up_event = p_subnet_up_event;
>  
> -	cl_qlist_init(&p_mgr->idle_time_list);
> -
> -	status = cl_spinlock_init(&p_mgr->idle_lock);
> -	if (status != CL_SUCCESS) {
> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> -			"osm_state_mgr_init: ERR 3302: "
> -			"Spinlock init failed (%s)\n", CL_STATUS_MSG(status));
> -	}
> -
>  	OSM_LOG_EXIT(p_mgr->p_log);
> -	return (status);
> +	return IB_SUCCESS;
>  }
>  
>  /**********************************************************************
> @@ -989,79 +974,6 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t *
>  }
>  
>  /**********************************************************************
> - **********************************************************************/
> -static void __process_idle_time_queue_done(IN osm_state_mgr_t * const p_mgr)
> -{
> -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> -	cl_list_item_t *p_list_item;
> -	osm_idle_item_t *p_process_item;
> -
> -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
> -
> -	cl_spinlock_acquire(&p_mgr->idle_lock);
> -	p_list_item = cl_qlist_remove_head(p_list);
> -
> -	if (p_list_item == cl_qlist_end(p_list)) {
> -		cl_spinlock_release(&p_mgr->idle_lock);
> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> -			"__process_idle_time_queue_done: ERR 3314: "
> -			"Idle time queue is empty\n");
> -		return;
> -	}
> -	cl_spinlock_release(&p_mgr->idle_lock);
> -
> -	p_process_item = (osm_idle_item_t *) p_list_item;
> -
> -	if (p_process_item->pfn_done) {
> -
> -		p_process_item->pfn_done(p_process_item->context1,
> -					 p_process_item->context2);
> -	}
> -
> -	free(p_process_item);
> -
> -	OSM_LOG_EXIT(p_mgr->p_log);
> -	return;
> -}
> -
> -/**********************************************************************
> - **********************************************************************/
> -static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
> -						    const p_mgr)
> -{
> -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> -	cl_list_item_t *p_list_item;
> -	osm_idle_item_t *p_process_item;
> -	osm_signal_t signal;
> -
> -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start);
> -
> -	cl_spinlock_acquire(&p_mgr->idle_lock);
> -
> -	p_list_item = cl_qlist_head(p_list);
> -	if (p_list_item == cl_qlist_end(p_list)) {
> -		cl_spinlock_release(&p_mgr->idle_lock);
> -		OSM_LOG_EXIT(p_mgr->p_log);
> -		return OSM_SIGNAL_NONE;
> -	}
> -
> -	cl_spinlock_release(&p_mgr->idle_lock);
> -
> -	p_process_item = (osm_idle_item_t *) p_list_item;
> -
> -	CL_ASSERT(p_process_item->pfn_start);
> -
> -	signal =
> -	    p_process_item->pfn_start(p_process_item->context1,
> -				      p_process_item->context2);
> -
> -	CL_ASSERT(signal != OSM_SIGNAL_NONE);
> -
> -	OSM_LOG_EXIT(p_mgr->p_log);
> -	return signal;
> -}
> -
> -/**********************************************************************
>   * Go over all the remote SMs (as updated in the sm_guid_tbl).
>   * Find if there is a remote sm that is a master SM.
>   * If there is a remote master SM - return a pointer to it,
> @@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>  		case OSM_SM_STATE_PROCESS_REQUEST:
>  			switch (signal) {
>  			case OSM_SIGNAL_IDLE_TIME_PROCESS:
> -				signal = __process_idle_time_queue_start(p_mgr);
> +				signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr);
>  				switch (signal) {
>  				case OSM_SIGNAL_NONE:
>  					p_mgr->state = OSM_SM_STATE_IDLE;
> @@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>  			switch (signal) {
>  			case OSM_SIGNAL_NO_PENDING_TRANSACTIONS:
>  			case OSM_SIGNAL_DONE:
> -				/* CALL the done function */
> -				__process_idle_time_queue_done(p_mgr);
> -
> -				/*
> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> -				 * so that the next element in the queue gets processed
> -				 */
> -
>  				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>  				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>  				break;
> @@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>  
>  	OSM_LOG_EXIT(p_mgr->p_log);
>  }
> -
> -/**********************************************************************
> - **********************************************************************/
> -ib_api_status_t
> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> -			   IN osm_pfn_start_t pfn_start,
> -			   IN osm_pfn_done_t pfn_done, void *context1,
> -			   void *context2)
> -{
> -	osm_idle_item_t *p_idle_item;
> -
> -	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle);
> -
> -	p_idle_item = malloc(sizeof(osm_idle_item_t));
> -	if (p_idle_item == NULL) {
> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> -			"osm_state_mgr_process_idle: ERR 3321: "
> -			"insufficient memory\n");
> -		return IB_ERROR;
> -	}
> -
> -	memset(p_idle_item, 0, sizeof(osm_idle_item_t));
> -	p_idle_item->pfn_start = pfn_start;
> -	p_idle_item->pfn_done = pfn_done;
> -	p_idle_item->context1 = context1;
> -	p_idle_item->context2 = context2;
> -
> -	cl_spinlock_acquire(&p_mgr->idle_lock);
> -	cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item);
> -	cl_spinlock_release(&p_mgr->idle_lock);
> -
> -	osm_sm_signal(&p_mgr->p_subn->p_osm->sm,
> -		      OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> -
> -	OSM_LOG_EXIT(p_mgr->p_log);
> -
> -	return IB_SUCCESS;
> -}


From dwfounderpcm at founderpc.com  Sun Dec 30 09:06:20 2007
From: dwfounderpcm at founderpc.com (Ana Murphy)
Date: , 30 Dec 2007 19:06:20 +0200
Subject: [ofa-general] Reliable software only
Message-ID: <980684904.88231515669907@founderpc.com>

Our purpose is to present low cost PC and Macintosh legal software and computer solutions for anyone.
Whether you are a corporate purchaser, a small-scale enterprise possessor, or shopping for your own personal computer, we think that we'll help you.

VIEW ALL PRODUCTS
http://geocities.com/house.clayton/

Most popular materials in sight are:
*Windows XP Professional With SP2 Full Version: Retail price today - $259.99; Our only today - $59.95
*Adobe Creative Suite 3 Design Premium: Retail price today - $1799.95; Our just - $229.95
*Adobe After Effects CS3: Retail price for this time - $999.95; Our only for today - $99.95
*Autodesk VIZ 2008: Retail price now - $1999.95; Our just - $149.95
*Adobe Illustrator CS2: Retail price now - $499.00; Our just - $59.95
*Windows Vista Ultimate 32-bit: Retail price for now - $359.95; Our only today - $79.95
*Adobe Contribute CS3: Retail price for now - $199.95; Our only - $59.95

COME TO US RIGHT NOW!
http://geocities.com/house.clayton/ Nothing in France until he has no. Twas a good lady twas a good. PateGood faith across but my. Send forth your amorous token. Me downMust answer for your. Torcher his diurnal ringEre. Tis most credible we here. Returning entertained my.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071230/7eb7507a/attachment.html>

From sashak at voltaire.com  Sun Dec 30 10:16:10 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 30 Dec 2007 18:16:10 +0000
Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements
In-Reply-To: <1199032710.23289.340.camel@hrosenstock-ws.xsigo.com>
References: <4770CDCE.8040200@dev.mellanox.co.il>
	<20071229182718.GA19160@sashak.voltaire.com>
	<1199032710.23289.340.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071230181610.GC10650@sashak.voltaire.com>

On 08:38 Sun 30 Dec     , Hal Rosenstock wrote:
> On Sat, 2007-12-29 at 18:27 +0000, Sasha Khapyorsky wrote:
> > This improves handling of mcast join/leave requests storming. Now mcast
> > routing will be recalculated for all mcast groups where changes occurred
> > and not one by one. For this it queues mcast groups instead of mcast
> > rerouting requests, this also makes state_mgr idle queue obsolete.
> 
> Looks like a nice improvement.
> 
> What testing has been done with this change ? Can you comment on any
> results ?

osmtest, basic ipoib, SA db and MFTs dump diffs. Didn't find any
problem.

> For which branches is this change being proposed ?

I think it should go to OFED 1.3.

Sasha

> 
> -- Hal
> 
> > Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> > ---
> > 
> > Hi Yevgeny,
> > 
> > For me it looks that it should solve the original problem (mcast group
> > list is purged in osm_mcast_mgr_process()). Could you review and ideally
> > test it? Thanks.
> > 
> > Sasha
> > 
> > ---
> >  opensm/include/opensm/osm_mcast_mgr.h |   14 +--
> >  opensm/include/opensm/osm_multicast.h |    2 +
> >  opensm/include/opensm/osm_sm.h        |    2 +
> >  opensm/include/opensm/osm_state_mgr.h |   95 -----------------
> >  opensm/opensm/osm_mcast_mgr.c         |  187 +++++++++++++++------------------
> >  opensm/opensm/osm_sm.c                |   70 ++++++-------
> >  opensm/opensm/osm_state_mgr.c         |  138 +------------------------
> >  7 files changed, 130 insertions(+), 378 deletions(-)
> > 
> > diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h
> > index 3e0b761..47b67ed 100644
> > --- a/opensm/include/opensm/osm_mcast_mgr.h
> > +++ b/opensm/include/opensm/osm_mcast_mgr.h
> > @@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr {
> >  	osm_req_t *p_req;
> >  	osm_log_t *p_log;
> >  	cl_plock_t *p_lock;
> > -
> >  } osm_mcast_mgr_t;
> >  /*
> >  * FIELDS
> > @@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr);
> >  *	Multicast Manager, Node Info Response Controller
> >  *********/
> >  
> > -/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb
> > +/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups
> >  * NAME
> > -*	osm_mcast_mgr_process_mgrp_cb
> > +*	osm_mcast_mgr_process_mgroups
> >  *
> >  * DESCRIPTION
> > -*	Callback entry point for the osm_mcast_mgr_process_mgrp function.
> > +*	Process only requested mcast groups.
> >  *
> >  * SYNOPSIS
> >  */
> >  osm_signal_t
> > -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2);
> > +osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr);
> >  /*
> >  * PARAMETERS
> > -*	(Context1) p_mgr
> > +*	p_mgr
> >  *		[in] Pointer to an osm_mcast_mgr_t object.
> >  *
> > -*	(Context2) p_mgrp
> > -*		[in] Pointer to the multicast group to process.
> > -*
> >  * RETURN VALUES
> >  *	IB_SUCCESS
> >  *
> > diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h
> > index 729a2ea..f442a45 100644
> > --- a/opensm/include/opensm/osm_multicast.h
> > +++ b/opensm/include/opensm/osm_multicast.h
> > @@ -50,6 +50,7 @@
> >  
> >  #include <iba/ib_types.h>
> >  #include <complib/cl_qmap.h>
> > +#include <complib/cl_qlist.h>
> >  #include <complib/cl_spinlock.h>
> >  #include <opensm/osm_base.h>
> >  #include <opensm/osm_mtree.h>
> > @@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN osm_mcast_req_type_t req_type);
> >  * SYNOPSIS
> >  */
> >  typedef struct osm_mcast_mgr_ctxt {
> > +	cl_list_item_t list_item;
> >  	ib_net16_t mlid;
> >  	osm_mcast_req_type_t req_type;
> >  	ib_net64_t port_guid;
> > diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h
> > index 4c6ce27..a676cd6 100644
> > --- a/opensm/include/opensm/osm_sm.h
> > +++ b/opensm/include/opensm/osm_sm.h
> > @@ -140,6 +140,8 @@ typedef struct osm_sm {
> >  	cl_dispatcher_t *p_disp;
> >  	cl_plock_t *p_lock;
> >  	atomic32_t sm_trans_id;
> > +	cl_spinlock_t mgrp_lock;
> > +	cl_qlist_t mgrp_list;
> >  	osm_req_t req;
> >  	osm_resp_t resp;
> >  	osm_ni_rcv_t ni_rcv;
> > diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h
> > index dada097..f51593a 100644
> > --- a/opensm/include/opensm/osm_state_mgr.h
> > +++ b/opensm/include/opensm/osm_state_mgr.h
> > @@ -109,8 +109,6 @@ typedef struct _osm_state_mgr {
> >  	osm_stats_t *p_stats;
> >  	struct _osm_sm_state_mgr *p_sm_state_mgr;
> >  	const osm_sm_mad_ctrl_t *p_mad_ctrl;
> > -	cl_spinlock_t idle_lock;
> > -	cl_qlist_t idle_time_list;
> >  	cl_plock_t *p_lock;
> >  	cl_event_t *p_subnet_up_event;
> >  	osm_sm_state_t state;
> > @@ -172,99 +170,6 @@ typedef struct _osm_state_mgr {
> >  *	State Manager object
> >  *********/
> >  
> > -/****s* OpenSM: State Manager/_osm_idle_item
> > -* NAME
> > -*	_osm_idle_item
> > -*
> > -* DESCRIPTION
> > -*	Idle item.
> > -*
> > -* SYNOPSIS
> > -*/
> > -
> > -typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void *context2);
> > -
> > -typedef void
> > - (*osm_pfn_done_t) (IN void *context1, IN void *context2);
> > -
> > -typedef struct _osm_idle_item {
> > -	cl_list_item_t list_item;
> > -	void *context1;
> > -	void *context2;
> > -	osm_pfn_start_t pfn_start;
> > -	osm_pfn_done_t pfn_done;
> > -} osm_idle_item_t;
> > -
> > -/*
> > -* FIELDS
> > -*	list_item
> > -*		list item.
> > -*
> > -*	context1
> > -*		Context pointer
> > -*
> > -*	context2
> > -*		Context pointer
> > -*
> > -*	pfn_start
> > -*		Pointer to the start function.
> > -*
> > -*	pfn_done
> > -*		Pointer to the dine function.
> > -* SEE ALSO
> > -*	State Manager object
> > -*********/
> > -
> > -/****f* OpenSM: State Manager/osm_state_mgr_process_idle
> > -* NAME
> > -*	osm_state_mgr_process_idle
> > -*
> > -* DESCRIPTION
> > -*	Formulates the osm_idle_item and inserts it into the queue and
> > -*	signals the state manager.
> > -*
> > -* SYNOPSIS
> > -*/
> > -
> > -ib_api_status_t
> > -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> > -			   IN osm_pfn_start_t pfn_start,
> > -			   IN osm_pfn_done_t pfn_done,
> > -			   void *context1, void *context2);
> > -
> > -/*
> > -* PARAMETERS
> > -*	p_mgr
> > -*		[in] Pointer to a State Manager object to construct.
> > -*
> > -*	pfn_start
> > -*		[in] Pointer the start function which will be called at
> > -*			idle time.
> > -*
> > -*	pfn_done
> > -*		[in] pointer the done function which will be called
> > -*			when outstanding smps is zero
> > -*
> > -*	context1
> > -*		[in] Pointer to void
> > -*
> > -*	context2
> > -*		[in] Pointer to void
> > -*
> > -* RETURN VALUE
> > -*	IB_SUCCESS or IB_ERROR
> > -*
> > -* NOTES
> > -*	Allows osm_state_mgr_destroy
> > -*
> > -*	Calling osm_state_mgr_construct is a prerequisite to calling any other
> > -*	method except osm_state_mgr_init.
> > -*
> > -* SEE ALSO
> > -*	State Manager object, osm_state_mgr_init,
> > -*	osm_state_mgr_destroy
> > -*********/
> > -
> >  /****f* OpenSM: State Manager/osm_state_mgr_construct
> >  * NAME
> >  *	osm_state_mgr_construct
> > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
> > index 50b95fd..f51a45a 100644
> > --- a/opensm/opensm/osm_mcast_mgr.c
> > +++ b/opensm/opensm/osm_mcast_mgr.c
> > @@ -815,7 +815,7 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr,
> >  	}
> >  
> >  	free(list_array);
> > -      Exit:
> > +Exit:
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> >  	return (p_mtn);
> >  }
> > @@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t * const p_mgr,
> >  		"Configured MLID 0x%X for %u ports, max tree depth = %u\n",
> >  		cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth);
> >  
> > -      Exit:
> > +Exit:
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> >  	return (status);
> >  }
> > @@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr,
> >  		}
> >  	}
> >  
> > -      Exit:
> > +Exit:
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> >  	return (status);
> >  }
> > @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * const p_mgr,
> >  							   port_guid);
> >  	}
> >  
> > -      Exit:
> > +Exit:
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> >  	return (status);
> >  }
> >  
> >  /**********************************************************************
> >   Process the entire group.
> > -
> >   NOTE : The lock should be held externally!
> >   **********************************************************************/
> > -static osm_signal_t
> > -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> > -			   IN osm_mgrp_t * const p_mgrp,
> > -			   IN osm_mcast_req_type_t req_type,
> > -			   IN ib_net64_t port_guid)
> > +static ib_api_status_t
> > +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> > +		       IN osm_mgrp_t * const p_mgrp,
> > +		       IN osm_mcast_req_type_t req_type,
> > +		       IN ib_net64_t port_guid)
> >  {
> > -	osm_signal_t signal = OSM_SIGNAL_DONE;
> >  	ib_api_status_t status;
> > -	osm_switch_t *p_sw;
> > -	cl_qmap_t *p_sw_tbl;
> > -	boolean_t pending_transactions = FALSE;
> >  
> >  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp);
> >  
> > -	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> > -
> >  	status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid);
> >  	if (status != IB_SUCCESS) {
> >  		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > -			"osm_mcast_mgr_process_mgrp: ERR 0A19: "
> > +			"mcast_mgr_process_mgrp: ERR 0A19: "
> >  			"Unable to create spanning tree (%s)\n",
> >  			ib_get_err_str(status));
> > -
> >  		goto Exit;
> >  	}
> > +	p_mgrp->last_tree_id = p_mgrp->last_change_id;
> >  
> > -	/*
> > -	   Walk the switches and download the tables for each.
> > +	/* Remove MGRP only if osm_mcm_port_t count is 0 and
> > +	 * Not a well known group
> >  	 */
> > -	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> > -	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> > -		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> > -		if (signal == OSM_SIGNAL_DONE_PENDING)
> > -			pending_transactions = TRUE;
> > -
> > -		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> > +	if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) {
> > +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > +			"mcast_mgr_process_mgrp: "
> > +			"Destroying mgrp with lid:0x%X\n",
> > +			cl_ntoh16(p_mgrp->mlid));
> > +		/* Send a Report to any InformInfo registered for
> > +		   Trap 67 : MCGroup delete */
> > +		osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> > +					    p_mgrp);
> > +		cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> > +				    (cl_map_item_t *) p_mgrp);
> > +		osm_mgrp_delete(p_mgrp);
> >  	}
> >  
> > -	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> > -
> > -      Exit:
> > +Exit:
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> > -
> > -	if (pending_transactions == TRUE)
> > -		return (OSM_SIGNAL_DONE_PENDING);
> > -	else
> > -		return (OSM_SIGNAL_DONE);
> > +	return status;
> >  }
> >  
> >  /**********************************************************************
> > @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
> >  	osm_switch_t *p_sw;
> >  	cl_qmap_t *p_sw_tbl;
> >  	cl_qmap_t *p_mcast_tbl;
> > +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
> >  	osm_mgrp_t *p_mgrp;
> > -	ib_api_status_t status;
> >  	boolean_t pending_transactions = FALSE;
> >  
> >  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process);
> >  
> >  	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> > -
> >  	p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl;
> >  	/*
> >  	   While holding the lock, iterate over all the established
> > @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
> >  		/* We reached here due to some change that caused a heavy sweep
> >  		   of the subnet. Not due to a specific multicast request.
> >  		   So the request type is subnet_change and the port guid is 0. */
> > -		status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp,
> > -						    OSM_MCAST_REQ_TYPE_SUBNET_CHANGE,
> > -						    0);
> > -		if (status != IB_SUCCESS) {
> > -			osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > -				"osm_mcast_mgr_process: ERR 0A20: "
> > -				"Unable to create spanning tree (%s)\n",
> > -				ib_get_err_str(status));
> > -		}
> > -
> > +		mcast_mgr_process_mgrp(p_mgr, p_mgrp,
> > +				       OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0);
> >  		p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item);
> >  	}
> >  
> > @@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
> >  		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> >  		if (signal == OSM_SIGNAL_DONE_PENDING)
> >  			pending_transactions = TRUE;
> > -
> >  		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> >  	}
> >  
> > +	while (!cl_is_qlist_empty(p_list)) {
> > +		cl_list_item_t *p = cl_qlist_remove_head(p_list);
> > +		free(p);
> > +	}
> > +
> >  	CL_PLOCK_RELEASE(p_mgr->p_lock);
> >  
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> > @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcast_mgr_t * const p_mgr,
> >  
> >  /**********************************************************************
> >    This is the function that is invoked during idle time to handle the
> > -  process request. Context1 is simply the osm_mcast_mgr_t*, Context2
> > -  hold the mlid, port guid and action (join/leave/delete) required.
> > +  process request for mcast groups where join/leave/delete was required.
> >   **********************************************************************/
> > -osm_signal_t
> > -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2)
> > +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr)
> >  {
> > -	osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1;
> > +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
> > +	osm_switch_t *p_sw;
> > +	cl_qmap_t *p_sw_tbl;
> >  	osm_mgrp_t *p_mgrp;
> >  	ib_net16_t mlid;
> > -	osm_signal_t signal = OSM_SIGNAL_DONE;
> > -	osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2;
> > -	osm_mcast_req_type_t req_type = p_ctxt->req_type;
> > -	ib_net64_t port_guid = p_ctxt->port_guid;
> > -
> > -	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb);
> > -
> > -	/* nice copy no warning on size diff */
> > -	memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid));
> > +	osm_signal_t ret, signal = OSM_SIGNAL_DONE;
> > +	osm_mcast_mgr_ctxt_t *ctx;
> > +	osm_mcast_req_type_t req_type;
> > +	ib_net64_t port_guid;
> >  
> > -	/* we can destroy the context now */
> > -	free(p_ctxt);
> > +	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups);
> >  
> >  	/* we need a lock to make sure the p_mgrp is not change other ways */
> >  	CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock);
> > -	p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
> >  
> > -	/* since we delayed the execution we prefer to pass the
> > -	   mlid as the mgrp identifier and then find it or abort */
> > +	if (cl_is_qlist_empty(p_list)) {
> > +		CL_PLOCK_RELEASE(p_mgr->p_lock);
> > +		return OSM_SIGNAL_NONE;
> > +	}
> > +
> > +	while (!cl_is_qlist_empty(p_list)) {
> > +		ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list);
> > +		req_type = ctx->req_type;
> > +		port_guid = ctx->port_guid;
> > +
> > +		/* nice copy no warning on size diff */
> > +		memcpy(&mlid, &ctx->mlid, sizeof(mlid));
> >  
> > -	if (p_mgrp) {
> > +		/* we can destroy the context now */
> > +		free(ctx);
> > +
> > +		/* since we delayed the execution we prefer to pass the
> > +		   mlid as the mgrp identifier and then find it or abort */
> > +		p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
> > +		if (!p_mgrp)
> > +			continue;
> >  
> > -		/* if there was no change from the last time we processed the group
> > -		   we can skip doing anything
> > +		/* if there was no change from the last time
> > +		 * we processed the group we can skip doing anything
> >  		 */
> >  		if (p_mgrp->last_change_id == p_mgrp->last_tree_id) {
> >  			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > -				"osm_mcast_mgr_process_mgrp_cb: "
> > +				"osm_mcast_mgr_process_mgroups: "
> >  				"Skip processing mgrp with lid:0x%X change id:%u\n",
> >  				cl_ntoh16(mlid), p_mgrp->last_change_id);
> > -		} else {
> > -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > -				"osm_mcast_mgr_process_mgrp_cb: "
> > -				"Processing mgrp with lid:0x%X change id:%u\n",
> > -				cl_ntoh16(mlid), p_mgrp->last_change_id);
> > -
> > -			signal =
> > -			    osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type,
> > -						       port_guid);
> > -			p_mgrp->last_tree_id = p_mgrp->last_change_id;
> > +			continue;
> >  		}
> >  
> > -		/* Remove MGRP only if osm_mcm_port_t count is 0 and
> > -		 * Not a well known group
> > -		 */
> > -		if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) &&
> > -		    (p_mgrp->well_known == FALSE)) {
> > -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > -				"osm_mcast_mgr_process_mgrp_cb: "
> > -				"Destroying mgrp with lid:0x%X\n",
> > -				cl_ntoh16(mlid));
> > -
> > -			/* Send a Report to any InformInfo registered for
> > -			   Trap 67 : MCGroup delete */
> > -			osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> > -						    p_mgrp);
> > -
> > -			cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> > -					    (cl_map_item_t *) p_mgrp);
> > +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > +			"osm_mcast_mgr_process_mgroups: "
> > +			"Processing mgrp with lid:0x%X change id:%u\n",
> > +			cl_ntoh16(mlid), p_mgrp->last_change_id);
> > +		mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid);
> > +	}
> >  
> > -			osm_mgrp_delete(p_mgrp);
> > -		}
> > +	/*
> > +	   Walk the switches and download the tables for each.
> > +	 */
> > +	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> > +	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> > +	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> > +		ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> > +		if (ret == OSM_SIGNAL_DONE_PENDING)
> > +			signal = ret;
> > +		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> >  	}
> >  
> > +	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> > +
> >  	CL_PLOCK_RELEASE(p_mgr->p_lock);
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> >  	return signal;
> > diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c
> > index 88e6d4a..b295a77 100644
> > --- a/opensm/opensm/osm_sm.c
> > +++ b/opensm/opensm/osm_sm.c
> > @@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm)
> >  	cl_event_construct(&p_sm->signal_event);
> >  	cl_event_construct(&p_sm->subnet_up_event);
> >  	cl_thread_construct(&p_sm->sweeper);
> > +	cl_spinlock_construct(&p_sm->mgrp_lock);
> >  	osm_req_construct(&p_sm->req);
> >  	osm_resp_construct(&p_sm->resp);
> >  	osm_ni_rcv_construct(&p_sm->ni_rcv);
> > @@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm)
> >  	cl_event_destroy(&p_sm->signal_event);
> >  	cl_event_destroy(&p_sm->subnet_up_event);
> >  	cl_spinlock_destroy(&p_sm->signal_lock);
> > +	cl_spinlock_destroy(&p_sm->mgrp_lock);
> >  
> >  	osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n");	/* Format Waived */
> >  	OSM_LOG_EXIT(p_sm->p_log);
> > @@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm,
> >  	if (status != CL_SUCCESS)
> >  		goto Exit;
> >  
> > +	cl_qlist_init(&p_sm->mgrp_list);
> > +
> > +	status = cl_spinlock_init(&p_sm->mgrp_lock);
> > +	if (status != CL_SUCCESS)
> > +		goto Exit;
> > +
> >  	status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl,
> >  				      p_sm->p_subn,
> >  				      p_sm->p_mad_pool,
> > @@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const ib_net64_t port_guid)
> >  /**********************************************************************
> >   **********************************************************************/
> >  static ib_api_status_t
> > -__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> > +__osm_sm_mgrp_process(IN osm_sm_t * const p_sm,
> >  		      IN osm_mgrp_t * const p_mgrp,
> >  		      IN const ib_net64_t port_guid,
> >  		      IN osm_mcast_req_type_t req_type)
> >  {
> > -	ib_api_status_t status;
> >  	osm_mcast_mgr_ctxt_t *ctx2;
> >  
> > -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect);
> > -
> >  	/*
> >  	 * 'Schedule' all the QP0 traffic for when the state manager
> >  	 * isn't busy trying to do something else.
> >  	 */
> >  	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> > +	if (!ctx2)
> > +		return IB_ERROR;
> > +	memset(ctx2, 0, sizeof(*ctx2));
> >  	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
> >  	ctx2->req_type = req_type;
> >  	ctx2->port_guid = port_guid;
> >  
> > -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> > -					    osm_mcast_mgr_process_mgrp_cb,
> > -					    NULL, &p_sm->mcast_mgr,
> > -					    (void *)ctx2);
> > +	cl_spinlock_acquire(&p_sm->mgrp_lock);
> > +	cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item);
> > +	cl_spinlock_release(&p_sm->mgrp_lock);
> >  
> > -	OSM_LOG_EXIT(p_sm->p_log);
> > -	return (status);
> > +	osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> > +
> > +	return IB_SUCCESS;
> > +}
> > +
> > +/**********************************************************************
> > + **********************************************************************/
> > +static ib_api_status_t
> > +__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> > +		      IN osm_mgrp_t * const p_mgrp,
> > +		      IN const ib_net64_t port_guid,
> > +		      IN osm_mcast_req_type_t req_type)
> > +{
> > +	return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type);
> >  }
> >  
> >  /**********************************************************************
> > @@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm,
> >  			 IN osm_mgrp_t * const p_mgrp,
> >  			 IN const ib_net64_t port_guid)
> >  {
> > -	ib_api_status_t status;
> > -	osm_mcast_mgr_ctxt_t *ctx2;
> > -
> > -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect);
> > -
> > -	/*
> > -	 * 'Schedule' all the QP0 traffic for when the state manager
> > -	 * isn't busy trying to do something else.
> > -	 */
> > -	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> > -	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
> > -	ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE;
> > -	ctx2->port_guid = port_guid;
> > -
> > -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> > -					    osm_mcast_mgr_process_mgrp_cb,
> > -					    NULL, &p_sm->mcast_mgr, ctx2);
> > -	if (status != IB_SUCCESS) {
> > -		osm_log(p_sm->p_log, OSM_LOG_ERROR,
> > -			"__osm_sm_mgrp_disconnect: ERR 2E11: "
> > -			"Failure processing multicast group (%s)\n",
> > -			ib_get_err_str(status));
> > -	}
> > -
> > -	OSM_LOG_EXIT(p_sm->p_log);
> > +	__osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, OSM_MCAST_REQ_TYPE_LEAVE);
> >  }
> >  
> >  /**********************************************************************
> > @@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm,
> >  		goto Exit;
> >  	}
> >  
> > -	CL_PLOCK_RELEASE(p_sm->p_lock);
> >  	status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type);
> > +	CL_PLOCK_RELEASE(p_sm->p_lock);
> >  
> >        Exit:
> >  	OSM_LOG_EXIT(p_sm->p_log);
> > @@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm,
> >  
> >  	osm_port_remove_mgrp(p_port, mlid);
> >  
> > -	CL_PLOCK_RELEASE(p_sm->p_lock);
> > -
> >  	__osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid);
> > +	CL_PLOCK_RELEASE(p_sm->p_lock);
> >  
> >        Exit:
> >  	OSM_LOG_EXIT(p_sm->p_log);
> > diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
> > index 5c39f11..d4dd782 100644
> > --- a/opensm/opensm/osm_state_mgr.c
> > +++ b/opensm/opensm/osm_state_mgr.c
> > @@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm);
> >  void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr)
> >  {
> >  	memset(p_mgr, 0, sizeof(*p_mgr));
> > -	cl_spinlock_construct(&p_mgr->idle_lock);
> >  	p_mgr->state = OSM_SM_STATE_INIT;
> >  }
> >  
> > @@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr)
> >  
> >  	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy);
> >  
> > -	/* destroy the locks */
> > -	cl_spinlock_destroy(&p_mgr->idle_lock);
> > -
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> >  }
> >  
> > @@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
> >  		   IN cl_event_t * const p_subnet_up_event,
> >  		   IN osm_log_t * const p_log)
> >  {
> > -	cl_status_t status;
> > -
> >  	OSM_LOG_ENTER(p_log, osm_state_mgr_init);
> >  
> >  	CL_ASSERT(p_subn);
> > @@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
> >  	p_mgr->p_lock = p_lock;
> >  	p_mgr->p_subnet_up_event = p_subnet_up_event;
> >  
> > -	cl_qlist_init(&p_mgr->idle_time_list);
> > -
> > -	status = cl_spinlock_init(&p_mgr->idle_lock);
> > -	if (status != CL_SUCCESS) {
> > -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > -			"osm_state_mgr_init: ERR 3302: "
> > -			"Spinlock init failed (%s)\n", CL_STATUS_MSG(status));
> > -	}
> > -
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> > -	return (status);
> > +	return IB_SUCCESS;
> >  }
> >  
> >  /**********************************************************************
> > @@ -989,79 +974,6 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t *
> >  }
> >  
> >  /**********************************************************************
> > - **********************************************************************/
> > -static void __process_idle_time_queue_done(IN osm_state_mgr_t * const p_mgr)
> > -{
> > -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> > -	cl_list_item_t *p_list_item;
> > -	osm_idle_item_t *p_process_item;
> > -
> > -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
> > -
> > -	cl_spinlock_acquire(&p_mgr->idle_lock);
> > -	p_list_item = cl_qlist_remove_head(p_list);
> > -
> > -	if (p_list_item == cl_qlist_end(p_list)) {
> > -		cl_spinlock_release(&p_mgr->idle_lock);
> > -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > -			"__process_idle_time_queue_done: ERR 3314: "
> > -			"Idle time queue is empty\n");
> > -		return;
> > -	}
> > -	cl_spinlock_release(&p_mgr->idle_lock);
> > -
> > -	p_process_item = (osm_idle_item_t *) p_list_item;
> > -
> > -	if (p_process_item->pfn_done) {
> > -
> > -		p_process_item->pfn_done(p_process_item->context1,
> > -					 p_process_item->context2);
> > -	}
> > -
> > -	free(p_process_item);
> > -
> > -	OSM_LOG_EXIT(p_mgr->p_log);
> > -	return;
> > -}
> > -
> > -/**********************************************************************
> > - **********************************************************************/
> > -static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
> > -						    const p_mgr)
> > -{
> > -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> > -	cl_list_item_t *p_list_item;
> > -	osm_idle_item_t *p_process_item;
> > -	osm_signal_t signal;
> > -
> > -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start);
> > -
> > -	cl_spinlock_acquire(&p_mgr->idle_lock);
> > -
> > -	p_list_item = cl_qlist_head(p_list);
> > -	if (p_list_item == cl_qlist_end(p_list)) {
> > -		cl_spinlock_release(&p_mgr->idle_lock);
> > -		OSM_LOG_EXIT(p_mgr->p_log);
> > -		return OSM_SIGNAL_NONE;
> > -	}
> > -
> > -	cl_spinlock_release(&p_mgr->idle_lock);
> > -
> > -	p_process_item = (osm_idle_item_t *) p_list_item;
> > -
> > -	CL_ASSERT(p_process_item->pfn_start);
> > -
> > -	signal =
> > -	    p_process_item->pfn_start(p_process_item->context1,
> > -				      p_process_item->context2);
> > -
> > -	CL_ASSERT(signal != OSM_SIGNAL_NONE);
> > -
> > -	OSM_LOG_EXIT(p_mgr->p_log);
> > -	return signal;
> > -}
> > -
> > -/**********************************************************************
> >   * Go over all the remote SMs (as updated in the sm_guid_tbl).
> >   * Find if there is a remote sm that is a master SM.
> >   * If there is a remote master SM - return a pointer to it,
> > @@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
> >  		case OSM_SM_STATE_PROCESS_REQUEST:
> >  			switch (signal) {
> >  			case OSM_SIGNAL_IDLE_TIME_PROCESS:
> > -				signal = __process_idle_time_queue_start(p_mgr);
> > +				signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr);
> >  				switch (signal) {
> >  				case OSM_SIGNAL_NONE:
> >  					p_mgr->state = OSM_SM_STATE_IDLE;
> > @@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
> >  			switch (signal) {
> >  			case OSM_SIGNAL_NO_PENDING_TRANSACTIONS:
> >  			case OSM_SIGNAL_DONE:
> > -				/* CALL the done function */
> > -				__process_idle_time_queue_done(p_mgr);
> > -
> > -				/*
> > -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> > -				 * so that the next element in the queue gets processed
> > -				 */
> > -
> >  				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
> >  				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
> >  				break;
> > @@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
> >  
> >  	OSM_LOG_EXIT(p_mgr->p_log);
> >  }
> > -
> > -/**********************************************************************
> > - **********************************************************************/
> > -ib_api_status_t
> > -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> > -			   IN osm_pfn_start_t pfn_start,
> > -			   IN osm_pfn_done_t pfn_done, void *context1,
> > -			   void *context2)
> > -{
> > -	osm_idle_item_t *p_idle_item;
> > -
> > -	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle);
> > -
> > -	p_idle_item = malloc(sizeof(osm_idle_item_t));
> > -	if (p_idle_item == NULL) {
> > -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > -			"osm_state_mgr_process_idle: ERR 3321: "
> > -			"insufficient memory\n");
> > -		return IB_ERROR;
> > -	}
> > -
> > -	memset(p_idle_item, 0, sizeof(osm_idle_item_t));
> > -	p_idle_item->pfn_start = pfn_start;
> > -	p_idle_item->pfn_done = pfn_done;
> > -	p_idle_item->context1 = context1;
> > -	p_idle_item->context2 = context2;
> > -
> > -	cl_spinlock_acquire(&p_mgr->idle_lock);
> > -	cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item);
> > -	cl_spinlock_release(&p_mgr->idle_lock);
> > -
> > -	osm_sm_signal(&p_mgr->p_subn->p_osm->sm,
> > -		      OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> > -
> > -	OSM_LOG_EXIT(p_mgr->p_log);
> > -
> > -	return IB_SUCCESS;
> > -}


From sashak at voltaire.com  Sun Dec 30 10:20:31 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Sun, 30 Dec 2007 18:20:31 +0000
Subject: [ofa-general] [PATCH] opensm/osm_pkey_mgr.c: setting only
	outbound partition enforcement on switch
In-Reply-To: <1199031692.23289.322.camel@hrosenstock-ws.xsigo.com>
References: <4770F7BB.6040502@dev.mellanox.co.il>
	<1198767608.23289.198.camel@hrosenstock-ws.xsigo.com>
	<20071227161325.GB13378@sashak.voltaire.com>
	<1198772437.23289.209.camel@hrosenstock-ws.xsigo.com>
	<20071227164028.GE13378@sashak.voltaire.com>
	<4774135E.6060601@dev.mellanox.co.il>
	<20071229183459.GB19160@sashak.voltaire.com>
	<1199031692.23289.322.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071230182031.GD10650@sashak.voltaire.com>

On 08:21 Sun 30 Dec     , Hal Rosenstock wrote:
> On Sat, 2007-12-29 at 18:34 +0000, Sasha Khapyorsky wrote:
> > On 23:04 Thu 27 Dec     , Yevgeny Kliteynik wrote:
> > >  Sasha Khapyorsky wrote:
> > > > On 08:20 Thu 27 Dec     , Hal Rosenstock wrote:
> > > >> On Thu, 2007-12-27 at 16:13 +0000, Sasha Khapyorsky wrote:
> > > >>> Hi Hal,
> > > >>>
> > > >>> On 07:00 Thu 27 Dec     , Hal Rosenstock wrote:
> > > >>>> On Tue, 2007-12-25 at 14:29 +0200, Yevgeny Kliteynik wrote:
> > > >>>>> Fixing wrong setting of partition enforcement bits on switch ports.
> > > >>>>> When an HCA port is configured with a certain pkey, the peer port
> > > >>>>> on the switch should turn on outbound partition enforcement bit only.
> > > >>>>> Turning on the inbound enforcement will cause the switch to drop
> > > >>>>> valid packets if the HCA is partial member.
> > > >>>> Inbound enforcement is actually the more useful case. If there is
> > > >>>> inbound enforcement, outbound enforcement doesn't add much.
> > > >>>>
> > > >>>> In the case of partial only (not both partial and full) membership, the
> > > >>>> peer switch physical port would need to be set to full membership.
> > > >>> Then it could break outbound enforcement. Isn't it?
> > > >> What I wrote was wrong. Limited pkey is sufficient. See o18-14
> > > > Do you mean last paragraph of o18-14? Assuming so - it makes sense. So
> > > > we need just revert the original patch.
> > > 
> > >  Almost true. It would be nice to keep the new condition:
> > > 
> > >  -	if ((p_pi->vl_enforce & 0xc) == (0xc) * (enforce == TRUE)) {
> > >  +	if (((p_pi->vl_enforce & 0xc) == 0x4 && enforce) ||
> > >  +	    ((p_pi->vl_enforce & 0xc) == 0 && !enforce)) {
> > 
> > I liked the original version - it is shorter and looks cleaner for me.
> > 
> > I'm reverting entire patch.
> 
> Is all the pkey handling back to being identical to before Yevgeny's 
> change(s) now ?

Yes.

> Also, which branch(es) ? master and ofed_1_3 ?

The original was not pulled to ofed_1_3 yet.

> BTW, are master and ofed_1_3 different right now ?

The code is identical now. The only differences are original pkey patch
and its revert, 'git-shortlog ofed_1_3..master' shows:

Sasha Khapyorsky (1):
      opensm: Revert "opensm/osm_pkey_mgr.c: setting only outbound partition enforcement on switch"

Yevgeny Kliteynik (1):
      opensm/osm_pkey_mgr.c: setting only outbound partition enforcement on switch


Sasha


From dwsehakiam at sehakia.org  Sun Dec 30 19:09:44 2007
From: dwsehakiam at sehakia.org (Martha Arias)
Date: Mon, 31 Dec 2007 11:09:44 +0800
Subject: [ofa-general] Best medications, best prices! 
Message-ID: <01c84b9d$a594cc00$70d150de@dwsehakiam>

Want to be the top all night long? 
Buy top products at Canadian Pharmacy store. 
Here you can find brands that you trust. 
Buy high-quality Viagra at discount pharmacy. 

http://geocities.com/LuellaHudson91/

Only Confidential purchase. Verified by VISA! 


From info at microsoft.co.uk  Sun Dec 30 20:40:02 2007
From: info at microsoft.co.uk (MICROSOFT AWARD PROMOTION)
Date: Mon, 31 Dec 2007 05:40:02 +0100 (CET)
Subject: [ofa-general] MICROSOFT AWARD 2007, YOUR EMAIL ADDRESS WAS SELECTED
Message-ID: <39105.213.185.118.212.1199076002.squirrel@213.185.118.212>


-- 
-- 
MICROSOFT AWARD PROMOTION
20 Craven Park, Harlesden
London NW10,United Kingdom.
Ref: BTD/968/07
Batch: 409978E
WINNING NOTIFICATION
over 1,000.000 Draws e-mails were randomly selected and drawn from a
 wide range of web hosts
which we enjoy their patronage.Your email address as indicated was
 drawn
and attached to ticket number 008795727498 with serial numbers
BTD/9080648302/07 and drew the lucky numbers 14-21-25-39-40-47(20) you
 was
the jackpot winners in the draw.You have therefore won the entire
winning sum of ¢G1,000,000 (One MillionGreat Britain Pounds).To file
for your claim, Contact The Processing Consultant:
1.Full Name
2.Full Address
3.Marital Status
4.Occupation
5.Age
6.Sex
7.Nationality
8.Country Of Residence
9.Telephone Number
mr:Tony amy
Tel:+447045726804
Microsoft Promotion Award Team
Head Winning Claims Dept.
E-Mail:claims_agentaustinkel at yahoo.com.hk


From eli at mellanox.co.il  Sun Dec 30 23:15:39 2007
From: eli at mellanox.co.il (Eli Cohen)
Date: Mon, 31 Dec 2007 09:15:39 +0200
Subject: [ofa-general] Re: Registration of Shared memory with read only
	permission fails
In-Reply-To: <477768FC.9060100@dev.mellanox.co.il>
References: <477768FC.9060100@dev.mellanox.co.il>
Message-ID: <1199085339.21275.221.camel@mtls03>

I don't know why it fails but have the following thoughts:
1. Does the key you use represent an existing shared memory region? If
not than you have a read only memory region that no one ever touched or
can touch.
2. Can you try and see what happens if you try to access the shared
memory region? Read it?

On Sun, 2007-12-30 at 11:46 +0200, Dotan Barak wrote:
> key


From b at tashimahideaki.net  Sun Dec 30 23:36:46 2007
From: b at tashimahideaki.net (Michael Obrien)
Date: Mon, 31 Dec 2007 03:36:46 -0400
Subject: [ofa-general] Hohe Qualit&#228;
	t und niedriger Preis sind in der Software vereinigt
Message-ID: <01c84b5e$5e3b1b00$e06808c8@b>

Die echte und vollige Produkte der Software fur wenig Geld? Das ist wirklich. Sie momentan zu bekommen? Ja ist die Antwort. Einfach bezahlen und auslasten. Au?erdem sind die Programmen auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmt. Die professionelle Konsultation des Anwenderdienstes hilft Ihnen jedes Programm leicht aufstellen. Schnelle Antwort ist garantiert. Die Ruckzahlung ist moglich. Sie kaufen die Software, sie funktionieren ausgezeichnet
http://geocities.com/robt.mathews/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071231/c1377bce/attachment.html>

From uwaadnbdv at bluelinechoice.com  Mon Dec 31 00:00:35 2007
From: uwaadnbdv at bluelinechoice.com (Jame Mohr)
Date: Mon, 31 Dec 2007 10:00:35 +0200
Subject: [ofa-general] Start using your software immediately after purchase.
Message-ID: <01c84b93$fcbfe7d0$ff5eea58@uwaadnbdv>

  Great site to purchase more than 270 programs! Even for Macintosh! Software in all European languages available! Cheap prices and original programs only! There are special offers and discounts for you to make even more significant savings.

 Professional customer service will help in case some problem with installation occurs. All updates are available to download free of charge. Money back guarantee!

http://geocities.com/AmandaLevy06/

   Buy, download and install right now!


From eli at mellanox.co.il  Mon Dec 31 02:44:44 2007
From: eli at mellanox.co.il (Eli Cohen)
Date: Mon, 31 Dec 2007 12:44:44 +0200
Subject: [ofa-general] [PATCH] ib/ipoib: Reduce comparison size in data path
Message-ID: <1199097884.21275.242.camel@mtls03>

In the majority of cases, if the neighbour will change, it will
be reflected in the guid part of the GID (bytes 8-15). If the GID
prefix will change as well (bytes 0-7) it will be because the master
SM has changed, in which case we will get an SM change event resulting
in all paths flushed.

Signed-off-by: Eli Cohen <eli at mellanox.co.il>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index c9f6077..e9a4f96 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -693,10 +693,9 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				goto out;
 			}
 		} else if (neigh->ah) {
-			if (unlikely((memcmp(&neigh->dgid.raw,
-					    skb->dst->neighbour->ha + 4,
-					    sizeof(union ib_gid))) ||
-					 (neigh->dev != dev))) {
+			if (unlikely((memcmp(&neigh->dgid.raw[8],
+					     skb->dst->neighbour->ha + 12, 8)) ||
+					     (neigh->dev != dev))) {
 				spin_lock(&priv->lock);
 				/*
 				 * It's safe to call ipoib_put_ah() inside
-- 
1.5.3.6


From rembus at bubblemountain.com  Mon Dec 31 03:03:39 2007
From: rembus at bubblemountain.com (Eddie Vargas)
Date: Mon, 31 Dec 2007 12:03:39 +0100
Subject: [ofa-general] Die Software, legal und billig, ist m&#246;glich
Message-ID: <522338020.48368263691898@bubblemountain.com>

Sie bekommen Software momentan. Nur bezahlen und auslasten. Die Programmen sind auf allen europaischen Sprachen uberlassen und fur Windows und Macintosh vorherbestimmt. Alle hier prasentierten Produkte der Software sind billig, aber nur original und vollig.Brauchen Sie Hilfe bei der Aufstellung des Programms? Benutzen Sie die Hilfe der professionellen Konsultation des Anwenderdienstes. Haben Sie Fragen? Wir antworten schnell. Wir garantieren die Moglichkeit der Ruckzahlung. Wenn Sie die Software kaufen, kaufen Sie nur die vollkommen funktionierende Software
http://geocities.com/powers.jamaal/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071231/6a2d4998/attachment.html>

From vlad at lists.openfabrics.org  Mon Dec 31 03:08:57 2007
From: vlad at lists.openfabrics.org (Vladimir Sokolovsky Mellanox)
Date: Mon, 31 Dec 2007 03:08:57 -0800 (PST)
Subject: [ofa-general] ofa_1_3_kernel 20071231-0200 daily build status
Message-ID: <20071231110857.D498DE60289@openfabrics.org>

This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel

Common build parameters:   --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod  --with-rds-mod --with-cxgb3-mod --with-nes-mod

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on x86_64 with linux-2.6.19
Passed on ppc64 with linux-2.6.15
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.18
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.16
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.19
Passed on powerpc with linux-2.6.14
Passed on ia64 with linux-2.6.14
Passed on ppc64 with linux-2.6.19
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on powerpc with linux-2.6.15
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.13
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.15
Passed on ia64 with linux-2.6.22
Passed on x86_64 with linux-2.6.16.43-0.3-smp
Passed on x86_64 with linux-2.6.9-55.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.23
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on x86_64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on x86_64 with linux-2.6.22.5-31-default
Passed on ppc64 with linux-2.6.18-8.el5
Passed on x86_64 with linux-2.6.22
Passed on ia64 with linux-2.6.16.21-0.8-default
Passed on x86_64 with linux-2.6.18-53.el5

Failed:


From jackm at dev.mellanox.co.il  Mon Dec 31 03:39:40 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Mon, 31 Dec 2007 13:39:40 +0200
Subject: [ofa-general] [RFC] XRC -- make receiving XRC QP independent of
	=?iso-8859-1?q?any=09one_user?= process
Message-ID: <200712311339.41166.jackm@dev.mellanox.co.il>

> Tang, Changqing wrote:
> >         If I have a MPI server processes on a node, many other MPI
> > client processes will dynamically connect/disconnect with
> > the server. The server use same XRC domain.
> >
> >         Will this cause accumulating the "kernel" QP for such
> > application ? we want the server to run 365 days a year.
> >
> > I have some question about the scenario above. Did you call
> > for the mpi disconnect on the both ends (server/client)
> > before the client exit (did we must to do it?)
> 
> Yes, both ends will call disconnect. But for us, MPI_Comm_disconnect() call
> is not a collective call, it is just a local operation.
> 
> --CQ
>
Possible solution (internal review as yet):

  Each user process registers with the XRC QP:
    a. each process registers ONCE. If it registers multiple times, there is no reference increment --
       rather the registration succeeds, but only one PID entry is kept per QP.
    b. Can have cleanup in the event of a process dying suddenly.
    c. QP cannot be destroyed while there are any user processes still registered with it.

libibverbs API is as follows:

======================================================================================
/**
 * ibv_xrc_rcv_qp_alloc - creates an XRC QP for serving as a receive-side only QP,
 *	and moves the created qp through the RESET->INIT and INIT->RTR transitions.
 *      (The RTR->RTS transition is not needed, since this QP does no sending).
 * 	The sending XRC QP uses this QP as destination, while specifying an XRC SRQ
 * 	for actually receiving the transmissions and generating all completions on the
 *	receiving side.
 *
 * 	This QP is created in kernel space, and persists until the last process registered
 *      for the QP calls ibv_xrc_rcv_qp_unregister() (at which time the QP is destroyed).
 *
 * @pd: protection domain to use.  At lower layer, this provides access to userspace obj
 * @xrc_domain: xrc domain to use for the QP.
 * @attr: modify-qp attributes needed to bring the QP to RTR.
 * @attr_mask:  bitmap indicating which attributes are provided in the attr struct.
 * 	used for validity checking.
 * @xrc_rcv_qpn: qp_num of created QP (if success). To be passed to the remote node (sender).
 *		 The remote node will use xrc_rcv_qpn in ibv_post_send when sending to
 *		 XRC SRQ's on this host in the same xrc domain.
 *
 * RETURNS: success (0), or a (negative) error value.
 *
 * NOTE: this verb also registers the calling user-process with the QP at its creation time
 *       (implicit call to ibv_xrc_rcv_qp_register), to avoid race conditions.
 *       The creating process will need to call ibv_xrc_qp_unregister() for the QP to release it from
 *       this process.
 */

int ibv_xrc_rcv_qp_alloc(struct ibv_pd *pd,
			 struct ibv_xrc_domain *xrc_domain,
			 struct ibv_qp_attr *attr,
			 enum ibv_qp_attr_mask attr_mask,
			 uint32_t *xrc_rcv_qpn);

=====================================================================

/**
 * ibv_xrc_rcv_qp_register: registers a user process with an XRC QP which serves as
 *         a receive-side only QP.
 *
 * @xrc_domain: xrc domain the QP belongs to (for verification).
 * @xrc_qp_num: The (24 bit) number of the XRC QP.
 *
 * RETURNS: success (0), 
 *          or error (-EINVAL), if:
 *            1. There is no such QP_num allocated.
 *            2. The QP is allocated, but is not an receive XRC QP
 *            3. The XRC QP does not belong to the given domain.
 */
int ibv_xrc_rcv_qp_register(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num);

=====================================================================
/**
 * ibv_xrc_rcv_qp_unregister: detaches a user process from an XRC QP serving as
 *         a receive-side only QP. If as a result, there are no remaining userspace processes
 *	   registered for this XRC QP, it is destroyed.
 *
 * @xrc_domain: xrc domain the QP belongs to (for verification).
 * @xrc_qp_num: The (24 bit) number of the XRC QP.
 *
 * RETURNS: success (0), 
 *          or error (-EINVAL), if:
 *            1. There is no such QP_num allocated.
 *            2. The QP is allocated, but is not an XRC QP
 *            3. The XRC QP does not belong to the given domain.
 * NOTE: I don't see any reason to return a special code if the QP is destroyed -- the unregister simply
 *       succeeds.
 */
int ibv_xrc_rcv_qp_unregister(struct ibv_xrc_domain *xrc_domain, uint32_t xrc_qp_num);
=============================================================================================

Usage:

1. Sender creates an XRC QP (sending QP)
2. Sender sends some receiving process on a remote node (say R1) a request to provide an XRC QP and XRC SRQ for
   receiving messages (the request includes the sending QP number).
3. R1 calls ibv_xrc_rcv_qp_alloc() to create a receiving XRC QP in kernel space, and move
   that QP up to RTR state. This function also registers process R1 with the XRC QP.
4. R1 calls ibv_create_xrc_srq() to create an SRQ for receive messages via the just created XRC QP.
5. R1 responds to request, providing the XRC qp number, and XRC SRQ number to be used in communication.
6. Sender then may wish to communicate with another receiving process on the remote host (say R2). 
   it sends a request to R2 containing the remote XRC QP number (obtained from R1)
   which it will use to send messages.
7. R2 creates an XRC SRQ (if one does not already exist for the domain), and also
   calls ibv_xrc_rcv_qp_register() to register the process R2 with the XRC QP created by R1.
8. If R1 no longer needs to communicate with the sender, it calls ibv_xrc_rcv_qp_unregister() for the QP.
   The QP will not yet be destroyed, since R2 is still registered with it.
9. If R2 no longer needs to communicate with the sender, it calls ibv_xrc_rcv_qp_unregister() for the QP.
   At this point, the QP is destroyed, since no processes remain registered with it.

NOTES:
1. The problem of the QP being destroyed and quickly re-allocated does not exist -- the upper bits of the
   QP number are incremented at each allocation (except for the MSB which is always 1 for XRC QPs).  Thus,
   even if the same QP is re-allocated, its QP number (stored in the QP object) will be different than
   expected (unless it is re-destroyed/re-allocated several hundred times).

2. With this model, we do not need a heartbeat: if a receiving process dies, all XRC QPs it has registered for will
   be unregistered as part of process cleanup in kernel space.

- Jack


From erezz at Voltaire.COM  Mon Dec 31 05:52:19 2007
From: erezz at Voltaire.COM (Erez Zilber)
Date: Mon, 31 Dec 2007 15:52:19 +0200
Subject: [ofa-general] [PATCH] IB/iser: update url of iSER docs
In-Reply-To: <4767D0B6.1030708@voltaire.com>
References: <4767D0B6.1030708@voltaire.com>
Message-ID: <4778F413.6080400@Voltaire.COM>

Erez Zilber wrote:
> The url for iSER docs in Kconfig has changed.
>
> Signed-off-by: Erez Zilber <erezz at voltaire.com>
> ---
>  drivers/infiniband/ulp/iser/Kconfig |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/iser/Kconfig b/drivers/infiniband/ulp/iser/Kconfig
> index fe604c8..77dedba 100644
> --- a/drivers/infiniband/ulp/iser/Kconfig
> +++ b/drivers/infiniband/ulp/iser/Kconfig
> @@ -8,5 +8,5 @@ config INFINIBAND_ISER
>            that speak iSCSI over iSER over InfiniBand.
>  
>  	  The iSER protocol is defined by IETF.
> -	  See <http://www.ietf.org/internet-drafts/draft-ietf-ips-iser-05.txt>
> -	  and <http://www.infinibandta.org/members/spec/iser_annex_060418.pdf>
> +	  See <http://www.ietf.org/rfc/rfc5046.txt>
> +	  and <http://www.infinibandta.org/members/spec/Annex_iSER.PDF>
>   

Roland,

Did you get this patch? I've sent it ~2 weeks ago, and it should go to
2.6.25.

Erez


From misa0992 at mercury.livedoor.com  Mon Dec 31 06:25:55 2007
From: misa0992 at mercury.livedoor.com (misa0992 at mercury.livedoor.com)
Date: Mon, 31 Dec 2007 23:25:55 +0900
Subject: [ofa-general] =?iso-2022-jp?b?GyRCS1xGfCRoJGpMNU5BJEckNDZhGyhC?=
	=?iso-2022-jp?b?GyRCPWokNSRzQzUkNxsoQg==?=
Message-ID: <20071231142611.74617E6017D@openfabrics.org>

心も体もあったかくなるご近所さんを探しませんか？

めぐみ　23歳　フリーター
題名：メッセしませんか？
家にPCあるので一緒にメッセンジャーでもしませんか？
なんか毎日退屈だよぉ。待ってますね。
http://www.di-girl.com/?ff

彩香　27歳　OL
題名：ただそれだけって…
はっきり言って欲求不満です。ただそれだけって駄目なのかな？
癒されたいし癒してほしいです。こういう女って引かれちゃうのかな…。
週末時間あるから連絡欲しいです。
http://www.di-girl.com/?ff

ミサキ　34歳　主婦
題名：一応既婚者ですけど…
サイト面倒だし、会ってお話出来るかなぁ？
出来れば今日がいいんですけど…
一応既婚者ですけど夫からは見放されてますから…。
秘密厳守出来る人お願いします。
http://www.di-girl.com/?ff


☆恋したい子もエッチな子もいっぱい☆
【完全無料】ご近所さん探しはこちら↓↓
http://www.b-gw.net/?hu


From hrosenstock at xsigo.com  Mon Dec 31 07:31:23 2007
From: hrosenstock at xsigo.com (Hal Rosenstock)
Date: Mon, 31 Dec 2007 07:31:23 -0800
Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements
In-Reply-To: <20071230181610.GC10650@sashak.voltaire.com>
References: <4770CDCE.8040200@dev.mellanox.co.il>
	<20071229182718.GA19160@sashak.voltaire.com>
	<1199032710.23289.340.camel@hrosenstock-ws.xsigo.com>
	<20071230181610.GC10650@sashak.voltaire.com>
Message-ID: <1199115083.23289.359.camel@hrosenstock-ws.xsigo.com>

On Sun, 2007-12-30 at 18:16 +0000, Sasha Khapyorsky wrote:
> On 08:38 Sun 30 Dec     , Hal Rosenstock wrote:
> > On Sat, 2007-12-29 at 18:27 +0000, Sasha Khapyorsky wrote:
> > > This improves handling of mcast join/leave requests storming. Now mcast
> > > routing will be recalculated for all mcast groups where changes occurred
> > > and not one by one. For this it queues mcast groups instead of mcast
> > > rerouting requests, this also makes state_mgr idle queue obsolete.
> > 
> > Looks like a nice improvement.
> > 
> > What testing has been done with this change ? Can you comment on any
> > results ?
> 
> osmtest, basic ipoib, SA db and MFTs dump diffs. Didn't find any
> problem.

What size topologies ? real and/or simulated ?

> > For which branches is this change being proposed ?
> 
> I think it should go to OFED 1.3.

Perhaps if there is sufficient soak time on real life topologies and
other torture tests for this.

-- Hal

> Sasha
> 
> > 
> > -- Hal
> > 
> > > Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> > > ---
> > > 
> > > Hi Yevgeny,
> > > 
> > > For me it looks that it should solve the original problem (mcast group
> > > list is purged in osm_mcast_mgr_process()). Could you review and ideally
> > > test it? Thanks.
> > > 
> > > Sasha
> > > 
> > > ---
> > >  opensm/include/opensm/osm_mcast_mgr.h |   14 +--
> > >  opensm/include/opensm/osm_multicast.h |    2 +
> > >  opensm/include/opensm/osm_sm.h        |    2 +
> > >  opensm/include/opensm/osm_state_mgr.h |   95 -----------------
> > >  opensm/opensm/osm_mcast_mgr.c         |  187 +++++++++++++++------------------
> > >  opensm/opensm/osm_sm.c                |   70 ++++++-------
> > >  opensm/opensm/osm_state_mgr.c         |  138 +------------------------
> > >  7 files changed, 130 insertions(+), 378 deletions(-)
> > > 
> > > diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h
> > > index 3e0b761..47b67ed 100644
> > > --- a/opensm/include/opensm/osm_mcast_mgr.h
> > > +++ b/opensm/include/opensm/osm_mcast_mgr.h
> > > @@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr {
> > >  	osm_req_t *p_req;
> > >  	osm_log_t *p_log;
> > >  	cl_plock_t *p_lock;
> > > -
> > >  } osm_mcast_mgr_t;
> > >  /*
> > >  * FIELDS
> > > @@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr);
> > >  *	Multicast Manager, Node Info Response Controller
> > >  *********/
> > >  
> > > -/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb
> > > +/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups
> > >  * NAME
> > > -*	osm_mcast_mgr_process_mgrp_cb
> > > +*	osm_mcast_mgr_process_mgroups
> > >  *
> > >  * DESCRIPTION
> > > -*	Callback entry point for the osm_mcast_mgr_process_mgrp function.
> > > +*	Process only requested mcast groups.
> > >  *
> > >  * SYNOPSIS
> > >  */
> > >  osm_signal_t
> > > -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2);
> > > +osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr);
> > >  /*
> > >  * PARAMETERS
> > > -*	(Context1) p_mgr
> > > +*	p_mgr
> > >  *		[in] Pointer to an osm_mcast_mgr_t object.
> > >  *
> > > -*	(Context2) p_mgrp
> > > -*		[in] Pointer to the multicast group to process.
> > > -*
> > >  * RETURN VALUES
> > >  *	IB_SUCCESS
> > >  *
> > > diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h
> > > index 729a2ea..f442a45 100644
> > > --- a/opensm/include/opensm/osm_multicast.h
> > > +++ b/opensm/include/opensm/osm_multicast.h
> > > @@ -50,6 +50,7 @@
> > >  
> > >  #include <iba/ib_types.h>
> > >  #include <complib/cl_qmap.h>
> > > +#include <complib/cl_qlist.h>
> > >  #include <complib/cl_spinlock.h>
> > >  #include <opensm/osm_base.h>
> > >  #include <opensm/osm_mtree.h>
> > > @@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN osm_mcast_req_type_t req_type);
> > >  * SYNOPSIS
> > >  */
> > >  typedef struct osm_mcast_mgr_ctxt {
> > > +	cl_list_item_t list_item;
> > >  	ib_net16_t mlid;
> > >  	osm_mcast_req_type_t req_type;
> > >  	ib_net64_t port_guid;
> > > diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h
> > > index 4c6ce27..a676cd6 100644
> > > --- a/opensm/include/opensm/osm_sm.h
> > > +++ b/opensm/include/opensm/osm_sm.h
> > > @@ -140,6 +140,8 @@ typedef struct osm_sm {
> > >  	cl_dispatcher_t *p_disp;
> > >  	cl_plock_t *p_lock;
> > >  	atomic32_t sm_trans_id;
> > > +	cl_spinlock_t mgrp_lock;
> > > +	cl_qlist_t mgrp_list;
> > >  	osm_req_t req;
> > >  	osm_resp_t resp;
> > >  	osm_ni_rcv_t ni_rcv;
> > > diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h
> > > index dada097..f51593a 100644
> > > --- a/opensm/include/opensm/osm_state_mgr.h
> > > +++ b/opensm/include/opensm/osm_state_mgr.h
> > > @@ -109,8 +109,6 @@ typedef struct _osm_state_mgr {
> > >  	osm_stats_t *p_stats;
> > >  	struct _osm_sm_state_mgr *p_sm_state_mgr;
> > >  	const osm_sm_mad_ctrl_t *p_mad_ctrl;
> > > -	cl_spinlock_t idle_lock;
> > > -	cl_qlist_t idle_time_list;
> > >  	cl_plock_t *p_lock;
> > >  	cl_event_t *p_subnet_up_event;
> > >  	osm_sm_state_t state;
> > > @@ -172,99 +170,6 @@ typedef struct _osm_state_mgr {
> > >  *	State Manager object
> > >  *********/
> > >  
> > > -/****s* OpenSM: State Manager/_osm_idle_item
> > > -* NAME
> > > -*	_osm_idle_item
> > > -*
> > > -* DESCRIPTION
> > > -*	Idle item.
> > > -*
> > > -* SYNOPSIS
> > > -*/
> > > -
> > > -typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void *context2);
> > > -
> > > -typedef void
> > > - (*osm_pfn_done_t) (IN void *context1, IN void *context2);
> > > -
> > > -typedef struct _osm_idle_item {
> > > -	cl_list_item_t list_item;
> > > -	void *context1;
> > > -	void *context2;
> > > -	osm_pfn_start_t pfn_start;
> > > -	osm_pfn_done_t pfn_done;
> > > -} osm_idle_item_t;
> > > -
> > > -/*
> > > -* FIELDS
> > > -*	list_item
> > > -*		list item.
> > > -*
> > > -*	context1
> > > -*		Context pointer
> > > -*
> > > -*	context2
> > > -*		Context pointer
> > > -*
> > > -*	pfn_start
> > > -*		Pointer to the start function.
> > > -*
> > > -*	pfn_done
> > > -*		Pointer to the dine function.
> > > -* SEE ALSO
> > > -*	State Manager object
> > > -*********/
> > > -
> > > -/****f* OpenSM: State Manager/osm_state_mgr_process_idle
> > > -* NAME
> > > -*	osm_state_mgr_process_idle
> > > -*
> > > -* DESCRIPTION
> > > -*	Formulates the osm_idle_item and inserts it into the queue and
> > > -*	signals the state manager.
> > > -*
> > > -* SYNOPSIS
> > > -*/
> > > -
> > > -ib_api_status_t
> > > -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> > > -			   IN osm_pfn_start_t pfn_start,
> > > -			   IN osm_pfn_done_t pfn_done,
> > > -			   void *context1, void *context2);
> > > -
> > > -/*
> > > -* PARAMETERS
> > > -*	p_mgr
> > > -*		[in] Pointer to a State Manager object to construct.
> > > -*
> > > -*	pfn_start
> > > -*		[in] Pointer the start function which will be called at
> > > -*			idle time.
> > > -*
> > > -*	pfn_done
> > > -*		[in] pointer the done function which will be called
> > > -*			when outstanding smps is zero
> > > -*
> > > -*	context1
> > > -*		[in] Pointer to void
> > > -*
> > > -*	context2
> > > -*		[in] Pointer to void
> > > -*
> > > -* RETURN VALUE
> > > -*	IB_SUCCESS or IB_ERROR
> > > -*
> > > -* NOTES
> > > -*	Allows osm_state_mgr_destroy
> > > -*
> > > -*	Calling osm_state_mgr_construct is a prerequisite to calling any other
> > > -*	method except osm_state_mgr_init.
> > > -*
> > > -* SEE ALSO
> > > -*	State Manager object, osm_state_mgr_init,
> > > -*	osm_state_mgr_destroy
> > > -*********/
> > > -
> > >  /****f* OpenSM: State Manager/osm_state_mgr_construct
> > >  * NAME
> > >  *	osm_state_mgr_construct
> > > diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
> > > index 50b95fd..f51a45a 100644
> > > --- a/opensm/opensm/osm_mcast_mgr.c
> > > +++ b/opensm/opensm/osm_mcast_mgr.c
> > > @@ -815,7 +815,7 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr,
> > >  	}
> > >  
> > >  	free(list_array);
> > > -      Exit:
> > > +Exit:
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > >  	return (p_mtn);
> > >  }
> > > @@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t * const p_mgr,
> > >  		"Configured MLID 0x%X for %u ports, max tree depth = %u\n",
> > >  		cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth);
> > >  
> > > -      Exit:
> > > +Exit:
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > >  	return (status);
> > >  }
> > > @@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr,
> > >  		}
> > >  	}
> > >  
> > > -      Exit:
> > > +Exit:
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > >  	return (status);
> > >  }
> > > @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * const p_mgr,
> > >  							   port_guid);
> > >  	}
> > >  
> > > -      Exit:
> > > +Exit:
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > >  	return (status);
> > >  }
> > >  
> > >  /**********************************************************************
> > >   Process the entire group.
> > > -
> > >   NOTE : The lock should be held externally!
> > >   **********************************************************************/
> > > -static osm_signal_t
> > > -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> > > -			   IN osm_mgrp_t * const p_mgrp,
> > > -			   IN osm_mcast_req_type_t req_type,
> > > -			   IN ib_net64_t port_guid)
> > > +static ib_api_status_t
> > > +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> > > +		       IN osm_mgrp_t * const p_mgrp,
> > > +		       IN osm_mcast_req_type_t req_type,
> > > +		       IN ib_net64_t port_guid)
> > >  {
> > > -	osm_signal_t signal = OSM_SIGNAL_DONE;
> > >  	ib_api_status_t status;
> > > -	osm_switch_t *p_sw;
> > > -	cl_qmap_t *p_sw_tbl;
> > > -	boolean_t pending_transactions = FALSE;
> > >  
> > >  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp);
> > >  
> > > -	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> > > -
> > >  	status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid);
> > >  	if (status != IB_SUCCESS) {
> > >  		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > > -			"osm_mcast_mgr_process_mgrp: ERR 0A19: "
> > > +			"mcast_mgr_process_mgrp: ERR 0A19: "
> > >  			"Unable to create spanning tree (%s)\n",
> > >  			ib_get_err_str(status));
> > > -
> > >  		goto Exit;
> > >  	}
> > > +	p_mgrp->last_tree_id = p_mgrp->last_change_id;
> > >  
> > > -	/*
> > > -	   Walk the switches and download the tables for each.
> > > +	/* Remove MGRP only if osm_mcm_port_t count is 0 and
> > > +	 * Not a well known group
> > >  	 */
> > > -	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> > > -	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> > > -		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> > > -		if (signal == OSM_SIGNAL_DONE_PENDING)
> > > -			pending_transactions = TRUE;
> > > -
> > > -		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> > > +	if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) {
> > > +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > > +			"mcast_mgr_process_mgrp: "
> > > +			"Destroying mgrp with lid:0x%X\n",
> > > +			cl_ntoh16(p_mgrp->mlid));
> > > +		/* Send a Report to any InformInfo registered for
> > > +		   Trap 67 : MCGroup delete */
> > > +		osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> > > +					    p_mgrp);
> > > +		cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> > > +				    (cl_map_item_t *) p_mgrp);
> > > +		osm_mgrp_delete(p_mgrp);
> > >  	}
> > >  
> > > -	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> > > -
> > > -      Exit:
> > > +Exit:
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > > -
> > > -	if (pending_transactions == TRUE)
> > > -		return (OSM_SIGNAL_DONE_PENDING);
> > > -	else
> > > -		return (OSM_SIGNAL_DONE);
> > > +	return status;
> > >  }
> > >  
> > >  /**********************************************************************
> > > @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
> > >  	osm_switch_t *p_sw;
> > >  	cl_qmap_t *p_sw_tbl;
> > >  	cl_qmap_t *p_mcast_tbl;
> > > +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
> > >  	osm_mgrp_t *p_mgrp;
> > > -	ib_api_status_t status;
> > >  	boolean_t pending_transactions = FALSE;
> > >  
> > >  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process);
> > >  
> > >  	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> > > -
> > >  	p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl;
> > >  	/*
> > >  	   While holding the lock, iterate over all the established
> > > @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
> > >  		/* We reached here due to some change that caused a heavy sweep
> > >  		   of the subnet. Not due to a specific multicast request.
> > >  		   So the request type is subnet_change and the port guid is 0. */
> > > -		status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp,
> > > -						    OSM_MCAST_REQ_TYPE_SUBNET_CHANGE,
> > > -						    0);
> > > -		if (status != IB_SUCCESS) {
> > > -			osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > > -				"osm_mcast_mgr_process: ERR 0A20: "
> > > -				"Unable to create spanning tree (%s)\n",
> > > -				ib_get_err_str(status));
> > > -		}
> > > -
> > > +		mcast_mgr_process_mgrp(p_mgr, p_mgrp,
> > > +				       OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0);
> > >  		p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item);
> > >  	}
> > >  
> > > @@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
> > >  		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> > >  		if (signal == OSM_SIGNAL_DONE_PENDING)
> > >  			pending_transactions = TRUE;
> > > -
> > >  		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> > >  	}
> > >  
> > > +	while (!cl_is_qlist_empty(p_list)) {
> > > +		cl_list_item_t *p = cl_qlist_remove_head(p_list);
> > > +		free(p);
> > > +	}
> > > +
> > >  	CL_PLOCK_RELEASE(p_mgr->p_lock);
> > >  
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > > @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcast_mgr_t * const p_mgr,
> > >  
> > >  /**********************************************************************
> > >    This is the function that is invoked during idle time to handle the
> > > -  process request. Context1 is simply the osm_mcast_mgr_t*, Context2
> > > -  hold the mlid, port guid and action (join/leave/delete) required.
> > > +  process request for mcast groups where join/leave/delete was required.
> > >   **********************************************************************/
> > > -osm_signal_t
> > > -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2)
> > > +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr)
> > >  {
> > > -	osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1;
> > > +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
> > > +	osm_switch_t *p_sw;
> > > +	cl_qmap_t *p_sw_tbl;
> > >  	osm_mgrp_t *p_mgrp;
> > >  	ib_net16_t mlid;
> > > -	osm_signal_t signal = OSM_SIGNAL_DONE;
> > > -	osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2;
> > > -	osm_mcast_req_type_t req_type = p_ctxt->req_type;
> > > -	ib_net64_t port_guid = p_ctxt->port_guid;
> > > -
> > > -	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb);
> > > -
> > > -	/* nice copy no warning on size diff */
> > > -	memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid));
> > > +	osm_signal_t ret, signal = OSM_SIGNAL_DONE;
> > > +	osm_mcast_mgr_ctxt_t *ctx;
> > > +	osm_mcast_req_type_t req_type;
> > > +	ib_net64_t port_guid;
> > >  
> > > -	/* we can destroy the context now */
> > > -	free(p_ctxt);
> > > +	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups);
> > >  
> > >  	/* we need a lock to make sure the p_mgrp is not change other ways */
> > >  	CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock);
> > > -	p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
> > >  
> > > -	/* since we delayed the execution we prefer to pass the
> > > -	   mlid as the mgrp identifier and then find it or abort */
> > > +	if (cl_is_qlist_empty(p_list)) {
> > > +		CL_PLOCK_RELEASE(p_mgr->p_lock);
> > > +		return OSM_SIGNAL_NONE;
> > > +	}
> > > +
> > > +	while (!cl_is_qlist_empty(p_list)) {
> > > +		ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list);
> > > +		req_type = ctx->req_type;
> > > +		port_guid = ctx->port_guid;
> > > +
> > > +		/* nice copy no warning on size diff */
> > > +		memcpy(&mlid, &ctx->mlid, sizeof(mlid));
> > >  
> > > -	if (p_mgrp) {
> > > +		/* we can destroy the context now */
> > > +		free(ctx);
> > > +
> > > +		/* since we delayed the execution we prefer to pass the
> > > +		   mlid as the mgrp identifier and then find it or abort */
> > > +		p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
> > > +		if (!p_mgrp)
> > > +			continue;
> > >  
> > > -		/* if there was no change from the last time we processed the group
> > > -		   we can skip doing anything
> > > +		/* if there was no change from the last time
> > > +		 * we processed the group we can skip doing anything
> > >  		 */
> > >  		if (p_mgrp->last_change_id == p_mgrp->last_tree_id) {
> > >  			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > > -				"osm_mcast_mgr_process_mgrp_cb: "
> > > +				"osm_mcast_mgr_process_mgroups: "
> > >  				"Skip processing mgrp with lid:0x%X change id:%u\n",
> > >  				cl_ntoh16(mlid), p_mgrp->last_change_id);
> > > -		} else {
> > > -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > > -				"osm_mcast_mgr_process_mgrp_cb: "
> > > -				"Processing mgrp with lid:0x%X change id:%u\n",
> > > -				cl_ntoh16(mlid), p_mgrp->last_change_id);
> > > -
> > > -			signal =
> > > -			    osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type,
> > > -						       port_guid);
> > > -			p_mgrp->last_tree_id = p_mgrp->last_change_id;
> > > +			continue;
> > >  		}
> > >  
> > > -		/* Remove MGRP only if osm_mcm_port_t count is 0 and
> > > -		 * Not a well known group
> > > -		 */
> > > -		if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) &&
> > > -		    (p_mgrp->well_known == FALSE)) {
> > > -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > > -				"osm_mcast_mgr_process_mgrp_cb: "
> > > -				"Destroying mgrp with lid:0x%X\n",
> > > -				cl_ntoh16(mlid));
> > > -
> > > -			/* Send a Report to any InformInfo registered for
> > > -			   Trap 67 : MCGroup delete */
> > > -			osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> > > -						    p_mgrp);
> > > -
> > > -			cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> > > -					    (cl_map_item_t *) p_mgrp);
> > > +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> > > +			"osm_mcast_mgr_process_mgroups: "
> > > +			"Processing mgrp with lid:0x%X change id:%u\n",
> > > +			cl_ntoh16(mlid), p_mgrp->last_change_id);
> > > +		mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid);
> > > +	}
> > >  
> > > -			osm_mgrp_delete(p_mgrp);
> > > -		}
> > > +	/*
> > > +	   Walk the switches and download the tables for each.
> > > +	 */
> > > +	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> > > +	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> > > +	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> > > +		ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> > > +		if (ret == OSM_SIGNAL_DONE_PENDING)
> > > +			signal = ret;
> > > +		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> > >  	}
> > >  
> > > +	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> > > +
> > >  	CL_PLOCK_RELEASE(p_mgr->p_lock);
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > >  	return signal;
> > > diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c
> > > index 88e6d4a..b295a77 100644
> > > --- a/opensm/opensm/osm_sm.c
> > > +++ b/opensm/opensm/osm_sm.c
> > > @@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm)
> > >  	cl_event_construct(&p_sm->signal_event);
> > >  	cl_event_construct(&p_sm->subnet_up_event);
> > >  	cl_thread_construct(&p_sm->sweeper);
> > > +	cl_spinlock_construct(&p_sm->mgrp_lock);
> > >  	osm_req_construct(&p_sm->req);
> > >  	osm_resp_construct(&p_sm->resp);
> > >  	osm_ni_rcv_construct(&p_sm->ni_rcv);
> > > @@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm)
> > >  	cl_event_destroy(&p_sm->signal_event);
> > >  	cl_event_destroy(&p_sm->subnet_up_event);
> > >  	cl_spinlock_destroy(&p_sm->signal_lock);
> > > +	cl_spinlock_destroy(&p_sm->mgrp_lock);
> > >  
> > >  	osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n");	/* Format Waived */
> > >  	OSM_LOG_EXIT(p_sm->p_log);
> > > @@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm,
> > >  	if (status != CL_SUCCESS)
> > >  		goto Exit;
> > >  
> > > +	cl_qlist_init(&p_sm->mgrp_list);
> > > +
> > > +	status = cl_spinlock_init(&p_sm->mgrp_lock);
> > > +	if (status != CL_SUCCESS)
> > > +		goto Exit;
> > > +
> > >  	status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl,
> > >  				      p_sm->p_subn,
> > >  				      p_sm->p_mad_pool,
> > > @@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const ib_net64_t port_guid)
> > >  /**********************************************************************
> > >   **********************************************************************/
> > >  static ib_api_status_t
> > > -__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> > > +__osm_sm_mgrp_process(IN osm_sm_t * const p_sm,
> > >  		      IN osm_mgrp_t * const p_mgrp,
> > >  		      IN const ib_net64_t port_guid,
> > >  		      IN osm_mcast_req_type_t req_type)
> > >  {
> > > -	ib_api_status_t status;
> > >  	osm_mcast_mgr_ctxt_t *ctx2;
> > >  
> > > -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect);
> > > -
> > >  	/*
> > >  	 * 'Schedule' all the QP0 traffic for when the state manager
> > >  	 * isn't busy trying to do something else.
> > >  	 */
> > >  	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> > > +	if (!ctx2)
> > > +		return IB_ERROR;
> > > +	memset(ctx2, 0, sizeof(*ctx2));
> > >  	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
> > >  	ctx2->req_type = req_type;
> > >  	ctx2->port_guid = port_guid;
> > >  
> > > -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> > > -					    osm_mcast_mgr_process_mgrp_cb,
> > > -					    NULL, &p_sm->mcast_mgr,
> > > -					    (void *)ctx2);
> > > +	cl_spinlock_acquire(&p_sm->mgrp_lock);
> > > +	cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item);
> > > +	cl_spinlock_release(&p_sm->mgrp_lock);
> > >  
> > > -	OSM_LOG_EXIT(p_sm->p_log);
> > > -	return (status);
> > > +	osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> > > +
> > > +	return IB_SUCCESS;
> > > +}
> > > +
> > > +/**********************************************************************
> > > + **********************************************************************/
> > > +static ib_api_status_t
> > > +__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> > > +		      IN osm_mgrp_t * const p_mgrp,
> > > +		      IN const ib_net64_t port_guid,
> > > +		      IN osm_mcast_req_type_t req_type)
> > > +{
> > > +	return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type);
> > >  }
> > >  
> > >  /**********************************************************************
> > > @@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm,
> > >  			 IN osm_mgrp_t * const p_mgrp,
> > >  			 IN const ib_net64_t port_guid)
> > >  {
> > > -	ib_api_status_t status;
> > > -	osm_mcast_mgr_ctxt_t *ctx2;
> > > -
> > > -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect);
> > > -
> > > -	/*
> > > -	 * 'Schedule' all the QP0 traffic for when the state manager
> > > -	 * isn't busy trying to do something else.
> > > -	 */
> > > -	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> > > -	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
> > > -	ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE;
> > > -	ctx2->port_guid = port_guid;
> > > -
> > > -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> > > -					    osm_mcast_mgr_process_mgrp_cb,
> > > -					    NULL, &p_sm->mcast_mgr, ctx2);
> > > -	if (status != IB_SUCCESS) {
> > > -		osm_log(p_sm->p_log, OSM_LOG_ERROR,
> > > -			"__osm_sm_mgrp_disconnect: ERR 2E11: "
> > > -			"Failure processing multicast group (%s)\n",
> > > -			ib_get_err_str(status));
> > > -	}
> > > -
> > > -	OSM_LOG_EXIT(p_sm->p_log);
> > > +	__osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, OSM_MCAST_REQ_TYPE_LEAVE);
> > >  }
> > >  
> > >  /**********************************************************************
> > > @@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm,
> > >  		goto Exit;
> > >  	}
> > >  
> > > -	CL_PLOCK_RELEASE(p_sm->p_lock);
> > >  	status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type);
> > > +	CL_PLOCK_RELEASE(p_sm->p_lock);
> > >  
> > >        Exit:
> > >  	OSM_LOG_EXIT(p_sm->p_log);
> > > @@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm,
> > >  
> > >  	osm_port_remove_mgrp(p_port, mlid);
> > >  
> > > -	CL_PLOCK_RELEASE(p_sm->p_lock);
> > > -
> > >  	__osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid);
> > > +	CL_PLOCK_RELEASE(p_sm->p_lock);
> > >  
> > >        Exit:
> > >  	OSM_LOG_EXIT(p_sm->p_log);
> > > diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
> > > index 5c39f11..d4dd782 100644
> > > --- a/opensm/opensm/osm_state_mgr.c
> > > +++ b/opensm/opensm/osm_state_mgr.c
> > > @@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm);
> > >  void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr)
> > >  {
> > >  	memset(p_mgr, 0, sizeof(*p_mgr));
> > > -	cl_spinlock_construct(&p_mgr->idle_lock);
> > >  	p_mgr->state = OSM_SM_STATE_INIT;
> > >  }
> > >  
> > > @@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr)
> > >  
> > >  	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy);
> > >  
> > > -	/* destroy the locks */
> > > -	cl_spinlock_destroy(&p_mgr->idle_lock);
> > > -
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > >  }
> > >  
> > > @@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
> > >  		   IN cl_event_t * const p_subnet_up_event,
> > >  		   IN osm_log_t * const p_log)
> > >  {
> > > -	cl_status_t status;
> > > -
> > >  	OSM_LOG_ENTER(p_log, osm_state_mgr_init);
> > >  
> > >  	CL_ASSERT(p_subn);
> > > @@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
> > >  	p_mgr->p_lock = p_lock;
> > >  	p_mgr->p_subnet_up_event = p_subnet_up_event;
> > >  
> > > -	cl_qlist_init(&p_mgr->idle_time_list);
> > > -
> > > -	status = cl_spinlock_init(&p_mgr->idle_lock);
> > > -	if (status != CL_SUCCESS) {
> > > -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > > -			"osm_state_mgr_init: ERR 3302: "
> > > -			"Spinlock init failed (%s)\n", CL_STATUS_MSG(status));
> > > -	}
> > > -
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > > -	return (status);
> > > +	return IB_SUCCESS;
> > >  }
> > >  
> > >  /**********************************************************************
> > > @@ -989,79 +974,6 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t *
> > >  }
> > >  
> > >  /**********************************************************************
> > > - **********************************************************************/
> > > -static void __process_idle_time_queue_done(IN osm_state_mgr_t * const p_mgr)
> > > -{
> > > -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> > > -	cl_list_item_t *p_list_item;
> > > -	osm_idle_item_t *p_process_item;
> > > -
> > > -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
> > > -
> > > -	cl_spinlock_acquire(&p_mgr->idle_lock);
> > > -	p_list_item = cl_qlist_remove_head(p_list);
> > > -
> > > -	if (p_list_item == cl_qlist_end(p_list)) {
> > > -		cl_spinlock_release(&p_mgr->idle_lock);
> > > -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > > -			"__process_idle_time_queue_done: ERR 3314: "
> > > -			"Idle time queue is empty\n");
> > > -		return;
> > > -	}
> > > -	cl_spinlock_release(&p_mgr->idle_lock);
> > > -
> > > -	p_process_item = (osm_idle_item_t *) p_list_item;
> > > -
> > > -	if (p_process_item->pfn_done) {
> > > -
> > > -		p_process_item->pfn_done(p_process_item->context1,
> > > -					 p_process_item->context2);
> > > -	}
> > > -
> > > -	free(p_process_item);
> > > -
> > > -	OSM_LOG_EXIT(p_mgr->p_log);
> > > -	return;
> > > -}
> > > -
> > > -/**********************************************************************
> > > - **********************************************************************/
> > > -static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
> > > -						    const p_mgr)
> > > -{
> > > -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> > > -	cl_list_item_t *p_list_item;
> > > -	osm_idle_item_t *p_process_item;
> > > -	osm_signal_t signal;
> > > -
> > > -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start);
> > > -
> > > -	cl_spinlock_acquire(&p_mgr->idle_lock);
> > > -
> > > -	p_list_item = cl_qlist_head(p_list);
> > > -	if (p_list_item == cl_qlist_end(p_list)) {
> > > -		cl_spinlock_release(&p_mgr->idle_lock);
> > > -		OSM_LOG_EXIT(p_mgr->p_log);
> > > -		return OSM_SIGNAL_NONE;
> > > -	}
> > > -
> > > -	cl_spinlock_release(&p_mgr->idle_lock);
> > > -
> > > -	p_process_item = (osm_idle_item_t *) p_list_item;
> > > -
> > > -	CL_ASSERT(p_process_item->pfn_start);
> > > -
> > > -	signal =
> > > -	    p_process_item->pfn_start(p_process_item->context1,
> > > -				      p_process_item->context2);
> > > -
> > > -	CL_ASSERT(signal != OSM_SIGNAL_NONE);
> > > -
> > > -	OSM_LOG_EXIT(p_mgr->p_log);
> > > -	return signal;
> > > -}
> > > -
> > > -/**********************************************************************
> > >   * Go over all the remote SMs (as updated in the sm_guid_tbl).
> > >   * Find if there is a remote sm that is a master SM.
> > >   * If there is a remote master SM - return a pointer to it,
> > > @@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
> > >  		case OSM_SM_STATE_PROCESS_REQUEST:
> > >  			switch (signal) {
> > >  			case OSM_SIGNAL_IDLE_TIME_PROCESS:
> > > -				signal = __process_idle_time_queue_start(p_mgr);
> > > +				signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr);
> > >  				switch (signal) {
> > >  				case OSM_SIGNAL_NONE:
> > >  					p_mgr->state = OSM_SM_STATE_IDLE;
> > > @@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
> > >  			switch (signal) {
> > >  			case OSM_SIGNAL_NO_PENDING_TRANSACTIONS:
> > >  			case OSM_SIGNAL_DONE:
> > > -				/* CALL the done function */
> > > -				__process_idle_time_queue_done(p_mgr);
> > > -
> > > -				/*
> > > -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> > > -				 * so that the next element in the queue gets processed
> > > -				 */
> > > -
> > >  				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
> > >  				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
> > >  				break;
> > > @@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
> > >  
> > >  	OSM_LOG_EXIT(p_mgr->p_log);
> > >  }
> > > -
> > > -/**********************************************************************
> > > - **********************************************************************/
> > > -ib_api_status_t
> > > -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> > > -			   IN osm_pfn_start_t pfn_start,
> > > -			   IN osm_pfn_done_t pfn_done, void *context1,
> > > -			   void *context2)
> > > -{
> > > -	osm_idle_item_t *p_idle_item;
> > > -
> > > -	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle);
> > > -
> > > -	p_idle_item = malloc(sizeof(osm_idle_item_t));
> > > -	if (p_idle_item == NULL) {
> > > -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> > > -			"osm_state_mgr_process_idle: ERR 3321: "
> > > -			"insufficient memory\n");
> > > -		return IB_ERROR;
> > > -	}
> > > -
> > > -	memset(p_idle_item, 0, sizeof(osm_idle_item_t));
> > > -	p_idle_item->pfn_start = pfn_start;
> > > -	p_idle_item->pfn_done = pfn_done;
> > > -	p_idle_item->context1 = context1;
> > > -	p_idle_item->context2 = context2;
> > > -
> > > -	cl_spinlock_acquire(&p_mgr->idle_lock);
> > > -	cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item);
> > > -	cl_spinlock_release(&p_mgr->idle_lock);
> > > -
> > > -	osm_sm_signal(&p_mgr->p_subn->p_osm->sm,
> > > -		      OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> > > -
> > > -	OSM_LOG_EXIT(p_mgr->p_log);
> > > -
> > > -	return IB_SUCCESS;
> > > -}
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


From kliteyn at dev.mellanox.co.il  Mon Dec 31 07:35:41 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 31 Dec 2007 17:35:41 +0200
Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements
In-Reply-To: <1199115083.23289.359.camel@hrosenstock-ws.xsigo.com>
References: <4770CDCE.8040200@dev.mellanox.co.il>	<20071229182718.GA19160@sashak.voltaire.com>	<1199032710.23289.340.camel@hrosenstock-ws.xsigo.com>	<20071230181610.GC10650@sashak.voltaire.com>
	<1199115083.23289.359.camel@hrosenstock-ws.xsigo.com>
Message-ID: <47790C4D.7080405@dev.mellanox.co.il>

Hal Rosenstock wrote:
> On Sun, 2007-12-30 at 18:16 +0000, Sasha Khapyorsky wrote:
>> On 08:38 Sun 30 Dec     , Hal Rosenstock wrote:
>>> On Sat, 2007-12-29 at 18:27 +0000, Sasha Khapyorsky wrote:
>>>> This improves handling of mcast join/leave requests storming. Now mcast
>>>> routing will be recalculated for all mcast groups where changes occurred
>>>> and not one by one. For this it queues mcast groups instead of mcast
>>>> rerouting requests, this also makes state_mgr idle queue obsolete.
>>> Looks like a nice improvement.
>>>
>>> What testing has been done with this change ? Can you comment on any
>>> results ?
>> osmtest, basic ipoib, SA db and MFTs dump diffs. Didn't find any
>> problem.
> 
> What size topologies ? real and/or simulated ?
> 
>>> For which branches is this change being proposed ?
>> I think it should go to OFED 1.3.
> 
> Perhaps if there is sufficient soak time on real life topologies and
> other torture tests for this.

I will include this patch in the nightly simulation today,
but currently I don't have access to any real cluster.

-- Yevgeny


> -- Hal
> 
>> Sasha
>>
>>> -- Hal
>>>
>>>> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
>>>> ---
>>>>
>>>> Hi Yevgeny,
>>>>
>>>> For me it looks that it should solve the original problem (mcast group
>>>> list is purged in osm_mcast_mgr_process()). Could you review and ideally
>>>> test it? Thanks.
>>>>
>>>> Sasha
>>>>
>>>> ---
>>>>  opensm/include/opensm/osm_mcast_mgr.h |   14 +--
>>>>  opensm/include/opensm/osm_multicast.h |    2 +
>>>>  opensm/include/opensm/osm_sm.h        |    2 +
>>>>  opensm/include/opensm/osm_state_mgr.h |   95 -----------------
>>>>  opensm/opensm/osm_mcast_mgr.c         |  187 +++++++++++++++------------------
>>>>  opensm/opensm/osm_sm.c                |   70 ++++++-------
>>>>  opensm/opensm/osm_state_mgr.c         |  138 +------------------------
>>>>  7 files changed, 130 insertions(+), 378 deletions(-)
>>>>
>>>> diff --git a/opensm/include/opensm/osm_mcast_mgr.h b/opensm/include/opensm/osm_mcast_mgr.h
>>>> index 3e0b761..47b67ed 100644
>>>> --- a/opensm/include/opensm/osm_mcast_mgr.h
>>>> +++ b/opensm/include/opensm/osm_mcast_mgr.h
>>>> @@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr {
>>>>  	osm_req_t *p_req;
>>>>  	osm_log_t *p_log;
>>>>  	cl_plock_t *p_lock;
>>>> -
>>>>  } osm_mcast_mgr_t;
>>>>  /*
>>>>  * FIELDS
>>>> @@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr);
>>>>  *	Multicast Manager, Node Info Response Controller
>>>>  *********/
>>>>  
>>>> -/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb
>>>> +/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups
>>>>  * NAME
>>>> -*	osm_mcast_mgr_process_mgrp_cb
>>>> +*	osm_mcast_mgr_process_mgroups
>>>>  *
>>>>  * DESCRIPTION
>>>> -*	Callback entry point for the osm_mcast_mgr_process_mgrp function.
>>>> +*	Process only requested mcast groups.
>>>>  *
>>>>  * SYNOPSIS
>>>>  */
>>>>  osm_signal_t
>>>> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2);
>>>> +osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr);
>>>>  /*
>>>>  * PARAMETERS
>>>> -*	(Context1) p_mgr
>>>> +*	p_mgr
>>>>  *		[in] Pointer to an osm_mcast_mgr_t object.
>>>>  *
>>>> -*	(Context2) p_mgrp
>>>> -*		[in] Pointer to the multicast group to process.
>>>> -*
>>>>  * RETURN VALUES
>>>>  *	IB_SUCCESS
>>>>  *
>>>> diff --git a/opensm/include/opensm/osm_multicast.h b/opensm/include/opensm/osm_multicast.h
>>>> index 729a2ea..f442a45 100644
>>>> --- a/opensm/include/opensm/osm_multicast.h
>>>> +++ b/opensm/include/opensm/osm_multicast.h
>>>> @@ -50,6 +50,7 @@
>>>>  
>>>>  #include <iba/ib_types.h>
>>>>  #include <complib/cl_qmap.h>
>>>> +#include <complib/cl_qlist.h>
>>>>  #include <complib/cl_spinlock.h>
>>>>  #include <opensm/osm_base.h>
>>>>  #include <opensm/osm_mtree.h>
>>>> @@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN osm_mcast_req_type_t req_type);
>>>>  * SYNOPSIS
>>>>  */
>>>>  typedef struct osm_mcast_mgr_ctxt {
>>>> +	cl_list_item_t list_item;
>>>>  	ib_net16_t mlid;
>>>>  	osm_mcast_req_type_t req_type;
>>>>  	ib_net64_t port_guid;
>>>> diff --git a/opensm/include/opensm/osm_sm.h b/opensm/include/opensm/osm_sm.h
>>>> index 4c6ce27..a676cd6 100644
>>>> --- a/opensm/include/opensm/osm_sm.h
>>>> +++ b/opensm/include/opensm/osm_sm.h
>>>> @@ -140,6 +140,8 @@ typedef struct osm_sm {
>>>>  	cl_dispatcher_t *p_disp;
>>>>  	cl_plock_t *p_lock;
>>>>  	atomic32_t sm_trans_id;
>>>> +	cl_spinlock_t mgrp_lock;
>>>> +	cl_qlist_t mgrp_list;
>>>>  	osm_req_t req;
>>>>  	osm_resp_t resp;
>>>>  	osm_ni_rcv_t ni_rcv;
>>>> diff --git a/opensm/include/opensm/osm_state_mgr.h b/opensm/include/opensm/osm_state_mgr.h
>>>> index dada097..f51593a 100644
>>>> --- a/opensm/include/opensm/osm_state_mgr.h
>>>> +++ b/opensm/include/opensm/osm_state_mgr.h
>>>> @@ -109,8 +109,6 @@ typedef struct _osm_state_mgr {
>>>>  	osm_stats_t *p_stats;
>>>>  	struct _osm_sm_state_mgr *p_sm_state_mgr;
>>>>  	const osm_sm_mad_ctrl_t *p_mad_ctrl;
>>>> -	cl_spinlock_t idle_lock;
>>>> -	cl_qlist_t idle_time_list;
>>>>  	cl_plock_t *p_lock;
>>>>  	cl_event_t *p_subnet_up_event;
>>>>  	osm_sm_state_t state;
>>>> @@ -172,99 +170,6 @@ typedef struct _osm_state_mgr {
>>>>  *	State Manager object
>>>>  *********/
>>>>  
>>>> -/****s* OpenSM: State Manager/_osm_idle_item
>>>> -* NAME
>>>> -*	_osm_idle_item
>>>> -*
>>>> -* DESCRIPTION
>>>> -*	Idle item.
>>>> -*
>>>> -* SYNOPSIS
>>>> -*/
>>>> -
>>>> -typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void *context2);
>>>> -
>>>> -typedef void
>>>> - (*osm_pfn_done_t) (IN void *context1, IN void *context2);
>>>> -
>>>> -typedef struct _osm_idle_item {
>>>> -	cl_list_item_t list_item;
>>>> -	void *context1;
>>>> -	void *context2;
>>>> -	osm_pfn_start_t pfn_start;
>>>> -	osm_pfn_done_t pfn_done;
>>>> -} osm_idle_item_t;
>>>> -
>>>> -/*
>>>> -* FIELDS
>>>> -*	list_item
>>>> -*		list item.
>>>> -*
>>>> -*	context1
>>>> -*		Context pointer
>>>> -*
>>>> -*	context2
>>>> -*		Context pointer
>>>> -*
>>>> -*	pfn_start
>>>> -*		Pointer to the start function.
>>>> -*
>>>> -*	pfn_done
>>>> -*		Pointer to the dine function.
>>>> -* SEE ALSO
>>>> -*	State Manager object
>>>> -*********/
>>>> -
>>>> -/****f* OpenSM: State Manager/osm_state_mgr_process_idle
>>>> -* NAME
>>>> -*	osm_state_mgr_process_idle
>>>> -*
>>>> -* DESCRIPTION
>>>> -*	Formulates the osm_idle_item and inserts it into the queue and
>>>> -*	signals the state manager.
>>>> -*
>>>> -* SYNOPSIS
>>>> -*/
>>>> -
>>>> -ib_api_status_t
>>>> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
>>>> -			   IN osm_pfn_start_t pfn_start,
>>>> -			   IN osm_pfn_done_t pfn_done,
>>>> -			   void *context1, void *context2);
>>>> -
>>>> -/*
>>>> -* PARAMETERS
>>>> -*	p_mgr
>>>> -*		[in] Pointer to a State Manager object to construct.
>>>> -*
>>>> -*	pfn_start
>>>> -*		[in] Pointer the start function which will be called at
>>>> -*			idle time.
>>>> -*
>>>> -*	pfn_done
>>>> -*		[in] pointer the done function which will be called
>>>> -*			when outstanding smps is zero
>>>> -*
>>>> -*	context1
>>>> -*		[in] Pointer to void
>>>> -*
>>>> -*	context2
>>>> -*		[in] Pointer to void
>>>> -*
>>>> -* RETURN VALUE
>>>> -*	IB_SUCCESS or IB_ERROR
>>>> -*
>>>> -* NOTES
>>>> -*	Allows osm_state_mgr_destroy
>>>> -*
>>>> -*	Calling osm_state_mgr_construct is a prerequisite to calling any other
>>>> -*	method except osm_state_mgr_init.
>>>> -*
>>>> -* SEE ALSO
>>>> -*	State Manager object, osm_state_mgr_init,
>>>> -*	osm_state_mgr_destroy
>>>> -*********/
>>>> -
>>>>  /****f* OpenSM: State Manager/osm_state_mgr_construct
>>>>  * NAME
>>>>  *	osm_state_mgr_construct
>>>> diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c
>>>> index 50b95fd..f51a45a 100644
>>>> --- a/opensm/opensm/osm_mcast_mgr.c
>>>> +++ b/opensm/opensm/osm_mcast_mgr.c
>>>> @@ -815,7 +815,7 @@ static osm_mtree_node_t *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr,
>>>>  	}
>>>>  
>>>>  	free(list_array);
>>>> -      Exit:
>>>> +Exit:
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>>  	return (p_mtn);
>>>>  }
>>>> @@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t * const p_mgr,
>>>>  		"Configured MLID 0x%X for %u ports, max tree depth = %u\n",
>>>>  		cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth);
>>>>  
>>>> -      Exit:
>>>> +Exit:
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>>  	return (status);
>>>>  }
>>>> @@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * const p_mgr,
>>>>  		}
>>>>  	}
>>>>  
>>>> -      Exit:
>>>> +Exit:
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>>  	return (status);
>>>>  }
>>>> @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * const p_mgr,
>>>>  							   port_guid);
>>>>  	}
>>>>  
>>>> -      Exit:
>>>> +Exit:
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>>  	return (status);
>>>>  }
>>>>  
>>>>  /**********************************************************************
>>>>   Process the entire group.
>>>> -
>>>>   NOTE : The lock should be held externally!
>>>>   **********************************************************************/
>>>> -static osm_signal_t
>>>> -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
>>>> -			   IN osm_mgrp_t * const p_mgrp,
>>>> -			   IN osm_mcast_req_type_t req_type,
>>>> -			   IN ib_net64_t port_guid)
>>>> +static ib_api_status_t
>>>> +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
>>>> +		       IN osm_mgrp_t * const p_mgrp,
>>>> +		       IN osm_mcast_req_type_t req_type,
>>>> +		       IN ib_net64_t port_guid)
>>>>  {
>>>> -	osm_signal_t signal = OSM_SIGNAL_DONE;
>>>>  	ib_api_status_t status;
>>>> -	osm_switch_t *p_sw;
>>>> -	cl_qmap_t *p_sw_tbl;
>>>> -	boolean_t pending_transactions = FALSE;
>>>>  
>>>>  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp);
>>>>  
>>>> -	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
>>>> -
>>>>  	status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, port_guid);
>>>>  	if (status != IB_SUCCESS) {
>>>>  		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
>>>> -			"osm_mcast_mgr_process_mgrp: ERR 0A19: "
>>>> +			"mcast_mgr_process_mgrp: ERR 0A19: "
>>>>  			"Unable to create spanning tree (%s)\n",
>>>>  			ib_get_err_str(status));
>>>> -
>>>>  		goto Exit;
>>>>  	}
>>>> +	p_mgrp->last_tree_id = p_mgrp->last_change_id;
>>>>  
>>>> -	/*
>>>> -	   Walk the switches and download the tables for each.
>>>> +	/* Remove MGRP only if osm_mcm_port_t count is 0 and
>>>> +	 * Not a well known group
>>>>  	 */
>>>> -	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
>>>> -	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
>>>> -		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
>>>> -		if (signal == OSM_SIGNAL_DONE_PENDING)
>>>> -			pending_transactions = TRUE;
>>>> -
>>>> -		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
>>>> +	if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) {
>>>> +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>>>> +			"mcast_mgr_process_mgrp: "
>>>> +			"Destroying mgrp with lid:0x%X\n",
>>>> +			cl_ntoh16(p_mgrp->mlid));
>>>> +		/* Send a Report to any InformInfo registered for
>>>> +		   Trap 67 : MCGroup delete */
>>>> +		osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
>>>> +					    p_mgrp);
>>>> +		cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
>>>> +				    (cl_map_item_t *) p_mgrp);
>>>> +		osm_mgrp_delete(p_mgrp);
>>>>  	}
>>>>  
>>>> -	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
>>>> -
>>>> -      Exit:
>>>> +Exit:
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>> -
>>>> -	if (pending_transactions == TRUE)
>>>> -		return (OSM_SIGNAL_DONE_PENDING);
>>>> -	else
>>>> -		return (OSM_SIGNAL_DONE);
>>>> +	return status;
>>>>  }
>>>>  
>>>>  /**********************************************************************
>>>> @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
>>>>  	osm_switch_t *p_sw;
>>>>  	cl_qmap_t *p_sw_tbl;
>>>>  	cl_qmap_t *p_mcast_tbl;
>>>> +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
>>>>  	osm_mgrp_t *p_mgrp;
>>>> -	ib_api_status_t status;
>>>>  	boolean_t pending_transactions = FALSE;
>>>>  
>>>>  	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process);
>>>>  
>>>>  	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
>>>> -
>>>>  	p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl;
>>>>  	/*
>>>>  	   While holding the lock, iterate over all the established
>>>> @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
>>>>  		/* We reached here due to some change that caused a heavy sweep
>>>>  		   of the subnet. Not due to a specific multicast request.
>>>>  		   So the request type is subnet_change and the port guid is 0. */
>>>> -		status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp,
>>>> -						    OSM_MCAST_REQ_TYPE_SUBNET_CHANGE,
>>>> -						    0);
>>>> -		if (status != IB_SUCCESS) {
>>>> -			osm_log(p_mgr->p_log, OSM_LOG_ERROR,
>>>> -				"osm_mcast_mgr_process: ERR 0A20: "
>>>> -				"Unable to create spanning tree (%s)\n",
>>>> -				ib_get_err_str(status));
>>>> -		}
>>>> -
>>>> +		mcast_mgr_process_mgrp(p_mgr, p_mgrp,
>>>> +				       OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0);
>>>>  		p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item);
>>>>  	}
>>>>  
>>>> @@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN osm_mcast_mgr_t * const p_mgr)
>>>>  		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
>>>>  		if (signal == OSM_SIGNAL_DONE_PENDING)
>>>>  			pending_transactions = TRUE;
>>>> -
>>>>  		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
>>>>  	}
>>>>  
>>>> +	while (!cl_is_qlist_empty(p_list)) {
>>>> +		cl_list_item_t *p = cl_qlist_remove_head(p_list);
>>>> +		free(p);
>>>> +	}
>>>> +
>>>>  	CL_PLOCK_RELEASE(p_mgr->p_lock);
>>>>  
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>> @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN osm_mcast_mgr_t * const p_mgr,
>>>>  
>>>>  /**********************************************************************
>>>>    This is the function that is invoked during idle time to handle the
>>>> -  process request. Context1 is simply the osm_mcast_mgr_t*, Context2
>>>> -  hold the mlid, port guid and action (join/leave/delete) required.
>>>> +  process request for mcast groups where join/leave/delete was required.
>>>>   **********************************************************************/
>>>> -osm_signal_t
>>>> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const Context2)
>>>> +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr)
>>>>  {
>>>> -	osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1;
>>>> +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
>>>> +	osm_switch_t *p_sw;
>>>> +	cl_qmap_t *p_sw_tbl;
>>>>  	osm_mgrp_t *p_mgrp;
>>>>  	ib_net16_t mlid;
>>>> -	osm_signal_t signal = OSM_SIGNAL_DONE;
>>>> -	osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2;
>>>> -	osm_mcast_req_type_t req_type = p_ctxt->req_type;
>>>> -	ib_net64_t port_guid = p_ctxt->port_guid;
>>>> -
>>>> -	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb);
>>>> -
>>>> -	/* nice copy no warning on size diff */
>>>> -	memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid));
>>>> +	osm_signal_t ret, signal = OSM_SIGNAL_DONE;
>>>> +	osm_mcast_mgr_ctxt_t *ctx;
>>>> +	osm_mcast_req_type_t req_type;
>>>> +	ib_net64_t port_guid;
>>>>  
>>>> -	/* we can destroy the context now */
>>>> -	free(p_ctxt);
>>>> +	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups);
>>>>  
>>>>  	/* we need a lock to make sure the p_mgrp is not change other ways */
>>>>  	CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock);
>>>> -	p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
>>>>  
>>>> -	/* since we delayed the execution we prefer to pass the
>>>> -	   mlid as the mgrp identifier and then find it or abort */
>>>> +	if (cl_is_qlist_empty(p_list)) {
>>>> +		CL_PLOCK_RELEASE(p_mgr->p_lock);
>>>> +		return OSM_SIGNAL_NONE;
>>>> +	}
>>>> +
>>>> +	while (!cl_is_qlist_empty(p_list)) {
>>>> +		ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list);
>>>> +		req_type = ctx->req_type;
>>>> +		port_guid = ctx->port_guid;
>>>> +
>>>> +		/* nice copy no warning on size diff */
>>>> +		memcpy(&mlid, &ctx->mlid, sizeof(mlid));
>>>>  
>>>> -	if (p_mgrp) {
>>>> +		/* we can destroy the context now */
>>>> +		free(ctx);
>>>> +
>>>> +		/* since we delayed the execution we prefer to pass the
>>>> +		   mlid as the mgrp identifier and then find it or abort */
>>>> +		p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
>>>> +		if (!p_mgrp)
>>>> +			continue;
>>>>  
>>>> -		/* if there was no change from the last time we processed the group
>>>> -		   we can skip doing anything
>>>> +		/* if there was no change from the last time
>>>> +		 * we processed the group we can skip doing anything
>>>>  		 */
>>>>  		if (p_mgrp->last_change_id == p_mgrp->last_tree_id) {
>>>>  			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>>>> -				"osm_mcast_mgr_process_mgrp_cb: "
>>>> +				"osm_mcast_mgr_process_mgroups: "
>>>>  				"Skip processing mgrp with lid:0x%X change id:%u\n",
>>>>  				cl_ntoh16(mlid), p_mgrp->last_change_id);
>>>> -		} else {
>>>> -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>>>> -				"osm_mcast_mgr_process_mgrp_cb: "
>>>> -				"Processing mgrp with lid:0x%X change id:%u\n",
>>>> -				cl_ntoh16(mlid), p_mgrp->last_change_id);
>>>> -
>>>> -			signal =
>>>> -			    osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type,
>>>> -						       port_guid);
>>>> -			p_mgrp->last_tree_id = p_mgrp->last_change_id;
>>>> +			continue;
>>>>  		}
>>>>  
>>>> -		/* Remove MGRP only if osm_mcm_port_t count is 0 and
>>>> -		 * Not a well known group
>>>> -		 */
>>>> -		if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) &&
>>>> -		    (p_mgrp->well_known == FALSE)) {
>>>> -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>>>> -				"osm_mcast_mgr_process_mgrp_cb: "
>>>> -				"Destroying mgrp with lid:0x%X\n",
>>>> -				cl_ntoh16(mlid));
>>>> -
>>>> -			/* Send a Report to any InformInfo registered for
>>>> -			   Trap 67 : MCGroup delete */
>>>> -			osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
>>>> -						    p_mgrp);
>>>> -
>>>> -			cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
>>>> -					    (cl_map_item_t *) p_mgrp);
>>>> +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
>>>> +			"osm_mcast_mgr_process_mgroups: "
>>>> +			"Processing mgrp with lid:0x%X change id:%u\n",
>>>> +			cl_ntoh16(mlid), p_mgrp->last_change_id);
>>>> +		mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid);
>>>> +	}
>>>>  
>>>> -			osm_mgrp_delete(p_mgrp);
>>>> -		}
>>>> +	/*
>>>> +	   Walk the switches and download the tables for each.
>>>> +	 */
>>>> +	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
>>>> +	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
>>>> +	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
>>>> +		ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
>>>> +		if (ret == OSM_SIGNAL_DONE_PENDING)
>>>> +			signal = ret;
>>>> +		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
>>>>  	}
>>>>  
>>>> +	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
>>>> +
>>>>  	CL_PLOCK_RELEASE(p_mgr->p_lock);
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>>  	return signal;
>>>> diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c
>>>> index 88e6d4a..b295a77 100644
>>>> --- a/opensm/opensm/osm_sm.c
>>>> +++ b/opensm/opensm/osm_sm.c
>>>> @@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm)
>>>>  	cl_event_construct(&p_sm->signal_event);
>>>>  	cl_event_construct(&p_sm->subnet_up_event);
>>>>  	cl_thread_construct(&p_sm->sweeper);
>>>> +	cl_spinlock_construct(&p_sm->mgrp_lock);
>>>>  	osm_req_construct(&p_sm->req);
>>>>  	osm_resp_construct(&p_sm->resp);
>>>>  	osm_ni_rcv_construct(&p_sm->ni_rcv);
>>>> @@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm)
>>>>  	cl_event_destroy(&p_sm->signal_event);
>>>>  	cl_event_destroy(&p_sm->subnet_up_event);
>>>>  	cl_spinlock_destroy(&p_sm->signal_lock);
>>>> +	cl_spinlock_destroy(&p_sm->mgrp_lock);
>>>>  
>>>>  	osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n");	/* Format Waived */
>>>>  	OSM_LOG_EXIT(p_sm->p_log);
>>>> @@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm,
>>>>  	if (status != CL_SUCCESS)
>>>>  		goto Exit;
>>>>  
>>>> +	cl_qlist_init(&p_sm->mgrp_list);
>>>> +
>>>> +	status = cl_spinlock_init(&p_sm->mgrp_lock);
>>>> +	if (status != CL_SUCCESS)
>>>> +		goto Exit;
>>>> +
>>>>  	status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl,
>>>>  				      p_sm->p_subn,
>>>>  				      p_sm->p_mad_pool,
>>>> @@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const ib_net64_t port_guid)
>>>>  /**********************************************************************
>>>>   **********************************************************************/
>>>>  static ib_api_status_t
>>>> -__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
>>>> +__osm_sm_mgrp_process(IN osm_sm_t * const p_sm,
>>>>  		      IN osm_mgrp_t * const p_mgrp,
>>>>  		      IN const ib_net64_t port_guid,
>>>>  		      IN osm_mcast_req_type_t req_type)
>>>>  {
>>>> -	ib_api_status_t status;
>>>>  	osm_mcast_mgr_ctxt_t *ctx2;
>>>>  
>>>> -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect);
>>>> -
>>>>  	/*
>>>>  	 * 'Schedule' all the QP0 traffic for when the state manager
>>>>  	 * isn't busy trying to do something else.
>>>>  	 */
>>>>  	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
>>>> +	if (!ctx2)
>>>> +		return IB_ERROR;
>>>> +	memset(ctx2, 0, sizeof(*ctx2));
>>>>  	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
>>>>  	ctx2->req_type = req_type;
>>>>  	ctx2->port_guid = port_guid;
>>>>  
>>>> -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
>>>> -					    osm_mcast_mgr_process_mgrp_cb,
>>>> -					    NULL, &p_sm->mcast_mgr,
>>>> -					    (void *)ctx2);
>>>> +	cl_spinlock_acquire(&p_sm->mgrp_lock);
>>>> +	cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item);
>>>> +	cl_spinlock_release(&p_sm->mgrp_lock);
>>>>  
>>>> -	OSM_LOG_EXIT(p_sm->p_log);
>>>> -	return (status);
>>>> +	osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
>>>> +
>>>> +	return IB_SUCCESS;
>>>> +}
>>>> +
>>>> +/**********************************************************************
>>>> + **********************************************************************/
>>>> +static ib_api_status_t
>>>> +__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
>>>> +		      IN osm_mgrp_t * const p_mgrp,
>>>> +		      IN const ib_net64_t port_guid,
>>>> +		      IN osm_mcast_req_type_t req_type)
>>>> +{
>>>> +	return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type);
>>>>  }
>>>>  
>>>>  /**********************************************************************
>>>> @@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm,
>>>>  			 IN osm_mgrp_t * const p_mgrp,
>>>>  			 IN const ib_net64_t port_guid)
>>>>  {
>>>> -	ib_api_status_t status;
>>>> -	osm_mcast_mgr_ctxt_t *ctx2;
>>>> -
>>>> -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect);
>>>> -
>>>> -	/*
>>>> -	 * 'Schedule' all the QP0 traffic for when the state manager
>>>> -	 * isn't busy trying to do something else.
>>>> -	 */
>>>> -	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
>>>> -	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
>>>> -	ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE;
>>>> -	ctx2->port_guid = port_guid;
>>>> -
>>>> -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
>>>> -					    osm_mcast_mgr_process_mgrp_cb,
>>>> -					    NULL, &p_sm->mcast_mgr, ctx2);
>>>> -	if (status != IB_SUCCESS) {
>>>> -		osm_log(p_sm->p_log, OSM_LOG_ERROR,
>>>> -			"__osm_sm_mgrp_disconnect: ERR 2E11: "
>>>> -			"Failure processing multicast group (%s)\n",
>>>> -			ib_get_err_str(status));
>>>> -	}
>>>> -
>>>> -	OSM_LOG_EXIT(p_sm->p_log);
>>>> +	__osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, OSM_MCAST_REQ_TYPE_LEAVE);
>>>>  }
>>>>  
>>>>  /**********************************************************************
>>>> @@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm,
>>>>  		goto Exit;
>>>>  	}
>>>>  
>>>> -	CL_PLOCK_RELEASE(p_sm->p_lock);
>>>>  	status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type);
>>>> +	CL_PLOCK_RELEASE(p_sm->p_lock);
>>>>  
>>>>        Exit:
>>>>  	OSM_LOG_EXIT(p_sm->p_log);
>>>> @@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm,
>>>>  
>>>>  	osm_port_remove_mgrp(p_port, mlid);
>>>>  
>>>> -	CL_PLOCK_RELEASE(p_sm->p_lock);
>>>> -
>>>>  	__osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid);
>>>> +	CL_PLOCK_RELEASE(p_sm->p_lock);
>>>>  
>>>>        Exit:
>>>>  	OSM_LOG_EXIT(p_sm->p_log);
>>>> diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
>>>> index 5c39f11..d4dd782 100644
>>>> --- a/opensm/opensm/osm_state_mgr.c
>>>> +++ b/opensm/opensm/osm_state_mgr.c
>>>> @@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm);
>>>>  void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr)
>>>>  {
>>>>  	memset(p_mgr, 0, sizeof(*p_mgr));
>>>> -	cl_spinlock_construct(&p_mgr->idle_lock);
>>>>  	p_mgr->state = OSM_SM_STATE_INIT;
>>>>  }
>>>>  
>>>> @@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const p_mgr)
>>>>  
>>>>  	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy);
>>>>  
>>>> -	/* destroy the locks */
>>>> -	cl_spinlock_destroy(&p_mgr->idle_lock);
>>>> -
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>>  }
>>>>  
>>>> @@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
>>>>  		   IN cl_event_t * const p_subnet_up_event,
>>>>  		   IN osm_log_t * const p_log)
>>>>  {
>>>> -	cl_status_t status;
>>>> -
>>>>  	OSM_LOG_ENTER(p_log, osm_state_mgr_init);
>>>>  
>>>>  	CL_ASSERT(p_subn);
>>>> @@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const p_mgr,
>>>>  	p_mgr->p_lock = p_lock;
>>>>  	p_mgr->p_subnet_up_event = p_subnet_up_event;
>>>>  
>>>> -	cl_qlist_init(&p_mgr->idle_time_list);
>>>> -
>>>> -	status = cl_spinlock_init(&p_mgr->idle_lock);
>>>> -	if (status != CL_SUCCESS) {
>>>> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
>>>> -			"osm_state_mgr_init: ERR 3302: "
>>>> -			"Spinlock init failed (%s)\n", CL_STATUS_MSG(status));
>>>> -	}
>>>> -
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>> -	return (status);
>>>> +	return IB_SUCCESS;
>>>>  }
>>>>  
>>>>  /**********************************************************************
>>>> @@ -989,79 +974,6 @@ static ib_api_status_t __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t *
>>>>  }
>>>>  
>>>>  /**********************************************************************
>>>> - **********************************************************************/
>>>> -static void __process_idle_time_queue_done(IN osm_state_mgr_t * const p_mgr)
>>>> -{
>>>> -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
>>>> -	cl_list_item_t *p_list_item;
>>>> -	osm_idle_item_t *p_process_item;
>>>> -
>>>> -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
>>>> -
>>>> -	cl_spinlock_acquire(&p_mgr->idle_lock);
>>>> -	p_list_item = cl_qlist_remove_head(p_list);
>>>> -
>>>> -	if (p_list_item == cl_qlist_end(p_list)) {
>>>> -		cl_spinlock_release(&p_mgr->idle_lock);
>>>> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
>>>> -			"__process_idle_time_queue_done: ERR 3314: "
>>>> -			"Idle time queue is empty\n");
>>>> -		return;
>>>> -	}
>>>> -	cl_spinlock_release(&p_mgr->idle_lock);
>>>> -
>>>> -	p_process_item = (osm_idle_item_t *) p_list_item;
>>>> -
>>>> -	if (p_process_item->pfn_done) {
>>>> -
>>>> -		p_process_item->pfn_done(p_process_item->context1,
>>>> -					 p_process_item->context2);
>>>> -	}
>>>> -
>>>> -	free(p_process_item);
>>>> -
>>>> -	OSM_LOG_EXIT(p_mgr->p_log);
>>>> -	return;
>>>> -}
>>>> -
>>>> -/**********************************************************************
>>>> - **********************************************************************/
>>>> -static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t *
>>>> -						    const p_mgr)
>>>> -{
>>>> -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
>>>> -	cl_list_item_t *p_list_item;
>>>> -	osm_idle_item_t *p_process_item;
>>>> -	osm_signal_t signal;
>>>> -
>>>> -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start);
>>>> -
>>>> -	cl_spinlock_acquire(&p_mgr->idle_lock);
>>>> -
>>>> -	p_list_item = cl_qlist_head(p_list);
>>>> -	if (p_list_item == cl_qlist_end(p_list)) {
>>>> -		cl_spinlock_release(&p_mgr->idle_lock);
>>>> -		OSM_LOG_EXIT(p_mgr->p_log);
>>>> -		return OSM_SIGNAL_NONE;
>>>> -	}
>>>> -
>>>> -	cl_spinlock_release(&p_mgr->idle_lock);
>>>> -
>>>> -	p_process_item = (osm_idle_item_t *) p_list_item;
>>>> -
>>>> -	CL_ASSERT(p_process_item->pfn_start);
>>>> -
>>>> -	signal =
>>>> -	    p_process_item->pfn_start(p_process_item->context1,
>>>> -				      p_process_item->context2);
>>>> -
>>>> -	CL_ASSERT(signal != OSM_SIGNAL_NONE);
>>>> -
>>>> -	OSM_LOG_EXIT(p_mgr->p_log);
>>>> -	return signal;
>>>> -}
>>>> -
>>>> -/**********************************************************************
>>>>   * Go over all the remote SMs (as updated in the sm_guid_tbl).
>>>>   * Find if there is a remote sm that is a master SM.
>>>>   * If there is a remote master SM - return a pointer to it,
>>>> @@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>>>>  		case OSM_SM_STATE_PROCESS_REQUEST:
>>>>  			switch (signal) {
>>>>  			case OSM_SIGNAL_IDLE_TIME_PROCESS:
>>>> -				signal = __process_idle_time_queue_start(p_mgr);
>>>> +				signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr);
>>>>  				switch (signal) {
>>>>  				case OSM_SIGNAL_NONE:
>>>>  					p_mgr->state = OSM_SM_STATE_IDLE;
>>>> @@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>>>>  			switch (signal) {
>>>>  			case OSM_SIGNAL_NO_PENDING_TRANSACTIONS:
>>>>  			case OSM_SIGNAL_DONE:
>>>> -				/* CALL the done function */
>>>> -				__process_idle_time_queue_done(p_mgr);
>>>> -
>>>> -				/*
>>>> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
>>>> -				 * so that the next element in the queue gets processed
>>>> -				 */
>>>> -
>>>>  				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
>>>>  				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
>>>>  				break;
>>>> @@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * const p_mgr,
>>>>  
>>>>  	OSM_LOG_EXIT(p_mgr->p_log);
>>>>  }
>>>> -
>>>> -/**********************************************************************
>>>> - **********************************************************************/
>>>> -ib_api_status_t
>>>> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
>>>> -			   IN osm_pfn_start_t pfn_start,
>>>> -			   IN osm_pfn_done_t pfn_done, void *context1,
>>>> -			   void *context2)
>>>> -{
>>>> -	osm_idle_item_t *p_idle_item;
>>>> -
>>>> -	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle);
>>>> -
>>>> -	p_idle_item = malloc(sizeof(osm_idle_item_t));
>>>> -	if (p_idle_item == NULL) {
>>>> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
>>>> -			"osm_state_mgr_process_idle: ERR 3321: "
>>>> -			"insufficient memory\n");
>>>> -		return IB_ERROR;
>>>> -	}
>>>> -
>>>> -	memset(p_idle_item, 0, sizeof(osm_idle_item_t));
>>>> -	p_idle_item->pfn_start = pfn_start;
>>>> -	p_idle_item->pfn_done = pfn_done;
>>>> -	p_idle_item->context1 = context1;
>>>> -	p_idle_item->context2 = context2;
>>>> -
>>>> -	cl_spinlock_acquire(&p_mgr->idle_lock);
>>>> -	cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item);
>>>> -	cl_spinlock_release(&p_mgr->idle_lock);
>>>> -
>>>> -	osm_sm_signal(&p_mgr->p_subn->p_osm->sm,
>>>> -		      OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
>>>> -
>>>> -	OSM_LOG_EXIT(p_mgr->p_log);
>>>> -
>>>> -	return IB_SUCCESS;
>>>> -}
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


From sashak at voltaire.com  Mon Dec 31 07:47:11 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Mon, 31 Dec 2007 15:47:11 +0000
Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements
In-Reply-To: <1199115083.23289.359.camel@hrosenstock-ws.xsigo.com>
References: <4770CDCE.8040200@dev.mellanox.co.il>
	<20071229182718.GA19160@sashak.voltaire.com>
	<1199032710.23289.340.camel@hrosenstock-ws.xsigo.com>
	<20071230181610.GC10650@sashak.voltaire.com>
	<1199115083.23289.359.camel@hrosenstock-ws.xsigo.com>
Message-ID: <20071231154711.GC11591@sashak.voltaire.com>

On 07:31 Mon 31 Dec     , Hal Rosenstock wrote:
> On Sun, 2007-12-30 at 18:16 +0000, Sasha Khapyorsky wrote:
> > On 08:38 Sun 30 Dec     , Hal Rosenstock wrote:
> > > On Sat, 2007-12-29 at 18:27 +0000, Sasha Khapyorsky wrote:
> > > > This improves handling of mcast join/leave requests storming. Now mcast
> > > > routing will be recalculated for all mcast groups where changes occurred
> > > > and not one by one. For this it queues mcast groups instead of mcast
> > > > rerouting requests, this also makes state_mgr idle queue obsolete.
> > > 
> > > Looks like a nice improvement.
> > > 
> > > What testing has been done with this change ? Can you comment on any
> > > results ?
> > 
> > osmtest, basic ipoib, SA db and MFTs dump diffs. Didn't find any
> > problem.
> 
> What size topologies ? real and/or simulated ?

Both real and simulated. The real fabric was small.

> > > For which branches is this change being proposed ?
> > 
> > I think it should go to OFED 1.3.
> 
> Perhaps if there is sufficient soak time on real life topologies and
> other torture tests for this.

Now it looks we have at least one month up to 1.3 finilazing.

Sasha


From sashak at voltaire.com  Mon Dec 31 07:48:15 2007
From: sashak at voltaire.com (Sasha Khapyorsky)
Date: Mon, 31 Dec 2007 15:48:15 +0000
Subject: [ofa-general] Re: [PATCH RFC] opensm: mcast mgr improvements
In-Reply-To: <47790C4D.7080405@dev.mellanox.co.il>
References: <4770CDCE.8040200@dev.mellanox.co.il>
	<20071229182718.GA19160@sashak.voltaire.com>
	<1199032710.23289.340.camel@hrosenstock-ws.xsigo.com>
	<20071230181610.GC10650@sashak.voltaire.com>
	<1199115083.23289.359.camel@hrosenstock-ws.xsigo.com>
	<47790C4D.7080405@dev.mellanox.co.il>
Message-ID: <20071231154815.GD11591@sashak.voltaire.com>

On 17:35 Mon 31 Dec     , Yevgeny Kliteynik wrote:
>  Hal Rosenstock wrote:
> > On Sun, 2007-12-30 at 18:16 +0000, Sasha Khapyorsky wrote:
> >> On 08:38 Sun 30 Dec     , Hal Rosenstock wrote:
> >>> On Sat, 2007-12-29 at 18:27 +0000, Sasha Khapyorsky wrote:
> >>>> This improves handling of mcast join/leave requests storming. Now mcast
> >>>> routing will be recalculated for all mcast groups where changes occurred
> >>>> and not one by one. For this it queues mcast groups instead of mcast
> >>>> rerouting requests, this also makes state_mgr idle queue obsolete.
> >>> Looks like a nice improvement.
> >>>
> >>> What testing has been done with this change ? Can you comment on any
> >>> results ?
> >> osmtest, basic ipoib, SA db and MFTs dump diffs. Didn't find any
> >> problem.
> > What size topologies ? real and/or simulated ?
> >>> For which branches is this change being proposed ?
> >> I think it should go to OFED 1.3.
> > Perhaps if there is sufficient soak time on real life topologies and
> > other torture tests for this.
> 
>  I will include this patch in the nightly simulation today,

Thanks!

Sasha

>  but currently I don't have access to any real cluster.
> 
>  -- Yevgeny
> 
> 
> > -- Hal
> >> Sasha
> >>
> >>> -- Hal
> >>>
> >>>> Signed-off-by: Sasha Khapyorsky <sashak at voltaire.com>
> >>>> ---
> >>>>
> >>>> Hi Yevgeny,
> >>>>
> >>>> For me it looks that it should solve the original problem (mcast group
> >>>> list is purged in osm_mcast_mgr_process()). Could you review and ideally
> >>>> test it? Thanks.
> >>>>
> >>>> Sasha
> >>>>
> >>>> ---
> >>>>  opensm/include/opensm/osm_mcast_mgr.h |   14 +--
> >>>>  opensm/include/opensm/osm_multicast.h |    2 +
> >>>>  opensm/include/opensm/osm_sm.h        |    2 +
> >>>>  opensm/include/opensm/osm_state_mgr.h |   95 -----------------
> >>>>  opensm/opensm/osm_mcast_mgr.c         |  187 
> >>>> +++++++++++++++------------------
> >>>>  opensm/opensm/osm_sm.c                |   70 ++++++-------
> >>>>  opensm/opensm/osm_state_mgr.c         |  138 +------------------------
> >>>>  7 files changed, 130 insertions(+), 378 deletions(-)
> >>>>
> >>>> diff --git a/opensm/include/opensm/osm_mcast_mgr.h 
> >>>> b/opensm/include/opensm/osm_mcast_mgr.h
> >>>> index 3e0b761..47b67ed 100644
> >>>> --- a/opensm/include/opensm/osm_mcast_mgr.h
> >>>> +++ b/opensm/include/opensm/osm_mcast_mgr.h
> >>>> @@ -100,7 +100,6 @@ typedef struct _osm_mcast_mgr {
> >>>>  	osm_req_t *p_req;
> >>>>  	osm_log_t *p_log;
> >>>>  	cl_plock_t *p_lock;
> >>>> -
> >>>>  } osm_mcast_mgr_t;
> >>>>  /*
> >>>>  * FIELDS
> >>>> @@ -253,25 +252,22 @@ osm_signal_t osm_mcast_mgr_process(IN 
> >>>> osm_mcast_mgr_t * const p_mgr);
> >>>>  *	Multicast Manager, Node Info Response Controller
> >>>>  *********/
> >>>>  -/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgrp_cb
> >>>> +/****f* OpenSM: Multicast Manager/osm_mcast_mgr_process_mgroups
> >>>>  * NAME
> >>>> -*	osm_mcast_mgr_process_mgrp_cb
> >>>> +*	osm_mcast_mgr_process_mgroups
> >>>>  *
> >>>>  * DESCRIPTION
> >>>> -*	Callback entry point for the osm_mcast_mgr_process_mgrp function.
> >>>> +*	Process only requested mcast groups.
> >>>>  *
> >>>>  * SYNOPSIS
> >>>>  */
> >>>>  osm_signal_t
> >>>> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const 
> >>>> Context2);
> >>>> +osm_mcast_mgr_process_mgroups(IN osm_mcast_mgr_t *p_mgr);
> >>>>  /*
> >>>>  * PARAMETERS
> >>>> -*	(Context1) p_mgr
> >>>> +*	p_mgr
> >>>>  *		[in] Pointer to an osm_mcast_mgr_t object.
> >>>>  *
> >>>> -*	(Context2) p_mgrp
> >>>> -*		[in] Pointer to the multicast group to process.
> >>>> -*
> >>>>  * RETURN VALUES
> >>>>  *	IB_SUCCESS
> >>>>  *
> >>>> diff --git a/opensm/include/opensm/osm_multicast.h 
> >>>> b/opensm/include/opensm/osm_multicast.h
> >>>> index 729a2ea..f442a45 100644
> >>>> --- a/opensm/include/opensm/osm_multicast.h
> >>>> +++ b/opensm/include/opensm/osm_multicast.h
> >>>> @@ -50,6 +50,7 @@
> >>>>   #include <iba/ib_types.h>
> >>>>  #include <complib/cl_qmap.h>
> >>>> +#include <complib/cl_qlist.h>
> >>>>  #include <complib/cl_spinlock.h>
> >>>>  #include <opensm/osm_base.h>
> >>>>  #include <opensm/osm_mtree.h>
> >>>> @@ -121,6 +122,7 @@ const char *osm_get_mcast_req_type_str(IN 
> >>>> osm_mcast_req_type_t req_type);
> >>>>  * SYNOPSIS
> >>>>  */
> >>>>  typedef struct osm_mcast_mgr_ctxt {
> >>>> +	cl_list_item_t list_item;
> >>>>  	ib_net16_t mlid;
> >>>>  	osm_mcast_req_type_t req_type;
> >>>>  	ib_net64_t port_guid;
> >>>> diff --git a/opensm/include/opensm/osm_sm.h 
> >>>> b/opensm/include/opensm/osm_sm.h
> >>>> index 4c6ce27..a676cd6 100644
> >>>> --- a/opensm/include/opensm/osm_sm.h
> >>>> +++ b/opensm/include/opensm/osm_sm.h
> >>>> @@ -140,6 +140,8 @@ typedef struct osm_sm {
> >>>>  	cl_dispatcher_t *p_disp;
> >>>>  	cl_plock_t *p_lock;
> >>>>  	atomic32_t sm_trans_id;
> >>>> +	cl_spinlock_t mgrp_lock;
> >>>> +	cl_qlist_t mgrp_list;
> >>>>  	osm_req_t req;
> >>>>  	osm_resp_t resp;
> >>>>  	osm_ni_rcv_t ni_rcv;
> >>>> diff --git a/opensm/include/opensm/osm_state_mgr.h 
> >>>> b/opensm/include/opensm/osm_state_mgr.h
> >>>> index dada097..f51593a 100644
> >>>> --- a/opensm/include/opensm/osm_state_mgr.h
> >>>> +++ b/opensm/include/opensm/osm_state_mgr.h
> >>>> @@ -109,8 +109,6 @@ typedef struct _osm_state_mgr {
> >>>>  	osm_stats_t *p_stats;
> >>>>  	struct _osm_sm_state_mgr *p_sm_state_mgr;
> >>>>  	const osm_sm_mad_ctrl_t *p_mad_ctrl;
> >>>> -	cl_spinlock_t idle_lock;
> >>>> -	cl_qlist_t idle_time_list;
> >>>>  	cl_plock_t *p_lock;
> >>>>  	cl_event_t *p_subnet_up_event;
> >>>>  	osm_sm_state_t state;
> >>>> @@ -172,99 +170,6 @@ typedef struct _osm_state_mgr {
> >>>>  *	State Manager object
> >>>>  *********/
> >>>>  -/****s* OpenSM: State Manager/_osm_idle_item
> >>>> -* NAME
> >>>> -*	_osm_idle_item
> >>>> -*
> >>>> -* DESCRIPTION
> >>>> -*	Idle item.
> >>>> -*
> >>>> -* SYNOPSIS
> >>>> -*/
> >>>> -
> >>>> -typedef osm_signal_t(*osm_pfn_start_t) (IN void *context1, IN void 
> >>>> *context2);
> >>>> -
> >>>> -typedef void
> >>>> - (*osm_pfn_done_t) (IN void *context1, IN void *context2);
> >>>> -
> >>>> -typedef struct _osm_idle_item {
> >>>> -	cl_list_item_t list_item;
> >>>> -	void *context1;
> >>>> -	void *context2;
> >>>> -	osm_pfn_start_t pfn_start;
> >>>> -	osm_pfn_done_t pfn_done;
> >>>> -} osm_idle_item_t;
> >>>> -
> >>>> -/*
> >>>> -* FIELDS
> >>>> -*	list_item
> >>>> -*		list item.
> >>>> -*
> >>>> -*	context1
> >>>> -*		Context pointer
> >>>> -*
> >>>> -*	context2
> >>>> -*		Context pointer
> >>>> -*
> >>>> -*	pfn_start
> >>>> -*		Pointer to the start function.
> >>>> -*
> >>>> -*	pfn_done
> >>>> -*		Pointer to the dine function.
> >>>> -* SEE ALSO
> >>>> -*	State Manager object
> >>>> -*********/
> >>>> -
> >>>> -/****f* OpenSM: State Manager/osm_state_mgr_process_idle
> >>>> -* NAME
> >>>> -*	osm_state_mgr_process_idle
> >>>> -*
> >>>> -* DESCRIPTION
> >>>> -*	Formulates the osm_idle_item and inserts it into the queue and
> >>>> -*	signals the state manager.
> >>>> -*
> >>>> -* SYNOPSIS
> >>>> -*/
> >>>> -
> >>>> -ib_api_status_t
> >>>> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> >>>> -			   IN osm_pfn_start_t pfn_start,
> >>>> -			   IN osm_pfn_done_t pfn_done,
> >>>> -			   void *context1, void *context2);
> >>>> -
> >>>> -/*
> >>>> -* PARAMETERS
> >>>> -*	p_mgr
> >>>> -*		[in] Pointer to a State Manager object to construct.
> >>>> -*
> >>>> -*	pfn_start
> >>>> -*		[in] Pointer the start function which will be called at
> >>>> -*			idle time.
> >>>> -*
> >>>> -*	pfn_done
> >>>> -*		[in] pointer the done function which will be called
> >>>> -*			when outstanding smps is zero
> >>>> -*
> >>>> -*	context1
> >>>> -*		[in] Pointer to void
> >>>> -*
> >>>> -*	context2
> >>>> -*		[in] Pointer to void
> >>>> -*
> >>>> -* RETURN VALUE
> >>>> -*	IB_SUCCESS or IB_ERROR
> >>>> -*
> >>>> -* NOTES
> >>>> -*	Allows osm_state_mgr_destroy
> >>>> -*
> >>>> -*	Calling osm_state_mgr_construct is a prerequisite to calling any 
> >>>> other
> >>>> -*	method except osm_state_mgr_init.
> >>>> -*
> >>>> -* SEE ALSO
> >>>> -*	State Manager object, osm_state_mgr_init,
> >>>> -*	osm_state_mgr_destroy
> >>>> -*********/
> >>>> -
> >>>>  /****f* OpenSM: State Manager/osm_state_mgr_construct
> >>>>  * NAME
> >>>>  *	osm_state_mgr_construct
> >>>> diff --git a/opensm/opensm/osm_mcast_mgr.c 
> >>>> b/opensm/opensm/osm_mcast_mgr.c
> >>>> index 50b95fd..f51a45a 100644
> >>>> --- a/opensm/opensm/osm_mcast_mgr.c
> >>>> +++ b/opensm/opensm/osm_mcast_mgr.c
> >>>> @@ -815,7 +815,7 @@ static osm_mtree_node_t 
> >>>> *__osm_mcast_mgr_branch(osm_mcast_mgr_t * const p_mgr,
> >>>>  	}
> >>>>   	free(list_array);
> >>>> -      Exit:
> >>>> +Exit:
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>>  	return (p_mtn);
> >>>>  }
> >>>> @@ -932,7 +932,7 @@ __osm_mcast_mgr_build_spanning_tree(osm_mcast_mgr_t 
> >>>> * const p_mgr,
> >>>>  		"Configured MLID 0x%X for %u ports, max tree depth = %u\n",
> >>>>  		cl_ntoh16(osm_mgrp_get_mlid(p_mgrp)), count, max_depth);
> >>>>  -      Exit:
> >>>> +Exit:
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>>  	return (status);
> >>>>  }
> >>>> @@ -1171,7 +1171,7 @@ osm_mcast_mgr_process_single(IN osm_mcast_mgr_t * 
> >>>> const p_mgr,
> >>>>  		}
> >>>>  	}
> >>>>  -      Exit:
> >>>> +Exit:
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>>  	return (status);
> >>>>  }
> >>>> @@ -1254,63 +1254,55 @@ osm_mcast_mgr_process_tree(IN osm_mcast_mgr_t * 
> >>>> const p_mgr,
> >>>>  							   port_guid);
> >>>>  	}
> >>>>  -      Exit:
> >>>> +Exit:
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>>  	return (status);
> >>>>  }
> >>>>   
> >>>> /**********************************************************************
> >>>>   Process the entire group.
> >>>> -
> >>>>   NOTE : The lock should be held externally!
> >>>>   
> >>>> **********************************************************************/
> >>>> -static osm_signal_t
> >>>> -osm_mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> >>>> -			   IN osm_mgrp_t * const p_mgrp,
> >>>> -			   IN osm_mcast_req_type_t req_type,
> >>>> -			   IN ib_net64_t port_guid)
> >>>> +static ib_api_status_t
> >>>> +mcast_mgr_process_mgrp(IN osm_mcast_mgr_t * const p_mgr,
> >>>> +		       IN osm_mgrp_t * const p_mgrp,
> >>>> +		       IN osm_mcast_req_type_t req_type,
> >>>> +		       IN ib_net64_t port_guid)
> >>>>  {
> >>>> -	osm_signal_t signal = OSM_SIGNAL_DONE;
> >>>>  	ib_api_status_t status;
> >>>> -	osm_switch_t *p_sw;
> >>>> -	cl_qmap_t *p_sw_tbl;
> >>>> -	boolean_t pending_transactions = FALSE;
> >>>>   	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp);
> >>>>  -	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> >>>> -
> >>>>  	status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp, req_type, 
> >>>> port_guid);
> >>>>  	if (status != IB_SUCCESS) {
> >>>>  		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> >>>> -			"osm_mcast_mgr_process_mgrp: ERR 0A19: "
> >>>> +			"mcast_mgr_process_mgrp: ERR 0A19: "
> >>>>  			"Unable to create spanning tree (%s)\n",
> >>>>  			ib_get_err_str(status));
> >>>> -
> >>>>  		goto Exit;
> >>>>  	}
> >>>> +	p_mgrp->last_tree_id = p_mgrp->last_change_id;
> >>>>  -	/*
> >>>> -	   Walk the switches and download the tables for each.
> >>>> +	/* Remove MGRP only if osm_mcm_port_t count is 0 and
> >>>> +	 * Not a well known group
> >>>>  	 */
> >>>> -	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> >>>> -	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> >>>> -		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> >>>> -		if (signal == OSM_SIGNAL_DONE_PENDING)
> >>>> -			pending_transactions = TRUE;
> >>>> -
> >>>> -		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> >>>> +	if (cl_qmap_count(&p_mgrp->mcm_port_tbl) == 0 && !p_mgrp->well_known) 
> >>>> {
> >>>> +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> >>>> +			"mcast_mgr_process_mgrp: "
> >>>> +			"Destroying mgrp with lid:0x%X\n",
> >>>> +			cl_ntoh16(p_mgrp->mlid));
> >>>> +		/* Send a Report to any InformInfo registered for
> >>>> +		   Trap 67 : MCGroup delete */
> >>>> +		osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> >>>> +					    p_mgrp);
> >>>> +		cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> >>>> +				    (cl_map_item_t *) p_mgrp);
> >>>> +		osm_mgrp_delete(p_mgrp);
> >>>>  	}
> >>>>  -	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> >>>> -
> >>>> -      Exit:
> >>>> +Exit:
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>> -
> >>>> -	if (pending_transactions == TRUE)
> >>>> -		return (OSM_SIGNAL_DONE_PENDING);
> >>>> -	else
> >>>> -		return (OSM_SIGNAL_DONE);
> >>>> +	return status;
> >>>>  }
> >>>>   
> >>>> /**********************************************************************
> >>>> @@ -1321,14 +1313,13 @@ osm_signal_t osm_mcast_mgr_process(IN 
> >>>> osm_mcast_mgr_t * const p_mgr)
> >>>>  	osm_switch_t *p_sw;
> >>>>  	cl_qmap_t *p_sw_tbl;
> >>>>  	cl_qmap_t *p_mcast_tbl;
> >>>> +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
> >>>>  	osm_mgrp_t *p_mgrp;
> >>>> -	ib_api_status_t status;
> >>>>  	boolean_t pending_transactions = FALSE;
> >>>>   	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process);
> >>>>   	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> >>>> -
> >>>>  	p_mcast_tbl = &p_mgr->p_subn->mgrp_mlid_tbl;
> >>>>  	/*
> >>>>  	   While holding the lock, iterate over all the established
> >>>> @@ -1343,16 +1334,8 @@ osm_signal_t osm_mcast_mgr_process(IN 
> >>>> osm_mcast_mgr_t * const p_mgr)
> >>>>  		/* We reached here due to some change that caused a heavy sweep
> >>>>  		   of the subnet. Not due to a specific multicast request.
> >>>>  		   So the request type is subnet_change and the port guid is 0. */
> >>>> -		status = osm_mcast_mgr_process_tree(p_mgr, p_mgrp,
> >>>> -						    OSM_MCAST_REQ_TYPE_SUBNET_CHANGE,
> >>>> -						    0);
> >>>> -		if (status != IB_SUCCESS) {
> >>>> -			osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> >>>> -				"osm_mcast_mgr_process: ERR 0A20: "
> >>>> -				"Unable to create spanning tree (%s)\n",
> >>>> -				ib_get_err_str(status));
> >>>> -		}
> >>>> -
> >>>> +		mcast_mgr_process_mgrp(p_mgr, p_mgrp,
> >>>> +				       OSM_MCAST_REQ_TYPE_SUBNET_CHANGE, 0);
> >>>>  		p_mgrp = (osm_mgrp_t *) cl_qmap_next(&p_mgrp->map_item);
> >>>>  	}
> >>>>  @@ -1364,10 +1347,14 @@ osm_signal_t osm_mcast_mgr_process(IN 
> >>>> osm_mcast_mgr_t * const p_mgr)
> >>>>  		signal = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> >>>>  		if (signal == OSM_SIGNAL_DONE_PENDING)
> >>>>  			pending_transactions = TRUE;
> >>>> -
> >>>>  		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> >>>>  	}
> >>>>  +	while (!cl_is_qlist_empty(p_list)) {
> >>>> +		cl_list_item_t *p = cl_qlist_remove_head(p_list);
> >>>> +		free(p);
> >>>> +	}
> >>>> +
> >>>>  	CL_PLOCK_RELEASE(p_mgr->p_lock);
> >>>>   	OSM_LOG_EXIT(p_mgr->p_log);
> >>>> @@ -1395,79 +1382,79 @@ osm_mgrp_t *__get_mgrp_by_mlid(IN 
> >>>> osm_mcast_mgr_t * const p_mgr,
> >>>>   
> >>>> /**********************************************************************
> >>>>    This is the function that is invoked during idle time to handle the
> >>>> -  process request. Context1 is simply the osm_mcast_mgr_t*, Context2
> >>>> -  hold the mlid, port guid and action (join/leave/delete) required.
> >>>> +  process request for mcast groups where join/leave/delete was 
> >>>> required.
> >>>>   
> >>>> **********************************************************************/
> >>>> -osm_signal_t
> >>>> -osm_mcast_mgr_process_mgrp_cb(IN void *const Context1, IN void *const 
> >>>> Context2)
> >>>> +osm_signal_t osm_mcast_mgr_process_mgroups(osm_mcast_mgr_t * p_mgr)
> >>>>  {
> >>>> -	osm_mcast_mgr_t *p_mgr = (osm_mcast_mgr_t *) Context1;
> >>>> +	cl_qlist_t *p_list = &p_mgr->p_subn->p_osm->sm.mgrp_list;
> >>>> +	osm_switch_t *p_sw;
> >>>> +	cl_qmap_t *p_sw_tbl;
> >>>>  	osm_mgrp_t *p_mgrp;
> >>>>  	ib_net16_t mlid;
> >>>> -	osm_signal_t signal = OSM_SIGNAL_DONE;
> >>>> -	osm_mcast_mgr_ctxt_t *p_ctxt = (osm_mcast_mgr_ctxt_t *) Context2;
> >>>> -	osm_mcast_req_type_t req_type = p_ctxt->req_type;
> >>>> -	ib_net64_t port_guid = p_ctxt->port_guid;
> >>>> -
> >>>> -	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgrp_cb);
> >>>> -
> >>>> -	/* nice copy no warning on size diff */
> >>>> -	memcpy(&mlid, &p_ctxt->mlid, sizeof(mlid));
> >>>> +	osm_signal_t ret, signal = OSM_SIGNAL_DONE;
> >>>> +	osm_mcast_mgr_ctxt_t *ctx;
> >>>> +	osm_mcast_req_type_t req_type;
> >>>> +	ib_net64_t port_guid;
> >>>>  -	/* we can destroy the context now */
> >>>> -	free(p_ctxt);
> >>>> +	OSM_LOG_ENTER(p_mgr->p_log, osm_mcast_mgr_process_mgroups);
> >>>>   	/* we need a lock to make sure the p_mgrp is not change other ways */
> >>>>  	CL_PLOCK_EXCL_ACQUIRE(p_mgr->p_lock);
> >>>> -	p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
> >>>>  -	/* since we delayed the execution we prefer to pass the
> >>>> -	   mlid as the mgrp identifier and then find it or abort */
> >>>> +	if (cl_is_qlist_empty(p_list)) {
> >>>> +		CL_PLOCK_RELEASE(p_mgr->p_lock);
> >>>> +		return OSM_SIGNAL_NONE;
> >>>> +	}
> >>>> +
> >>>> +	while (!cl_is_qlist_empty(p_list)) {
> >>>> +		ctx = (osm_mcast_mgr_ctxt_t *) cl_qlist_remove_head(p_list);
> >>>> +		req_type = ctx->req_type;
> >>>> +		port_guid = ctx->port_guid;
> >>>> +
> >>>> +		/* nice copy no warning on size diff */
> >>>> +		memcpy(&mlid, &ctx->mlid, sizeof(mlid));
> >>>>  -	if (p_mgrp) {
> >>>> +		/* we can destroy the context now */
> >>>> +		free(ctx);
> >>>> +
> >>>> +		/* since we delayed the execution we prefer to pass the
> >>>> +		   mlid as the mgrp identifier and then find it or abort */
> >>>> +		p_mgrp = __get_mgrp_by_mlid(p_mgr, mlid);
> >>>> +		if (!p_mgrp)
> >>>> +			continue;
> >>>>  -		/* if there was no change from the last time we processed the group
> >>>> -		   we can skip doing anything
> >>>> +		/* if there was no change from the last time
> >>>> +		 * we processed the group we can skip doing anything
> >>>>  		 */
> >>>>  		if (p_mgrp->last_change_id == p_mgrp->last_tree_id) {
> >>>>  			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> >>>> -				"osm_mcast_mgr_process_mgrp_cb: "
> >>>> +				"osm_mcast_mgr_process_mgroups: "
> >>>>  				"Skip processing mgrp with lid:0x%X change id:%u\n",
> >>>>  				cl_ntoh16(mlid), p_mgrp->last_change_id);
> >>>> -		} else {
> >>>> -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> >>>> -				"osm_mcast_mgr_process_mgrp_cb: "
> >>>> -				"Processing mgrp with lid:0x%X change id:%u\n",
> >>>> -				cl_ntoh16(mlid), p_mgrp->last_change_id);
> >>>> -
> >>>> -			signal =
> >>>> -			    osm_mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type,
> >>>> -						       port_guid);
> >>>> -			p_mgrp->last_tree_id = p_mgrp->last_change_id;
> >>>> +			continue;
> >>>>  		}
> >>>>  -		/* Remove MGRP only if osm_mcm_port_t count is 0 and
> >>>> -		 * Not a well known group
> >>>> -		 */
> >>>> -		if ((0x0 == cl_qmap_count(&p_mgrp->mcm_port_tbl)) &&
> >>>> -		    (p_mgrp->well_known == FALSE)) {
> >>>> -			osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> >>>> -				"osm_mcast_mgr_process_mgrp_cb: "
> >>>> -				"Destroying mgrp with lid:0x%X\n",
> >>>> -				cl_ntoh16(mlid));
> >>>> -
> >>>> -			/* Send a Report to any InformInfo registered for
> >>>> -			   Trap 67 : MCGroup delete */
> >>>> -			osm_mgrp_send_delete_notice(p_mgr->p_subn, p_mgr->p_log,
> >>>> -						    p_mgrp);
> >>>> -
> >>>> -			cl_qmap_remove_item(&p_mgr->p_subn->mgrp_mlid_tbl,
> >>>> -					    (cl_map_item_t *) p_mgrp);
> >>>> +		osm_log(p_mgr->p_log, OSM_LOG_DEBUG,
> >>>> +			"osm_mcast_mgr_process_mgroups: "
> >>>> +			"Processing mgrp with lid:0x%X change id:%u\n",
> >>>> +			cl_ntoh16(mlid), p_mgrp->last_change_id);
> >>>> +		mcast_mgr_process_mgrp(p_mgr, p_mgrp, req_type, port_guid);
> >>>> +	}
> >>>>  -			osm_mgrp_delete(p_mgrp);
> >>>> -		}
> >>>> +	/*
> >>>> +	   Walk the switches and download the tables for each.
> >>>> +	 */
> >>>> +	p_sw_tbl = &p_mgr->p_subn->sw_guid_tbl;
> >>>> +	p_sw = (osm_switch_t *) cl_qmap_head(p_sw_tbl);
> >>>> +	while (p_sw != (osm_switch_t *) cl_qmap_end(p_sw_tbl)) {
> >>>> +		ret = __osm_mcast_mgr_set_tbl(p_mgr, p_sw);
> >>>> +		if (ret == OSM_SIGNAL_DONE_PENDING)
> >>>> +			signal = ret;
> >>>> +		p_sw = (osm_switch_t *) cl_qmap_next(&p_sw->map_item);
> >>>>  	}
> >>>>  +	osm_dump_mcast_routes(p_mgr->p_subn->p_osm);
> >>>> +
> >>>>  	CL_PLOCK_RELEASE(p_mgr->p_lock);
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>>  	return signal;
> >>>> diff --git a/opensm/opensm/osm_sm.c b/opensm/opensm/osm_sm.c
> >>>> index 88e6d4a..b295a77 100644
> >>>> --- a/opensm/opensm/osm_sm.c
> >>>> +++ b/opensm/opensm/osm_sm.c
> >>>> @@ -144,6 +144,7 @@ void osm_sm_construct(IN osm_sm_t * const p_sm)
> >>>>  	cl_event_construct(&p_sm->signal_event);
> >>>>  	cl_event_construct(&p_sm->subnet_up_event);
> >>>>  	cl_thread_construct(&p_sm->sweeper);
> >>>> +	cl_spinlock_construct(&p_sm->mgrp_lock);
> >>>>  	osm_req_construct(&p_sm->req);
> >>>>  	osm_resp_construct(&p_sm->resp);
> >>>>  	osm_ni_rcv_construct(&p_sm->ni_rcv);
> >>>> @@ -245,6 +246,7 @@ void osm_sm_destroy(IN osm_sm_t * const p_sm)
> >>>>  	cl_event_destroy(&p_sm->signal_event);
> >>>>  	cl_event_destroy(&p_sm->subnet_up_event);
> >>>>  	cl_spinlock_destroy(&p_sm->signal_lock);
> >>>> +	cl_spinlock_destroy(&p_sm->mgrp_lock);
> >>>>   	osm_log(p_sm->p_log, OSM_LOG_SYS, "Exiting SM\n");	/* Format Waived 
> >>>> */
> >>>>  	OSM_LOG_EXIT(p_sm->p_log);
> >>>> @@ -292,6 +294,12 @@ osm_sm_init(IN osm_sm_t * const p_sm,
> >>>>  	if (status != CL_SUCCESS)
> >>>>  		goto Exit;
> >>>>  +	cl_qlist_init(&p_sm->mgrp_list);
> >>>> +
> >>>> +	status = cl_spinlock_init(&p_sm->mgrp_lock);
> >>>> +	if (status != CL_SUCCESS)
> >>>> +		goto Exit;
> >>>> +
> >>>>  	status = osm_sm_mad_ctrl_init(&p_sm->mad_ctrl,
> >>>>  				      p_sm->p_subn,
> >>>>  				      p_sm->p_mad_pool,
> >>>> @@ -551,32 +559,43 @@ osm_sm_bind(IN osm_sm_t * const p_sm, IN const 
> >>>> ib_net64_t port_guid)
> >>>>  /**********************************************************************
> >>>>   
> >>>> **********************************************************************/
> >>>>  static ib_api_status_t
> >>>> -__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> >>>> +__osm_sm_mgrp_process(IN osm_sm_t * const p_sm,
> >>>>  		      IN osm_mgrp_t * const p_mgrp,
> >>>>  		      IN const ib_net64_t port_guid,
> >>>>  		      IN osm_mcast_req_type_t req_type)
> >>>>  {
> >>>> -	ib_api_status_t status;
> >>>>  	osm_mcast_mgr_ctxt_t *ctx2;
> >>>>  -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_connect);
> >>>> -
> >>>>  	/*
> >>>>  	 * 'Schedule' all the QP0 traffic for when the state manager
> >>>>  	 * isn't busy trying to do something else.
> >>>>  	 */
> >>>>  	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> >>>> +	if (!ctx2)
> >>>> +		return IB_ERROR;
> >>>> +	memset(ctx2, 0, sizeof(*ctx2));
> >>>>  	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
> >>>>  	ctx2->req_type = req_type;
> >>>>  	ctx2->port_guid = port_guid;
> >>>>  -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> >>>> -					    osm_mcast_mgr_process_mgrp_cb,
> >>>> -					    NULL, &p_sm->mcast_mgr,
> >>>> -					    (void *)ctx2);
> >>>> +	cl_spinlock_acquire(&p_sm->mgrp_lock);
> >>>> +	cl_qlist_insert_tail(&p_sm->mgrp_list, &ctx2->list_item);
> >>>> +	cl_spinlock_release(&p_sm->mgrp_lock);
> >>>>  -	OSM_LOG_EXIT(p_sm->p_log);
> >>>> -	return (status);
> >>>> +	osm_sm_signal(p_sm, OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> >>>> +
> >>>> +	return IB_SUCCESS;
> >>>> +}
> >>>> +
> >>>> +/**********************************************************************
> >>>> + 
> >>>> **********************************************************************/
> >>>> +static ib_api_status_t
> >>>> +__osm_sm_mgrp_connect(IN osm_sm_t * const p_sm,
> >>>> +		      IN osm_mgrp_t * const p_mgrp,
> >>>> +		      IN const ib_net64_t port_guid,
> >>>> +		      IN osm_mcast_req_type_t req_type)
> >>>> +{
> >>>> +	return __osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, req_type);
> >>>>  }
> >>>>   
> >>>> /**********************************************************************
> >>>> @@ -586,31 +605,7 @@ __osm_sm_mgrp_disconnect(IN osm_sm_t * const p_sm,
> >>>>  			 IN osm_mgrp_t * const p_mgrp,
> >>>>  			 IN const ib_net64_t port_guid)
> >>>>  {
> >>>> -	ib_api_status_t status;
> >>>> -	osm_mcast_mgr_ctxt_t *ctx2;
> >>>> -
> >>>> -	OSM_LOG_ENTER(p_sm->p_log, __osm_sm_mgrp_disconnect);
> >>>> -
> >>>> -	/*
> >>>> -	 * 'Schedule' all the QP0 traffic for when the state manager
> >>>> -	 * isn't busy trying to do something else.
> >>>> -	 */
> >>>> -	ctx2 = (osm_mcast_mgr_ctxt_t *) malloc(sizeof(osm_mcast_mgr_ctxt_t));
> >>>> -	memcpy(&ctx2->mlid, &p_mgrp->mlid, sizeof(p_mgrp->mlid));
> >>>> -	ctx2->req_type = OSM_MCAST_REQ_TYPE_LEAVE;
> >>>> -	ctx2->port_guid = port_guid;
> >>>> -
> >>>> -	status = osm_state_mgr_process_idle(&p_sm->state_mgr,
> >>>> -					    osm_mcast_mgr_process_mgrp_cb,
> >>>> -					    NULL, &p_sm->mcast_mgr, ctx2);
> >>>> -	if (status != IB_SUCCESS) {
> >>>> -		osm_log(p_sm->p_log, OSM_LOG_ERROR,
> >>>> -			"__osm_sm_mgrp_disconnect: ERR 2E11: "
> >>>> -			"Failure processing multicast group (%s)\n",
> >>>> -			ib_get_err_str(status));
> >>>> -	}
> >>>> -
> >>>> -	OSM_LOG_EXIT(p_sm->p_log);
> >>>> +	__osm_sm_mgrp_process(p_sm, p_mgrp, port_guid, 
> >>>> OSM_MCAST_REQ_TYPE_LEAVE);
> >>>>  }
> >>>>   
> >>>> /**********************************************************************
> >>>> @@ -719,8 +714,8 @@ osm_sm_mcgrp_join(IN osm_sm_t * const p_sm,
> >>>>  		goto Exit;
> >>>>  	}
> >>>>  -	CL_PLOCK_RELEASE(p_sm->p_lock);
> >>>>  	status = __osm_sm_mgrp_connect(p_sm, p_mgrp, port_guid, req_type);
> >>>> +	CL_PLOCK_RELEASE(p_sm->p_lock);
> >>>>         Exit:
> >>>>  	OSM_LOG_EXIT(p_sm->p_log);
> >>>> @@ -782,9 +777,8 @@ osm_sm_mcgrp_leave(IN osm_sm_t * const p_sm,
> >>>>   	osm_port_remove_mgrp(p_port, mlid);
> >>>>  -	CL_PLOCK_RELEASE(p_sm->p_lock);
> >>>> -
> >>>>  	__osm_sm_mgrp_disconnect(p_sm, p_mgrp, port_guid);
> >>>> +	CL_PLOCK_RELEASE(p_sm->p_lock);
> >>>>         Exit:
> >>>>  	OSM_LOG_EXIT(p_sm->p_log);
> >>>> diff --git a/opensm/opensm/osm_state_mgr.c 
> >>>> b/opensm/opensm/osm_state_mgr.c
> >>>> index 5c39f11..d4dd782 100644
> >>>> --- a/opensm/opensm/osm_state_mgr.c
> >>>> +++ b/opensm/opensm/osm_state_mgr.c
> >>>> @@ -76,7 +76,6 @@ osm_signal_t osm_qos_setup(IN osm_opensm_t * p_osm);
> >>>>  void osm_state_mgr_construct(IN osm_state_mgr_t * const p_mgr)
> >>>>  {
> >>>>  	memset(p_mgr, 0, sizeof(*p_mgr));
> >>>> -	cl_spinlock_construct(&p_mgr->idle_lock);
> >>>>  	p_mgr->state = OSM_SM_STATE_INIT;
> >>>>  }
> >>>>  @@ -88,9 +87,6 @@ void osm_state_mgr_destroy(IN osm_state_mgr_t * const 
> >>>> p_mgr)
> >>>>   	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_destroy);
> >>>>  -	/* destroy the locks */
> >>>> -	cl_spinlock_destroy(&p_mgr->idle_lock);
> >>>> -
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>>  }
> >>>>  @@ -112,8 +108,6 @@ osm_state_mgr_init(IN osm_state_mgr_t * const 
> >>>> p_mgr,
> >>>>  		   IN cl_event_t * const p_subnet_up_event,
> >>>>  		   IN osm_log_t * const p_log)
> >>>>  {
> >>>> -	cl_status_t status;
> >>>> -
> >>>>  	OSM_LOG_ENTER(p_log, osm_state_mgr_init);
> >>>>   	CL_ASSERT(p_subn);
> >>>> @@ -145,17 +139,8 @@ osm_state_mgr_init(IN osm_state_mgr_t * const 
> >>>> p_mgr,
> >>>>  	p_mgr->p_lock = p_lock;
> >>>>  	p_mgr->p_subnet_up_event = p_subnet_up_event;
> >>>>  -	cl_qlist_init(&p_mgr->idle_time_list);
> >>>> -
> >>>> -	status = cl_spinlock_init(&p_mgr->idle_lock);
> >>>> -	if (status != CL_SUCCESS) {
> >>>> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> >>>> -			"osm_state_mgr_init: ERR 3302: "
> >>>> -			"Spinlock init failed (%s)\n", CL_STATUS_MSG(status));
> >>>> -	}
> >>>> -
> >>>>  	OSM_LOG_EXIT(p_mgr->p_log);
> >>>> -	return (status);
> >>>> +	return IB_SUCCESS;
> >>>>  }
> >>>>   
> >>>> /**********************************************************************
> >>>> @@ -989,79 +974,6 @@ static ib_api_status_t 
> >>>> __osm_state_mgr_light_sweep_start(IN osm_state_mgr_t *
> >>>>  }
> >>>>   
> >>>> /**********************************************************************
> >>>> - 
> >>>> **********************************************************************/
> >>>> -static void __process_idle_time_queue_done(IN osm_state_mgr_t * const 
> >>>> p_mgr)
> >>>> -{
> >>>> -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> >>>> -	cl_list_item_t *p_list_item;
> >>>> -	osm_idle_item_t *p_process_item;
> >>>> -
> >>>> -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_done);
> >>>> -
> >>>> -	cl_spinlock_acquire(&p_mgr->idle_lock);
> >>>> -	p_list_item = cl_qlist_remove_head(p_list);
> >>>> -
> >>>> -	if (p_list_item == cl_qlist_end(p_list)) {
> >>>> -		cl_spinlock_release(&p_mgr->idle_lock);
> >>>> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> >>>> -			"__process_idle_time_queue_done: ERR 3314: "
> >>>> -			"Idle time queue is empty\n");
> >>>> -		return;
> >>>> -	}
> >>>> -	cl_spinlock_release(&p_mgr->idle_lock);
> >>>> -
> >>>> -	p_process_item = (osm_idle_item_t *) p_list_item;
> >>>> -
> >>>> -	if (p_process_item->pfn_done) {
> >>>> -
> >>>> -		p_process_item->pfn_done(p_process_item->context1,
> >>>> -					 p_process_item->context2);
> >>>> -	}
> >>>> -
> >>>> -	free(p_process_item);
> >>>> -
> >>>> -	OSM_LOG_EXIT(p_mgr->p_log);
> >>>> -	return;
> >>>> -}
> >>>> -
> >>>> -/**********************************************************************
> >>>> - 
> >>>> **********************************************************************/
> >>>> -static osm_signal_t __process_idle_time_queue_start(IN osm_state_mgr_t 
> >>>> *
> >>>> -						    const p_mgr)
> >>>> -{
> >>>> -	cl_qlist_t *p_list = &p_mgr->idle_time_list;
> >>>> -	cl_list_item_t *p_list_item;
> >>>> -	osm_idle_item_t *p_process_item;
> >>>> -	osm_signal_t signal;
> >>>> -
> >>>> -	OSM_LOG_ENTER(p_mgr->p_log, __process_idle_time_queue_start);
> >>>> -
> >>>> -	cl_spinlock_acquire(&p_mgr->idle_lock);
> >>>> -
> >>>> -	p_list_item = cl_qlist_head(p_list);
> >>>> -	if (p_list_item == cl_qlist_end(p_list)) {
> >>>> -		cl_spinlock_release(&p_mgr->idle_lock);
> >>>> -		OSM_LOG_EXIT(p_mgr->p_log);
> >>>> -		return OSM_SIGNAL_NONE;
> >>>> -	}
> >>>> -
> >>>> -	cl_spinlock_release(&p_mgr->idle_lock);
> >>>> -
> >>>> -	p_process_item = (osm_idle_item_t *) p_list_item;
> >>>> -
> >>>> -	CL_ASSERT(p_process_item->pfn_start);
> >>>> -
> >>>> -	signal =
> >>>> -	    p_process_item->pfn_start(p_process_item->context1,
> >>>> -				      p_process_item->context2);
> >>>> -
> >>>> -	CL_ASSERT(signal != OSM_SIGNAL_NONE);
> >>>> -
> >>>> -	OSM_LOG_EXIT(p_mgr->p_log);
> >>>> -	return signal;
> >>>> -}
> >>>> -
> >>>> -/**********************************************************************
> >>>>   * Go over all the remote SMs (as updated in the sm_guid_tbl).
> >>>>   * Find if there is a remote sm that is a master SM.
> >>>>   * If there is a remote master SM - return a pointer to it,
> >>>> @@ -1558,7 +1470,7 @@ void osm_state_mgr_process(IN osm_state_mgr_t * 
> >>>> const p_mgr,
> >>>>  		case OSM_SM_STATE_PROCESS_REQUEST:
> >>>>  			switch (signal) {
> >>>>  			case OSM_SIGNAL_IDLE_TIME_PROCESS:
> >>>> -				signal = __process_idle_time_queue_start(p_mgr);
> >>>> +				signal = osm_mcast_mgr_process_mgroups(p_mgr->p_mcast_mgr);
> >>>>  				switch (signal) {
> >>>>  				case OSM_SIGNAL_NONE:
> >>>>  					p_mgr->state = OSM_SM_STATE_IDLE;
> >>>> @@ -1604,14 +1516,6 @@ void osm_state_mgr_process(IN osm_state_mgr_t * 
> >>>> const p_mgr,
> >>>>  			switch (signal) {
> >>>>  			case OSM_SIGNAL_NO_PENDING_TRANSACTIONS:
> >>>>  			case OSM_SIGNAL_DONE:
> >>>> -				/* CALL the done function */
> >>>> -				__process_idle_time_queue_done(p_mgr);
> >>>> -
> >>>> -				/*
> >>>> -				 * Set the signal to OSM_SIGNAL_IDLE_TIME_PROCESS
> >>>> -				 * so that the next element in the queue gets processed
> >>>> -				 */
> >>>> -
> >>>>  				signal = OSM_SIGNAL_IDLE_TIME_PROCESS;
> >>>>  				p_mgr->state = OSM_SM_STATE_PROCESS_REQUEST;
> >>>>  				break;
> >>>> @@ -2424,41 +2328,3 @@ void osm_state_mgr_process(IN osm_state_mgr_t * 
> >>>> const p_mgr,
> >>>>   	OSM_LOG_EXIT(p_mgr->p_log);
> >>>>  }
> >>>> -
> >>>> -/**********************************************************************
> >>>> - 
> >>>> **********************************************************************/
> >>>> -ib_api_status_t
> >>>> -osm_state_mgr_process_idle(IN osm_state_mgr_t * const p_mgr,
> >>>> -			   IN osm_pfn_start_t pfn_start,
> >>>> -			   IN osm_pfn_done_t pfn_done, void *context1,
> >>>> -			   void *context2)
> >>>> -{
> >>>> -	osm_idle_item_t *p_idle_item;
> >>>> -
> >>>> -	OSM_LOG_ENTER(p_mgr->p_log, osm_state_mgr_process_idle);
> >>>> -
> >>>> -	p_idle_item = malloc(sizeof(osm_idle_item_t));
> >>>> -	if (p_idle_item == NULL) {
> >>>> -		osm_log(p_mgr->p_log, OSM_LOG_ERROR,
> >>>> -			"osm_state_mgr_process_idle: ERR 3321: "
> >>>> -			"insufficient memory\n");
> >>>> -		return IB_ERROR;
> >>>> -	}
> >>>> -
> >>>> -	memset(p_idle_item, 0, sizeof(osm_idle_item_t));
> >>>> -	p_idle_item->pfn_start = pfn_start;
> >>>> -	p_idle_item->pfn_done = pfn_done;
> >>>> -	p_idle_item->context1 = context1;
> >>>> -	p_idle_item->context2 = context2;
> >>>> -
> >>>> -	cl_spinlock_acquire(&p_mgr->idle_lock);
> >>>> -	cl_qlist_insert_tail(&p_mgr->idle_time_list, &p_idle_item->list_item);
> >>>> -	cl_spinlock_release(&p_mgr->idle_lock);
> >>>> -
> >>>> -	osm_sm_signal(&p_mgr->p_subn->p_osm->sm,
> >>>> -		      OSM_SIGNAL_IDLE_TIME_PROCESS_REQUEST);
> >>>> -
> >>>> -	OSM_LOG_EXIT(p_mgr->p_log);
> >>>> -
> >>>> -	return IB_SUCCESS;
> >>>> -}
> >> _______________________________________________
> >> general mailing list
> >> general at lists.openfabrics.org
> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>
> >> To unsubscribe, please visit 
> >> http://openib.org/mailman/listinfo/openib-general
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > To unsubscribe, please visit 
> > http://openib.org/mailman/listinfo/openib-general
> 


From msmxixy at bloodandbones.com  Mon Dec 31 08:46:40 2007
From: msmxixy at bloodandbones.com (Abdallah Brohawn)
Date: Mon, 31 Dec 2007 17:46:40 +0100
Subject: [ofa-general] Your own health and the health of those you love is
	something to be cherished.
Message-ID: <01c84bd5$19060000$13b9fdd9@msmxixy>

    ŤCanadianPharmacyť offers medications from the leading world famous manufacturers which stand for their quality. Fast worldwide shipping! Utmost care about each customer! Flexible discount system allows you to save on your meds. 

http://geocities.com/AlexSaunders64/

 We hope this information will help you to make the right choice. 

Abdallah Brohawn


From dwnpondm at npond.com  Mon Dec 31 12:30:59 2007
From: dwnpondm at npond.com (Leticia Mcintyre)
Date: Mon, 31 Dec 2007 14:30:59 -0600
Subject: [ofa-general] F&#252;
	r die qualitative Software wenig zu bezahlen: warum nicht?
Message-ID: <01c84bb9$c2d79380$46304818@dwnpondm>

Konnen die Produkte der Software gleichzeitig billig aber original und vollig sein? Ja, und Sie bekommen momentan die Programmen auf allen europaischen Sprachen uberlassen, die fur Windows und Macintosh vorherbestimmt sind. Einfach bezahlen und auslasten. Wie das Programm aufzustellen? Dabei hilft die professionelle Konsultation des Anwenderdienstes. Garantierte schnelle Antwort, die Ruckzahlung ist moglich. Sie kaufen, die Software funktionieren, ausgezeichnet
http://geocities.com/norris_weiss/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071231/3bc31e0b/attachment.html>

From YJia at tmriusa.com  Mon Dec 31 17:02:15 2007
From: YJia at tmriusa.com (Yicheng Jia)
Date: Mon, 31 Dec 2007 19:02:15 -0600
Subject: [ofa-general] synchronize commands issued to MTHCA
In-Reply-To: <adaejd6vz0q.fsf@cisco.com>
Message-ID: <OF5DD5B06A.128FA762-ON862573C3.0003F93E-862573C3.0005C99E@medical.local>

Hi Roland,

Thanks for your reply!

Actually I'm working on porting IB driver to QNX platform. I resume the 
work started by my former colleague, and I just found that the sync codes 
(dev->cmd.poll_sem and dev->cmd.hcr_mutex) were deleted for unknown 
reason. After adding back these sync codes, the driver runs much 
smoothlier. 

However I still get a command exec error which I believe is relevant to 
command synchronization. The problem is when "Created UDAV" is called 
during SW2HW_MPT command is being executed, the SW2HW_MPT command would 
return with bad parameter error. Here are my debug trace output:

139903841835 HCR CMD: op_code:          LE: d
139903861104 TRACE: mad.c:639/ib_mad_recv_done_handler
139903890876 HCR CMD: in_param_h:       LE: 0
139903942869 TRACE: mad.c:644/ib_mad_recv_done_handler
139903993296 HCR CMD: in_param_l:       LE: cf616000
139904038413 TRACE: verbs.c:182/ib_create_ah_from_wc
139904094753 HCR CMD: input_modifier:   LE: 1e
139904139150 TRACE: mthca_provider.c:447/mthca_ah_create
MTHCA DBG: <mthca_av.c:229> Created UDAV at 8075220/00000000:
139904197065 HCR CMD: out_pram_h:       LE: 0
139904333343   [ 0] 01000005
139904384499 HCR CMD: out_pram_l:       LE: 0
139904428086   [ 4] 0000ffff
139904478675 HCR CMD: token:            LE: ffff0000
139904520156   [ 8] 00003000
139904572059 HCR CMD: op_code_modifier: LE: 0
139904612802   [ c] 00000000
139904667693 HCR CMD: event:            LE: 0
139904708526   [10] 00000000
139904758422 HCR CMD 0x18h:             LE=80000d, BE=d008000
139904799210   [14] 00000000
139904904204   [18] 00000000
139904946792MTHCA DBG: <mthca_cmd.c:195> HCR_STATUS 40100698= d008000 ? 
8000
   [1c] 00000002
139905076860 TRACE: mthca_av.c:235/mthca_create_ah
139905112329 TRACE: mthca_av.c:243/mthca_create_ah
139905147672 TRACE: mthca_provider.c:460/mthca_ah_create
....
139906793007 HCR CMD: Status Return:              : 3

Do you have any idea?

Thanks and have a good new year!
Yicheng


Roland Dreier <rdreier at cisco.com> 
12/28/2007 11:39 PM

To
Yicheng Jia <YJia at tmriusa.com>
cc
general at lists.openfabrics.org
Subject
Re: [ofa-general] synchronize commands issued to MTHCA


 > I'm using OFED-1.0 and the problem I believe is related to command 
 > synchronization of HCA. The host issues a MAD_INF command at first and 
 > then a SW2HW_MTP command without waiting for the completion of the 
first 
 > command. Both of commands return with bad parameters error.

I guess you mean the MAD_IFC and SW2HW_MPT commands?  I've never heard
of a problem like that -- more details about your hardware/software
config and the exact symptoms you see would be helpful in debugging.

Anyway OFED 1.0 is ancient by now -- you are much better off just
using drivers from the standard kernel.  If you must use OFED, then
OFED 1.2 or even a 1.3 prerelease would be better.

 > My question is why there's no synchronization mechanism for the command 

 > execution on HCA, can I use "spin_lock" or "sem_wait" to synchronize 
 > between every command?

The HCA firmware allows multiple commands to be queued.  The
dev->cmd.event_sem semaphore is used to limit the number of
outstanding commands to the HCA's capabilities, and the
dev->cmd.hcr_mutex mutex is used to serialize the actual writing of
commands to the HCA.

There was a mmiowb() added to mthca_cmd_post() fairly recently that
might fix your problems if you are running on a large SGI Altix system.

 - R.

_____________________________________________________________________________
Scanned by IBM Email Security Management Services powered by MessageLabs. 
For more information please visit http://www.ers.ibm.com
_____________________________________________________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071231/955bfe8f/attachment.html>

From jackm at dev.mellanox.co.il  Mon Dec 31 23:03:38 2007
From: jackm at dev.mellanox.co.il (Jack Morgenstein)
Date: Tue, 1 Jan 2008 09:03:38 +0200
Subject: [ofa-general] synchronize commands issued to MTHCA
In-Reply-To: <OF5DD5B06A.128FA762-ON862573C3.0003F93E-862573C3.0005C99E@medical.local>
References: <OF5DD5B06A.128FA762-ON862573C3.0003F93E-862573C3.0005C99E@medical.local>
Message-ID: <200801010903.38891.jackm@dev.mellanox.co.il>

On Tuesday 01 January 2008 03:02, Yicheng Jia wrote:

Does your HCA use on-board memory?
(Run: "lspci" and look at "Mellanox" lines.  You have on-board memory
 if you see either:
	PCI bridge: Mellanox Technologies MT23108 InfiniHost HCA bridge (rev a1)
	InfiniBand: Mellanox Technologies MT23108 InfiniHost HCA (rev a1)
 OR:
   InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode)
)

In that case, when you create an AH in kernel space
(file mthca_av.c, procedure mthca_create_ah() ), you will enter the following flow:
	if (ah->type == MTHCA_AH_ON_HCA) {
		memcpy_toio(dev->av_table.av_map + index * MTHCA_AV_SIZE,
			    av, MTHCA_AV_SIZE);
		kfree(av);
	}

Roland, do you think that the memcpy_toio() call might mess things up?

Maybe we need "wmb()" or "mmiowb()" here as well?

- Jack

> Hi Roland,
> 
> Thanks for your reply!
> 
> Actually I'm working on porting IB driver to QNX platform. I resume the 
> work started by my former colleague, and I just found that the sync codes 
> (dev->cmd.poll_sem and dev->cmd.hcr_mutex) were deleted for unknown 
> reason. After adding back these sync codes, the driver runs much 
> smoothlier. 
> 
> However I still get a command exec error which I believe is relevant to 
> command synchronization. The problem is when "Created UDAV" is called 
> during SW2HW_MPT command is being executed, the SW2HW_MPT command would 
> return with bad parameter error. Here are my debug trace output:
> 
> 139903841835 HCR CMD: op_code:          LE: d
> 139903861104 TRACE: mad.c:639/ib_mad_recv_done_handler
> 139903890876 HCR CMD: in_param_h:       LE: 0
> 139903942869 TRACE: mad.c:644/ib_mad_recv_done_handler
> 139903993296 HCR CMD: in_param_l:       LE: cf616000
> 139904038413 TRACE: verbs.c:182/ib_create_ah_from_wc
> 139904094753 HCR CMD: input_modifier:   LE: 1e
> 139904139150 TRACE: mthca_provider.c:447/mthca_ah_create
> MTHCA DBG: <mthca_av.c:229> Created UDAV at 8075220/00000000:
> 139904197065 HCR CMD: out_pram_h:       LE: 0
> 139904333343   [ 0] 01000005
> 139904384499 HCR CMD: out_pram_l:       LE: 0
> 139904428086   [ 4] 0000ffff
> 139904478675 HCR CMD: token:            LE: ffff0000
> 139904520156   [ 8] 00003000
> 139904572059 HCR CMD: op_code_modifier: LE: 0
> 139904612802   [ c] 00000000
> 139904667693 HCR CMD: event:            LE: 0
> 139904708526   [10] 00000000
> 139904758422 HCR CMD 0x18h:             LE=80000d, BE=d008000
> 139904799210   [14] 00000000
> 139904904204   [18] 00000000
> 139904946792MTHCA DBG: <mthca_cmd.c:195> HCR_STATUS 40100698= d008000 ? 
> 8000
>    [1c] 00000002
> 139905076860 TRACE: mthca_av.c:235/mthca_create_ah
> 139905112329 TRACE: mthca_av.c:243/mthca_create_ah
> 139905147672 TRACE: mthca_provider.c:460/mthca_ah_create
> ....
> 139906793007 HCR CMD: Status Return:              : 3
> 
> Do you have any idea?
> 
> Thanks and have a good new year!
> Yicheng
> 
> 
> 
> 
> Roland Dreier <rdreier at cisco.com> 
> 12/28/2007 11:39 PM
> 
> To
> Yicheng Jia <YJia at tmriusa.com>
> cc
> general at lists.openfabrics.org
> Subject
> Re: [ofa-general] synchronize commands issued to MTHCA
> 
> 
> 
> 
> 
> 
>  > I'm using OFED-1.0 and the problem I believe is related to command 
>  > synchronization of HCA. The host issues a MAD_INF command at first and 
>  > then a SW2HW_MTP command without waiting for the completion of the 
> first 
>  > command. Both of commands return with bad parameters error.
> 
> I guess you mean the MAD_IFC and SW2HW_MPT commands?  I've never heard
> of a problem like that -- more details about your hardware/software
> config and the exact symptoms you see would be helpful in debugging.
> 
> Anyway OFED 1.0 is ancient by now -- you are much better off just
> using drivers from the standard kernel.  If you must use OFED, then
> OFED 1.2 or even a 1.3 prerelease would be better.
> 
>  > My question is why there's no synchronization mechanism for the command 
> 
>  > execution on HCA, can I use "spin_lock" or "sem_wait" to synchronize 
>  > between every command?
> 
> The HCA firmware allows multiple commands to be queued.  The
> dev->cmd.event_sem semaphore is used to limit the number of
> outstanding commands to the HCA's capabilities, and the
> dev->cmd.hcr_mutex mutex is used to serialize the actual writing of
> commands to the HCA.
> 
> There was a mmiowb() added to mthca_cmd_post() fairly recently that
> might fix your problems if you are running on a large SGI Altix system.
> 
>  - R.
> 
> _____________________________________________________________________________
> Scanned by IBM Email Security Management Services powered by MessageLabs. 
> For more information please visit http://www.ers.ibm.com
> _____________________________________________________________________________
> 
> 


From darla at sakerhetstjanst.com  Mon Dec 31 10:09:35 2007
From: darla at sakerhetstjanst.com (Alexander Hathaway)
Date: Wed, 1 Jan 2008 02:09:35 +0800
Subject: [ofa-general] Hohe Qualit&#228;
	t und niedriger Preis sind in der Software vereinigt
Message-ID: <01c84c1b$5ab99980$58c2e43d@darla>

Die Software auf allen europaischen Sprachen, fur Windows und Macintosh vorherbestimmt. Die konnen Sie momentan bekommen. Nur bezahlen und auslasten. Hier prasentiert sind nicht teuere, aber echte und vollige Produkte der Software.Die professionelle Konsultation des Anwenderdienstes hilft Ihnen jedes Programm leicht aufstellen. Schnelle Antwort ist garantiert. Die Ruckzahlung ist moglich. Sie kaufen die Software, sie funktionieren ausgezeichnet
http://geocities.com/junior.powers/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080101/c921ee18/attachment.html>