From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 00:40:18 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 00:40:18 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201084018.6FDD3E607F7@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #2 from erezz at voltaire.com 2007-02-01 00:40 -------
Created an attachment (id=71)
--> (https://bugs.openfabrics.org/attachment.cgi?id=71&action=view)
ofed.conf
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 00:40:39 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 00:40:39 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201084039.988DBE607F8@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #3 from erezz at voltaire.com 2007-02-01 00:40 -------
Created an attachment (id=72)
--> (https://bugs.openfabrics.org/attachment.cgi?id=72&action=view)
ofed_net.conf
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 00:50:19 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 00:50:19 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201085019.D3E05E607F7@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
erezz at voltaire.com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |erezz at voltaire.com
Component|IB Core |iSER
------- Comment #4 from erezz at voltaire.com 2007-02-01 00:50 -------
I wasn't able to reproduce this behavior. I made the cma fix:
diff -ru openib-1.1/drivers/infiniband/core/cma.c
openib-1.1-cma-fix/drivers/infiniband/core/cma.c
--- openib-1.1/drivers/infiniband/core/cma.c 2006-12-13 00:36:17.000000000
+0200
+++ openib-1.1-cma-fix/drivers/infiniband/core/cma.c 2007-02-01
09:57:47.000000000 +0200
@@ -43,6 +43,7 @@
#include
#include
#include
+#include
MODULE_AUTHOR("Sean Hefty");
MODULE_DESCRIPTION("Generic RDMA CM Agent");
Before installing OFED, I installed the open-iscsi package that was shipped
with SLES 10 (open-iscsi-0.5.545-9.12). Then, the installation was successful:
thyme:/tmp/OFED-1.1.1-ib_local_sa # ./install.sh -c ofed.conf -net
ofed_net.conf
Removing previous InfiniBand Software installations
Installing OFED software into /usr/local/ofed
Running /bin/rpm -ihv --force --nodeps
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/kernel-ib-1.1-2.6.16.21_0.8_smp.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/kernel-ib-devel-1.1-2.6.16.21_0.8_smp.x86_64.rpm
Running /bin/rpm -ihv
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibcm-0.9.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibcm-devel-0.9.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibcommon-1.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibcommon-devel-1.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibmad-1.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibmad-devel-1.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibumad-1.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibumad-devel-1.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibverbs-1.0.4-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibverbs-devel-1.0.4-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libibverbs-utils-1.0.4-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libmthca-1.0.3-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libmthca-devel-1.0.3-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libopensm-2.0.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libosmcomp-2.0.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/libosmvendor-2.0.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/librdmacm-0.9.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/librdmacm-devel-0.9.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/librdmacm-utils-0.9.0-0.x86_64.rpm
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/openib-diags-1.1.0-0.x86_64.rpm
Running /bin/rpm -Uhv
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/oiscsi-iser-support-1-1.x86_64.rpm
Running /bin/rpm -Uhv
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/ofed-docs-1.1.1-0.noarch.rpm
Running /bin/rpm -Uhv
/tmp/OFED-1.1.1-ib_local_sa/RPMS/sles-release-10-15.2/ofed-scripts-1.1.1-0.noarch.rpm
IPoIB configuration for ib0:
IPADDR=192.168.10.58
NETMASK=255.255.255.0
NETWORK=192.168.10.0
BROADCAST=192.168.10.255
ONBOOT=yes
IPoIB configuration for ib1:
IPADDR=195.168.10.58
NETMASK=255.255.10.0
NETWORK=195.168.10.0
BROADCAST=195.168.10.255
ONBOOT=no
Installation finished successfully...
thyme:/tmp/OFED-1.1.1-ib_local_sa # rpm -qa|grep kernel-ib
kernel-ib-1.1-2.6.16.21_0.8_smp
kernel-ib-devel-1.1-2.6.16.21_0.8_smp
thyme:/tmp/OFED-1.1.1-ib_local_sa # rpm -ql
kernel-ib-1.1-2.6.16.21_0.8_smp|grep iser
/lib/modules/2.6.16.21-0.8-smp/kernel/drivers/infiniband/ulp/iser
/lib/modules/2.6.16.21-0.8-smp/kernel/drivers/infiniband/ulp/iser/ib_iser.ko
For some reason, on your machine scsi/libiscsi.h was missing. On my machine it
is located here (this is where SLES 10 puts it):
thyme:/tmp/OFED-1.1.1-ib_local_sa # find /usr/src/linux-2.6.16.21-0.8 -name
libiscsi.h
/usr/src/linux-2.6.16.21-0.8/drivers/scsi/libiscsi.h
If you take a look at
openib-1.1/kernel_patches/backport/2.6.16_sles10/include_libiscsi.patch, you
will see that iSER will look for it in the right place. Therefore, I don't
understand what happened on your machine. Please check the following:
1. rpm -q open-iscsi
2. find /usr/src/linux-2.6.16.21-0.8 -name libiscsi.h
3. Check that kernel_patches/backport/2.6.16_sles10/include_libiscsi.patch was
applied successfully.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From ogerlitz at voltaire.com Thu Feb 1 00:58:56 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 01 Feb 2007 10:58:56 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <1170275331.14294.1.camel@stevo-desktop>
References: <1170275331.14294.1.camel@stevo-desktop>
Message-ID: <45C1ABD0.5090404@voltaire.com>
Steve Wise wrote:
> where can I find this symbol? I can't load rdma_cm on rhel4u4...
> rdma_cm: Unknown symbol ip_ib_mc_map
Sean, OK, sorry not to mention the rh4u4 issue once you did the push to
OFED 1.2 ...
From a reason that no one at RH can trace... someone went and removed
all the support for ARPHRD_INFINIBAND multicast from u4 where it exists
perfectly fine in u3 and hopefully on u5 as well (Doug can you update?),
see https://bugs.openfabrics.org/show_bug.cgi?id=2661
Specifically, the below snip from the patch means that on rh4 u4 all
IPv4 ARPHRD_INFINIBAND multicast goes on the broadcast group !!!
> Index: linux-2.6.9/net/ipv4/arp.c
> ===================================================================
> --- linux-2.6.9.orig/net/ipv4/arp.c 2004-10-18 23:55:06.000000000 +0200
> +++ linux-2.6.9/net/ipv4/arp.c 2006-09-20 14:43:59.000000000 +0300
> @@ -213,6 +213,9 @@
> case ARPHRD_IEEE802_TR:
> ip_tr_mc_map(addr, haddr);
> return 0;
> + case ARPHRD_INFINIBAND:
> + ip_ib_mc_map(addr, haddr);
> + return 0;
> default:
> if (dir) {
> memcpy(haddr, dev->broadcast, dev->addr_len);
anyway, OFED wise, i see two ways to solve this:
1) adding a backport to the rdma_cm containing ip_ib_mc_map, period.
This means that apps offloading multicast traffic through the rdma cm
would use the correct group where apps working through the net stack
use the broadcast group.
2) having the rdma cm follow the net stack and make its consumer use the
broadcast group.
Or.
From swise at opengridcomputing.com Thu Feb 1 01:01:24 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Thu, 01 Feb 2007 03:01:24 -0600
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <45C1480C.1020600@ichips.intel.com>
References: <000101c74576$fedc81f0$8698070a@amr.corp.intel.com>
<1170275680.14294.5.camel@stevo-desktop>
<45C1480C.1020600@ichips.intel.com>
Message-ID: <1170320484.654.6.camel@linux-q667.site>
On Wed, 2007-01-31 at 17:53 -0800, Sean Hefty wrote:
> Steve Wise wrote:
> > Perhaps there's no backport for this to rhel4u4?
>
> I would have thought so, but I really don't know. The function is called from
> net/ipv4/arp.c, and not directly by ipoib. So, I don't know how the backport
> patches typically handle this.
>
> - Sean
Here's what I see:
ip_ib_mc_map() is called directly from cma_join_ib_multicast(), which is
added to the ofed_1_2 cma.c via patch file:
kernel_patches/fixes/sean_multicast_1.patch
So when I compiled ofed_1_2 on rhel4u4, the cma wouldn't load because
there is no ip_ib_mc_map() in rhel4u4.
So you need a backport patch for this to work on rhel4u4. Probably many
of the older kernels.
Steve.
From mst at mellanox.co.il Thu Feb 1 01:06:28 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 11:06:28 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <45C1ABD0.5090404@voltaire.com>
References: <1170275331.14294.1.camel@stevo-desktop>
<45C1ABD0.5090404@voltaire.com>
Message-ID: <20070201090628.GC14189@mellanox.co.il>
> From a reason that no one at RH can trace... someone went and removed
> all the support for ARPHRD_INFINIBAND multicast from u4 where it exists
> perfectly fine in u3 and hopefully on u5 as well (Doug can you update?),
> see https://bugs.openfabrics.org/show_bug.cgi?id=2661
>
> Specifically, the below snip from the patch means that on rh4 u4 all
> IPv4 ARPHRD_INFINIBAND multicast goes on the broadcast group !!!
>
> > Index: linux-2.6.9/net/ipv4/arp.c
> > ===================================================================
> > --- linux-2.6.9.orig/net/ipv4/arp.c 2004-10-18 23:55:06.000000000 +0200
> > +++ linux-2.6.9/net/ipv4/arp.c 2006-09-20 14:43:59.000000000 +0300
> > @@ -213,6 +213,9 @@
> > case ARPHRD_IEEE802_TR:
> > ip_tr_mc_map(addr, haddr);
> > return 0;
> > + case ARPHRD_INFINIBAND:
> > + ip_ib_mc_map(addr, haddr);
> > + return 0;
> > default:
> > if (dir) {
> > memcpy(haddr, dev->broadcast, dev->addr_len);
>
> anyway, OFED wise, i see two ways to solve this:
>
> 1) adding a backport to the rdma_cm containing ip_ib_mc_map, period.
>
> This means that apps offloading multicast traffic through the rdma cm
> would use the correct group where apps working through the net stack
> use the broadcast group.
>
> 2) having the rdma cm follow the net stack and make its consumer use the
> broadcast group.
Correct. Since multicast is broken in other respects on U4
(sockets can't join multicast groups), I think 2 is the simplest approach.
Anyone who wants IPoIB milticast should just stay away from U4.
--
MST
From mst at mellanox.co.il Thu Feb 1 01:09:58 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 11:09:58 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <1170320484.654.6.camel@linux-q667.site>
References: <000101c74576$fedc81f0$8698070a@amr.corp.intel.com>
<1170275680.14294.5.camel@stevo-desktop>
<45C1480C.1020600@ichips.intel.com>
<1170320484.654.6.camel@linux-q667.site>
Message-ID: <20070201090958.GD14189@mellanox.co.il>
> Quoting Steve WIse :
> Subject: Re: ip_ib_mc_map?
>
> On Wed, 2007-01-31 at 17:53 -0800, Sean Hefty wrote:
> > Steve Wise wrote:
> > > Perhaps there's no backport for this to rhel4u4?
> >
> > I would have thought so, but I really don't know. The function is called from
> > net/ipv4/arp.c, and not directly by ipoib. So, I don't know how the backport
> > patches typically handle this.
> >
> > - Sean
>
> Here's what I see:
>
> ip_ib_mc_map() is called directly from cma_join_ib_multicast(), which is
> added to the ofed_1_2 cma.c via patch file:
> kernel_patches/fixes/sean_multicast_1.patch
>
> So when I compiled ofed_1_2 on rhel4u4, the cma wouldn't load because
> there is no ip_ib_mc_map() in rhel4u4.
>
> So you need a backport patch for this to work on rhel4u4. Probably many
> of the older kernels.
I think this breakage is U4 specific. Someone at RH went to the trouble to
rip all of IB related stuff out of the U4 kernel.
I think just calling ip_tr_mc_map on U4 instead will be enough.
--
MST
From ogerlitz at voltaire.com Thu Feb 1 01:17:53 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 01 Feb 2007 11:17:53 +0200
Subject: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K)
in kernel level fails
In-Reply-To: <45C0662A.7050203@dev.mellanox.co.il>
References: <45BF0575.9020507@dev.mellanox.co.il>
<45BF1866.3010807@voltaire.com>
<45C0662A.7050203@dev.mellanox.co.il>
Message-ID: <45C1B041.4000000@voltaire.com>
Dotan Barak wrote:
> I think that now, when implementation of IPoIB CM is available and SRQ
> is being used, one may
> need to use a SRQ with more than 16K WRs.
IPoIB UD uses SRQ by nature (since RX from all peers consume buffers
from the --only-- RQ) and lives fine with 32 buffers (or 64 you can look
in the code). Moreover, my assumption is that
pps(RC) <= pps(UC) <= pps(UD)
this means that what ever number of RX buffer for UD/2K MTU which is
"enough" to have no (or close to zero) packet loss under some traffic
pattern, the same pattern can be served with IPoIB CM using SRQ of the
same size.
Or.
From swise at opengridcomputing.com Thu Feb 1 01:37:50 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Thu, 01 Feb 2007 03:37:50 -0600
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
References: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
Message-ID: <1170322670.654.23.camel@linux-q667.site>
> Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> before I created an ofed_1_2 branch (which contains the fix), and didn't update
> to match my ofed_1_2 branch. The crash that you reported occurring over iWarp
> should also happen over IB for the same reason, so both are likely broken atm...
>
> Vlad, can you please update the ofed build by pulling from the ofed_1_2 branches
> of my rdma-dev.git and librdmacm.git trees?
I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
you made there will resolve this issue. It just needs to be pulled into
ofed_1_2.
Thanks!
Steve.
From ogerlitz at voltaire.com Thu Feb 1 01:38:46 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 01 Feb 2007 11:38:46 +0200
Subject: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K)
in kernel level fails
In-Reply-To:
References: <45BF0575.9020507@dev.mellanox.co.il>
<45BF1866.3010807@voltaire.com>
Message-ID: <45C1B526.30101@voltaire.com>
Roland Dreier wrote:
> > anyway, the solution that comes into my mind is to disable creating a
> > QP/SRQ for which > 128KB allocations are needed. So
> > mthca_query_device() will set the max_qp_wr and max_srq_wr attributes
> > to values whose derived size still allows to use kmalloc.
>
> But that will limit the size of the queues that userspace can create
> too. I guess we could allocate kernel wrid arrays with vmalloc(), but
> I wonder if anyone actually cares about this limit...
mmm, i would avoid vmalloc if possible. Allocating upto 128K bytes for a
kernel resource sounds fine.
As for the user space sharing of the same limitation, how about adding
to the --kernel-- struct ib_device_attr "for user space" buddy fields to
max_qp_wr max_srq_wr and max_cqe such that each hw driver set both
values: for the "user space" field the actual hw limitation and for
"kernel space" field a value which would pass kmalloc.
kernel ULPs calling ibv_device_query would use the original fields, no
need to patch them. Same for user space ULPs no need to patch them.
However, when the call is made from user space, uverbs_query_device
copies to the resp struct the "user space" attr.
Or.
From mst at mellanox.co.il Thu Feb 1 01:50:03 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 11:50:03 +0200
Subject: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K)
in kernel level fails
In-Reply-To: <45C1B526.30101@voltaire.com>
References: <45BF0575.9020507@dev.mellanox.co.il>
<45BF1866.3010807@voltaire.com>
<45C1B526.30101@voltaire.com>
Message-ID: <20070201095003.GA15505@mellanox.co.il>
> As for the user space sharing of the same limitation, how about adding
> to the --kernel-- struct ib_device_attr "for user space" buddy fields to
> max_qp_wr max_srq_wr and max_cqe such that each hw driver set both
> values: for the "user space" field the actual hw limitation and for
> "kernel space" field a value which would pass kmalloc.
We could do that I guess but no one so far used query in kernel,
and userspace values are currently good.
--
MST
From dledford at redhat.com Thu Feb 1 02:17:32 2007
From: dledford at redhat.com (Doug Ledford)
Date: Thu, 01 Feb 2007 05:17:32 -0500
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <45C1ABD0.5090404@voltaire.com>
References: <1170275331.14294.1.camel@stevo-desktop>
<45C1ABD0.5090404@voltaire.com>
Message-ID: <1170325052.2716.229.camel@fc6.xsintricity.com>
On Thu, 2007-02-01 at 10:58 +0200, Or Gerlitz wrote:
> Steve Wise wrote:
> > where can I find this symbol? I can't load rdma_cm on rhel4u4...
> > rdma_cm: Unknown symbol ip_ib_mc_map
>
> Sean, OK, sorry not to mention the rh4u4 issue once you did the push to
> OFED 1.2 ...
>
> From a reason that no one at RH can trace... someone went and removed
> all the support for ARPHRD_INFINIBAND multicast from u4 where it exists
> perfectly fine in u3 and hopefully on u5 as well (Doug can you update?),
> see https://bugs.openfabrics.org/show_bug.cgi?id=2661
Yes. It's been fixed for U5. It wasn't that the patch got removed,
it's that between U3 and U4 I did a complete rebase, which means that
all the patches from U3 were tossed out the window and a complete new
set made for U4. I just missed re-adding this one in U4.
> Specifically, the below snip from the patch means that on rh4 u4 all
> IPv4 ARPHRD_INFINIBAND multicast goes on the broadcast group !!!
>
> > Index: linux-2.6.9/net/ipv4/arp.c
> > ===================================================================
> > --- linux-2.6.9.orig/net/ipv4/arp.c 2004-10-18 23:55:06.000000000 +0200
> > +++ linux-2.6.9/net/ipv4/arp.c 2006-09-20 14:43:59.000000000 +0300
> > @@ -213,6 +213,9 @@
> > case ARPHRD_IEEE802_TR:
> > ip_tr_mc_map(addr, haddr);
> > return 0;
> > + case ARPHRD_INFINIBAND:
> > + ip_ib_mc_map(addr, haddr);
> > + return 0;
> > default:
> > if (dir) {
> > memcpy(haddr, dev->broadcast, dev->addr_len);
>
> anyway, OFED wise, i see two ways to solve this:
>
> 1) adding a backport to the rdma_cm containing ip_ib_mc_map, period.
>
> This means that apps offloading multicast traffic through the rdma cm
> would use the correct group where apps working through the net stack
> use the broadcast group.
>
> 2) having the rdma cm follow the net stack and make its consumer use the
> broadcast group.
>
> Or.
--
Doug Ledford
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 02:22:40 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 02:22:40 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201102241.38A69E607F8@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #5 from dmitry.yulov at intel.com 2007-02-01 02:22 -------
Created an attachment (id=73)
--> (https://bugs.openfabrics.org/attachment.cgi?id=73&action=view)
The file configuration for OFED
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From vlad at lists.openfabrics.org Thu Feb 1 02:23:03 2007
From: vlad at lists.openfabrics.org (vlad at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 02:23:03 -0800 (PST)
Subject: [openib-general] ofa_1_2_kernel 20070201-0200 daily build status
Message-ID: <20070201102303.B082FE607FA@openfabrics.org>
This email was generated automatically, please do not reply
Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod --with-addr_trans-mod --with-cxgb3-mod
Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.13
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.19
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16
Passed on powerpc with linux-2.6.17
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.14
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on powerpc with linux-2.6.13
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.16
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.18
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.13
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.12
Failed:
Build failed on ia64 with linux-2.6.16.21-0.8-default
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070201-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
/home/vlad/tmp/ofa_1_2_kernel-20070201-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
/home/vlad/tmp/ofa_1_2_kernel-20070201-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070201-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070201-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070201-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070201-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
make: *** [kernel] Error 2
----------------------------------------------------------------------------------
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 02:30:18 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 02:30:18 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201103018.4DA5EE607F7@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
dmitry.yulov at intel.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |INVALID
------- Comment #6 from dmitry.yulov at intel.com 2007-02-01 02:30 -------
Hi,
Thanks a lot for explanation.
I have some comments for you:
First of all I need to run build script to make RPMS. I use the build.sh script
to do this. Also I need to build all packages from sources. I have attached the
my file configuration to build rpms and I see some difference from your file.
> rpm -q open-iscsi
The fale was presented before I run built RPMS
> find /usr/src/linux-2.6.16.21-0.8 -name libiscsi.h
The file has presented on my machine
> Check that kernel_patches/backport/2.6.16_sles10/include_libiscsi.patch was
applied successfully.
I checked it and patch appalied sucess.
Could you please try to build OFED-1.1.1-ib_local_sa from source using for it
my file configuration not your? I get OFED-1.1.1-ib_local_sa from
https://svn.openfabrics.org/svn/openib/gen2/branches/1.1/ofed/releases/OFED-1.1.1-ib_local_sa.tgz.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From kliteyn at dev.mellanox.co.il Thu Feb 1 02:35:01 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 01 Feb 2007 12:35:01 +0200
Subject: [openib-general] [PATCH] osm: trivial casting for compilation on
windows
Message-ID: <45C1C255.4060405@dev.mellanox.co.il>
Trivial casting for compilation on windows
Signed-off-by: Yevgeny Kliteynik
---
osm/opensm/osm_subnet.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/osm/opensm/osm_subnet.c b/osm/opensm/osm_subnet.c
index f2e909b..e4e69c0 100644
--- a/osm/opensm/osm_subnet.c
+++ b/osm/opensm/osm_subnet.c
@@ -562,7 +562,7 @@ __osm_subn_opts_unpack_uint16(
if (!strcmp(p_req_key, p_key))
{
- val = strtoul(p_val_str, NULL, 0);
+ val = (uint16_t)strtoul(p_val_str, NULL, 0);
if (val != *p_val)
{
char buff[128];
--
1.4.4.1.GIT
From dotanb at dev.mellanox.co.il Thu Feb 1 02:41:25 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Thu, 01 Feb 2007 12:41:25 +0200
Subject: [openib-general] IB/mthca: question about HCA profile module
parameters
Message-ID: <45C1C3D5.1050301@dev.mellanox.co.il>
Hi Moni.
I tried to use the mthca module parameter: for example i tried to change
the number of QPs.
I got several failures when i used the HCA 25204:
* sometimes i got the following error message (when using big values,
for example 512K QPs):
ib_mthca: 0000:0c: INIT_HCA command failed aborting.
ib_mthca: probe of 0000:0c: failed with error -16
* when i tried to use small amount of QPs (1024) the machine just hanged
and i noticed a kernel oops message on the console
Did you verify the HCA profile module parameter feature?
Is there is any known limitation for the values that should be used?
(for example: only values which are power of two)
thanks
Dotan
From swise at opengridcomputing.com Thu Feb 1 02:53:25 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Thu, 01 Feb 2007 04:53:25 -0600
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <1170322670.654.23.camel@linux-q667.site>
References: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
<1170322670.654.23.camel@linux-q667.site>
Message-ID: <1170327205.654.34.camel@linux-q667.site>
On Thu, 2007-02-01 at 03:37 -0600, Steve WIse wrote:
> > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > before I created an ofed_1_2 branch (which contains the fix), and didn't update
> > to match my ofed_1_2 branch. The crash that you reported occurring over iWarp
> > should also happen over IB for the same reason, so both are likely broken atm...
> >
> > Vlad, can you please update the ofed build by pulling from the ofed_1_2 branches
> > of my rdma-dev.git and librdmacm.git trees?
>
> I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> you made there will resolve this issue. It just needs to be pulled into
> ofed_1_2.
>
Also, I just pulled down and built the latest ofed_1_2 kernel and user
code against 2.6.20-rc7, and the ucma abi is 4. So rdma_create_qp()
will still crash even with the librdmacm code to avoid the call to
rdma_init_qp_attr for ABI 3 kernels.
Steve.
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 03:04:17 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 03:04:17 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201110417.93A48E607F7@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
dmitry.yulov at intel.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|INVALID |
------- Comment #7 from dmitry.yulov at intel.com 2007-02-01 03:04 -------
I try to build the product again and i saw thet pathces from
kernel_patches/backport/2.6.16_sles10/ directory not applied automaticaly. When
I applay these patch manually all built. How I can run build process with
automaticaly appaling patches?
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From ogerlitz at voltaire.com Thu Feb 1 03:10:48 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 01 Feb 2007 13:10:48 +0200
Subject: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K)
in kernel level fails
In-Reply-To: <20070201095003.GA15505@mellanox.co.il>
References: <45BF0575.9020507@dev.mellanox.co.il>
<45BF1866.3010807@voltaire.com>
<45C1B526.30101@voltaire.com> <20070201095003.GA15505@mellanox.co.il>
Message-ID: <45C1CAB8.2080806@voltaire.com>
Michael S. Tsirkin wrote:
>> As for the user space sharing of the same limitation, how about adding
>> to the --kernel-- struct ib_device_attr "for user space" buddy fields to
>> max_qp_wr max_srq_wr and max_cqe such that each hw driver set both
>> values: for the "user space" field the actual hw limitation and for
>> "kernel space" field a value which would pass kmalloc.
> We could do that I guess but no one so far used query in kernel,
> and userspace values are currently good.
srp calls ibv_device_query but does not care for these fields, as for
IPoIB CM if you see things as in my other email, i guess you don't need
to query as well.
However, as this is a kind of easy to implement change which does not
break the user kernel ABI and allows kernel consumers to count on query
results they got from the hw driver, going longer term i think we do
want to have it done.
Or.
From kliteyn at dev.mellanox.co.il Thu Feb 1 03:48:48 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 01 Feb 2007 13:48:48 +0200
Subject: [openib-general] [PATCH] osm: some trivial chages in the
osm_ucast_lash for compilation on windows
Message-ID: <45C1D3A0.7060201@dev.mellanox.co.il>
Hi Hal,
This patch has some trivial changes in the osm_ucast_lash.c
for compilation on windows.
In general, this file needs a major cosmetic (and not only)
patch to fit better into the OSM code. Will get back to it
at some point in the future.
-- Yevgeny
Signed-off-by: Yevgeny Kliteynik
---
osm/opensm/osm_ucast_lash.c | 80 ++++++++++++++++++++++--------------------
1 files changed, 42 insertions(+), 38 deletions(-)
diff --git a/osm/opensm/osm_ucast_lash.c b/osm/opensm/osm_ucast_lash.c
index 70e5cbe..95f3ec9 100644
--- a/osm/opensm/osm_ucast_lash.c
+++ b/osm/opensm/osm_ucast_lash.c
@@ -217,6 +217,8 @@ static uint8_t find_port_from_lid(IN con
uint8_t port_count = 0;
uint8_t i=0;
osm_physp_t *p_current_physp, *p_remote_physp = NULL;
+ ib_port_info_t *port_info;
+ ib_net16_t port_lid;
uint8_t egress_port = 255;
@@ -227,8 +229,8 @@ static uint8_t find_port_from_lid(IN con
// process management port first
p_current_physp = osm_node_get_physp_ptr(p_sw->p_node, 0);
- ib_port_info_t *port_info = &p_current_physp->port_info;
- ib_net16_t port_lid = port_info->base_lid;
+ port_info = &p_current_physp->port_info;
+ port_lid = port_info->base_lid;
if (port_lid == lid_no) {
egress_port = 0;
goto Exit;
@@ -294,15 +296,15 @@ static int cycle_exists(cdg_vertex_t * s
} else {
if(current == NULL) {
current = start;
- assert(prev == NULL);
+ CL_ASSERT(prev == NULL);
}
current->visiting_number = visit_num;
if(prev != NULL) {
prev->next = current;
- assert(prev->to == current->from);
- assert(prev->visiting_number > 0);
+ CL_ASSERT(prev->to == current->from);
+ CL_ASSERT(prev->visiting_number > 0);
}
new_visit_num = visit_num + 1;
@@ -346,7 +348,7 @@ static void remove_semipermanent_depend_
while(sw != dest_switch){
v = cdg_vertex_matrix[lane][sw][i_next_switch];
- assert(v != NULL);
+ CL_ASSERT(v != NULL);
if(v->num_using_vertex == 1) {
@@ -366,7 +368,7 @@ static void remove_semipermanent_depend_
depend = i;
}
- assert(found);
+ CL_ASSERT(found);
if(v->num_using_this_depend[depend] == 1) {
for(i=depend; inum_dependencies-1; i++) {
@@ -403,7 +405,7 @@ static void enqueue(lash_t *p_lash, int
switch_t **switches = p_lash->switches;
q_item_t *q_head;
- assert(switches[sw]->q_member == 0);
+ CL_ASSERT(switches[sw]->q_member == 0);
switches[sw]->q_member = 1;
switches[sw]->dist = dist;
switches[sw]->prev = prev;
@@ -454,7 +456,7 @@ static void dequeue(lash_t *p_lash, int
*dist = switches[q_min->sw]->dist;
*prev = switches[q_min->sw]->prev;
- assert(switches[q_min->sw]->q_member == 1 && !switches[q_min->sw]->mst_member);
+ CL_ASSERT(switches[q_min->sw]->q_member == 1 && !switches[q_min->sw]->mst_member);
switches[q_min->sw]->q_member = 0;
free(q_min);
}
@@ -468,12 +470,11 @@ static void dequeue(lash_t *p_lash, int
static int get_phys_connection(switch_t **switches, int switch_from, int switch_to)
{
- int i = 0;
+ unsigned int i = 0;
for (i = 0; i < switches[switch_from]->num_connections; i++)
if(switches[switch_from]->phys_connections[i] == switch_to)
return i;
- assert(1==1);
return i;
}
@@ -557,7 +558,7 @@ static void generate_routing_func_for_ms
i_dest = i_dest->next;
}
- assert(prev->next == NULL);
+ CL_ASSERT(prev->next == NULL);
prev->next = concat_dest;
concat_dest = dest;
}
@@ -590,10 +591,9 @@ static void generate_cdg_for_sp(lash_t*p
while(sw != dest_switch) {
if(cdg_vertex_matrix[lane][sw][next_switch] == NULL) {
+ unsigned i;
v = create_cdg_vertex(num_switches);
- int i;
-
for(i=0; idependency[i] = NULL;
v->num_using_this_depend[i] = 0;
@@ -630,7 +630,7 @@ static void generate_cdg_for_sp(lash_t*p
prev->num_using_this_depend[prev->num_dependencies]++;
prev->num_dependencies++;
- assert(prev->num_dependencies < num_switches);
+ CL_ASSERT(prev->num_dependencies < (int)num_switches);
if(prev->temp==0)
prev->num_temp_depend++;
@@ -642,7 +642,7 @@ static void generate_cdg_for_sp(lash_t*p
output_link = switches[sw]->routing_table[dest_switch].out_link;
if(sw != dest_switch) {
- assert(output_link != NONE);
+ CL_ASSERT(output_link != NONE);
next_switch = switches[sw]->phys_connections[output_link];
}
@@ -670,7 +670,7 @@ static void set_temp_depend_to_permanent
while(sw != dest_switch) {
v = cdg_vertex_matrix[lane][sw][next_switch];
- assert(v != NULL);
+ CL_ASSERT(v != NULL);
if(v->temp == 1) {
v->temp = 0;
@@ -706,13 +706,13 @@ static void remove_temp_depend_for_sp(la
while(sw != dest_switch) {
v = cdg_vertex_matrix[lane][sw][next_switch];
- assert(v != NULL);
+ CL_ASSERT(v != NULL);
if(v->temp==1) {
cdg_vertex_matrix[lane][sw][next_switch] = NULL;
free(v);
} else {
- assert(v->num_temp_depend <= v->num_dependencies);
+ CL_ASSERT(v->num_temp_depend <= v->num_dependencies);
v->num_dependencies = v->num_dependencies - v->num_temp_depend;
v->num_temp_depend = 0;
v->num_using_vertex--;
@@ -744,7 +744,8 @@ static void balance_virtual_lanes(lash_t
int *num_mst_in_lane = p_lash->num_mst_in_lane;
int ***virtual_location = p_lash->virtual_location;
int min_filled_lane, max_filled_lane, medium_filled_lane, trials;
- int old_min_filled_lane, old_max_filled_lane, i, j, new_num_min_lane, new_num_max_lane;
+ int old_min_filled_lane, old_max_filled_lane, new_num_min_lane, new_num_max_lane;
+ unsigned int i, j;
int src, dest, start, next_switch, output_link;
int stop = 0, cycle_found;
@@ -788,7 +789,7 @@ static void balance_virtual_lanes(lash_t
output_link = p_lash->switches[src]->routing_table[dest].out_link;
next_switch = p_lash->switches[src]->phys_connections[output_link];
- assert(cdg_vertex_matrix[min_filled_lane][src][next_switch] != NULL);
+ CL_ASSERT(cdg_vertex_matrix[min_filled_lane][src][next_switch] != NULL);
cycle_found = cycle_exists(cdg_vertex_matrix[min_filled_lane][src][next_switch], NULL, NULL, 1);
for(i=0; inum_switches;
switch_t *sw;
- int i;
+ unsigned int i;
sw = malloc(sizeof(*sw));
if (!sw)
@@ -926,7 +927,7 @@ static void switch_delete(switch_t *sw)
static void free_lash_structures(lash_t *p_lash)
{
- int i,j,k;
+ unsigned int i,j,k;
unsigned num_switches = p_lash->num_switches;
osm_log_t *p_log = &p_lash->p_osm->log;
@@ -988,12 +989,11 @@ static int init_lash_structures(lash_t *
unsigned vl_min = p_lash->vl_min;
unsigned num_switches = p_lash->num_switches;
osm_log_t *p_log = &p_lash->p_osm->log;
+ int status = IB_SUCCESS;
+ unsigned int i, j, k;
OSM_LOG_ENTER( p_log, init_lash_structures);
- int status = IB_SUCCESS;
- int i, j, k;
-
// initialise cdg_vertex_matrix[num_switches][num_switches][num_switches]
p_lash->cdg_vertex_matrix = (cdg_vertex_t****)malloc(vl_min * sizeof(cdg_vertex_t ****));
for (i = 0; i < vl_min; i++) {
@@ -1084,10 +1084,11 @@ static int lash_core(lash_t *p_lash)
unsigned num_switches = p_lash->num_switches;
switch_t **switches = p_lash->switches;
unsigned lanes_needed = 1;
- int i, j, k, dest_switch = 0;
+ unsigned int i, j, k, dest_switch = 0;
reachable_dest_t * dests, * idest;
int cycle_found = 0;
- int v_lane, stop = 0, output_link, i_next_switch;
+ unsigned v_lane;
+ int stop = 0, output_link, i_next_switch;
int status = IB_SUCCESS;
OSM_LOG_ENTER( p_log, lash_core);
@@ -1113,7 +1114,7 @@ static int lash_core(lash_t *p_lash)
output_link = switches[i]->routing_table[dest_switch].out_link;
i_next_switch = switches[i]->phys_connections[output_link];
- assert(p_lash->cdg_vertex_matrix[v_lane][i][i_next_switch] != NULL);
+ CL_ASSERT(p_lash->cdg_vertex_matrix[v_lane][i][i_next_switch] != NULL);
cycle_found = cycle_exists(p_lash->cdg_vertex_matrix[v_lane][i][i_next_switch], NULL, NULL, 1);
for(j=0; jsw_guid_tbl )) {
+ uint64_t current_guid;
+ switch_t *sw;
p_sw = p_next_sw;
p_next_sw = (osm_switch_t*)cl_qmap_next( &p_sw->map_item );
max_lid_ho = osm_switch_get_max_lid_ho(p_sw);
- uint64_t current_guid = p_sw->p_node->node_info.port_guid;
- switch_t *sw = p_sw->priv;
+ current_guid = p_sw->p_node->node_info.port_guid;
+ sw = p_sw->priv;
memset(p_osm->sm.ucast_mgr.lft_buf, 0xff, IB_LID_UCAST_END_HO + 1);
@@ -1244,8 +1247,8 @@ static void populate_fwd_tbls(lash_t *p_
cl_ntoh64(current_guid), -1, egress_port);
} else {
unsigned dst_lash_switch_id = get_lash_id(p_dst_sw);
- uint8_t lash_egress_port = sw->routing_table[dst_lash_switch_id].out_link;
- uint8_t physical_egress_port = sw->virtual_physical_port_table[lash_egress_port];
+ uint8_t lash_egress_port = (uint8_t)sw->routing_table[dst_lash_switch_id].out_link;
+ uint8_t physical_egress_port = (uint8_t)sw->virtual_physical_port_table[lash_egress_port];
p_osm->sm.ucast_mgr.lft_buf[lid] = physical_egress_port;
osm_log(p_log, OSM_LOG_DEBUG,
@@ -1366,7 +1369,7 @@ static void lash_cleanup(lash_t *p_lash)
if (p_lash->switches) {
unsigned id;
- for (id = 0; id < p_lash->num_switches ; id++)
+ for (id = 0; ((int)id) < p_lash->num_switches ; id++)
if (p_lash->switches[id])
switch_delete(p_lash->switches[id]);
free(p_lash->switches);
@@ -1400,6 +1403,7 @@ static int discover_network_properties(l
p_next_sw = (osm_switch_t*)cl_qmap_head( &p_subn->sw_guid_tbl );
while(p_next_sw != (osm_switch_t*)cl_qmap_end( &p_subn->sw_guid_tbl ) ) {
+ uint16_t port_count;
p_sw = p_next_sw;
p_next_sw = (osm_switch_t*)cl_qmap_next( &p_sw->map_item );
@@ -1408,7 +1412,7 @@ static int discover_network_properties(l
return -1;
id++;
- uint16_t port_count = osm_node_get_num_physp (p_sw->p_node);
+ port_count = osm_node_get_num_physp (p_sw->p_node);
// Note, ignoring port 0. management port
for (i=1; ip_remote_physp) {
ib_port_info_t *p_port_info = &p_current_physp->port_info;
- int port_vl_min = ib_port_info_get_op_vls(p_port_info);
+ uint8_t port_vl_min = ib_port_info_get_op_vls(p_port_info);
if (port_vl_min && port_vl_min < vl_min)
vl_min = port_vl_min;
}
@@ -1508,7 +1512,7 @@ static void lash_delete(void *context)
lash_t *p_lash = context;
if (p_lash->switches) {
unsigned id;
- for (id = 0; id < p_lash->num_switches ; id++)
+ for (id = 0; ((int)id) < p_lash->num_switches ; id++)
if (p_lash->switches[id])
switch_delete(p_lash->switches[id]);
free(p_lash->switches);
@@ -1534,7 +1538,7 @@ uint8_t osm_get_lash_sl(osm_opensm_t *p_
if (!p_sw || !p_sw->priv)
return OSM_DEFAULT_SL;
- return ((switch_t *)p_sw->priv)->routing_table[dst_id].lane;
+ return (uint8_t)((switch_t *)p_sw->priv)->routing_table[dst_id].lane;
}
int osm_ucast_lash_setup(osm_opensm_t *p_osm)
--
1.4.4.1.GIT
From vlad at dev.mellanox.co.il Thu Feb 1 03:58:16 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Thu, 01 Feb 2007 13:58:16 +0200
Subject: [openib-general] MVAPICH2 SRPM and install file patches
In-Reply-To: <45C14344.9010602@cse.ohio-state.edu>
References: <45C14344.9010602@cse.ohio-state.edu>
Message-ID: <1170331096.6114.4.camel@vladsk-laptop>
On Wed, 2007-01-31 at 20:32 -0500, Shaun Rowland wrote:
> I've placed the MVAPICH2 SRPM on the OFA server in ~rowland/ofed_1_2,
> and it is linked to here:
>
> http://www.openfabrics.org/~rowland/ofed_1_2/
ofed_1_2_scripts.patch applied.
Thanks,
--
Vladimir Sokolovsky
Mellanox Technologies Ltd.
From ogerlitz at voltaire.com Thu Feb 1 04:09:11 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Thu, 01 Feb 2007 14:09:11 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <20070201090628.GC14189@mellanox.co.il>
References: <1170275331.14294.1.camel@stevo-desktop>
<45C1ABD0.5090404@voltaire.com> <20070201090628.GC14189@mellanox.co.il>
Message-ID: <45C1D867.4030208@voltaire.com>
Michael S. Tsirkin wrote:
>> 1) adding a backport to the rdma_cm containing ip_ib_mc_map, period.
>> 2) having the rdma cm follow the net stack and make its consumer use the
>> broadcast group.
> Correct. Since multicast is broken in other respects on U4
> (sockets can't join multicast groups), I think 2 is the simplest approach.
The situation in U4 is kind of more involved, sockets doing
IP_ADD_MEMBERSHIP to some multicast group are actually sending and
receiving traffic over the IPoIB broadcast group which makes this
cluster IPoIB kind of hell.
> Anyone who wants IPoIB milticast should just stay away from U4.
We are still interested to be able to run our multicast app over the
RDMA CM and we want it to be done over the correct multicast group and
not over a broadcast group. So option 2 is real problem for us.
Or.
From mst at mellanox.co.il Thu Feb 1 04:10:08 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 14:10:08 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070125191321.30934.74542.stgit@dell3.ogc.int>
References: <20070125191321.30934.74542.stgit@dell3.ogc.int>
Message-ID: <20070201121008.GA20789@mellanox.co.il>
> Quoting Steve Wise :
> Subject: [PATCH 00/12] ofed_1_2 - Neighbour update support
>
>
> Michael/Vlad:
>
> Here are the backports for snooping arp packets to generate neighbour
> update netevents. Also included is the addr.c patch to act on all valid
> neigh update events. If this series looks good to you then I'll push
> this up and you all can pull it from my git tree.
This patches seems to have created a reference leak on each neighbour
as a result ipoib interface could not be brought down.
It also seems that RHASU2 backport was missing code.
I pushed out the following:
commit d140398db0da0beb3172e0ccf14ef3023cafec9c
Author: Michael S. Tsirkin
Date: Thu Feb 1 12:21:34 2007 +0200
Fix neighbour reference leak in netevent.c
Signed-off-by: Michael S. Tsirkin
diff --git a/kernel_addons/backport/2.6.11/include/src/netevent.c b/kernel_addons/backport/2.6.11/include/src/netevent.c
index 6a8df29..0d26662 100644
--- a/kernel_addons/backport/2.6.11/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.11/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.12/include/src/netevent.c b/kernel_addons/backport/2.6.12/include/src/netevent.c
index 6a8df29..0d26662 100644
--- a/kernel_addons/backport/2.6.12/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.12/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.13/include/src/netevent.c b/kernel_addons/backport/2.6.13/include/src/netevent.c
index 6a8df29..0d26662 100644
--- a/kernel_addons/backport/2.6.13/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.13/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.14/include/src/netevent.c b/kernel_addons/backport/2.6.14/include/src/netevent.c
index 188283c..17a12ff 100644
--- a/kernel_addons/backport/2.6.14/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.14/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.15/include/src/netevent.c b/kernel_addons/backport/2.6.15/include/src/netevent.c
index 188283c..17a12ff 100644
--- a/kernel_addons/backport/2.6.15/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.15/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c b/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c
index 188283c..17a12ff 100644
--- a/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.16/include/src/netevent.c b/kernel_addons/backport/2.6.16/include/src/netevent.c
index 188283c..17a12ff 100644
--- a/kernel_addons/backport/2.6.16/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.16/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c b/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c
index 188283c..17a12ff 100644
--- a/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.17/include/src/netevent.c b/kernel_addons/backport/2.6.17/include/src/netevent.c
index 26a0920..4c67de1 100644
--- a/kernel_addons/backport/2.6.17/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.17/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c b/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
index 57a23ab..90fce0c 100644
--- a/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
@@ -39,8 +39,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
diff --git a/kernel_addons/backport/2.6.9_U2/include/src/netevent.c b/kernel_addons/backport/2.6.9_U2/include/src/netevent.c
index 5ffadd1..1589300 100644
--- a/kernel_addons/backport/2.6.9_U2/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.9_U2/include/src/netevent.c
@@ -13,10 +13,59 @@
* Fixes:
*/
-#include
-#include
#include
#include
+#include
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+static DEFINE_MUTEX(lock);
+static int count;
+
+static void destructor(struct sk_buff *skb)
+{
+ struct neighbour *n;
+ u8 *arp_ptr;
+ __be32 gw;
+
+ /* Pull the SPA */
+ arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
+ memcpy(&gw, arp_ptr, 4);
+ n = neigh_lookup(&arp_tbl, &gw, skb->dev);
+ if (n) {
+ call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
+ return;
+}
+
+static int arp_recv(struct sk_buff *skb, struct net_device *dev,
+ struct packet_type *pkt)
+{
+ struct arphdr *arp_hdr;
+ u16 op;
+
+ arp_hdr = (struct arphdr *) skb->nh.raw;
+ op = ntohs(arp_hdr->ar_op);
+
+ if ((op == ARPOP_REQUEST || op == ARPOP_REPLY) && !skb->destructor)
+ skb->destructor = destructor;
+
+ kfree_skb(skb);
+ return 0;
+}
+
+static struct packet_type arp = {
+ .type = __constant_htons(ETH_P_ARP),
+ .func = arp_recv,
+ .af_packet_priv = (void *)1,
+};
static struct notifier_block *netevent_notif_chain;
@@ -34,6 +83,12 @@ int register_netevent_notifier(struct notifier_block *nb)
int err;
err = notifier_chain_register(&netevent_notif_chain, nb);
+ if (!err) {
+ mutex_lock(&lock);
+ if (count++ == 0)
+ dev_add_pack(&arp);
+ mutex_unlock(&lock);
+ }
return err;
}
@@ -49,7 +104,16 @@ int register_netevent_notifier(struct notifier_block *nb)
int unregister_netevent_notifier(struct notifier_block *nb)
{
- return notifier_chain_unregister(&netevent_notif_chain, nb);
+ int err;
+
+ err = notifier_chain_unregister(&netevent_notif_chain, nb);
+ if (!err) {
+ mutex_lock(&lock);
+ if (--count == 0)
+ dev_remove_pack(&arp);
+ mutex_unlock(&lock);
+ }
+ return err;
}
/**
diff --git a/kernel_addons/backport/2.6.9_U3/include/src/netevent.c b/kernel_addons/backport/2.6.9_U3/include/src/netevent.c
index 5ffadd1..1589300 100644
--- a/kernel_addons/backport/2.6.9_U3/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.9_U3/include/src/netevent.c
@@ -13,10 +13,59 @@
* Fixes:
*/
-#include
-#include
#include
#include
+#include
+#include
+#include
+#include
+
+#include
+#include
+#include
+#include
+
+static DEFINE_MUTEX(lock);
+static int count;
+
+static void destructor(struct sk_buff *skb)
+{
+ struct neighbour *n;
+ u8 *arp_ptr;
+ __be32 gw;
+
+ /* Pull the SPA */
+ arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
+ memcpy(&gw, arp_ptr, 4);
+ n = neigh_lookup(&arp_tbl, &gw, skb->dev);
+ if (n) {
+ call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
+ return;
+}
+
+static int arp_recv(struct sk_buff *skb, struct net_device *dev,
+ struct packet_type *pkt)
+{
+ struct arphdr *arp_hdr;
+ u16 op;
+
+ arp_hdr = (struct arphdr *) skb->nh.raw;
+ op = ntohs(arp_hdr->ar_op);
+
+ if ((op == ARPOP_REQUEST || op == ARPOP_REPLY) && !skb->destructor)
+ skb->destructor = destructor;
+
+ kfree_skb(skb);
+ return 0;
+}
+
+static struct packet_type arp = {
+ .type = __constant_htons(ETH_P_ARP),
+ .func = arp_recv,
+ .af_packet_priv = (void *)1,
+};
static struct notifier_block *netevent_notif_chain;
@@ -34,6 +83,12 @@ int register_netevent_notifier(struct notifier_block *nb)
int err;
err = notifier_chain_register(&netevent_notif_chain, nb);
+ if (!err) {
+ mutex_lock(&lock);
+ if (count++ == 0)
+ dev_add_pack(&arp);
+ mutex_unlock(&lock);
+ }
return err;
}
@@ -49,7 +104,16 @@ int register_netevent_notifier(struct notifier_block *nb)
int unregister_netevent_notifier(struct notifier_block *nb)
{
- return notifier_chain_unregister(&netevent_notif_chain, nb);
+ int err;
+
+ err = notifier_chain_unregister(&netevent_notif_chain, nb);
+ if (!err) {
+ mutex_lock(&lock);
+ if (--count == 0)
+ dev_remove_pack(&arp);
+ mutex_unlock(&lock);
+ }
+ return err;
}
/**
diff --git a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
index 6a8df29..0d26662 100644
--- a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
@@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
memcpy(&gw, arp_ptr, 4);
n = neigh_lookup(&arp_tbl, &gw, skb->dev);
- if (n)
+ if (n) {
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
+ neigh_release(n);
+ }
return;
}
--
MST
From mst at mellanox.co.il Thu Feb 1 04:19:30 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 14:19:30 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070125191321.30934.74542.stgit@dell3.ogc.int>
References: <20070125191321.30934.74542.stgit@dell3.ogc.int>
Message-ID: <20070201121930.GB20789@mellanox.co.il>
> Here are the backports for snooping arp packets to generate neighbour
> update netevents.
OK, I went (somewhat belatedly) over this code in more depth and I see
a couple of issues that I'd like you to address:
- There's some trailing whitespace in some netevet.c files.
Could you clean these please?
- I see:
$ diff ./kernel_addons/backport/2.6.9_U4/include/src/netevent.c
kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
> #include
Should not redhat backports include skbuff.h too?
They do use skbuff struct so it seems it is cleaner to include
directly, and we would get identical code for redhat and suse.
- What is the reason for:
if ((op == ARPOP_REQUEST || op == ARPOP_REPLY) && !skb->destructor)
skb->destructor = destructor;
kfree_skb(skb);
Could we miss events because skb has a desctructor?
Can we just call the descructor function directly (this is what addr.c
did previously, and this apparently worked fine).
Steve, could you pls clone ofed git and address these?
--
MST
From glebn at voltaire.com Thu Feb 1 04:42:30 2007
From: glebn at voltaire.com (glebn at voltaire.com)
Date: Thu, 1 Feb 2007 14:42:30 +0200
Subject: [openib-general] [RFC/BUG] libibverbs: DMA vs. CQ race
In-Reply-To:
References:
Message-ID: <20070201124230.GA23354@minantech.com>
On Mon, Jan 29, 2007 at 01:49:04PM -0800, Roland Dreier wrote:
> Even with that resolved this all seems rather unfortunate to me. I
> don't like the idea of having the kernel keep all these buffers around
> and then have the userspace library have to map the right buffer. It
> leads to awkwardness like the fact that mthca_resize_cq() seems to be
> totally screwed if ibv_cmd_resize_cq() fails for some reason -- it
> already munmap'ed the original buffer, and it can't map the new
> buffer, and so the CQ is dead with no chance to recover.
I looked through ehca driver and it looks as it is doing exactly this
"keep all these buffers around and then have the userspace library have
to map the right buffer". ehca doesn't support resize_cq though, but
lest say this issue will be also resolved will this approach be
acceptable. This is how ehca works after all, so we are not inventing
something new here.
>
> The really strange thing about this is that this Altix
> coherent/consistent memory really isn't about the memory itself, but
> about the relationship of that memory with DMA elsewhere -- as I
> understand the code, doing dma_alloc_coherent() returns normal memory
> with a special DMA address that tells the system to flush other DMAs
> before doing DMA to the coherent region. Which isn't really what most
> people understand coherent memory to be, but it has the magic property
> of making most drivers work.
Yes. It seems Altix abuses dma_alloc_coherent() for this.
>
> So I'd really like a better solution, but I don't have one in mind
> unfortunately. Maybe we can all meditate on this and try to come up
> with something cleaner -- I really hope there is a better way to
> handle this.
>
Another approach may be to add another verbs (or we can make ibv_reg_mr
do this with special flag) for coherent memory allocation. This verb
will allocate coherent memory in the kernel and mmap it from a user space.
Than cq will be created as usual by providing lkey to the create_cq
verb. The resize will work exactly like it works now i.e allocate new cq
buffer call resize_cq with new buffer's lkey, copy cqes, unregister old buffer.
--
Gleb.
From mst at mellanox.co.il Thu Feb 1 04:42:11 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 14:42:11 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070201121930.GB20789@mellanox.co.il>
References: <20070125191321.30934.74542.stgit@dell3.ogc.int>
<20070201121930.GB20789@mellanox.co.il>
Message-ID: <20070201124211.GD20789@mellanox.co.il>
> - There's some trailing whitespace in some netevet.c files.
> Could you clean these please?
OK, fixed the trailing whitespace and pushed out.
--
MST
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 05:02:09 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 05:02:09 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201130209.CF235E607F7@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #8 from erezz at voltaire.com 2007-02-01 05:02 -------
(In reply to comment #7)
> I try to build the product again and i saw thet pathces from
> kernel_patches/backport/2.6.16_sles10/ directory not applied automaticaly. When
> I applay these patch manually all built. How I can run build process with
> automaticaly appaling patches?
>
What is the output of uname -a ?
on my machine:
thyme:/tmp/ofed_sa_test/OFED-1.1.1-ib_local_sa # uname -a
Linux thyme 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64
x86_64 GNU/Linux
Try the following:
Edit ofed_scripts/configure and add the line: "echo ${KVERSION}" where the
switch starts in line 214. See what happens in case 2.6.16*.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From wombat2 at us.ibm.com Thu Feb 1 05:21:36 2007
From: wombat2 at us.ibm.com (Bernard King-Smith)
Date: Thu, 1 Feb 2007 08:21:36 -0500
Subject: [openib-general] [mthca] Creation of a SRQ with many WR (> 16K)
in kernel level fails
In-Reply-To:
Message-ID:
> ----- Message from "Or Gerlitz" on Thu, 01 Feb
2007 11:17:53 +0200 -----
>
> Dotan Barak wrote:
> > I think that now, when implementation of IPoIB CM is available and SRQ
> > is being used, one may
> > need to use a SRQ with more than 16K WRs.
>
> IPoIB UD uses SRQ by nature (since RX from all peers consume buffers
> from the --only-- RQ) and lives fine with 32 buffers (or 64 you can look
> in the code). Moreover, my assumption is that
>
> pps(RC) <= pps(UC) <= pps(UD)
>
> this means that what ever number of RX buffer for UD/2K MTU which is
> "enough" to have no (or close to zero) packet loss under some traffic
> pattern, the same pattern can be served with IPoIB CM using SRQ of the
> same size.
I would expect that you will need more than 32 or 64 buffers using RC and
SRQ. With larger packets it takes longer to do receive processing on each
packet under RC. Larger packets means it takes more time to do checksum
and copy to the socket because of up to 60K or data vs. 2K. The residency
time on the receive queue will be longer. In the traffic pattern where one
adapter is receiving from many adapters over the fabric, there will be a
larger imbalance between sender rate vs. the receiving rate out of the
queue. Given a large enough TCP send and receive window for a single
socket to get peak bandwidth, muliple sockets will have more packet in
flight for a single destination at the same time in this pattern
>
> Or.
>
>
>
Bernie King-Smith
IBM Corporation
Server Group
Cluster System Performance
wombat2 at us.ibm.com (845)433-8483
Tie. 293-8483 or wombat2 on NOTES
"We are not responsible for the world we are born into, only for the world
we leave when we die.
So we have to accept what has gone before us and work to change the only
thing we can,
-- The Future." William Shatner
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From mst at mellanox.co.il Thu Feb 1 05:55:22 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 15:55:22 +0200
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <1170322670.654.23.camel@linux-q667.site>
References: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
<1170322670.654.23.camel@linux-q667.site>
Message-ID: <20070201135522.GA27688@mellanox.co.il>
> Quoting Steve WIse :
> Subject: Re: [PATCH] RE: regression in ofed 1.2
>
> > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > before I created an ofed_1_2 branch (which contains the fix), and didn't update
> > to match my ofed_1_2 branch. The crash that you reported occurring over iWarp
> > should also happen over IB for the same reason, so both are likely broken atm...
> >
> > Vlad, can you please update the ofed build by pulling from the ofed_1_2 branches
> > of my rdma-dev.git and librdmacm.git trees?
>
> I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> you made there will resolve this issue. It just needs to be pulled into
> ofed_1_2.
OK, I've updated ofed to code from rdma-dev ofed_1_2 branch. Some notes:
- Sean, please base your branches on specific -rc from linus
(OFED 1.2 is now -rc7).
- Now that we are entering feature freeze, we should not do full replaces anymore.
So Sean, please post incremental patches, labeled ofed-1.2 clearly.
--
MST
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 05:57:29 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 05:57:29 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201135729.C10E3E607F7@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #9 from dmitry.yulov at intel.com 2007-02-01 05:57 -------
> Edit ofed_scripts/configure and add the line: "echo ${KVERSION}" where the
> switch starts in line 214. See what happens in case 2.6.16*.
When I try to run build.sh I see in log file:
Applying patches for 2.6.16.21-0.8-smp kernel:
/var/tmp/OFEDRPM/BUILD/openib-1.1/kernel_patches/backport/2.6.16/addr_1_netevents_revert_to_2_6_17.patch
patching file drivers/infiniband/core/addr.c
/var/tmp/OFEDRPM/BUILD/openib-1.1/kernel_patches/backport/2.6.16/ipath-backport.patch
patching file drivers/infiniband/hw/ipath/iowrite32_copy_x86_64.S
patching file drivers/infiniband/hw/ipath/ipath_backport.h
patching file drivers/infiniband/hw/ipath/ipath_diag.c
patching file drivers/infiniband/hw/ipath/ipath_driver.c
As I understand in this case used directory 2.6.16 not 2.6.16_suse10.
I try to add in build.sh script the option
configure_options="$configure_options
--with-patchdir=/root/install/OFED-1.1.1-ib_local_sa/2.6.16_sles10"
But in this case build process broken. I don't know how I can add the patching
procedure in build.sh for patch cma.c file and kernel. Do you have any ideas?
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From swise at opengridcomputing.com Thu Feb 1 05:57:28 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 1 Feb 2007 07:57:28 -0600
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
References: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
<1170322670.654.23.camel@linux-q667.site>
<1170327205.654.34.camel@linux-q667.site>
<20070201135619.GB27688@mellanox.co.il>
Message-ID: <000e01c74608$e9b4a040$020010ac@haggard>
>> >
>>
>> Also, I just pulled down and built the latest ofed_1_2 kernel and
>> user
>> code against 2.6.20-rc7, and the ucma abi is 4. So rdma_create_qp()
>> will still crash even with the librdmacm code to avoid the call to
>> rdma_init_qp_attr for ABI 3 kernels.
>>
>>
>> Steve.
>
> I'm a bit confused. Can you please try with latest code I've just
> pushed out?
>
Will do. This was before you pulled in sean's code.
From mst at mellanox.co.il Thu Feb 1 05:56:19 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 15:56:19 +0200
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <1170327205.654.34.camel@linux-q667.site>
References: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
<1170322670.654.23.camel@linux-q667.site>
<1170327205.654.34.camel@linux-q667.site>
Message-ID: <20070201135619.GB27688@mellanox.co.il>
> Quoting Steve WIse :
> Subject: Re: [PATCH] RE: regression in ofed 1.2
>
> On Thu, 2007-02-01 at 03:37 -0600, Steve WIse wrote:
> > > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > > before I created an ofed_1_2 branch (which contains the fix), and didn't update
> > > to match my ofed_1_2 branch. The crash that you reported occurring over iWarp
> > > should also happen over IB for the same reason, so both are likely broken atm...
> > >
> > > Vlad, can you please update the ofed build by pulling from the ofed_1_2 branches
> > > of my rdma-dev.git and librdmacm.git trees?
> >
> > I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> > you made there will resolve this issue. It just needs to be pulled into
> > ofed_1_2.
> >
>
> Also, I just pulled down and built the latest ofed_1_2 kernel and user
> code against 2.6.20-rc7, and the ucma abi is 4. So rdma_create_qp()
> will still crash even with the librdmacm code to avoid the call to
> rdma_init_qp_attr for ABI 3 kernels.
>
>
> Steve.
I'm a bit confused. Can you please try with latest code I've just pushed out?
--
MST
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 06:15:18 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 06:15:18 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070201141518.A7561E607F7@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #10 from erezz at voltaire.com 2007-02-01 06:15 -------
(In reply to comment #9)
> > Edit ofed_scripts/configure and add the line: "echo ${KVERSION}" where the
> > switch starts in line 214. See what happens in case 2.6.16*.
> When I try to run build.sh I see in log file:
> Applying patches for 2.6.16.21-0.8-smp kernel:
What is the output of uname -r ? This is VERY important. Also, can you run `cat
/etc/issue` and send the results?
>
> /var/tmp/OFEDRPM/BUILD/openib-1.1/kernel_patches/backport/2.6.16/addr_1_netevents_revert_to_2_6_17.patch
> patching file drivers/infiniband/core/addr.c
>
> /var/tmp/OFEDRPM/BUILD/openib-1.1/kernel_patches/backport/2.6.16/ipath-backport.patch
> patching file drivers/infiniband/hw/ipath/iowrite32_copy_x86_64.S
> patching file drivers/infiniband/hw/ipath/ipath_backport.h
> patching file drivers/infiniband/hw/ipath/ipath_diag.c
> patching file drivers/infiniband/hw/ipath/ipath_driver.c
>
> As I understand in this case used directory 2.6.16 not 2.6.16_suse10.
This is not good. Try to debug ofed_scripts/configure and see what happens in
the switch in apply_backport_patches.
> I try to add in build.sh script the option
> configure_options="$configure_options
> --with-patchdir=/root/install/OFED-1.1.1-ib_local_sa/2.6.16_sles10"
Don't do that.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From halr at voltaire.com Thu Feb 1 06:16:59 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 01 Feb 2007 09:16:59 -0500
Subject: [openib-general] [PATCH] osm: some trivial chages in the
osm_ucast_lash for compilation on windows
In-Reply-To: <45C1D3A0.7060201@dev.mellanox.co.il>
References: <45C1D3A0.7060201@dev.mellanox.co.il>
Message-ID: <1170339359.15660.265762.camel@hal.voltaire.com>
Hi Yevgeny,
On Thu, 2007-02-01 at 06:48, Yevgeny Kliteynik wrote:
> Hi Hal,
>
> This patch has some trivial changes in the osm_ucast_lash.c
> for compilation on windows.
>
> In general, this file needs a major cosmetic (and not only)
> patch to fit better into the OSM code.
There will shortly be some work to improve this. This is one of the next
items on the list for this.
> Will get back to it at some point in the future.
Sure; this is not your problem but if you get to it first that will
help.
> -- Yevgeny
>
> Signed-off-by: Yevgeny Kliteynik
Thanks. Applied.
-- Hal
From halr at voltaire.com Thu Feb 1 06:32:35 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 01 Feb 2007 09:32:35 -0500
Subject: [openib-general] [PATCH] osm: trivial casting for compilation
on windows
In-Reply-To: <45C1C255.4060405@dev.mellanox.co.il>
References: <45C1C255.4060405@dev.mellanox.co.il>
Message-ID: <1170339465.15660.265845.camel@hal.voltaire.com>
On Thu, 2007-02-01 at 05:35, Yevgeny Kliteynik wrote:
> Trivial casting for compilation on windows
>
> Signed-off-by: Yevgeny Kliteynik
Thanks. Applied.
-- Hal
From steakdbini at yahoo.co.jp Thu Feb 1 07:20:31 2007
From: steakdbini at yahoo.co.jp ()
Date: Fri, 2 Feb 2007 00:20:31 +0900 (JST)
Subject: [openib-general]
=?ISO-2022-JP?B?g4GBW4OLgqCC6IKqgsaCpIKygrSCooLcgrWCvYH0?=
Message-ID: 20070202002015
お久し振りです。瑞奈です。
先日はメールありがとうございました。
返事が遅くなってしまい、申し訳ありません。
前のメールで質問されていた仕事の話ですが・・・
私は専業主婦なんです。
去年の12月からずっと家のことをやってて、それで忙しかったんです。
家事は楽しいんですが、さすがに疲れが・・・(><
こんな生活なので出会いもないし、誰かに甘えたくなっちゃう事も多くて。
それで、急にこんな事をいうと変に思われるかもしれませんが
一度会ってお話をしたいのですが、ご迷惑でしょうか?
私は世田谷区に住んでいる31歳です。
一緒にゴハンを食べたり、たくさんお話がしたいです♪
できれば今週末、新宿か渋谷あたりが私は都合がいいのですが
いかがでしょうか?
http://mic.chu.jp/mizuna/
最近、このサイトを利用しているので
ここからメールを下さいませんか?
mixiもやっているのですが、こちらの方が居心地がいいので
このサイトばかりを使ってます(^^;
それでは、お返事をお待ちしていますね。
瑞奈
From tziporet at mellanox.co.il Thu Feb 1 07:40:26 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Thu, 01 Feb 2007 17:40:26 +0200
Subject: [openib-general] components that have not opend the ofed_1_2 branch
Message-ID: <45C209EA.1040207@mellanox.co.il>
The following components have not opened ofed_1_2 branch:
* libibverbs - Roland
* libmthca - Roland
* libipathverbs - Bryan
* tvflash - Roland
* srptools - Ishai
* management - Hal
Please open the branch today or tomorrow at the latest .
Thanks,
Tziporet
From kliteyn at dev.mellanox.co.il Thu Feb 1 07:57:42 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Thu, 01 Feb 2007 17:57:42 +0200
Subject: [openib-general] [PATCH 10/10] osm: QoS in OpenSM
In-Reply-To: <1170344724.15660.271079.camel@hal.voltaire.com>
References: <45BF6548.80104@dev.mellanox.co.il>
<1170264561.15660.189494.camel@hal.voltaire.com>
<45C115D8.6070504@dev.mellanox.co.il>
<1170344724.15660.271079.camel@hal.voltaire.com>
Message-ID: <45C20DF6.6060809@dev.mellanox.co.il>
Hi Hal,
Hal Rosenstock wrote:
> Hi again Yevgeny,
>
> On Wed, 2007-01-31 at 17:19, Yevgeny Kliteynik wrote:
>
> [snip...]
>
>>>> + for (i = 0; i < IB_MAX_NUM_VLS; i++)
>>>> + {
>>>> + if (valid_sls[i])
>>>> + {
>>>> + vl = ib_slvl_table_get(p_slvl_tbl,i);
>>>> + if (vl == IB_DROP_VL)
>>> Does vl > Operational VLs need checking here or is it never set this way
>>> ?
>> I think that it would be better if the "setup" part would check it when
>> configuring sl2vl tables, and when VL > Operational VL it should set
>> some default value instead (VL15 looks as a good option).
>
> OK; but why scan all VLs if they are not supported ?
Agree, adding it to my ToDo list of improvements in QoS.
>>>> + valid_sls[i] = FALSE;
>>>> + }
>>>> + }
>>>> +
>>>> + /*
>>>> + * now get pointer to the destination port (same as above)
>>>> + */
>>>> + p_node = osm_physp_get_node_ptr( p_dest_physp );
>>>> +
>>>> + if( p_node->sw )
>>>> + {
>>>> + p_dest_physp = osm_switch_get_route_by_lid( p_node->sw, cl_ntoh16( dest_lid_ho ) );
>>>> + if ( p_dest_physp == 0 )
>>>> + {
>>>> + osm_log( p_rcv->p_log, OSM_LOG_ERROR,
>>>> + "__osm_pr_rcv_get_path_parms_qos: ERR 1F03: "
>>>> + "Cannot find routing to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n",
>>>> + dest_lid_ho,
>>>> + cl_ntoh64( osm_node_get_node_guid( p_node ) ) );
>>>> + status = IB_ERROR;
>>>> + goto Exit;
>>>> + }
>>>> + }
>>>> +
>>>> + /*
>>>> + * Now go through the path step by step
>>>> + */
>>>> +
>>>> + while( p_physp != p_dest_physp )
>>>> + {
>>>> + p_physp = osm_physp_get_remote( p_physp );
>>>> + if ( p_physp == 0 )
>>>> + {
>>>> + osm_log( p_rcv->p_log, OSM_LOG_ERROR,
>>>> + "__osm_pr_rcv_get_path_parms_qos: ERR 1F04: "
>>>> + "Cannot find remote phys port when routing to LID 0x%X from node GUID 0x%016" PRIx64 "\n",
>>>> + dest_lid_ho,
>>>> + cl_ntoh64( osm_node_get_node_guid( p_node ) ) );
>>>> + status = IB_ERROR;
>>>> + goto Exit;
>>>> + }
>>>> +
>>>> + in_port_num = osm_physp_get_port_num(p_physp);
>>>> +
>>>> + /* this is point to point case (no switch in between) */
>>>> + if( p_physp == p_dest_physp )
>>>> + break;
>>>
>>> Ordering of check for switch and point to point case are different here
>>> and original routine. Should they be the same ? If so, which should
>>> change ? (Any reason why this was moved in this routine ?)
>> Not sure I'm following.
>> The order of check for switch and point to point case looks the same
>> to me (am I missing something?). The difference that I see is that
>> the mtu and rate in the original function are adjusted after the
>> check for switch, and in the new function they are adjusted before the
>> check, which I think is the same.
>
> That could have been what I was seeing. Shouldn't the two functions be
> indentical in order (assuming these are to be separated) ? I wouldn't
> want to see them diverge further.
The order in the new function can be changed to match the order in the
old one - I have no problem with that.
> [snip...]
>
>>>> +/**********************************************************************
>>>> + **********************************************************************/
>>>> static void
>>>> __osm_pr_rcv_build_pr(
>>>> IN osm_pr_rcv_t* const p_rcv,
>>>> @@ -774,7 +1569,8 @@ __osm_pr_rcv_build_pr(
>>>> #endif
>>>>
>>>> p_pr->pkey = p_parms->pkey;
>>>> - p_pr->sl = cl_hton16(p_parms->sl);
>>>> + ib_path_rec_set_qos_class(p_pr,p_parms->class);
>>>> + ib_path_rec_set_sl(p_pr,p_parms->sl);
>>>> p_pr->mtu = (uint8_t)(p_parms->mtu | 0x80);
>>>> p_pr->rate = (uint8_t)(p_parms->rate | 0x80);
>>>>
>>>> @@ -832,10 +1628,14 @@ __osm_pr_rcv_get_lid_pair_path(
>>>> goto Exit;
>>>> }
>>>>
>>>> - status = __osm_pr_rcv_get_path_parms( p_rcv, p_pr, p_src_port,
>>>> - p_dest_port, dest_lid_ho,
>>>> - comp_mask, &path_parms );
>>>> -
>>>> + if (p_rcv->p_subn->opt.no_qos)
>>> Shouldn't this be based on p_rcv->p_subn.opt.qos_policy_file rather than
>>> no_qos ? I think there are cases where the QoS will be used without the
>>> QoS policy (higher level QoS support).
>> By totally ignoring sl2vl tables the original function may return
>> path that isn't a "real" path - it may lead to VL15 at some point.
>> So the new function takes care of this problem.
>
> So it's a bug fix (missing functionality) in the existing QoS support.
Right. Hopefully, the new function will replace the old one, and there
won't be a need to add this functionality to the old function as a separate
task.
>> When there's no policy file, the policy parse tree is empty, and then
>> the ports would not have any qos-level to be applied on the examined path.
>> In that case the new function does whatever the old one did, plus checking
>> the path for sl2vl "consistency".
>
> Got it. Thanks.
>
> -- Hal
>
>> -- Yevgeny
>
>
From halr at voltaire.com Thu Feb 1 07:45:34 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 01 Feb 2007 10:45:34 -0500
Subject: [openib-general] [PATCH 10/10] osm: QoS in OpenSM
In-Reply-To: <45C115D8.6070504@dev.mellanox.co.il>
References: <45BF6548.80104@dev.mellanox.co.il>
<1170264561.15660.189494.camel@hal.voltaire.com>
<45C115D8.6070504@dev.mellanox.co.il>
Message-ID: <1170344724.15660.271079.camel@hal.voltaire.com>
Hi again Yevgeny,
On Wed, 2007-01-31 at 17:19, Yevgeny Kliteynik wrote:
[snip...]
> >> + for (i = 0; i < IB_MAX_NUM_VLS; i++)
> >> + {
> >> + if (valid_sls[i])
> >> + {
> >> + vl = ib_slvl_table_get(p_slvl_tbl,i);
> >> + if (vl == IB_DROP_VL)
> >
> > Does vl > Operational VLs need checking here or is it never set this way
> > ?
> I think that it would be better if the "setup" part would check it when
> configuring sl2vl tables, and when VL > Operational VL it should set
> some default value instead (VL15 looks as a good option).
OK; but why scan all VLs if they are not supported ?
> >> + valid_sls[i] = FALSE;
> >> + }
> >> + }
> >> +
> >> + /*
> >> + * now get pointer to the destination port (same as above)
> >> + */
> >> + p_node = osm_physp_get_node_ptr( p_dest_physp );
> >> +
> >> + if( p_node->sw )
> >> + {
> >> + p_dest_physp = osm_switch_get_route_by_lid( p_node->sw, cl_ntoh16( dest_lid_ho ) );
> >> + if ( p_dest_physp == 0 )
> >> + {
> >> + osm_log( p_rcv->p_log, OSM_LOG_ERROR,
> >> + "__osm_pr_rcv_get_path_parms_qos: ERR 1F03: "
> >> + "Cannot find routing to LID 0x%X from switch for GUID 0x%016" PRIx64 "\n",
> >> + dest_lid_ho,
> >> + cl_ntoh64( osm_node_get_node_guid( p_node ) ) );
> >> + status = IB_ERROR;
> >> + goto Exit;
> >> + }
> >> + }
> >> +
> >> + /*
> >> + * Now go through the path step by step
> >> + */
> >> +
> >> + while( p_physp != p_dest_physp )
> >> + {
> >> + p_physp = osm_physp_get_remote( p_physp );
> >> + if ( p_physp == 0 )
> >> + {
> >> + osm_log( p_rcv->p_log, OSM_LOG_ERROR,
> >> + "__osm_pr_rcv_get_path_parms_qos: ERR 1F04: "
> >> + "Cannot find remote phys port when routing to LID 0x%X from node GUID 0x%016" PRIx64 "\n",
> >> + dest_lid_ho,
> >> + cl_ntoh64( osm_node_get_node_guid( p_node ) ) );
> >> + status = IB_ERROR;
> >> + goto Exit;
> >> + }
> >> +
> >> + in_port_num = osm_physp_get_port_num(p_physp);
> >> +
> >> + /* this is point to point case (no switch in between) */
> >> + if( p_physp == p_dest_physp )
> >> + break;
> >
> >
> > Ordering of check for switch and point to point case are different here
> > and original routine. Should they be the same ? If so, which should
> > change ? (Any reason why this was moved in this routine ?)
> Not sure I'm following.
> The order of check for switch and point to point case looks the same
> to me (am I missing something?). The difference that I see is that
> the mtu and rate in the original function are adjusted after the
> check for switch, and in the new function they are adjusted before the
> check, which I think is the same.
That could have been what I was seeing. Shouldn't the two functions be
indentical in order (assuming these are to be separated) ? I wouldn't
want to see them diverge further.
[snip...]
> >> +/**********************************************************************
> >> + **********************************************************************/
> >> static void
> >> __osm_pr_rcv_build_pr(
> >> IN osm_pr_rcv_t* const p_rcv,
> >> @@ -774,7 +1569,8 @@ __osm_pr_rcv_build_pr(
> >> #endif
> >>
> >> p_pr->pkey = p_parms->pkey;
> >> - p_pr->sl = cl_hton16(p_parms->sl);
> >> + ib_path_rec_set_qos_class(p_pr,p_parms->class);
> >> + ib_path_rec_set_sl(p_pr,p_parms->sl);
> >> p_pr->mtu = (uint8_t)(p_parms->mtu | 0x80);
> >> p_pr->rate = (uint8_t)(p_parms->rate | 0x80);
> >>
> >> @@ -832,10 +1628,14 @@ __osm_pr_rcv_get_lid_pair_path(
> >> goto Exit;
> >> }
> >>
> >> - status = __osm_pr_rcv_get_path_parms( p_rcv, p_pr, p_src_port,
> >> - p_dest_port, dest_lid_ho,
> >> - comp_mask, &path_parms );
> >> -
> >> + if (p_rcv->p_subn->opt.no_qos)
> >
> > Shouldn't this be based on p_rcv->p_subn.opt.qos_policy_file rather than
> > no_qos ? I think there are cases where the QoS will be used without the
> > QoS policy (higher level QoS support).
>
> By totally ignoring sl2vl tables the original function may return
> path that isn't a "real" path - it may lead to VL15 at some point.
> So the new function takes care of this problem.
So it's a bug fix (missing functionality) in the existing QoS support.
> When there's no policy file, the policy parse tree is empty, and then
> the ports would not have any qos-level to be applied on the examined path.
> In that case the new function does whatever the old one did, plus checking
> the path for sl2vl "consistency".
Got it. Thanks.
-- Hal
> -- Yevgeny
From monil at voltaire.com Thu Feb 1 08:17:54 2007
From: monil at voltaire.com (Moni Levy)
Date: Thu, 1 Feb 2007 18:17:54 +0200
Subject: [openib-general] OFED 1.2 release - to be reviewed in the
meeting today
In-Reply-To: <45C08E47.2040506@mellanox.co.il>
References: <45BDFF11.9080901@mellanox.co.il>
<45BFF296.8000908@cse.ohio-state.edu> <45C08E47.2040506@mellanox.co.il>
Message-ID: <6a122cc00702010817j52958d85n1d141316e29a7ebf@mail.gmail.com>
Tziporet,
On 1/31/07, Tziporet Koren wrote:
> Shaun Rowland wrote:
> >
> > Hi. I am not exactly sure where the ofed_1_2 directory for MPI SRPMs is
> > supposed to go. I assume from previous meetings this is just a
> > filesystem directory. Should it be a directory in my home directory on
> > staging.openfabrics.org, in ~/public_html, or is there something else I
> > need to do to put this into place? I think from the previous MPI
> > specific meeting, this was supposed to be done in a web directory. Since
> > I am unclear, I wanted to ask here.
>
> Please place your SRPM under your home directory at ofed_1_2 directory.
> Then you can make this directory accessible to the web in this way:
> 1. mkdir public_html
> 2. chmod 755 public_html
>
> Now you can put any stuff under public_html (also symbolic links) and it
> will be available via web
> www.openfabrics.org/~/
I have put the ib-bonding SRPM in ~monis/ofed_1_2
--Moni
>
> Tziporet
>
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>
From swise at opengridcomputing.com Thu Feb 1 09:12:01 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 11:12:01 -0600
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070201121008.GA20789@mellanox.co.il>
References: <20070125191321.30934.74542.stgit@dell3.ogc.int>
<20070201121008.GA20789@mellanox.co.il>
Message-ID: <1170349921.16637.1.camel@stevo-desktop>
Looks good.
Thanks,
Steve.
On Thu, 2007-02-01 at 14:10 +0200, Michael S. Tsirkin wrote:
> > Quoting Steve Wise :
> > Subject: [PATCH 00/12] ofed_1_2 - Neighbour update support
> >
> >
> > Michael/Vlad:
> >
> > Here are the backports for snooping arp packets to generate neighbour
> > update netevents. Also included is the addr.c patch to act on all valid
> > neigh update events. If this series looks good to you then I'll push
> > this up and you all can pull it from my git tree.
>
> This patches seems to have created a reference leak on each neighbour
> as a result ipoib interface could not be brought down.
> It also seems that RHASU2 backport was missing code.
> I pushed out the following:
>
>
> commit d140398db0da0beb3172e0ccf14ef3023cafec9c
> Author: Michael S. Tsirkin
> Date: Thu Feb 1 12:21:34 2007 +0200
>
> Fix neighbour reference leak in netevent.c
>
> Signed-off-by: Michael S. Tsirkin
>
> diff --git a/kernel_addons/backport/2.6.11/include/src/netevent.c b/kernel_addons/backport/2.6.11/include/src/netevent.c
> index 6a8df29..0d26662 100644
> --- a/kernel_addons/backport/2.6.11/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.11/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.12/include/src/netevent.c b/kernel_addons/backport/2.6.12/include/src/netevent.c
> index 6a8df29..0d26662 100644
> --- a/kernel_addons/backport/2.6.12/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.12/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.13/include/src/netevent.c b/kernel_addons/backport/2.6.13/include/src/netevent.c
> index 6a8df29..0d26662 100644
> --- a/kernel_addons/backport/2.6.13/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.13/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.14/include/src/netevent.c b/kernel_addons/backport/2.6.14/include/src/netevent.c
> index 188283c..17a12ff 100644
> --- a/kernel_addons/backport/2.6.14/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.14/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.15/include/src/netevent.c b/kernel_addons/backport/2.6.15/include/src/netevent.c
> index 188283c..17a12ff 100644
> --- a/kernel_addons/backport/2.6.15/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.15/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c b/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c
> index 188283c..17a12ff 100644
> --- a/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.15_ubuntu606/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.16/include/src/netevent.c b/kernel_addons/backport/2.6.16/include/src/netevent.c
> index 188283c..17a12ff 100644
> --- a/kernel_addons/backport/2.6.16/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.16/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c b/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c
> index 188283c..17a12ff 100644
> --- a/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.16_sles10/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.17/include/src/netevent.c b/kernel_addons/backport/2.6.17/include/src/netevent.c
> index 26a0920..4c67de1 100644
> --- a/kernel_addons/backport/2.6.17/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.17/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c b/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
> index 57a23ab..90fce0c 100644
> --- a/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
> @@ -39,8 +39,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
> diff --git a/kernel_addons/backport/2.6.9_U2/include/src/netevent.c b/kernel_addons/backport/2.6.9_U2/include/src/netevent.c
> index 5ffadd1..1589300 100644
> --- a/kernel_addons/backport/2.6.9_U2/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.9_U2/include/src/netevent.c
> @@ -13,10 +13,59 @@
> * Fixes:
> */
>
> -#include
> -#include
> #include
> #include
> +#include
> +#include
> +#include
> +#include
> +
> +#include
> +#include
> +#include
> +#include
> +
> +static DEFINE_MUTEX(lock);
> +static int count;
> +
> +static void destructor(struct sk_buff *skb)
> +{
> + struct neighbour *n;
> + u8 *arp_ptr;
> + __be32 gw;
> +
> + /* Pull the SPA */
> + arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> + memcpy(&gw, arp_ptr, 4);
> + n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> + if (n) {
> + call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> + return;
> +}
> +
> +static int arp_recv(struct sk_buff *skb, struct net_device *dev,
> + struct packet_type *pkt)
> +{
> + struct arphdr *arp_hdr;
> + u16 op;
> +
> + arp_hdr = (struct arphdr *) skb->nh.raw;
> + op = ntohs(arp_hdr->ar_op);
> +
> + if ((op == ARPOP_REQUEST || op == ARPOP_REPLY) && !skb->destructor)
> + skb->destructor = destructor;
> +
> + kfree_skb(skb);
> + return 0;
> +}
> +
> +static struct packet_type arp = {
> + .type = __constant_htons(ETH_P_ARP),
> + .func = arp_recv,
> + .af_packet_priv = (void *)1,
> +};
>
> static struct notifier_block *netevent_notif_chain;
>
> @@ -34,6 +83,12 @@ int register_netevent_notifier(struct notifier_block *nb)
> int err;
>
> err = notifier_chain_register(&netevent_notif_chain, nb);
> + if (!err) {
> + mutex_lock(&lock);
> + if (count++ == 0)
> + dev_add_pack(&arp);
> + mutex_unlock(&lock);
> + }
> return err;
> }
>
> @@ -49,7 +104,16 @@ int register_netevent_notifier(struct notifier_block *nb)
>
> int unregister_netevent_notifier(struct notifier_block *nb)
> {
> - return notifier_chain_unregister(&netevent_notif_chain, nb);
> + int err;
> +
> + err = notifier_chain_unregister(&netevent_notif_chain, nb);
> + if (!err) {
> + mutex_lock(&lock);
> + if (--count == 0)
> + dev_remove_pack(&arp);
> + mutex_unlock(&lock);
> + }
> + return err;
> }
>
> /**
> diff --git a/kernel_addons/backport/2.6.9_U3/include/src/netevent.c b/kernel_addons/backport/2.6.9_U3/include/src/netevent.c
> index 5ffadd1..1589300 100644
> --- a/kernel_addons/backport/2.6.9_U3/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.9_U3/include/src/netevent.c
> @@ -13,10 +13,59 @@
> * Fixes:
> */
>
> -#include
> -#include
> #include
> #include
> +#include
> +#include
> +#include
> +#include
> +
> +#include
> +#include
> +#include
> +#include
> +
> +static DEFINE_MUTEX(lock);
> +static int count;
> +
> +static void destructor(struct sk_buff *skb)
> +{
> + struct neighbour *n;
> + u8 *arp_ptr;
> + __be32 gw;
> +
> + /* Pull the SPA */
> + arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> + memcpy(&gw, arp_ptr, 4);
> + n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> + if (n) {
> + call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> + return;
> +}
> +
> +static int arp_recv(struct sk_buff *skb, struct net_device *dev,
> + struct packet_type *pkt)
> +{
> + struct arphdr *arp_hdr;
> + u16 op;
> +
> + arp_hdr = (struct arphdr *) skb->nh.raw;
> + op = ntohs(arp_hdr->ar_op);
> +
> + if ((op == ARPOP_REQUEST || op == ARPOP_REPLY) && !skb->destructor)
> + skb->destructor = destructor;
> +
> + kfree_skb(skb);
> + return 0;
> +}
> +
> +static struct packet_type arp = {
> + .type = __constant_htons(ETH_P_ARP),
> + .func = arp_recv,
> + .af_packet_priv = (void *)1,
> +};
>
> static struct notifier_block *netevent_notif_chain;
>
> @@ -34,6 +83,12 @@ int register_netevent_notifier(struct notifier_block *nb)
> int err;
>
> err = notifier_chain_register(&netevent_notif_chain, nb);
> + if (!err) {
> + mutex_lock(&lock);
> + if (count++ == 0)
> + dev_add_pack(&arp);
> + mutex_unlock(&lock);
> + }
> return err;
> }
>
> @@ -49,7 +104,16 @@ int register_netevent_notifier(struct notifier_block *nb)
>
> int unregister_netevent_notifier(struct notifier_block *nb)
> {
> - return notifier_chain_unregister(&netevent_notif_chain, nb);
> + int err;
> +
> + err = notifier_chain_unregister(&netevent_notif_chain, nb);
> + if (!err) {
> + mutex_lock(&lock);
> + if (--count == 0)
> + dev_remove_pack(&arp);
> + mutex_unlock(&lock);
> + }
> + return err;
> }
>
> /**
> diff --git a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> index 6a8df29..0d26662 100644
> --- a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> @@ -38,8 +38,10 @@ static void destructor(struct sk_buff *skb)
> arp_ptr = skb->nh.raw + sizeof(struct arphdr) + skb->dev->addr_len;
> memcpy(&gw, arp_ptr, 4);
> n = neigh_lookup(&arp_tbl, &gw, skb->dev);
> - if (n)
> + if (n) {
> call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
> + neigh_release(n);
> + }
> return;
> }
>
>
From swise at opengridcomputing.com Thu Feb 1 09:29:24 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 11:29:24 -0600
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070201121930.GB20789@mellanox.co.il>
References: <20070125191321.30934.74542.stgit@dell3.ogc.int>
<20070201121930.GB20789@mellanox.co.il>
Message-ID: <1170350964.16637.18.camel@stevo-desktop>
On Thu, 2007-02-01 at 14:19 +0200, Michael S. Tsirkin wrote:
> > Here are the backports for snooping arp packets to generate neighbour
> > update netevents.
>
> OK, I went (somewhat belatedly) over this code in more depth and I see
> a couple of issues that I'd like you to address:
>
> - There's some trailing whitespace in some netevet.c files.
> Could you clean these please?
>
You took care of these I assume based on your followup email.
> - I see:
> $ diff ./kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> kernel_addons/backport/2.6.5_sles9_sp3/include/src/netevent.c
> > #include
>
> Should not redhat backports include skbuff.h too?
> They do use skbuff struct so it seems it is cleaner to include
> directly, and we would get identical code for redhat and suse.
>
Yup.
> - What is the reason for:
> if ((op == ARPOP_REQUEST || op == ARPOP_REPLY) && !skb->destructor)
> skb->destructor = destructor;
>
> kfree_skb(skb);
>
> Could we miss events because skb has a desctructor?
Yes. I looked through the ethernet drivers and didn't see anyone using
destructors. I thought perhaps this is ok for backports. There are
ways to address this issue:
1) Enhance the current code to save off the original destructor function
if it exists and put in ours. Then when our function is called, we do
our processing, then call the original destructor function. We would
need to save the original function ptr somewhere.
2) schedule the function to happen at a later time and hope the ARP
subsystem has already updated the neigh table. I opted against this
approach because it doesn't ensure that the neigh entry was updated
before we act on it.
> Can we just call the descructor function directly (this is what addr.c
> did previously, and this apparently worked fine).
The original addr.c snoop code worked fine for IB address resolution and
for the initial ARP resolution for iWARP devices, but not for notifying
iWARP devices when a neighbour changes. For instance, if the neighbour
mac address changes, then the iWARP device needs to be notified so it
can update its L2 table maintained in the device.
We need to defer calling the destructor function until the ARP subsystem
has processed this ARP packet. Through testing, I saw that our snoop
function gets called _before_ the ARP subsystem processes the ARP
packet. So the neighbour entry hasn't been updated yet. Hooking via
destructor calls our function _after_ the ARP subsystem has updated the
neighbour. So we can then lookup the neigh entry and do the callouts.
From mshefty at ichips.intel.com Thu Feb 1 09:55:10 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Thu, 01 Feb 2007 09:55:10 -0800
Subject: [openib-general] new IB CM reject reason
In-Reply-To: <20070201062431.GB4499@mellanox.co.il>
References: <000201c74585$a0bc7260$8698070a@amr.corp.intel.com>
<20070201062431.GB4499@mellanox.co.il>
Message-ID: <45C2297E.9050306@ichips.intel.com>
> No, I don't think "application crashed" makes sense as an element of wire protocol.
> I think an optional logging of errors in kernel CM would be a much better
> solution. I know I had to add some printks it each time I was debugging SDP.
The "application crashed" scenario is what high-lighted the issue. The problem
is that the CM must provide a reject reason. Which reject reason do you use?
My suggestion was for a reject reason of other/unknown/none given (pick one).
> 2. Another objection is that this feature seems to invite misuse where applications
> will use REJ reason as a hint on whether remote side crashed. But REJ could be
> lost. Wouldn't this confuse the remote side?
Currently, the CM issues the reject using "consumer defined", since nothing else
maps any better under this condition. But the reject isn't consumer defined...
By doing this, an application that expects specific private data in the reject
message won't find it, which is just as likely to confuse the remote side. This
is why I think an unknown/unspecified reject reason is needed.
How an application interprets a reject with 'unknown' reason is up to the
application, but I do think this is better than the application trying to guess
whether 'consumer defined' really does mean consumer defined.
- Sean
From mshefty at ichips.intel.com Thu Feb 1 10:01:01 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Thu, 01 Feb 2007 10:01:01 -0800
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <20070201135522.GA27688@mellanox.co.il>
References: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
<1170322670.654.23.camel@linux-q667.site>
<20070201135522.GA27688@mellanox.co.il>
Message-ID: <45C22ADD.40108@ichips.intel.com>
> - Sean, please base your branches on specific -rc from linus
> (OFED 1.2 is now -rc7).
My branches should be in sync with rc6. The original branches were built from
an earlier rc version, and updated by pulling in the latest rc from Linus
through my master branch. Are you wanting the history of the branches reworked
so that they get completely rebuilt off of the latest kernel?
> - Now that we are entering feature freeze, we should not do full replaces anymore.
> So Sean, please post incremental patches, labeled ofed-1.2 clearly.
Additional patches will be posted to my ofed_1-2 branch, which you should be
able to pull. Do you see a problem with this process? I don't understand why
you would need to do a full replace.
- Sean
From mst at mellanox.co.il Thu Feb 1 10:39:22 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 20:39:22 +0200
Subject: [openib-general] new IB CM reject reason
In-Reply-To: <45C2297E.9050306@ichips.intel.com>
References: <45C2297E.9050306@ichips.intel.com>
Message-ID: <20070201183922.GB15115@mellanox.co.il>
> Quoting Sean Hefty :
> Subject: Re: [openib-general] new IB CM reject reason
>
> > No, I don't think "application crashed" makes sense as an element of wire protocol.
> > I think an optional logging of errors in kernel CM would be a much better
> > solution. I know I had to add some printks it each time I was debugging SDP.
>
> The "application crashed" scenario is what high-lighted the issue. The problem
> is that the CM must provide a reject reason. Which reject reason do you use?
> My suggestion was for a reject reason of other/unknown/none given (pick one).
I'm actually happy with what existing code does (consumer reject).
I would like to highlight the lack of ability to make CM errors
go to system log as a weekness in current CM code, which hinders debugging.
Would you be interested in a patch making it possible to enable logging CM errors
and/or all CM events?
> > 2. Another objection is that this feature seems to invite misuse where applications
> > will use REJ reason as a hint on whether remote side crashed. But REJ could be
> > lost. Wouldn't this confuse the remote side?
>
> Currently, the CM issues the reject using "consumer defined", since nothing else
> maps any better under this condition. But the reject isn't consumer defined...
> By doing this, an application that expects specific private data in the reject
> message won't find it, which is just as likely to confuse the remote side. This
> is why I think an unknown/unspecified reject reason is needed.
>
> How an application interprets a reject with 'unknown' reason is up to the
> application, but I do think this is better than the application trying to guess
> whether 'consumer defined' really does mean consumer defined.
Are we talking about code 28? My spec lists it as "consumer reject".
The meaning of *private data* is consumer defined.
The consumer decided to reject the communica-
tion or EE context setup establishment attempt for
reasons other than those listed in the other REJ
codes. Typically this happens based upon infor-
mation being conveyed in the PrivateData field of
a message. It can also happen because the Con-
sumer decided for reasons unrelated to any CM
message it received to terminate the communica-
tion or EE context setup establishment attempt.
This would therefore be the appropriate Reason
code to use if the Consumer decided to destroy
the QP or EEC in the midst of the communication
or EE context setup establishment attempt.
So this really *does* seem to be what spec intended for exactly our case.
Now, I do not really object to inventing new rejection reasons: for example,
maybe we can invent one that lets us stick the errno value in private data
somehow - but it's not like there's no solution inside the spec,
and inventing a whole new reject reason just for userspace consumers
seems like a narrow approach to me.
--
MST
From swise at opengridcomputing.com Thu Feb 1 10:42:11 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 12:42:11 -0600
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <20070201135522.GA27688@mellanox.co.il>
References: <000401c7458b$9bff77d0$8698070a@amr.corp.intel.com>
<1170322670.654.23.camel@linux-q667.site>
<20070201135522.GA27688@mellanox.co.il>
Message-ID: <1170355331.16637.25.camel@stevo-desktop>
Um, now on rhel4u4 we crash creating the mcast workqueue.
The name is "ib_mcast_wq" which is too long for older kernels.
Did we loose a backport patch?
On Thu, 2007-02-01 at 15:55 +0200, Michael S. Tsirkin wrote:
> > Quoting Steve WIse :
> > Subject: Re: [PATCH] RE: regression in ofed 1.2
> >
> > > Okay - I _think_ the problem is that OFED 1.2 pulled code from my git tree
> > > before I created an ofed_1_2 branch (which contains the fix), and didn't update
> > > to match my ofed_1_2 branch. The crash that you reported occurring over iWarp
> > > should also happen over IB for the same reason, so both are likely broken atm...
> > >
> > > Vlad, can you please update the ofed build by pulling from the ofed_1_2 branches
> > > of my rdma-dev.git and librdmacm.git trees?
> >
> > I looked at your rdma-dev ofed_1_2 branch and see that the cma.c changes
> > you made there will resolve this issue. It just needs to be pulled into
> > ofed_1_2.
>
> OK, I've updated ofed to code from rdma-dev ofed_1_2 branch. Some notes:
>
> - Sean, please base your branches on specific -rc from linus
> (OFED 1.2 is now -rc7).
> - Now that we are entering feature freeze, we should not do full replaces anymore.
> So Sean, please post incremental patches, labeled ofed-1.2 clearly.
>
From sean.hefty at intel.com Thu Feb 1 10:55:20 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Thu, 1 Feb 2007 10:55:20 -0800
Subject: [openib-general] new IB CM reject reason
In-Reply-To: <20070201183922.GB15115@mellanox.co.il>
Message-ID: <000101c74632$85b37bf0$8698070a@amr.corp.intel.com>
>Would you be interested in a patch making it possible to enable logging CM
>errors
>and/or all CM events?
A patch for this would be fine with me.
>Are we talking about code 28? My spec lists it as "consumer reject".
>The meaning of *private data* is consumer defined.
>
> The consumer decided to reject the communica-
> tion or EE context setup establishment attempt for
> reasons other than those listed in the other REJ
> codes. Typically this happens based upon infor-
> mation being conveyed in the PrivateData field of
> a message. It can also happen because the Con-
> sumer decided for reasons unrelated to any CM
> message it received to terminate the communica-
> tion or EE context setup establishment attempt.
> This would therefore be the appropriate Reason
> code to use if the Consumer decided to destroy
> the QP or EEC in the midst of the communication
> or EE context setup establishment attempt.
>
>So this really *does* seem to be what spec intended for exactly our case.
I disagree. This is for the CM consumer, not the CM itself. In this case, the
CM must issue a reject that will be delivered to the remote application. The CM
has no idea what private data format the remote application expects.
>Now, I do not really object to inventing new rejection reasons: for example,
>maybe we can invent one that lets us stick the errno value in private data
>somehow - but it's not like there's no solution inside the spec,
>and inventing a whole new reject reason just for userspace consumers
>seems like a narrow approach to me.
Unless we start enforcing a policy that kernel consumers must issue a reject
before destroying a cm_id (while in the connecting phase), they have this
problem.
My claim is that the reject reasons are insufficient to cover all possible
conditions, and adding a generic 'other' reject reason solves this. Using
consumer defined, which is what is done today, is incorrect. As an alternate
solution, we could also not send any reject and just let the connection time out
on the remote side.
- Sean
From mst at mellanox.co.il Thu Feb 1 11:00:49 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 21:00:49 +0200
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <45C22ADD.40108@ichips.intel.com>
References: <45C22ADD.40108@ichips.intel.com>
Message-ID: <20070201190049.GC15115@mellanox.co.il>
> Quoting Sean Hefty :
> Subject: Re: [openib-general] [PATCH] RE: regression in ofed 1.2
>
> > - Sean, please base your branches on specific -rc from linus
> > (OFED 1.2 is now -rc7).
>
> My branches should be in sync with rc6.
If you check, they are not. ofed 1 2 branch has an extra
commit on top of -rc6. But I figured it out already.
> so that they get completely rebuilt off of the latest kernel?
No need to do anything at this point.
> > - Now that we are entering feature freeze, we should not do full replaces anymore.
> > So Sean, please post incremental patches, labeled ofed-1.2 clearly.
>
> Additional patches will be posted to my ofed_1-2 branch, which you should be
> able to pull.
First, please post patches on list as well.
We can then just take the patch from git or from mail and add it under fixes.
> Do you see a problem with this process?
Yes. I had to jump through some hoops to first get a patch I can put in OFED due
to the issue outlined above, and then get the diff I got to apply without
conflicts, since port randomization code conflicted with the QoS patches. All
solved now - just put your patch before QoS one - but these conflicts should be
be figured out by whoever submits patches.
> I don't understand why you would need to do a full replace.
We won't do a full replace, just add patches in fixes directory.
What I expect everyone to do however, to get patches put in OFED,
is to test that patches one posts work in OFED git tree, not just against
upstream based git trees.
This currently includes testing for build against older kernels on various
architectures (me and Vlad put a cross-build setup for this at staging,
it now has kernel.org kernels but we will be adding distro kernels)
and testing on at least one of the main supported enterprise distros (RHEL/SLES).
I simply can't take untested patches - I have nightly tests but no time to test
all ULPs before I apply.
--
MST
From mst at mellanox.co.il Thu Feb 1 11:06:24 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 21:06:24 +0200
Subject: [openib-general] new IB CM reject reason
In-Reply-To: <000101c74632$85b37bf0$8698070a@amr.corp.intel.com>
References: <20070201183922.GB15115@mellanox.co.il>
<000101c74632$85b37bf0$8698070a@amr.corp.intel.com>
Message-ID: <20070201190624.GB6473@mellanox.co.il>
> >Are we talking about code 28? My spec lists it as "consumer reject".
> >The meaning of *private data* is consumer defined.
> >
> > The consumer decided to reject the communica-
> > tion or EE context setup establishment attempt for
> > reasons other than those listed in the other REJ
> > codes. Typically this happens based upon infor-
> > mation being conveyed in the PrivateData field of
> > a message. It can also happen because the Con-
> > sumer decided for reasons unrelated to any CM
> > message it received to terminate the communica-
> > tion or EE context setup establishment attempt.
> > This would therefore be the appropriate Reason
> > code to use if the Consumer decided to destroy
> > the QP or EEC in the midst of the communication
> > or EE context setup establishment attempt.
> >
> >So this really *does* seem to be what spec intended for exactly our case.
>
> I disagree. This is for the CM consumer, not the CM itself. In this case, the
> CM must issue a reject that will be delivered to the remote application. The CM
> has no idea what private data format the remote application expects.
Since we disagree about spec reading, would you raise this in the
relevant WG?
> >Now, I do not really object to inventing new rejection reasons: for example,
> >maybe we can invent one that lets us stick the errno value in private data
> >somehow - but it's not like there's no solution inside the spec,
> >and inventing a whole new reject reason just for userspace consumers
> >seems like a narrow approach to me.
>
> Unless we start enforcing a policy that kernel consumers must issue a reject
> before destroying a cm_id (while in the connecting phase), they have this
> problem.
>
> My claim is that the reject reasons are insufficient to cover all possible
> conditions, and adding a generic 'other' reject reason solves this. Using
> consumer defined, which is what is done today, is incorrect. As an alternate
> solution, we could also not send any reject and just let the connection time out
> on the remote side.
And my claim is that you should define private data format to go with this
other reason otherwise you are not really solving the problem.
--
MST
From mst at mellanox.co.il Thu Feb 1 11:11:28 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 21:11:28 +0200
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <1170355331.16637.25.camel@stevo-desktop>
References: <1170355331.16637.25.camel@stevo-desktop>
Message-ID: <20070201191128.GB17617@mellanox.co.il>
> Quoting Steve Wise :
> Subject: Re: [PATCH] RE: regression in ofed 1.2
>
> Um, now on rhel4u4 we crash creating the mcast workqueue.
>
> The name is "ib_mcast_wq" which is too long for older kernels.
>
> Did we loose a backport patch?
Sean, please rename the multicast wq to ib_mcast as we agreed.
I just pushed the following out:
commit efedfe57a21a134a65d951bcca73af46da609c5e
Author: Michael S. Tsirkin
Date: Thu Feb 1 21:09:16 2007 +0200
Make multicast WQ name shorter.
Signed-off-by: Michael S. Tsirkin
diff --git a/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch b/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch
index e70d4da..4b968db 100644
--- a/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch
+++ b/kernel_patches/fixes/merged_sean_rdma_dev_ofed_1_2.patch
@@ -2225,7 +2225,7 @@ index 0000000..039f1eb
+{
+ int ret;
+
-+ mcast_wq = create_singlethread_workqueue("ib_mcast_wq");
++ mcast_wq = create_singlethread_workqueue("ib_mcast");
+ if (!mcast_wq)
+ return -ENOMEM;
+
--
MST
From mst at mellanox.co.il Thu Feb 1 11:22:21 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 21:22:21 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <1170350964.16637.18.camel@stevo-desktop>
References: <1170350964.16637.18.camel@stevo-desktop>
Message-ID: <20070201192221.GD17617@mellanox.co.il>
> > Could we miss events because skb has a desctructor?
>
> Yes. I looked through the ethernet drivers and didn't see anyone using
> destructors. I thought perhaps this is ok for backports. There are
> ways to address this issue:
>
> 1) Enhance the current code to save off the original destructor function
> if it exists and put in ours. Then when our function is called, we do
> our processing, then call the original destructor function. We would
> need to save the original function ptr somewhere.
>
> 2) schedule the function to happen at a later time and hope the ARP
> subsystem has already updated the neigh table. I opted against this
> approach because it doesn't ensure that the neigh entry was updated
> before we act on it.
>
> > Can we just call the descructor function directly (this is what addr.c
> > did previously, and this apparently worked fine).
>
> The original addr.c snoop code worked fine for IB address resolution and
> for the initial ARP resolution for iWARP devices, but not for notifying
> iWARP devices when a neighbour changes. For instance, if the neighbour
> mac address changes, then the iWARP device needs to be notified so it
> can update its L2 table maintained in the device.
>
> We need to defer calling the destructor function until the ARP subsystem
> has processed this ARP packet. Through testing, I saw that our snoop
> function gets called _before_ the ARP subsystem processes the ARP
> packet. So the neighbour entry hasn't been updated yet. Hooking via
> destructor calls our function _after_ the ARP subsystem has updated the
> neighbour. So we can then lookup the neigh entry and do the callouts.
Not sure how do you mean all this. You do kfree_skb immediately in the
arp processing function. Will this not call the destructor directly?
Anyway, it seems too risky to change the code a lot now.
what I am concerned is that this could have broken working code.
To reduce the risk of problems for existing code,
I'd like to see something like the following:
if (someone asked for notification on neighbour changes)
do the destructor trick
if (someone asked for notification on address resolution)
call the destructor directly
Could you code this up please?
--
MST
From mst at mellanox.co.il Thu Feb 1 11:29:24 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Thu, 1 Feb 2007 21:29:24 +0200
Subject: [openib-general] IPoIB CM for merge?
Message-ID: <20070201192924.GE17617@mellanox.co.il>
Roland, 2.6.20 is nearly done.
Could you please spend some time reviewing IPoIB CM code?
I am concerned about missing the 2.6.21 merge window.
--
MST
From swise at opengridcomputing.com Thu Feb 1 12:01:11 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 14:01:11 -0600
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070201192221.GD17617@mellanox.co.il>
References: <1170350964.16637.18.camel@stevo-desktop>
<20070201192221.GD17617@mellanox.co.il>
Message-ID: <1170360071.16637.39.camel@stevo-desktop>
On Thu, 2007-02-01 at 21:22 +0200, Michael S. Tsirkin wrote:
> > > Could we miss events because skb has a desctructor?
> >
> > Yes. I looked through the ethernet drivers and didn't see anyone using
> > destructors. I thought perhaps this is ok for backports. There are
> > ways to address this issue:
> >
> > 1) Enhance the current code to save off the original destructor function
> > if it exists and put in ours. Then when our function is called, we do
> > our processing, then call the original destructor function. We would
> > need to save the original function ptr somewhere.
> >
> > 2) schedule the function to happen at a later time and hope the ARP
> > subsystem has already updated the neigh table. I opted against this
> > approach because it doesn't ensure that the neigh entry was updated
> > before we act on it.
> >
> > > Can we just call the descructor function directly (this is what addr.c
> > > did previously, and this apparently worked fine).
> >
> > The original addr.c snoop code worked fine for IB address resolution and
> > for the initial ARP resolution for iWARP devices, but not for notifying
> > iWARP devices when a neighbour changes. For instance, if the neighbour
> > mac address changes, then the iWARP device needs to be notified so it
> > can update its L2 table maintained in the device.
> >
> > We need to defer calling the destructor function until the ARP subsystem
> > has processed this ARP packet. Through testing, I saw that our snoop
> > function gets called _before_ the ARP subsystem processes the ARP
> > packet. So the neighbour entry hasn't been updated yet. Hooking via
> > destructor calls our function _after_ the ARP subsystem has updated the
> > neighbour. So we can then lookup the neigh entry and do the callouts.
>
> Not sure how do you mean all this. You do kfree_skb immediately in the
> arp processing function. Will this not call the destructor directly?
>
No because the skb refcnt gets bumped by the dev packet code before
passing it up to each snoop function. So the destructor fn will get
called only when the _last_ user of this skbuf frees it. If by some
reason we are the last ref, then yes, we'd get called immediately. But
that's not what happens because the snoopers get added to the end of the
list of users who want any given ethertype packet. Hope that makes
sense.
> Anyway, it seems too risky to change the code a lot now.
> what I am concerned is that this could have broken working code.
>
I tested it with IB and iWARP.
> To reduce the risk of problems for existing code,
> I'd like to see something like the following:
>
> if (someone asked for notification on neighbour changes)
> do the destructor trick
>
> if (someone asked for notification on address resolution)
> call the destructor directly
>
> Could you code this up please?
There's no easy way to tell who asked for notifications. And
particularly why they asked for notification.
I think we should leave it as-is. If we have problems, we'll fix it.
Or you could put your arp snoop code back in addr.c and address
translation will not use netevents. But still thing we should leave
it...
From mshefty at ichips.intel.com Thu Feb 1 12:05:34 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Thu, 01 Feb 2007 12:05:34 -0800
Subject: [openib-general] new IB CM reject reason
In-Reply-To: <20070201190624.GB6473@mellanox.co.il>
References: <20070201183922.GB15115@mellanox.co.il>
<000101c74632$85b37bf0$8698070a@amr.corp.intel.com>
<20070201190624.GB6473@mellanox.co.il>
Message-ID: <45C2480E.2000904@ichips.intel.com>
> And my claim is that you should define private data format to go with this
> other reason otherwise you are not really solving the problem.
This is not a consumer issued reject. It is a CM issued reject, so the private
data is ignored. This is no different than several other reject reasons (like
invalid service ID). At best we could define the ARI, but if we knew what the
contents of the ARI should be, then we should use a more specific reject reason
than 'other'.
- Sean
From swise at opengridcomputing.com Thu Feb 1 12:07:21 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 14:07:21 -0600
Subject: [openib-general] [PATCH] ofed_1_2 Cleanup RHEL4U4 netevent backport]
Message-ID: <1170360441.16637.41.camel@stevo-desktop>
From: Steve Wise
Add skbuff.h to include list for RHEL4U4 netevent.c file. This makes
it identical to the SLES9SP3 file.
Signed-off-by: Steve Wise
---
.../backport/2.6.9_U4/include/src/netevent.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
index 1589300..87fb55c 100644
--- a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
+++ b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
@@ -13,6 +13,7 @@
* Fixes:
*/
+#include
#include
#include
#include
From swise at opengridcomputing.com Thu Feb 1 12:09:03 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 14:09:03 -0600
Subject: [openib-general] [PATCH] ofed_1_2 Chelsio ethernet driver updates.
Message-ID: <1170360543.16637.45.camel@stevo-desktop>
From: Steve Wise
This patch updates the ofed_1_2 cxgb3 module to the latest queued
for 2.6.21.
Signed-off-by: Steve Wise
---
drivers/net/cxgb3/firmware_exports.h | 2 +-
drivers/net/cxgb3/sge.c | 21 +++++++++------------
drivers/net/cxgb3/t3_cpl.h | 3 ---
3 files changed, 10 insertions(+), 16 deletions(-)
diff --git a/drivers/net/cxgb3/firmware_exports.h b/drivers/net/cxgb3/firmware_exports.h
index 4538377..6a835f6 100755
--- a/drivers/net/cxgb3/firmware_exports.h
+++ b/drivers/net/cxgb3/firmware_exports.h
@@ -129,7 +129,7 @@ #define FW_OFLD_NUM 8
#define FW_OFLD_SGEEC_START 0
/*
- *
+ *
*/
#define FW_RI_NUM 1
#define FW_RI_SGEEC_START 65527
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 6b053bf..3f2cf8a 100755
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -601,17 +601,16 @@ static struct sk_buff *get_packet(struct
if (len <= SGE_RX_COPY_THRES) {
skb = alloc_skb(len, GFP_ATOMIC);
if (likely(skb != NULL)) {
- struct rx_desc *d = &fl->desc[fl->cidx];
- dma_addr_t mapping =
- (dma_addr_t)((u64) be32_to_cpu(d->addr_hi) << 32 |
- be32_to_cpu(d->addr_lo));
-
__skb_put(skb, len);
- pci_dma_sync_single_for_cpu(adap->pdev, mapping, len,
- PCI_DMA_FROMDEVICE);
+ pci_dma_sync_single_for_cpu(adap->pdev,
+ pci_unmap_addr(sd,
+ dma_addr),
+ len, PCI_DMA_FROMDEVICE);
memcpy(skb->data, sd->skb->data, len);
- pci_dma_sync_single_for_device(adap->pdev, mapping, len,
- PCI_DMA_FROMDEVICE);
+ pci_dma_sync_single_for_device(adap->pdev,
+ pci_unmap_addr(sd,
+ dma_addr),
+ len, PCI_DMA_FROMDEVICE);
} else if (!drop_thres)
goto use_orig_buf;
recycle:
@@ -1667,7 +1666,7 @@ #endif
credits = G_RSPD_TXQ0_CR(flags);
if (credits)
qs->txq[TXQ_ETH].processed += credits;
-
+
credits = G_RSPD_TXQ2_CR(flags);
if (credits)
qs->txq[TXQ_CTRL].processed += credits;
@@ -2220,14 +2219,12 @@ static irqreturn_t t3b_intr_napi(int irq
if (likely(map & 1)) {
dev = adap->sge.qs[0].netdev;
- BUG_ON(napi_is_scheduled(dev));
if (likely(__netif_rx_schedule_prep(dev)))
__netif_rx_schedule(dev);
}
if (map & 2) {
dev = adap->sge.qs[1].netdev;
- BUG_ON(napi_is_scheduled(dev));
if (likely(__netif_rx_schedule_prep(dev)))
__netif_rx_schedule(dev);
}
diff --git a/drivers/net/cxgb3/t3_cpl.h b/drivers/net/cxgb3/t3_cpl.h
index 96b2f36..b7a1a31 100755
--- a/drivers/net/cxgb3/t3_cpl.h
+++ b/drivers/net/cxgb3/t3_cpl.h
@@ -184,9 +184,6 @@ #define V_OPCODE(x) ((x) << S_OPCODE)
#define G_OPCODE(x) (((x) >> S_OPCODE) & 0xFF)
#define G_TID(x) ((x) & 0xFFFFFF)
-#define S_QNUM 0
-#define G_QNUM(x) (((x) >> S_QNUM) & 0xFFFF)
-
/* tid is assumed to be 24-bits */
#define MK_OPCODE_TID(opcode, tid) (V_OPCODE(opcode) | (tid))
From swise at opengridcomputing.com Thu Feb 1 12:19:43 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 14:19:43 -0600
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <20070201090958.GD14189@mellanox.co.il>
References: <000101c74576$fedc81f0$8698070a@amr.corp.intel.com>
<1170275680.14294.5.camel@stevo-desktop>
<45C1480C.1020600@ichips.intel.com>
<1170320484.654.6.camel@linux-q667.site>
<20070201090958.GD14189@mellanox.co.il>
Message-ID: <1170361183.16637.47.camel@stevo-desktop>
On Thu, 2007-02-01 at 11:09 +0200, Michael S. Tsirkin wrote:
> > Quoting Steve WIse :
> > Subject: Re: ip_ib_mc_map?
> >
> > On Wed, 2007-01-31 at 17:53 -0800, Sean Hefty wrote:
> > > Steve Wise wrote:
> > > > Perhaps there's no backport for this to rhel4u4?
> > >
> > > I would have thought so, but I really don't know. The function is called from
> > > net/ipv4/arp.c, and not directly by ipoib. So, I don't know how the backport
> > > patches typically handle this.
> > >
> > > - Sean
> >
> > Here's what I see:
> >
> > ip_ib_mc_map() is called directly from cma_join_ib_multicast(), which is
> > added to the ofed_1_2 cma.c via patch file:
> > kernel_patches/fixes/sean_multicast_1.patch
> >
> > So when I compiled ofed_1_2 on rhel4u4, the cma wouldn't load because
> > there is no ip_ib_mc_map() in rhel4u4.
> >
> > So you need a backport patch for this to work on rhel4u4. Probably many
> > of the older kernels.
>
> I think this breakage is U4 specific. Someone at RH went to the trouble to
> rip all of IB related stuff out of the U4 kernel.
>
> I think just calling ip_tr_mc_map on U4 instead will be enough.
>
I changed cma.c to call ip_tr_mc_map() and I can at least load rdma_cm
now. I didn't test any mcast, but the rdma-cm is working over iwarp...
Steve.
From jlentini at netapp.com Thu Feb 1 12:29:00 2007
From: jlentini at netapp.com (James Lentini)
Date: Thu, 1 Feb 2007 15:29:00 -0500 (EST)
Subject: [openib-general] new IB CM reject reason
In-Reply-To: <45C2480E.2000904@ichips.intel.com>
References: <20070201183922.GB15115@mellanox.co.il>
<000101c74632$85b37bf0$8698070a@amr.corp.intel.com>
<20070201190624.GB6473@mellanox.co.il>
<45C2480E.2000904@ichips.intel.com>
Message-ID:
On Thu, 1 Feb 2007, Sean Hefty wrote:
> > And my claim is that you should define private data format to go with this
> > other reason otherwise you are not really solving the problem.
>
> This is not a consumer issued reject. It is a CM issued reject, so
> the private data is ignored. This is no different than several
> other reject reasons (like invalid service ID). At best we could
> define the ARI, but if we knew what the contents of the ARI should
> be, then we should use a more specific reject reason than 'other'.
Invalid Service ID (8) appears to be an appropriate Reason value for
the case when a REQ is received for a service ID that is not
registered with the CM (either because the application crashed or
exited on its own accord).
I agree that if the reason codes are insufficient we should take this
up in the IBTA.
From or.gerlitz at gmail.com Thu Feb 1 12:40:57 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Thu, 1 Feb 2007 22:40:57 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <1170325052.2716.229.camel@fc6.xsintricity.com>
References: <1170275331.14294.1.camel@stevo-desktop>
<45C1ABD0.5090404@voltaire.com>
<1170325052.2716.229.camel@fc6.xsintricity.com>
Message-ID: <15ddcffd0702011240l3c427bfcx6fcc7f7968fcf8b9@mail.gmail.com>
On 2/1/07, Doug Ledford wrote:
> On Thu, 2007-02-01 at 10:58 +0200, Or Gerlitz wrote:
> > From a reason that no one at RH can trace... someone went and removed
> > all the support for ARPHRD_INFINIBAND multicast from u4 where it exists
> > perfectly fine in u3 and hopefully on u5 as well (Doug can you update?),
> > see https://bugs.openfabrics.org/show_bug.cgi?id=2661
> Yes. It's been fixed for U5. It wasn't that the patch got removed,
> it's that between U3 and U4 I did a complete rebase, which means that
> all the patches from U3 were tossed out the window and a complete new
> set made for U4. I just missed re-adding this one in U4.
thanks for fixing this for U5 (which i understand is not out yet, correct?).
As of the importance for us to have IP multicast working fine with
IPoIB over RH4...
do you have an IB setup to test that?
Or.
From swise at opengridcomputing.com Thu Feb 1 13:05:34 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 15:05:34 -0600
Subject: [openib-general] [Fwd: Re: [PATCH 1/10] cxgb3 - main header
files]
In-Reply-To:
References: <1169216896.15842.6.camel@stevo-desktop>
Message-ID: <1170363934.16637.58.camel@stevo-desktop>
On Fri, 2007-01-19 at 09:07 -0800, Roland Dreier wrote:
> > Jeff has pulled in the Chelsio Ethernet driver. If you are ready to
> > merge in the RDMA driver, you can pull it from
>
> Yes, I saw that... OK, I'll get serious about reviewing the RDMA stuff.
Hey Roland,
Have you had a chance to review this?
Thanks,
Steve.
From dledford at redhat.com Thu Feb 1 14:19:21 2007
From: dledford at redhat.com (Doug Ledford)
Date: Thu, 01 Feb 2007 17:19:21 -0500
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <15ddcffd0702011240l3c427bfcx6fcc7f7968fcf8b9@mail.gmail.com>
References: <1170275331.14294.1.camel@stevo-desktop>
<45C1ABD0.5090404@voltaire.com>
<1170325052.2716.229.camel@fc6.xsintricity.com>
<15ddcffd0702011240l3c427bfcx6fcc7f7968fcf8b9@mail.gmail.com>
Message-ID: <1170368361.2716.239.camel@fc6.xsintricity.com>
On Thu, 2007-02-01 at 22:40 +0200, Or Gerlitz wrote:
> On 2/1/07, Doug Ledford wrote:
> > On Thu, 2007-02-01 at 10:58 +0200, Or Gerlitz wrote:
>
> > > From a reason that no one at RH can trace... someone went and removed
> > > all the support for ARPHRD_INFINIBAND multicast from u4 where it exists
> > > perfectly fine in u3 and hopefully on u5 as well (Doug can you update?),
> > > see https://bugs.openfabrics.org/show_bug.cgi?id=2661
>
> > Yes. It's been fixed for U5. It wasn't that the patch got removed,
> > it's that between U3 and U4 I did a complete rebase, which means that
> > all the patches from U3 were tossed out the window and a complete new
> > set made for U4. I just missed re-adding this one in U4.
>
> thanks for fixing this for U5 (which i understand is not out yet, correct?).
Correct. Although I can get people the packages slated for U5 if they
want to test/check them out.
> As of the importance for us to have IP multicast working fine with
> IPoIB over RH4...
> do you have an IB setup to test that?
Yeah, I've got a setup, I just don't have any multicast tests that I
run. Any test programs you have for multicast in particular would be
helpful.
--
Doug Ledford
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
From mst at mellanox.co.il Thu Feb 1 14:24:05 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 00:24:05 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <1170360071.16637.39.camel@stevo-desktop>
References: <1170360071.16637.39.camel@stevo-desktop>
Message-ID: <20070201222405.GG17617@mellanox.co.il>
> There's no easy way to tell who asked for notifications. And
> particularly why they asked for notification.
>
> I think we should leave it as-is. If we have problems, we'll fix it.
>
> Or you could put your arp snoop code back in addr.c and address
> translation will not use netevents. But still thing we should leave
> it...
I think the issues need to be addressed in some way.
I think I see another issue with the destructor approach: ib_core could
be unloaded while skb with destructor pointing to our code is still around.
This will lead to nasty crashes without clear backtrace on screen if text
segment memory gets over-written and the destructor gets called afterwards.
It currently seems that invoking the callback function directly rather than
sticking it in skb->destructor is the lesser of evils at this point.
But I'll think all this over, and I'd like to ask you to do this too,
and post some suggestions.
I can think of some more complicated approaches that might work better
for iwarp. Off the top of my head, our netevents implementation could
keep a reference on the skb, start a timer, check the users counter on skb and
call the notifier chain when it drops to 1. Let's sleep on it.
--
MST
From mst at mellanox.co.il Thu Feb 1 14:25:57 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 00:25:57 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <1170361183.16637.47.camel@stevo-desktop>
References: <1170361183.16637.47.camel@stevo-desktop>
Message-ID: <20070201222557.GH17617@mellanox.co.il>
> Quoting Steve Wise :
> Subject: Re: ip_ib_mc_map?
>
> On Thu, 2007-02-01 at 11:09 +0200, Michael S. Tsirkin wrote:
> > > Quoting Steve WIse :
> > > Subject: Re: ip_ib_mc_map?
> > >
> > > On Wed, 2007-01-31 at 17:53 -0800, Sean Hefty wrote:
> > > > Steve Wise wrote:
> > > > > Perhaps there's no backport for this to rhel4u4?
> > > >
> > > > I would have thought so, but I really don't know. The function is called from
> > > > net/ipv4/arp.c, and not directly by ipoib. So, I don't know how the backport
> > > > patches typically handle this.
> > > >
> > > > - Sean
> > >
> > > Here's what I see:
> > >
> > > ip_ib_mc_map() is called directly from cma_join_ib_multicast(), which is
> > > added to the ofed_1_2 cma.c via patch file:
> > > kernel_patches/fixes/sean_multicast_1.patch
> > >
> > > So when I compiled ofed_1_2 on rhel4u4, the cma wouldn't load because
> > > there is no ip_ib_mc_map() in rhel4u4.
> > >
> > > So you need a backport patch for this to work on rhel4u4. Probably many
> > > of the older kernels.
> >
> > I think this breakage is U4 specific. Someone at RH went to the trouble to
> > rip all of IB related stuff out of the U4 kernel.
> >
> > I think just calling ip_tr_mc_map on U4 instead will be enough.
> >
>
> I changed cma.c to call ip_tr_mc_map() and I can at least load rdma_cm
> now. I didn't test any mcast, but the rdma-cm is working over iwarp...
So this could be a macro in kernel_addons, unless someone from
Voltaire is willing to step up with a more elaborate implementation.
--
MST
From swise at opengridcomputing.com Thu Feb 1 14:41:56 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 16:41:56 -0600
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070201222405.GG17617@mellanox.co.il>
References: <1170360071.16637.39.camel@stevo-desktop>
<20070201222405.GG17617@mellanox.co.il>
Message-ID: <1170369716.16637.69.camel@stevo-desktop>
On Fri, 2007-02-02 at 00:24 +0200, Michael S. Tsirkin wrote:
> > There's no easy way to tell who asked for notifications. And
> > particularly why they asked for notification.
> >
> > I think we should leave it as-is. If we have problems, we'll fix it.
> >
> > Or you could put your arp snoop code back in addr.c and address
> > translation will not use netevents. But still thing we should leave
> > it...
>
> I think the issues need to be addressed in some way.
>
> I think I see another issue with the destructor approach: ib_core could
> be unloaded while skb with destructor pointing to our code is still around.
> This will lead to nasty crashes without clear backtrace on screen if text
> segment memory gets over-written and the destructor gets called afterwards.
>
Yes...hmm... We could reference the module in the snoop function and
deref it in the destructor function.
> It currently seems that invoking the callback function directly rather than
> sticking it in skb->destructor is the lesser of evils at this point.
> But I'll think all this over, and I'd like to ask you to do this too,
> and post some suggestions.
>
Ok.
> I can think of some more complicated approaches that might work better
> for iwarp. Off the top of my head, our netevents implementation could
> keep a reference on the skb, start a timer, check the users counter on skb and
> call the notifier chain when it drops to 1. Let's sleep on it.
>
Ok. I'll ponder it some more. But we could solve the module unload
issue via module refs methinks.
Steve.
From mst at mellanox.co.il Thu Feb 1 14:43:04 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 00:43:04 +0200
Subject: [openib-general] new IB CM reject reason
In-Reply-To: <45C2480E.2000904@ichips.intel.com>
References: <20070201183922.GB15115@mellanox.co.il>
<000101c74632$85b37bf0$8698070a@amr.corp.intel.com>
<20070201190624.GB6473@mellanox.co.il>
<45C2480E.2000904@ichips.intel.com>
Message-ID: <20070201224304.GI17617@mellanox.co.il>
> Quoting Sean Hefty :
> Subject: Re: new IB CM reject reason
>
> > And my claim is that you should define private data format to go with this
> > other reason otherwise you are not really solving the problem.
>
> This is not a consumer issued reject. It is a CM issued reject, so the private
> data is ignored. This is no different than several other reject reasons (like
> invalid service ID). At best we could define the ARI, but if we knew what the
> contents of the ARI should be, then we should use a more specific reject reason
> than 'other'.
I still don't really buy this, and I think you don't see my point.
The difference between ib_cm module and consumer is an artificial one -
the consumer just uses ib_cm as a convenience module. In particular, as a
feature, he gets automatic REJ generation when CM ID is destroyed.
In this case private data is all 0s.
So a custom protocol on top of ib_cm module that has its own consumer rejects for
some reason, would be wise to put something other than all 0s in its private
data if it wants to differentiate between the two kinds of consumer reject.
Most likely no one cares much about reject reasons so all this is
unnecessary.
But adding "other" reason just moves the problem up one level -
what if the actual consumer is using some library on top of CM?
Consider for example cma. It might generate rejects on its own too.
So now, there is cm, cma as a cm consumer, and the cma consumer.
So do we need yet another reject reason for cma generated rejects?
Do you see my point now?
--
MST
From mst at mellanox.co.il Thu Feb 1 14:48:41 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 00:48:41 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <1170369716.16637.69.camel@stevo-desktop>
References: <1170369716.16637.69.camel@stevo-desktop>
Message-ID: <20070201224841.GJ17617@mellanox.co.il>
> > I can think of some more complicated approaches that might work better
> > for iwarp. Off the top of my head, our netevents implementation could
> > keep a reference on the skb, start a timer, check the users counter on skb and
> > call the notifier chain when it drops to 1. Let's sleep on it.
> >
>
> Ok. I'll ponder it some more. But we could solve the module unload
> issue via module refs methinks.
This almost never works cleanly - module can't reference itself
without races: module can get unloaded after it drops the reference
to itself and before the function exits.
But I agree such a race is mostly theoretical.
And we still have the case where destructor != NULL.
Certainly something to think about.
--
MST
From mst at mellanox.co.il Thu Feb 1 14:57:54 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 00:57:54 +0200
Subject: [openib-general] new IB CM reject reason
In-Reply-To:
References:
Message-ID: <20070201225754.GK17617@mellanox.co.il>
> Invalid Service ID (8) appears to be an appropriate Reason value for
> the case when a REQ is received for a service ID that is not
> registered with the CM (either because the application crashed or
> exited on its own accord).
No, we are actually speaking about reject to generate when application
cancels the communication establishment (e.g. by exiting), not as a response
to any CM message.
--
MST
From or.gerlitz at gmail.com Thu Feb 1 15:18:26 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Fri, 2 Feb 2007 01:18:26 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <1170368361.2716.239.camel@fc6.xsintricity.com>
References: <1170275331.14294.1.camel@stevo-desktop>
<45C1ABD0.5090404@voltaire.com>
<1170325052.2716.229.camel@fc6.xsintricity.com>
<15ddcffd0702011240l3c427bfcx6fcc7f7968fcf8b9@mail.gmail.com>
<1170368361.2716.239.camel@fc6.xsintricity.com>
Message-ID: <15ddcffd0702011518qf115aaey862ef168784e81ca@mail.gmail.com>
On 2/2/07, Doug Ledford wrote:
> > As of the importance for us to have IP multicast working fine with
> > IPoIB over RH4...
> > do you have an IB setup to test that?
>
> Yeah, I've got a setup, I just don't have any multicast tests that I
> run. Any test programs you have for multicast in particular would be
> helpful.
This is farely simple to do: have some multicast traffic routed over
an IPoIB subnet on two nodes, eg using
$ route add -net 224.0.0.0 netmask 255.0.0.0 dev ib0
and then
server
$ iperf -usB 224.5.5.5 -i 1
client
$ iperf -uc 224.5.5.5 -l 100 -b 50M -t 30 -i 1
Or.
From swise at opengridcomputing.com Thu Feb 1 15:23:37 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 17:23:37 -0600
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070201224841.GJ17617@mellanox.co.il>
References: <1170369716.16637.69.camel@stevo-desktop>
<20070201224841.GJ17617@mellanox.co.il>
Message-ID: <1170372217.16637.87.camel@stevo-desktop>
On Fri, 2007-02-02 at 00:48 +0200, Michael S. Tsirkin wrote:
> > > I can think of some more complicated approaches that might work better
> > > for iwarp. Off the top of my head, our netevents implementation could
> > > keep a reference on the skb, start a timer, check the users counter on skb and
> > > call the notifier chain when it drops to 1. Let's sleep on it.
> > >
Remembering which skbs to check later requires more complication. Here
is one method to handle this and do what you suggest above.
In the snoop function:
Clone the skb and save the original skb ptr in the new skb->cb area.
This area is ours to use on a freshly cloned skbuff. Add this new skb
ptr to a linked list of outstanding netevents to be processed later.
Don't free the original skb passed in. This keeps the reference on it
like you proposed above. Schedule a delayed work handler for a few
ticks in the future.
In the delayed work handler:
Walk the pending netevents skb list. For each pending skb, get the
original skb ptr from the cloned skb->cb area, and if the user count is
now 1 then do the current destructor() logic, remove the skb from the
pending list, and free both skbs. If the list is not empty reschedule
the delayed work handler for a few ticks later.
In the module unload function:
cancel any delayed work handling
walk the pending list and free the skbs and the original snooped skbs.
This solves the destructor issue and the rmmod issue, but is more
complicated. If you're worried about regressing straight rdma address
translation, then you can call the address translation timer function
synchronously in the snoop function like before and change the
addr_trans module to not use netevents...
Steve.
From mst at mellanox.co.il Thu Feb 1 15:33:18 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 01:33:18 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <1170372217.16637.87.camel@stevo-desktop>
References: <1170372217.16637.87.camel@stevo-desktop>
Message-ID: <20070201233318.GO17617@mellanox.co.il>
> Quoting Steve Wise :
> Subject: Re: [PATCH 00/12] ofed_1_2 - Neighbour update support
>
> On Fri, 2007-02-02 at 00:48 +0200, Michael S. Tsirkin wrote:
> > > > I can think of some more complicated approaches that might work better
> > > > for iwarp. Off the top of my head, our netevents implementation could
> > > > keep a reference on the skb, start a timer, check the users counter on skb and
> > > > call the notifier chain when it drops to 1. Let's sleep on it.
> > > >
>
> Remembering which skbs to check later requires more complication. Here
> is one method to handle this and do what you suggest above.
>
> In the snoop function:
>
> Clone the skb and save the original skb ptr in the new skb->cb area.
> This area is ours to use on a freshly cloned skbuff. Add this new skb
> ptr to a linked list of outstanding netevents to be processed later.
> Don't free the original skb passed in. This keeps the reference on it
> like you proposed above. Schedule a delayed work handler for a few
> ticks in the future.
>
> In the delayed work handler:
>
> Walk the pending netevents skb list. For each pending skb, get the
> original skb ptr from the cloned skb->cb area, and if the user count is
> now 1 then do the current destructor() logic, remove the skb from the
> pending list, and free both skbs. If the list is not empty reschedule
> the delayed work handler for a few ticks later.
>
> In the module unload function:
>
> cancel any delayed work handling
> walk the pending list and free the skbs and the original snooped skbs.
>
> This solves the destructor issue and the rmmod issue, but is more
> complicated. If you're worried about regressing straight rdma address
> translation, then you can call the address translation timer function
> synchronously in the snoop function like before and change the
> addr_trans module to not use netevents...
Yes, this is what I proposed above. It does all sound quite complicated.
Some notes:
- you don't need an skb just too keep a void*. create your own
structure for this.
- better use a timer than a workqueue - you are calling netevents
from atomic context on new kernels anyway.
So maybe destructor with module ref counting is better.
Donnu.
--
MST
From swise at opengridcomputing.com Thu Feb 1 15:50:27 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Thu, 01 Feb 2007 17:50:27 -0600
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070201233318.GO17617@mellanox.co.il>
References: <1170372217.16637.87.camel@stevo-desktop>
<20070201233318.GO17617@mellanox.co.il>
Message-ID: <1170373827.16637.92.camel@stevo-desktop>
On Fri, 2007-02-02 at 01:33 +0200, Michael S. Tsirkin wrote:
> > Quoting Steve Wise :
> > Subject: Re: [PATCH 00/12] ofed_1_2 - Neighbour update support
> >
> > On Fri, 2007-02-02 at 00:48 +0200, Michael S. Tsirkin wrote:
> > > > > I can think of some more complicated approaches that might work better
> > > > > for iwarp. Off the top of my head, our netevents implementation could
> > > > > keep a reference on the skb, start a timer, check the users counter on skb and
> > > > > call the notifier chain when it drops to 1. Let's sleep on it.
> > > > >
> >
> > Remembering which skbs to check later requires more complication. Here
> > is one method to handle this and do what you suggest above.
> >
> > In the snoop function:
> >
> > Clone the skb and save the original skb ptr in the new skb->cb area.
> > This area is ours to use on a freshly cloned skbuff. Add this new skb
> > ptr to a linked list of outstanding netevents to be processed later.
> > Don't free the original skb passed in. This keeps the reference on it
> > like you proposed above. Schedule a delayed work handler for a few
> > ticks in the future.
> >
> > In the delayed work handler:
> >
> > Walk the pending netevents skb list. For each pending skb, get the
> > original skb ptr from the cloned skb->cb area, and if the user count is
> > now 1 then do the current destructor() logic, remove the skb from the
> > pending list, and free both skbs. If the list is not empty reschedule
> > the delayed work handler for a few ticks later.
> >
> > In the module unload function:
> >
> > cancel any delayed work handling
> > walk the pending list and free the skbs and the original snooped skbs.
> >
> > This solves the destructor issue and the rmmod issue, but is more
> > complicated. If you're worried about regressing straight rdma address
> > translation, then you can call the address translation timer function
> > synchronously in the snoop function like before and change the
> > addr_trans module to not use netevents...
>
>
> Yes, this is what I proposed above. It does all sound quite complicated.
> Some notes:
> - you don't need an skb just too keep a void*. create your own
> structure for this.
> - better use a timer than a workqueue - you are calling netevents
> from atomic context on new kernels anyway.
>
> So maybe destructor with module ref counting is better.
> Donnu.
We could use a global refcnt to count the number of pending destructions
and use a completion object to block unload until all the destructors
fire and the refcnt goes to zero.
From rdreier at cisco.com Thu Feb 1 20:45:11 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 01 Feb 2007 20:45:11 -0800
Subject: [openib-general] ipath and current git woes
In-Reply-To: <20070201002202.GA12386@obsidianresearch.com> (Jason
Gunthorpe's message of "Wed, 31 Jan 2007 17:22:02 -0700")
References: <20070201002202.GA12386@obsidianresearch.com>
Message-ID:
> After applying that patch the user space consumers load but we got a
> kernel oops when we tried to run a test here :<
>
> Unable to handle kernel NULL pointer dereference at 0000000000000918 RIP:
> [] :ib_ipath:ipath_mmap+0x37/0x95
So I had a look at this, and it seems that there are two bugs that
lead to this.
First of all, libipathverbs gets a response from the kernel that has a
64-bit kernel address in it, and passes that back into a call to
mmap(), where it uses that address as the offset. On 32-bit
userspace, that chops off the high bits of the address and so the
ipath kernel driver can't find the address in its list.
So that explains why things don't work. And unfortunately the obvious
fix for libipathverbs to use mmap64() instead of mmap() doesn't work,
because on Linux, mmap64() is implemented with the mmap2 system call,
which just allows the offset to be 12 bits bigger -- so it only gets
you to 44 bits, which is not enough to reach a 64-bit kernel address
(which is typically something like 0xffffc20000072000). So you
probably want to use something like a 32-bit serial number to point at
your buffers or something like that.
The oops is caused by another more serious problem. Obviously a buggy
libipathverbs shouldn't be able to crash the kernel, because even if
libipathverbs is fixed then malicious userspace could do the same
thing too.
It turns out that all the handling of pending_mmaps in the ipath
driver is not really careful about userspace screwing it up. When
userspace creates a CQ, the CQ buffer is added to the device-wide list
of pending mmaps. Of course 32-bit userspace never succeeds in
mapping that CQ, so it stays on the list (the only way it gets removed
is if it is successfully mmapped). But then the destroy CQ operation
sees that the mmap is pending, and frees the structure holding the
information (without removing it from the list). And of course when
that memory gets reused, then the pending mmap list gets corrupted,
etc etc.
Of course this is ugly to fix with the current data structure -- the
list of pending mmaps is singly-linked, which means I have to walk the
whole list to delete an entry. It also makes the list walking in
ipath_mmap() is unnecessarily obfuscated. I think it's much better to
just use the standard kernel list_head stuff if you're going to delete
things from the middle of the list, rather than implementing your own
singly-linked list. Sure it costs an extra pointer in each entry but
no one ever has to worry about whether you're deleting things
correctly, etc.
There's some other silly stuff I noticed too, like:
grep -n mmap_cnt *.[ch] /dev/null
ipath_cq.c:232: ip->mmap_cnt = 0;
ipath_mmap.c:63: ip->mmap_cnt++;
ipath_mmap.c:70: ip->mmap_cnt--;
ipath_qp.c:837: ip->mmap_cnt = 0;
ipath_srq.c:162: ip->mmap_cnt = 0;
ipath_verbs.h:178: unsigned mmap_cnt;
umm -- no one ever looks at mmap_cnt (there's a kref too), so why keep
it at all?
So Qlogic guys -- please fix this up!
- R.
From rdreier at cisco.com Thu Feb 1 20:47:10 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 01 Feb 2007 20:47:10 -0800
Subject: [openib-general] IPoIB CM for merge?
In-Reply-To: <20070201192924.GE17617@mellanox.co.il> (Michael S.
Tsirkin's message of "Thu, 1 Feb 2007 21:29:24 +0200")
References: <20070201192924.GE17617@mellanox.co.il>
Message-ID:
> Could you please spend some time reviewing IPoIB CM code?
> I am concerned about missing the 2.6.21 merge window.
Thanks for the reminder.
Can we trade? Have you looked at the cxgb3 iwarp driver? Any comments?
- R.
From rdreier at cisco.com Thu Feb 1 20:48:13 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Thu, 01 Feb 2007 20:48:13 -0800
Subject: [openib-general] [Fwd: Re: [PATCH 1/10] cxgb3 - main header
files]
In-Reply-To: <1170363934.16637.58.camel@stevo-desktop> (Steve Wise's
message of "Thu, 01 Feb 2007 15:05:34 -0600")
References: <1169216896.15842.6.camel@stevo-desktop>
<1170363934.16637.58.camel@stevo-desktop>
Message-ID:
> Have you had a chance to review this?
Still on my list.
Can we trade? Can you look at the IPoIB connected mode stuff in the
ipoib-cm branch in
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git
and let me know if you see anything you don't like?
- R.
From mike.heffner at evergrid.com Thu Feb 1 21:10:09 2007
From: mike.heffner at evergrid.com (Mike Heffner)
Date: Fri, 02 Feb 2007 00:10:09 -0500
Subject: [openib-general] Detecting when an RDMA writer process disappears
Message-ID: <45C2C7B1.7090204@evergrid.com>
Is there any method by which a receiving process that is polling in
preregistered memory regions for data from a sender performing RDMA
writes, can detect if the sender is killed? Say by a SIGKILL signal? The
RC connection is setup using the RDMA CM and there do not appear to be
any CM events created on the event channel, nor does there appear to be
any async. events created. Occasionally I will get a CQE failure on the
QP, depending on where the communication flow is, that I can use to mark
the connection failed, but this happens only about 50% of the time.
An alternative solution would be periodically sending "keep-alives" and
detecting the CQE failure, but I'd be interested to know if there are
any other options that don't require sending keep-alives.
Thanks,
Mike
--
Mike Heffner
EverGrid Software
Blacksburg, VA USA
Voice: (540) 443-3500 #603
From jgunthorpe at obsidianresearch.com Thu Feb 1 21:25:03 2007
From: jgunthorpe at obsidianresearch.com (Jason Gunthorpe)
Date: Thu, 1 Feb 2007 22:25:03 -0700
Subject: [openib-general] ipath and current git woes
In-Reply-To: <45C13771.2070406@qlogic.com>
References: <20070201002202.GA12386@obsidianresearch.com>
<45C13771.2070406@qlogic.com>
Message-ID: <20070202052503.GA19654@obsidianresearch.com>
On Wed, Jan 31, 2007 at 04:42:25PM -0800, Robert Walsh wrote:
> Jason Gunthorpe wrote:
> >Has anyone been able to use ipath with the current latest git
> >everything?
>
> We're working on getting this up to date right now. Give us a couple of
> days and we'll have some new patches ready.
OK. Things are working ok here using the same kernel and a 64 bit OFED
1.1 user space built in a chroot. That makes sense after reading
Roland's analysis...
Thanks,
Jason
From eitan at sw053.yok.mtl.com Thu Feb 1 21:40:43 2007
From: eitan at sw053.yok.mtl.com (Eitan Zahavi)
Date: Fri, 2 Feb 2007 07:40:43 +0200
Subject: [openib-general] nightly osm_sim report 2007-02-02:normal completion
Message-ID: <200702020540.l125ehia022501@sw053.yok.mtl.com>
OSM Simulation Regression Summary
OpenSM rev = Thu_Feb_1_10:25:31_2007 b8cdb7
ibutils rev = Wed_Jan_3_11:42:12_2007 913448
Total=410 Pass=409 Fail=1
Pass:
30 Stability IS1-16.topo
30 Pkey IS1-16.topo
30 OsmTest IS1-16.topo
30 Multicast IS1-16.topo
30 LidMgr IS1-16.topo
29 OsmStress IS1-16.topo
10 Stability IS3-loop.topo
10 Stability IS3-128.topo
10 Pkey IS3-128.topo
10 OsmTest IS3-loop.topo
10 OsmTest IS3-128.topo
10 OsmStress IS3-128.topo
10 Multicast IS3-loop.topo
10 Multicast IS3-128.topo
10 LidMgr IS3-128.topo
10 FatTree part-4-ary-3-tree.topo
10 FatTree merge-roots-reorder-4-ary-2-tree.topo
10 FatTree merge-roots-4-ary-2-tree.topo
10 FatTree merge-root-4-ary-3-tree.topo
10 FatTree merge-root-12-ary-2-tree.topo
10 FatTree merge-2-ary-4-tree.topo
10 FatTree half-4-ary-3-tree.topo
10 FatTree blend-4-ary-2-tree.topo
10 FatTree 4-ary-4-tree.topo
10 FatTree 4-ary-3-tree.topo
10 FatTree 32nodes-3lvl-is1.topo
10 FatTree 2-ary-4-tree.topo
10 FatTree 12-node-spaced.topo
10 FatTree 12-ary-2-tree.topo
Failures:
1 OsmStress IS1-16.topo
From mst at mellanox.co.il Thu Feb 1 22:03:22 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 08:03:22 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <1170373827.16637.92.camel@stevo-desktop>
References: <1170373827.16637.92.camel@stevo-desktop>
Message-ID: <20070202060228.GQ17617@mellanox.co.il>
> We could use a global refcnt to count the number of pending destructions
> and use a completion object to block unload until all the destructors
> fire and the refcnt goes to zero.
It has the same race as module refcnt. So just use that.
--
MST
From bugzilla-daemon at lists.openfabrics.org Thu Feb 1 22:16:04 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Thu, 1 Feb 2007 22:16:04 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070202061604.ECEC7E607F9@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #11 from dmitry.yulov at intel.com 2007-02-01 22:16 -------
(In reply to comment #10)
> What is the output of uname -r ? This is VERY important. Also, can you run
`cat /etc/issue` and send the results?
> >
As you can see my first message I wrote the my machine configuration:
>The machine configuration:
>Kernel: Linux 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64
x86_64 x86_64 GNU/Linux
>OS: SUSE Linux Enterprise Server 10 (x86_64)
>gcc version: gcc (GCC) 4.1.0 (SUSE Linux)
Unfortunately my machine didn't have the version of Linux in /etc/issue because
it is not right by IT requrements. I have saw the ofed_scripts/configure file
and I saw that for right choice of patches configure needed the file
/etc/issue. I think that not good idea because first of all need to run
command: cat /etc/*release* and find the version Linux in this file and after
this check (if neccessary) file /etc/issue
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mst at mellanox.co.il Thu Feb 1 22:56:14 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 08:56:14 +0200
Subject: [openib-general] IPoIB CM for merge?
In-Reply-To:
References:
Message-ID: <20070202065547.GS17617@mellanox.co.il>
> Quoting Roland Dreier :
> Subject: Re: IPoIB CM for merge?
>
> > Could you please spend some time reviewing IPoIB CM code?
> > I am concerned about missing the 2.6.21 merge window.
>
> Thanks for the reminder.
>
> Can we trade? Have you looked at the cxgb3 iwarp driver? Any comments?
I haven't yet, sorry. OK.
I am not sure I have the last version posted so I am going to go by what
is there in OFED git tree.
And I also only looked under drivers/infiniband/.
So, here are some questions: I looked in the archives and have not seen
these addressed. Maybe these can be answered and then I'll go from there?
Does this sound OK?
Files with names like
./core/cxio_hal.c
./core/cxio_hal.h
normally generate a fair bit of discussion which wasn't present here,
I did not guess everyone was just busy.
For example, why is there both struct iwch_cq and struct t3_cq?
File tcb.h comment says:
/* This file is automatically generated --- do not edit */
This looks like a GPL violation, does it not?
What's the deal with the naming convention?
Is there a reason in cxgb3, some files start with iwch and some with cxio?
How about using cxgb3 prefix all over?
--
MST
From philippe.gregoire at cea.fr Fri Feb 2 02:10:16 2007
From: philippe.gregoire at cea.fr (Philippe Gregoire)
Date: Fri, 02 Feb 2007 11:10:16 +0100
Subject: [openib-general] dry-run mode for opensm ?
Message-ID: <45C30E08.1030502@cea.fr>
Hal
Is there any way to run opensm in a dry-run mode
just to make it dump the route tables it will generate ?
We alve already an embedded SM and I would like to compare the
current route tables with those that OpenSM would generate.
Thanks
Philippe
From vlad at lists.openfabrics.org Fri Feb 2 02:20:43 2007
From: vlad at lists.openfabrics.org (vlad at lists.openfabrics.org)
Date: Fri, 2 Feb 2007 02:20:43 -0800 (PST)
Subject: [openib-general] ofa_1_2_kernel 20070202-0200 daily build status
Message-ID: <20070202102043.4FA07E607F9@openfabrics.org>
This email was generated automatically, please do not reply
Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod --with-addr_trans-mod --with-cxgb3-mod
Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.14
Passed on powerpc with linux-2.6.19
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.18
Passed on powerpc with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on x86_64 with linux-2.6.13
Passed on powerpc with linux-2.6.17
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.16
Passed on ppc64 with linux-2.6.17
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.18
Passed on powerpc with linux-2.6.12
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.13
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on ia64 with linux-2.6.15
Failed:
Build failed on ia64 with linux-2.6.16.21-0.8-default
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
make: *** [kernel] Error 2
----------------------------------------------------------------------------------
From mst at mellanox.co.il Fri Feb 2 03:15:32 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Fri, 2 Feb 2007 13:15:32 +0200
Subject: [openib-general] IPoIB CM for merge?
In-Reply-To:
References:
Message-ID: <20070202111532.GT17617@mellanox.co.il>
> Quoting Roland Dreier :
> Subject: Re: IPoIB CM for merge?
>
> > Could you please spend some time reviewing IPoIB CM code?
> > I am concerned about missing the 2.6.21 merge window.
>
> Thanks for the reminder.
>
> Can we trade? Have you looked at the cxgb3 iwarp driver? Any comments?
OK.
I am not sure I have the last version posted so I am going to go by what
is there in OFED git tree.
And I also only looked under drivers/infiniband/.
So, here are some questions: I looked in the archives and have not seen
these addressed. Maybe these can be answered and then I'll go from there?
Does this sound OK?
Files with names like
./core/cxio_hal.c
./core/cxio_hal.h
normally generate a fair bit of discussion which wasn't present here,
I did not guess everyone was just busy.
For example, why is there both struct iwch_cq and struct t3_cq?
File tcb.h comment says:
/* This file is automatically generated --- do not edit */
This looks like a GPL violation, does it not?
What's the deal with the naming convention?
Is there a reason in cxgb3, some files start with iwch and some with cxio?
How about using cxgb3 prefix all over?
--
MST
From bugzilla-daemon at lists.openfabrics.org Fri Feb 2 03:42:54 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Fri, 2 Feb 2007 03:42:54 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070202114254.39BAAE607F9@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #12 from dmitry.yulov at intel.com 2007-02-02 03:42 -------
Created an attachment (id=74)
--> (https://bugs.openfabrics.org/attachment.cgi?id=74&action=view)
Patch for ofed_scripts/configure
I have added a patch file for configure in my case.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at lists.openfabrics.org Fri Feb 2 03:56:43 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Fri, 2 Feb 2007 03:56:43 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070202115643.76DD4E607F9@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #13 from dmitry.yulov at intel.com 2007-02-02 03:56 -------
I want to ask someone how I can apply the patch during build.sh run script?
As I know when I run build.sh my old files with patch always update throught
run rpm -i openib-1.1.src.rpm. How I can do it (apply my patches) or I need to
wait new releases?
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From halr at voltaire.com Fri Feb 2 06:31:36 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 02 Feb 2007 09:31:36 -0500
Subject: [openib-general] dry-run mode for opensm ?
In-Reply-To: <45C30E08.1030502@cea.fr>
References: <45C30E08.1030502@cea.fr>
Message-ID: <1170426648.15660.351722.camel@hal.voltaire.com>
Hi Phillipe,
On Fri, 2007-02-02 at 05:10, Philippe Gregoire wrote:
> Hal
> Is there any way to run opensm in a dry-run mode
> just to make it dump the route tables it will generate ?
Not that I'm aware of.
> We alve already an embedded SM and I would like to compare the
> current route tables with those that OpenSM would generate.
There are two options here from what I know:
1. Turn off the embedded SM temporarily and run OpenSM (in one of it's
various routing modes)
2. Get your topology into a simulator and run OpenSM on it
BTW, there are scripts which will work with any SM to dump the routing
tables (dump_lfts/mgfts.sh) if that is how you are doing the comparison.
-- Hal
> Thanks
> Philippe
From swise at opengridcomputing.com Fri Feb 2 07:18:24 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 02 Feb 2007 09:18:24 -0600
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <20070202060228.GQ17617@mellanox.co.il>
References: <1170373827.16637.92.camel@stevo-desktop>
<20070202060228.GQ17617@mellanox.co.il>
Message-ID: <1170429504.26115.1.camel@stevo-desktop>
On Fri, 2007-02-02 at 08:03 +0200, Michael S. Tsirkin wrote:
> > We could use a global refcnt to count the number of pending destructions
> > and use a completion object to block unload until all the destructors
> > fire and the refcnt goes to zero.
>
> It has the same race as module refcnt. So just use that.
>
I don't understand the race. Can you explain please? This should be
able to be done without a race with a refcnt, a spinlock, a bit saying
we're unloading, and a completion object.
But maybe I'm confused ;-)
From swise at opengridcomputing.com Fri Feb 2 07:28:59 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 02 Feb 2007 09:28:59 -0600
Subject: [openib-general] [Fwd: Re: [PATCH 1/10] cxgb3 - main header
files]
In-Reply-To:
References: <1169216896.15842.6.camel@stevo-desktop>
<1170363934.16637.58.camel@stevo-desktop>
Message-ID: <1170430139.26115.9.camel@stevo-desktop>
On Thu, 2007-02-01 at 20:48 -0800, Roland Dreier wrote:
> > Have you had a chance to review this?
>
> Still on my list.
>
> Can we trade? Can you look at the IPoIB connected mode stuff in the
> ipoib-cm branch in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git
>
> and let me know if you see anything you don't like?
>
> - R.
Ok. I'll review the IPoIB connected mode code.
Steve.
From halr at voltaire.com Fri Feb 2 07:28:06 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 02 Feb 2007 10:28:06 -0500
Subject: [openib-general] components that have not opend the ofed_1_2
branch
In-Reply-To: <45C209EA.1040207@mellanox.co.il>
References: <45C209EA.1040207@mellanox.co.il>
Message-ID: <1170430064.15660.354336.camel@hal.voltaire.com>
On Thu, 2007-02-01 at 10:40, Tziporet Koren wrote:
> The following components have not opened ofed_1_2 branch:
>
> * libibverbs - Roland
> * libmthca - Roland
> * libipathverbs - Bryan
> * tvflash - Roland
> * srptools - Ishai
> * management - Hal
>
>
> Please open the branch today or tomorrow at the latest .
Done; just created the ofed_1_2 branch for management.
-- Hal
> Thanks,
> Tziporet
From swise at opengridcomputing.com Fri Feb 2 07:41:09 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 02 Feb 2007 09:41:09 -0600
Subject: [openib-general] ofa_1_2_kernel 20070202-0200 daily build status
In-Reply-To: <20070202102043.4FA07E607F9@openfabrics.org>
References: <20070202102043.4FA07E607F9@openfabrics.org>
Message-ID: <1170430869.26115.12.camel@stevo-desktop>
On Fri, 2007-02-02 at 02:20 -0800, vlad at lists.openfabrics.org wrote:
> This email was generated automatically, please do not reply
Which distro is 2.6.16.21-0.8-default? I'm sure I didn't do a netevent
backport that.
> Failed:
> Build failed on ia64 with linux-2.6.16.21-0.8-default
> Log:
> /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
> /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
> /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
> make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
> make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
> make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
> make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
> make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
> make: *** [kernel] Error 2
> ----------------------------------------------------------------------------------
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
From swise at opengridcomputing.com Fri Feb 2 07:54:31 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 02 Feb 2007 09:54:31 -0600
Subject: [openib-general] IPoIB CM for merge?
In-Reply-To: <20070202111532.GT17617@mellanox.co.il>
References: <20070202111532.GT17617@mellanox.co.il>
Message-ID: <1170431671.26115.25.camel@stevo-desktop>
On Fri, 2007-02-02 at 13:15 +0200, Michael S. Tsirkin wrote:
> > Quoting Roland Dreier :
> > Subject: Re: IPoIB CM for merge?
> >
> > > Could you please spend some time reviewing IPoIB CM code?
> > > I am concerned about missing the 2.6.21 merge window.
> >
> > Thanks for the reminder.
> >
> > Can we trade? Have you looked at the cxgb3 iwarp driver? Any comments?
>
> OK.
> I am not sure I have the last version posted so I am going to go by what
> is there in OFED git tree.
>
> And I also only looked under drivers/infiniband/.
>
> So, here are some questions: I looked in the archives and have not seen
> these addressed. Maybe these can be answered and then I'll go from there?
> Does this sound OK?
>
> Files with names like
> ./core/cxio_hal.c
> ./core/cxio_hal.h
> normally generate a fair bit of discussion which wasn't present here,
> I did not guess everyone was just busy.
> For example, why is there both struct iwch_cq and struct t3_cq?
>
The cxgb3/core code defines a low level interface to the RDMA bits of
the T3 device.
This code was originally a separate module (named cxio) that allowed
other RDMA middleware layers to sit on top of the this core rdma module.
At the time, there was RNIC-PI and OFA being developed. So that is the
history of this. As per the first openib review (about a year ago) of
this code I merged this core module into the cxgb3 module. I left the
file structure and names as-is because it was low priority IMO.
The t3_cq struct is the low level CQ structure used to manage both a HW
accessed CQ and a SW CQ (needed to handle error cases and out of order
completions). The iwch_cq struct contains the stuff needed to integrate
with the OFA core and uverbs code. It contains a t3_cq inline.
> File tcb.h comment says:
> /* This file is automatically generated --- do not edit */
> This looks like a GPL violation, does it not?
>
I can add the license if that's what you mean.
> What's the deal with the naming convention?
> Is there a reason in cxgb3, some files start with iwch and some with cxio?
> How about using cxgb3 prefix all over?
The cxio_ prefix is used for the low-level functions/types that talk
directly with the HW. iwch_ is the provider driver functions that
interface with the OFA stack. I'd rather not change the names.
Especially since this has already gone through several review cycles.
I'm hoping we can get this in and improve it with subsequent
submissions. Is that reasonable?
Steve.
From mshefty at ichips.intel.com Fri Feb 2 09:59:05 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Fri, 02 Feb 2007 09:59:05 -0800
Subject: [openib-general] please pull for 2.6.21: fix + add IB multicast
support
In-Reply-To: <45BF8E17.2010805@ichips.intel.com>
References: <000701c741a6$16dc4760$ff0da8c0@amr.corp.intel.com>
<45BF8E17.2010805@ichips.intel.com>
Message-ID: <45C37BE9.5040105@ichips.intel.com>
> Sean Hefty (3):
> rdma_cm: Increment port number after close to avoid re-use.
> ib_sa: track multicast join/leave requests
> rdma_cm: add multicast communication support
Assuming that you haven't look at this yet, I updated the ib_sa patch above to
shorten the workqueue name, plus added a fourth patch to shorten the workqueue
names for ib_addr and rdma_cm. E.g. "ib_mcast_wq" became "ib_mcast".
Let me know if you need any assistance.
- Sean
From swise at opengridcomputing.com Fri Feb 2 11:18:13 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Fri, 02 Feb 2007 13:18:13 -0600
Subject: [openib-general] IPoIB connected mode review comments
In-Reply-To:
References: <1169216896.15842.6.camel@stevo-desktop>
<1170363934.16637.58.camel@stevo-desktop>
Message-ID: <1170443893.26115.59.camel@stevo-desktop>
On Thu, 2007-02-01 at 20:48 -0800, Roland Dreier wrote:
> > Have you had a chance to review this?
>
> Still on my list.
>
> Can we trade? Can you look at the IPoIB connected mode stuff in the
> ipoib-cm branch in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git
>
> and let me know if you see anything you don't like?
>
> - R.
Here are my comments. I'm not an ib cm expert though. These are mostly
questions:
Since IPoIB is using IP addresses already, wouldn't it be simpler to use
the rdma cm to setup connections?
Could you optimize this design and only signal some of the tx wrs?
In ipoib_cm_send() you call ipoib_cm_skb_too_long() if the packet is too
large for the interface mtu. And you print a warning. But
ipoib_cm_skb_too_long() actually queues the packet for the cm case. For
ud it just drops the packet. The skb task for cm then will send a
ICMP_DEST_UNREACH for these packets. Why the difference? Also if this
packet came from the local stack via a local application, you don't want
to send DEST_UNREACH, right? (I'm probably just confused about the
purpose of this).
In ipoib_cm_tx_completion() you rearm, then drain the cq. I thought
there was some reason that it was better to do drain/rearm/drain?
Something about if you rearm and there's a cq entry mthca does another
immediate interrupt?
In ipoib_cm_handle_tx_wc():
When can a tx completion happen with a wr_id that isn't within the
ipoib_sendq_size range? This looks like it is really a bug condition
that should never happen. I see the same code in the rx completion path
too.
Also, what's up with the /* FIXME */ comment?
You lock the priv->lock inside of the priv->tx_lock. Is this ordering
correct and consistent across all the code?
ipoib_cm_handle_rx_wc() - what's up with the XXX comment?
What's the algorithm to keep enough buffers posted in the SRQ?
From akepner at sgi.com Fri Feb 2 13:34:15 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Fri, 2 Feb 2007 13:34:15 -0800 (PST)
Subject: [openib-general] [RFC/BUG] libibverbs: DMA vs. CQ race
In-Reply-To:
References:
Message-ID:
Thanks for having a look at this.
On Mon, 29 Jan 2007, Roland Dreier wrote:
> ....
> Well, first the changes to the userspace libmthca need to be such that
> new libmthca continues to work with old kernels....
Absolutely.
> .....
> The really strange thing about this is that this Altix
> coherent/consistent memory really isn't about the memory itself, but
> about the relationship of that memory with DMA elsewhere -- as I
> understand the code, doing dma_alloc_coherent() returns normal memory
> with a special DMA address that tells the system to flush other DMAs
> before doing DMA to the coherent region. Which isn't really what most
> people understand coherent memory to be, but it has the magic property
> of making most drivers work.
> ....
I agree that this isn't a very elegant solution, but I don't
know of a better one.
Assuming that something along the lines of the previous patch
is used, we need to address userspace/kernel compatibility.
The existing abi versioning doesn't seem to be exactly what
we want to use, though, because we want to change a verb's
semantics to work around a bug. (Changing the abi_version
may be an inevitable result, though.)
How about adding "semantic flags" to the mthca_* commands
(mthca_create_cq, etc.)? Userspace could read the contents of
a new sysfs file which, if found, would indicate the flags
that the kernel understands. Then it could pass the flags, if
it chooses, to get the kernel to use the desired semantics.
Something like:
# cat /sys/class/infiniband_verbs/uverbs0/abi_flags
0000000000000001 [64 bits of flags]
where:
enum abi_flags {
COHERENT_USER_CQ = (1<<0),
.....
};
Better/different ideas?
--
Arthur
From pasquale.davide at gmail.com Fri Feb 2 15:17:45 2007
From: pasquale.davide at gmail.com (Davide Pasquale)
Date: Sat, 3 Feb 2007 00:17:45 +0100
Subject: [openib-general] OFED 1.1 build issue
In-Reply-To: <1169128895.31746.73017.camel@hal.voltaire.com>
References:
<20070112112201.GB2802@mellanox.co.il>
<1169123080.31746.67663.camel@hal.voltaire.com>
<1169126162.31746.70598.camel@hal.voltaire.com>
<1169128895.31746.73017.camel@hal.voltaire.com>
Message-ID:
Solved upgrading blade enclosure firmware to version 1.20!
Thanks.
On 18 Jan 2007 09:01:45 -0500, Hal Rosenstock wrote:
>
> On Thu, 2007-01-18 at 08:52, Davide Pasquale wrote:
> > On 18 Jan 2007 08:19:34 -0500, Hal Rosenstock
> > wrote:
> > On Thu, 2007-01-18 at 08:02, Davide Pasquale wrote:
> > >
> > > On 18 Jan 2007 07:34:43 -0500, Hal Rosenstock
> >
> > > wrote:
> > > On Thu, 2007-01-18 at 06:19, Davide Pasquale wrote:
> > > > Starting opensm I see this error in
> > /var/log/osm.log:
> > > >
> > > > OpenSM Rev:openib-2.0.5 OpenIB svn Exported
> > revision
> > > > Jan 18 12:11:39 628147 [95AA8160] ->
> > osm_vendor_bind:
> > > Binding to port
> > > > 0x18feffff8c7a8d
> > > > Jan 18 12:11:39 629557 [95AA8160] ->
> > osm_vendor_bind:
> > > Binding to port
> > > > 0x18feffff8c7a8d
> > > > Jan 18 12:11:39 630605 [41401960] -> SM port is
> > down
> > > > Jan 18 12:11:39 630693 [41401960] ->
> > > __osm_sm_state_mgr_signal_error:
> > > > ERR 3207: Invalid signal OSM_SM_SIGNAL_DISCOVER in
> > state
> > > > IB_SMINFO_STATE_DISCOVERING
> > > > Jan 18 12:11:49 631170 [41E02960] -> SM port is
> > down
> > > > Jan 18 12:11:49 631238 [41E02960] ->
> > > __osm_sm_state_mgr_signal_error:
> > > > ERR 3207: Invalid signal OSM_SM_SIGNAL_DISCOVER in
> > state
> > > > IB_SMINFO_STATE_DISCOVERING
> > > >
> > > > and the SM port is always down.
> > >
> > > The error message is benign.
> > >
> > > Is the SM port plugged into any other IB device ?
> > >
> > > -- Hal
> > >
> > > Hi Hal,
> > >
> > > we are using HP Blade System and each blade has an
> > infiniband card
> > > onboard.
> > > The SM port is plugged in the Infiniband switch internal to
> > the blade
> > > enclosure.
> > > Is this information helpful for you ?
> >
> > The port being down has nothing to do with SM operation. For
> > some
> > reason, there is no connectivity or negotiation between the
> > blades and
> > the switch.
> >
> > -- Hal
> >
> > >
> > >
> > >
> > >
> > >
> >
> >
> > Thanks!
> > What can I look to in order to solve this problem ?
>
> I don't know the HP blade system so the only thing I can say to try is
> to unseat and reseat all the blades (HCAs and switch(es)) to see if this
> resolves the problem. If it doesn't, I have no clue.
>
> -- Hal
>
> >
> > Regards,
> > Davide.
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From sean.hefty at intel.com Fri Feb 2 16:02:23 2007
From: sean.hefty at intel.com (Sean Hefty)
Date: Fri, 2 Feb 2007 16:02:23 -0800
Subject: [openib-general] [RFC] [PATCH] ib_usa: export multicast and
informinfo registration to userspace
Message-ID: <000001c74726$94d0f500$e598070a@amr.corp.intel.com>
Export SA client capabilities for multicast and SA event registration
to userspace. Multicast and event registration are tracked on a per
port basis, with tracking done by the ib_sa kernel module.
Based on feedback from the list, a new userspace SA module was added,
rather than trying to rework the usermad interface. The user to kernel
interface is minimal, but was designed to be flexible enough to add
additional SA client support if needed. (E.g. local SA cache lookup,
SA queries, service registration, etc.)
Signed-off-by: Sean Hefty
---
The following patch is also available from the user_sa branch of my
rdma-dev.git tree, and is dependent on the informinfo branch/patch
posted earlier to the list. (A couple of small fixes to the informinfo
code have been added since the original patches.) A userspace sa library
is also available.
The informinfo and userspace support was completed as part of the
PathForward project at the request of the US National Laboratories.
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 9edface..b5ffc78 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -18,15 +18,15 @@ config INFINIBAND_USER_MAD
need libibumad from .
config INFINIBAND_USER_ACCESS
- tristate "InfiniBand userspace access (verbs and CM)"
+ tristate "InfiniBand userspace access (verbs, CM, SA client)"
depends on INFINIBAND
---help---
Userspace InfiniBand access support. This enables the
- kernel side of userspace verbs and the userspace
- communication manager (CM). This allows userspace processes
- to set up connections and directly access InfiniBand
+ kernel side of userspace verbs, the userspace communication
+ manager (CM), and userspace SA client. This allows userspace
+ processes to set up connections and directly access InfiniBand
hardware for fast-path operations. You will also need
- libibverbs, libibcm and a hardware driver library from
+ libibverbs, libibcm, libibsa, and a hardware driver library from
.
config INFINIBAND_ADDR_TRANS
diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index 2e9c4b2..e89cf2e 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -4,7 +4,7 @@ user_access-$(CONFIG_INFINIBAND_ADDR_TRANS) := rdma_ucm.o
obj-$(CONFIG_INFINIBAND) += ib_core.o ib_mad.o ib_sa.o \
ib_cm.o iw_cm.o $(infiniband-y)
obj-$(CONFIG_INFINIBAND_USER_MAD) += ib_umad.o
-obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o ib_ucm.o \
+obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o ib_ucm.o ib_usa.o \
$(user_access-y)
ib_core-y := packer.o ud_header.o verbs.o sysfs.o \
@@ -28,5 +28,7 @@ ib_umad-y := user_mad.o
ib_ucm-y := ucm.o
+ib_usa-y := usa.o
+
ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_mem.o \
uverbs_marshall.o
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 172a450..771f52a 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -464,6 +464,46 @@ static const struct ib_field notice_table[] = {
.size_bits = 128 },
};
+int ib_sa_pack_attr(void *dst, void *src, int attr_id)
+{
+ switch (attr_id) {
+ case IB_SA_ATTR_MC_MEMBER_REC:
+ ib_pack(mcmember_rec_table, ARRAY_SIZE(mcmember_rec_table),
+ src, dst);
+ break;
+ case IB_SA_ATTR_INFORM_INFO:
+ ib_pack(inform_table, ARRAY_SIZE(inform_table), src, dst);
+ break;
+ case IB_SA_ATTR_NOTICE:
+ ib_pack(notice_table, ARRAY_SIZE(notice_table), src, dst);
+ break;
+ default:
+ return -EINVAL;
+ }
+ return 0;
+}
+EXPORT_SYMBOL(ib_sa_pack_attr);
+
+int ib_sa_unpack_attr(void *dst, void *src, int attr_id)
+{
+ switch (attr_id) {
+ case IB_SA_ATTR_MC_MEMBER_REC:
+ ib_unpack(mcmember_rec_table, ARRAY_SIZE(mcmember_rec_table),
+ src, dst);
+ break;
+ case IB_SA_ATTR_INFORM_INFO:
+ ib_unpack(inform_table, ARRAY_SIZE(inform_table), src, dst);
+ break;
+ case IB_SA_ATTR_NOTICE:
+ ib_unpack(notice_table, ARRAY_SIZE(notice_table), src, dst);
+ break;
+ default:
+ return -EINVAL;
+ }
+ return 0;
+}
+EXPORT_SYMBOL(ib_sa_unpack_attr);
+
static void free_sm_ah(struct kref *kref)
{
struct ib_sa_sm_ah *sm_ah = container_of(kref, struct ib_sa_sm_ah, ref);
diff --git a/drivers/infiniband/core/usa.c b/drivers/infiniband/core/usa.c
new file mode 100644
index 0000000..ae05091
--- /dev/null
+++ b/drivers/infiniband/core/usa.c
@@ -0,0 +1,792 @@
+/*
+ * Copyright (c) 2006-2007 Intel Corporation. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include
+#include
+#include
+#include
+#include
+
+#include
+
+MODULE_AUTHOR("Sean Hefty");
+MODULE_DESCRIPTION("IB userspace SA");
+MODULE_LICENSE("Dual BSD/GPL");
+
+static void usa_add_one(struct ib_device *device);
+static void usa_remove_one(struct ib_device *device);
+
+static struct ib_client usa_client = {
+ .name = "ib_usa",
+ .add = usa_add_one,
+ .remove = usa_remove_one
+};
+
+struct usa_device {
+ struct list_head list;
+ struct ib_device *device;
+ struct completion comp;
+ atomic_t refcount;
+ int start_port;
+ int end_port;
+};
+
+struct usa_file {
+ struct mutex file_mutex;
+ struct file *filp;
+ struct ib_sa_client sa_client;
+ struct list_head event_list;
+ struct list_head id_list;
+ wait_queue_head_t poll_wait;
+ int event_id;
+};
+
+struct usa_id {
+ struct usa_file *file;
+ struct usa_device *dev;
+ struct list_head list;
+ u64 uid;
+ int num;
+ int events_reported;
+ u16 attr_id;
+};
+
+struct usa_event {
+ struct usa_id *id;
+ struct list_head list;
+ struct ib_usa_event_resp resp;
+};
+
+struct usa_multicast {
+ struct usa_id id;
+ struct usa_event event;
+ struct ib_sa_multicast *multicast;
+};
+
+struct usa_inform_info {
+ struct usa_id id;
+ struct ib_inform_info *inform_info;
+};
+
+static DEFINE_MUTEX(usa_mutex);
+static LIST_HEAD(dev_list);
+static DEFINE_IDR(usa_idr);
+
+static struct usa_device *get_dev(__be64 guid, __u8 port_num)
+{
+ struct usa_device *dev;
+
+ mutex_lock(&usa_mutex);
+ list_for_each_entry(dev, &dev_list, list) {
+ if (dev->device->node_guid == guid) {
+ if (port_num < dev->start_port ||
+ port_num > dev->end_port)
+ break;
+ atomic_inc(&dev->refcount);
+ mutex_unlock(&usa_mutex);
+ return dev;
+ }
+ }
+ mutex_unlock(&usa_mutex);
+ return NULL;
+}
+
+static void put_dev(struct usa_device *dev)
+{
+ if (atomic_dec_and_test(&dev->refcount))
+ complete(&dev->comp);
+}
+
+static int insert_id(struct usa_id *id)
+{
+ int ret;
+
+ do {
+ ret = idr_pre_get(&usa_idr, GFP_KERNEL);
+ if (!ret)
+ break;
+
+ mutex_lock(&usa_mutex);
+ ret = idr_get_new(&usa_idr, id, &id->num);
+ mutex_unlock(&usa_mutex);
+ } while (ret == -EAGAIN);
+
+ return ret;
+}
+
+static void remove_id(struct usa_id *id)
+{
+ mutex_lock(&usa_mutex);
+ idr_remove(&usa_idr, id->num);
+ mutex_unlock(&usa_mutex);
+}
+
+static struct usa_id *get_id(int num, struct usa_file *file, u16 attr_id)
+{
+ struct usa_id *id;
+
+ id = idr_find(&usa_idr, num);
+ if (!id)
+ return ERR_PTR(-ENOENT);
+
+ if ((id->file != file) || (id->attr_id != attr_id))
+ return ERR_PTR(-EINVAL);
+
+ return id;
+}
+
+static void insert_file_id(struct usa_file *file, struct usa_id *id)
+{
+ mutex_lock(&file->file_mutex);
+ list_add_tail(&id->list, &file->id_list);
+ mutex_unlock(&file->file_mutex);
+}
+
+static void remove_file_id(struct usa_file *file, struct usa_id *id)
+{
+ mutex_lock(&file->file_mutex);
+ list_del(&id->list);
+ mutex_unlock(&file->file_mutex);
+}
+
+static void finish_event(struct usa_event *event)
+{
+ switch (be16_to_cpu(event->resp.attr_id)) {
+ case IB_SA_ATTR_MC_MEMBER_REC:
+ list_del_init(&event->list);
+ event->id->events_reported++;
+ break;
+ default:
+ list_del(&event->list);
+ if (event->id)
+ event->id->events_reported++;
+ kfree(event);
+ break;
+ }
+}
+
+static ssize_t usa_get_event(struct usa_file *file, const char __user *inbuf,
+ int in_len, int out_len)
+{
+ struct ib_usa_get_event cmd;
+ struct usa_event *event;
+ int ret = 0;
+ DEFINE_WAIT(wait);
+
+ if (out_len < sizeof(event->resp))
+ return -ENOSPC;
+
+ if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+ return -EFAULT;
+
+ mutex_lock(&file->file_mutex);
+ while (list_empty(&file->event_list)) {
+ mutex_unlock(&file->file_mutex);
+
+ if (file->filp->f_flags & O_NONBLOCK)
+ return -EAGAIN;
+
+ if (wait_event_interruptible(file->poll_wait,
+ !list_empty(&file->event_list)))
+ return -ERESTARTSYS;
+
+ mutex_lock(&file->file_mutex);
+ }
+
+ event = list_entry(file->event_list.next, struct usa_event, list);
+
+ if (copy_to_user((void __user *)(unsigned long)cmd.response,
+ &event->resp, sizeof(event->resp))) {
+ ret = -EFAULT;
+ goto done;
+ }
+
+ finish_event(event);
+done:
+ mutex_unlock(&file->file_mutex);
+ return ret;
+}
+
+static void queue_event(struct usa_file *file, struct usa_event *event)
+{
+ mutex_lock(&file->file_mutex);
+ list_move_tail(&event->list, &file->event_list);
+ wake_up_interruptible(&file->poll_wait);
+ mutex_unlock(&file->file_mutex);
+}
+
+/*
+ * We can get up to two events for a single multicast member. A second event
+ * only occurs if there's an error on an existing multicast membership.
+ * Report only the last event.
+ */
+static int multicast_handler(int status, struct ib_sa_multicast *multicast)
+{
+ struct usa_multicast *mcast = multicast->context;
+ struct usa_file *file = mcast->id.file;
+
+ mcast->event.resp.status = status;
+ if (!status) {
+ mcast->event.resp.data_len = IB_SA_ATTR_MC_MEMBER_REC_LEN;
+ ib_sa_pack_attr(mcast->event.resp.data, &multicast->rec,
+ IB_SA_ATTR_MC_MEMBER_REC);
+ }
+
+ queue_event(file, &mcast->event);
+ return 0;
+}
+
+static int join_mcast(struct usa_file *file, struct ib_usa_request *req,
+ int out_len)
+{
+ struct usa_multicast *mcast;
+ struct ib_sa_mcmember_rec rec;
+ int ret;
+
+ if (out_len < sizeof(u32))
+ return -ENOSPC;
+
+ mcast = kzalloc(sizeof *mcast, GFP_KERNEL);
+ if (!mcast)
+ return -ENOMEM;
+
+ mcast->id.dev = get_dev(req->node_guid, req->port_num);
+ if (!mcast->id.dev) {
+ ret = -ENODEV;
+ goto err1;
+ }
+
+ if (copy_from_user(mcast->event.resp.data,
+ (void __user *) (unsigned long) req->attr,
+ IB_SA_ATTR_MC_MEMBER_REC_LEN)) {
+ ret = -EFAULT;
+ goto err2;
+ }
+
+ INIT_LIST_HEAD(&mcast->event.list);
+ mcast->event.id = &mcast->id;
+ mcast->event.resp.attr_id = cpu_to_be16(IB_SA_ATTR_MC_MEMBER_REC);
+ mcast->event.resp.uid = req->uid;
+ mcast->id.attr_id = IB_SA_ATTR_MC_MEMBER_REC;
+ mcast->id.uid = req->uid;
+
+ ret = insert_id(&mcast->id);
+ if (ret)
+ goto err2;
+
+ mcast->event.resp.id = mcast->id.num;
+ if (copy_to_user((void __user *) (unsigned long) req->response,
+ &mcast->id.num, sizeof(u32))) {
+ ret = EFAULT;
+ goto err3;
+ }
+
+ mcast->id.file = file;
+ insert_file_id(file, &mcast->id);
+
+ ib_sa_unpack_attr(&rec, mcast->event.resp.data,
+ IB_SA_ATTR_MC_MEMBER_REC);
+ mcast->multicast = ib_sa_join_multicast(&file->sa_client,
+ mcast->id.dev->device,
+ req->port_num, &rec,
+ (ib_sa_comp_mask) req->comp_mask,
+ GFP_KERNEL, multicast_handler,
+ mcast);
+ if (IS_ERR(mcast->multicast)) {
+ ret = PTR_ERR(mcast->multicast);
+ goto err4;
+ }
+
+ return 0;
+
+err4:
+ remove_file_id(file, &mcast->id);
+err3:
+ remove_id(&mcast->id);
+err2:
+ put_dev(mcast->id.dev);
+err1:
+ kfree(mcast);
+ return ret;
+}
+
+static int get_mcast(struct usa_file *file, struct ib_usa_request *req,
+ int out_len)
+{
+ struct usa_device *dev;
+ struct ib_sa_mcmember_rec rec;
+ u8 mcmember_rec[IB_SA_ATTR_MC_MEMBER_REC_LEN];
+ int ret;
+
+ if (out_len < sizeof(IB_SA_ATTR_MC_MEMBER_REC_LEN))
+ return -ENOSPC;
+
+ if (req->comp_mask != IB_SA_MCMEMBER_REC_MGID)
+ return -ENOSYS;
+
+ if (copy_from_user(mcmember_rec,
+ (void __user *) (unsigned long) req->attr,
+ IB_SA_ATTR_MC_MEMBER_REC_LEN))
+ return -EFAULT;
+
+ dev = get_dev(req->node_guid, req->port_num);
+ if (!dev)
+ return -ENODEV;
+
+ ib_sa_unpack_attr(&rec, mcmember_rec, IB_SA_ATTR_MC_MEMBER_REC);
+ ret = ib_sa_get_mcmember_rec(dev->device, req->port_num,
+ &rec.mgid, &rec);
+ if (!ret) {
+ ib_sa_pack_attr(mcmember_rec, &rec, IB_SA_ATTR_MC_MEMBER_REC);
+ if (copy_to_user((void __user *) (unsigned long) req->response,
+ mcmember_rec, IB_SA_ATTR_MC_MEMBER_REC_LEN))
+ ret = -EFAULT;
+ }
+
+ put_dev(dev);
+ return ret;
+}
+
+static int process_mcast(struct usa_file *file, struct ib_usa_request *req,
+ int out_len)
+{
+ /* Only indirect requests are currently supported. */
+ if (!req->local)
+ return -ENOSYS;
+
+ switch (req->method) {
+ case IB_MGMT_METHOD_GET:
+ return get_mcast(file, req, out_len);
+ case IB_MGMT_METHOD_SET:
+ return join_mcast(file, req, out_len);
+ default:
+ return -EINVAL;
+ }
+}
+
+static int notice_handler(int status, struct ib_inform_info *info,
+ struct ib_sa_notice *notice)
+{
+ struct usa_inform_info *inform = info->context;
+ struct usa_file *file = inform->id.file;
+ struct usa_event *event;
+
+ event = kzalloc(sizeof *event, GFP_KERNEL);
+ if (!event)
+ return 0;
+
+ event->resp.uid = inform->id.uid;
+ event->id = &inform->id;
+ event->resp.status = status;
+ INIT_LIST_HEAD(&event->list);
+
+ if (notice) {
+ event->resp.attr_id = cpu_to_be16(IB_SA_ATTR_NOTICE);
+ event->resp.data_len = IB_SA_ATTR_NOTICE_LEN;
+ ib_sa_pack_attr(event->resp.data, notice, IB_SA_ATTR_NOTICE);
+ } else
+ event->resp.attr_id = cpu_to_be16(IB_SA_ATTR_INFORM_INFO);
+
+ queue_event(file, event);
+ return 0;
+}
+
+static int reg_inform(struct usa_file *file, struct ib_usa_request *req,
+ int out_len)
+{
+ struct usa_inform_info *inform;
+ struct ib_sa_inform sa_inform_info;
+ u8 net_inform_info[IB_SA_ATTR_INFORM_INFO_LEN];
+ u16 trap_number;
+ int ret;
+
+ if (out_len < sizeof(u32))
+ return -ENOSPC;
+
+ if (copy_from_user(&net_inform_info,
+ (void __user *) (unsigned long) req->attr,
+ IB_SA_ATTR_INFORM_INFO_LEN))
+ return -EFAULT;
+
+ inform = kzalloc(sizeof *inform, GFP_KERNEL);
+ if (!inform)
+ return -ENOMEM;
+
+ inform->id.dev = get_dev(req->node_guid, req->port_num);
+ if (!inform->id.dev) {
+ ret = -ENODEV;
+ goto err1;
+ }
+
+ inform->id.attr_id = IB_SA_ATTR_INFORM_INFO;
+ inform->id.uid = req->uid;
+
+ ret = insert_id(&inform->id);
+ if (ret)
+ goto err2;
+
+ if (copy_to_user((void __user *) (unsigned long) req->response,
+ &inform->id.num, sizeof(u32))) {
+ ret = EFAULT;
+ goto err3;
+ }
+
+ inform->id.file = file;
+ insert_file_id(file, &inform->id);
+
+ ib_sa_unpack_attr(&sa_inform_info, &net_inform_info,
+ IB_SA_ATTR_INFORM_INFO);
+ trap_number = be16_to_cpu(sa_inform_info.trap.generic.trap_num);
+ inform->inform_info =
+ ib_sa_register_inform_info(&file->sa_client,
+ inform->id.dev->device,
+ req->port_num, trap_number,
+ GFP_KERNEL, notice_handler,
+ inform);
+ if (IS_ERR(inform->inform_info)) {
+ ret = PTR_ERR(inform->inform_info);
+ goto err4;
+ }
+
+ return 0;
+
+err4:
+ remove_file_id(file, &inform->id);
+err3:
+ remove_id(&inform->id);
+err2:
+ put_dev(inform->id.dev);
+err1:
+ kfree(inform);
+ return ret;
+}
+
+static int process_inform(struct usa_file *file, struct ib_usa_request *req,
+ int out_len)
+{
+ /* Only indirect requests are currently supported. */
+ if (!req->local)
+ return -ENOSYS;
+
+ if (req->method != IB_MGMT_METHOD_SET)
+ return -EINVAL;
+
+ return reg_inform(file, req, out_len);
+}
+
+static ssize_t usa_request(struct usa_file *file, const char __user *inbuf,
+ int in_len, int out_len)
+{
+ struct ib_usa_request req;
+
+ if (copy_from_user(&req, inbuf, sizeof(req)))
+ return -EFAULT;
+
+ switch (be16_to_cpu(req.attr_id)) {
+ case IB_SA_ATTR_MC_MEMBER_REC:
+ return process_mcast(file, &req, out_len);
+ case IB_SA_ATTR_INFORM_INFO:
+ return process_inform(file, &req, out_len);
+ default:
+ return -EINVAL;
+ }
+}
+
+static void *cleanup_mcast(struct usa_id *id)
+{
+ struct usa_multicast *mcast;
+
+ mcast = container_of(id, struct usa_multicast, id);
+ ib_sa_free_multicast(mcast->multicast);
+
+ mutex_lock(&id->file->file_mutex);
+ list_del(&id->list);
+ list_del(&mcast->event.list);
+ mutex_unlock(&id->file->file_mutex);
+
+ return mcast;
+}
+
+static void *cleanup_inform(struct usa_id *id)
+{
+ struct usa_inform_info *inform;
+
+ inform = container_of(id, struct usa_inform_info, id);
+ ib_sa_unregister_inform_info(inform->inform_info);
+
+ mutex_lock(&id->file->file_mutex);
+ list_del(&id->list);
+ /* TODO cleanup events */
+ mutex_unlock(&id->file->file_mutex);
+
+ return inform;
+}
+
+static int free_id(struct usa_id *id)
+{
+ void *free_obj;
+ int events_reported;
+
+ switch (id->attr_id) {
+ case IB_SA_ATTR_MC_MEMBER_REC:
+ free_obj = cleanup_mcast(id);
+ break;
+ case IB_SA_ATTR_INFORM_INFO:
+ free_obj = cleanup_inform(id);
+ break;
+ default:
+ free_obj = NULL;
+ break;
+ }
+
+ events_reported = id->events_reported;
+ put_dev(id->dev);
+ kfree(free_obj);
+
+ return events_reported;
+}
+
+static ssize_t usa_free(struct usa_file *file, const char __user *inbuf,
+ int in_len, int out_len)
+{
+ struct ib_usa_free cmd;
+ struct ib_usa_free_resp resp;
+ struct usa_id *id;
+ int ret = 0;
+
+ if (out_len < sizeof(resp))
+ return -ENOSPC;
+
+ if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+ return -EFAULT;
+
+ mutex_lock(&usa_mutex);
+ id = get_id(cmd.id, file, be16_to_cpu(cmd.attr_id));
+ if (!IS_ERR(id))
+ idr_remove(&usa_idr, id->num);
+ mutex_unlock(&usa_mutex);
+
+ resp.events_reported = free_id(id);
+
+ if (copy_to_user((void __user *) (unsigned long) cmd.response,
+ &resp, sizeof resp))
+ ret = -EFAULT;
+
+ return ret;
+}
+
+static ssize_t (*usa_cmd_table[])(struct usa_file *file,
+ const char __user *inbuf,
+ int in_len, int out_len) = {
+ [IB_USA_CMD_REQUEST] = usa_request,
+ [IB_USA_CMD_GET_EVENT] = usa_get_event,
+ [IB_USA_CMD_FREE] = usa_free,
+};
+
+static ssize_t usa_write(struct file *filp, const char __user *buf,
+ size_t len, loff_t *pos)
+{
+ struct usa_file *file = filp->private_data;
+ struct ib_usa_cmd_hdr hdr;
+ ssize_t ret;
+
+ if (len < sizeof(hdr))
+ return -EINVAL;
+
+ if (copy_from_user(&hdr, buf, sizeof(hdr)))
+ return -EFAULT;
+
+ if (hdr.cmd < 0 || hdr.cmd >= ARRAY_SIZE(usa_cmd_table))
+ return -EINVAL;
+
+ if (hdr.in + sizeof(hdr) > len)
+ return -EINVAL;
+
+ ret = usa_cmd_table[hdr.cmd](file, buf + sizeof(hdr), hdr.in, hdr.out);
+ if (!ret)
+ ret = len;
+
+ return ret;
+}
+
+static unsigned int usa_poll(struct file *filp, struct poll_table_struct *wait)
+{
+ struct usa_file *file = filp->private_data;
+ unsigned int mask = 0;
+
+ poll_wait(filp, &file->poll_wait, wait);
+
+ if (!list_empty(&file->event_list))
+ mask = POLLIN | POLLRDNORM;
+
+ return mask;
+}
+
+static int usa_open(struct inode *inode, struct file *filp)
+{
+ struct usa_file *file;
+
+ file = kmalloc(sizeof *file, GFP_KERNEL);
+ if (!file)
+ return -ENOMEM;
+
+ ib_sa_register_client(&file->sa_client);
+
+ INIT_LIST_HEAD(&file->event_list);
+ INIT_LIST_HEAD(&file->id_list);
+ init_waitqueue_head(&file->poll_wait);
+ mutex_init(&file->file_mutex);
+
+ filp->private_data = file;
+ file->filp = filp;
+ return 0;
+}
+
+static int usa_close(struct inode *inode, struct file *filp)
+{
+ struct usa_file *file = filp->private_data;
+ struct usa_id *id;
+
+ while (!list_empty(&file->id_list)) {
+ id = list_entry(file->id_list.next, struct usa_id, list);
+ remove_id(id);
+ free_id(id);
+ }
+ ib_sa_unregister_client(&file->sa_client);
+
+ kfree(file);
+ return 0;
+}
+
+static void usa_add_one(struct ib_device *device)
+{
+ struct usa_device *dev;
+
+ if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
+ return;
+
+ dev = kmalloc(sizeof *dev, GFP_KERNEL);
+ if (!dev)
+ return;
+
+ dev->device = device;
+ if (device->node_type == RDMA_NODE_IB_SWITCH)
+ dev->start_port = dev->end_port = 0;
+ else {
+ dev->start_port = 1;
+ dev->end_port = device->phys_port_cnt;
+ }
+
+ init_completion(&dev->comp);
+ atomic_set(&dev->refcount, 1);
+ ib_set_client_data(device, &usa_client, dev);
+
+ mutex_lock(&usa_mutex);
+ list_add_tail(&dev->list, &dev_list);
+ mutex_unlock(&usa_mutex);
+}
+
+static void usa_remove_one(struct ib_device *device)
+{
+ struct usa_device *dev;
+
+ dev = ib_get_client_data(device, &usa_client);
+ if (!dev)
+ return;
+
+ mutex_lock(&usa_mutex);
+ list_del(&dev->list);
+ mutex_unlock(&usa_mutex);
+
+ /* TODO: force immediate device removal */
+ put_dev(dev);
+ wait_for_completion(&dev->comp);
+ kfree(dev);
+}
+
+static struct file_operations usa_fops = {
+ .owner = THIS_MODULE,
+ .open = usa_open,
+ .release = usa_close,
+ .write = usa_write,
+ .poll = usa_poll,
+};
+
+static struct miscdevice usa_misc = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = "ib_usa",
+ .fops = &usa_fops,
+};
+
+static ssize_t show_abi_version(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return sprintf(buf, "%d\n", IB_USA_ABI_VERSION);
+}
+static DEVICE_ATTR(abi_version, S_IRUGO, show_abi_version, NULL);
+
+static int __init usa_init(void)
+{
+ int ret;
+
+ ret = misc_register(&usa_misc);
+ if (ret)
+ return ret;
+
+ ret = device_create_file(usa_misc.this_device, &dev_attr_abi_version);
+ if (ret)
+ goto err1;
+
+ ret = ib_register_client(&usa_client);
+ if (ret)
+ goto err2;
+
+ return 0;
+
+err2:
+ device_remove_file(usa_misc.this_device, &dev_attr_abi_version);
+err1:
+ misc_deregister(&usa_misc);
+ return ret;
+}
+
+static void __exit usa_cleanup(void)
+{
+ ib_unregister_client(&usa_client);
+ device_remove_file(usa_misc.this_device, &dev_attr_abi_version);
+ misc_deregister(&usa_misc);
+ idr_destroy(&usa_idr);
+}
+
+module_init(usa_init);
+module_exit(usa_cleanup);
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index a8e5221..f36be98 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -557,4 +557,7 @@ ib_sa_register_inform_info(struct ib_sa_client *client,
*/
void ib_sa_unregister_inform_info(struct ib_inform_info *info);
+int ib_sa_pack_attr(void *dst, void *src, int attr_id);
+int ib_sa_unpack_attr(void *dst, void *src, int attr_id);
+
#endif /* IB_SA_H */
diff --git a/include/rdma/ib_usa.h b/include/rdma/ib_usa.h
new file mode 100644
index 0000000..0180cab
--- /dev/null
+++ b/include/rdma/ib_usa.h
@@ -0,0 +1,97 @@
+/*
+ * Copyright (c) 2006-2007 Intel Corporation. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef IB_USA_H
+#define IB_USA_H
+
+#include
+#include
+
+#define IB_USA_ABI_VERSION 1
+
+#define IB_USA_EVENT_DATA 256
+
+enum {
+ IB_USA_CMD_REQUEST,
+ IB_USA_CMD_GET_EVENT,
+ IB_USA_CMD_FREE
+};
+
+enum {
+ IB_SA_ATTR_NOTICE_LEN = 80,
+ IB_SA_ATTR_INFORM_INFO_LEN = 36,
+ IB_SA_ATTR_MC_MEMBER_REC_LEN = 52
+};
+
+struct ib_usa_cmd_hdr {
+ __u32 cmd;
+ __u16 in;
+ __u16 out;
+};
+
+struct ib_usa_request {
+ __u64 response;
+ __u64 uid;
+ __u64 node_guid;
+ __u64 comp_mask;
+ __u64 attr;
+ __be16 attr_id;
+ __u8 method;
+ __u8 port_num;
+ __u8 local;
+};
+
+struct ib_usa_free {
+ __u64 response;
+ __u32 id;
+ __be16 attr_id;
+};
+
+struct ib_usa_free_resp {
+ __u32 events_reported;
+};
+
+struct ib_usa_get_event {
+ __u64 response;
+};
+
+struct ib_usa_event_resp {
+ __u64 uid;
+ __u32 id;
+ __u32 status;
+ __u32 data_len;
+ __be16 attr_id;
+ __u16 reserved;
+ __u8 data[IB_USA_EVENT_DATA];
+};
+
+#endif /* IB_USA_H */
From pradeep at us.ibm.com Fri Feb 2 16:31:32 2007
From: pradeep at us.ibm.com (Pradeep Satyanarayana)
Date: Fri, 2 Feb 2007 16:31:32 -0800
Subject: [openib-general] IPoIB CM for merge?
In-Reply-To: <1170431671.26115.25.camel@stevo-desktop>
Message-ID:
Hello Michael,
Here are a few more observations :
1. For the SRQ case, the skbs and recieve biffers are posted during init
and even before the rx_qp is created. This causes a problem (atleast for
non SRQs) for the ehca. We need to call the ipoib_cm_alloc_skb() and
ipoib_cm_post_recieve() after the rx_qp is in the RTR state.
2. Also found that in ipoib_cm_create_rx_qp() one needs to initialize
.cap.max_recv_wr and .cap.max_recv_sge. Otherwise this leads to some
problems like rq overflows and causing communication failures.
Pradeep
pradeep at us.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From chrisw at sous-sol.org Fri Feb 2 18:35:15 2007
From: chrisw at sous-sol.org (Chris Wright)
Date: Fri, 02 Feb 2007 18:35:15 -0800
Subject: [openib-general] [patch 11/59] [stable] [PATCH] IB/mthca: Fix
off-by-one in FMR handling on memfree
References: <20070203023504.435051000@sous-sol.org>
Message-ID: <20070203023916.739906000@sous-sol.org>
An embedded and charset-unspecified text was scrubbed...
Name: ib-mthca-fix-off-by-one-in-fmr-handling-on-memfree.patch
URL:
From kazeigan at yahoo.co.jp Fri Feb 2 18:37:39 2007
From: kazeigan at yahoo.co.jp ()
Date: Sat, 3 Feb 2007 11:37:39 +0900 (JST)
Subject: [openib-general]
=?ISO-2022-JP?B?g4GBW4OLgqCC6IKqgsaCpIKygrSCooLcgrWCvYH0?=
Message-ID: 20070203113738
お久し振りです。瑞奈です。
先日はメールありがとうございました。
返事が遅くなってしまい、申し訳ありません。
前のメールで質問されていた仕事の話ですが・・・
私は専業主婦なんです。
去年の12月からずっと家のことをやってて、それで忙しかったんです。
家事は楽しいんですが、さすがに疲れが・・・(><
こんな生活なので出会いもないし、誰かに甘えたくなっちゃう事も多くて。
それで、急にこんな事をいうと変に思われるかもしれませんが
一度会ってお話をしたいのですが、ご迷惑でしょうか?
私は世田谷区に住んでいる31歳です。
一緒にゴハンを食べたり、たくさんお話がしたいです♪
できれば今週末、新宿か渋谷あたりが私は都合がいいのですが
いかがでしょうか?
http://chu.punyu.jp/mizuna/
最近、このサイトを利用しているので
ここからメールを下さいませんか?
mixiもやっているのですが、こちらの方が居心地がいいので
このサイトばかりを使ってます(^^;
それでは、お返事をお待ちしていますね。
瑞奈
From xma at us.ibm.com Fri Feb 2 20:58:37 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Fri, 2 Feb 2007 20:58:37 -0800
Subject: [openib-general] Multicast join group failure prevents IPoIB
performing
Message-ID:
When bringing IPoIB interface up, I hit default group multicast join
failure. (This could be fixed in SM set up?)
ib0: multicast join failed for xxxx, status -22
Then the interface was UP but not RUNNING. So the nodes couldn't ping each
other. I think the right behavior of the interface should be UP and RUNNING
even with some multicast join failure. I would like to provide a patch if
there is no problem. Please advise.
Thanks
Shirley Ma
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From eitan at sw053.yok.mtl.com Fri Feb 2 21:28:01 2007
From: eitan at sw053.yok.mtl.com (Eitan Zahavi)
Date: Sat, 3 Feb 2007 07:28:01 +0200
Subject: [openib-general] nightly osm_sim report 2007-02-03:normal completion
Message-ID: <200702030528.l135S13O000650@sw053.yok.mtl.com>
OSM Simulation Regression Summary
OpenSM rev = Fri_Feb_2_09:16:30_2007 db386c
ibutils rev = Wed_Jan_3_11:42:12_2007 913448
Total=410 Pass=410 Fail=0
Pass:
30 Stability IS1-16.topo
30 Pkey IS1-16.topo
30 OsmTest IS1-16.topo
30 OsmStress IS1-16.topo
30 Multicast IS1-16.topo
30 LidMgr IS1-16.topo
10 Stability IS3-loop.topo
10 Stability IS3-128.topo
10 Pkey IS3-128.topo
10 OsmTest IS3-loop.topo
10 OsmTest IS3-128.topo
10 OsmStress IS3-128.topo
10 Multicast IS3-loop.topo
10 Multicast IS3-128.topo
10 LidMgr IS3-128.topo
10 FatTree part-4-ary-3-tree.topo
10 FatTree merge-roots-reorder-4-ary-2-tree.topo
10 FatTree merge-roots-4-ary-2-tree.topo
10 FatTree merge-root-4-ary-3-tree.topo
10 FatTree merge-root-12-ary-2-tree.topo
10 FatTree merge-2-ary-4-tree.topo
10 FatTree half-4-ary-3-tree.topo
10 FatTree blend-4-ary-2-tree.topo
10 FatTree 4-ary-4-tree.topo
10 FatTree 4-ary-3-tree.topo
10 FatTree 32nodes-3lvl-is1.topo
10 FatTree 2-ary-4-tree.topo
10 FatTree 12-node-spaced.topo
10 FatTree 12-ary-2-tree.topo
Failures:
From vlad at lists.openfabrics.org Sat Feb 3 02:21:53 2007
From: vlad at lists.openfabrics.org (vlad at lists.openfabrics.org)
Date: Sat, 3 Feb 2007 02:21:53 -0800 (PST)
Subject: [openib-general] ofa_1_2_kernel 20070203-0200 daily build status
Message-ID: <20070203102154.36F92E607F9@openfabrics.org>
This email was generated automatically, please do not reply
Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod --with-addr_trans-mod --with-cxgb3-mod
Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.13
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.18
Passed on powerpc with linux-2.6.17
Passed on x86_64 with linux-2.6.15
Passed on powerpc with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.13
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.14
Passed on powerpc with linux-2.6.16
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.14
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.15
Failed:
Build failed on ia64 with linux-2.6.16.21-0.8-default
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070203-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
/home/vlad/tmp/ofa_1_2_kernel-20070203-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
/home/vlad/tmp/ofa_1_2_kernel-20070203-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070203-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070203-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070203-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070203-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
make: *** [kernel] Error 2
----------------------------------------------------------------------------------
From halr at voltaire.com Sat Feb 3 06:30:36 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 03 Feb 2007 09:30:36 -0500
Subject: [openib-general] OpenIB management libraries release 1.0.2
Message-ID: <1170513034.4525.15093.camel@hal.voltaire.com>
http://www.openfabrics.org/~halr/
md5sum
b9b4bdf899f1d0ff15e06915cd846a3a libibcommon-1.0.2.tar.gz
2af3ff7e38a1f49fb7514660a9991c89 libibmad-1.0.2.tar.gz
7d7690abfe9b08c8240fbf0157653b90 libibumad-1.0.2.tar.gz
From xma at us.ibm.com Sat Feb 3 08:54:41 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Sat, 3 Feb 2007 09:54:41 -0700
Subject: [openib-general] Multicast join group failure prevents IPoIB
performing
In-Reply-To:
Message-ID:
According to IPoIB RFC4391 section 5, once IPoIB broadcast group has been
joined, the IPoIB link should be UP, since it's ready for data transfer,
the interface should be able to run for broadcast and unicast, do not need
to wait for all multicast join successfully. Here is the patch to allow
IPoIB interface running without waiting for all multicast join succesful,
like all host group multicast join .... Here is the patch:
diff -urpN ipoib/ipoib_multicast.c ipoib-patch/ipoib_multicast.c
--- ipoib/ipoib_multicast.c 2006-11-29 13:57:37.000000000 -0800
+++ ipoib-patch/ipoib_multicast.c 2007-02-03 00:52:23.000000000 -0800
@@ -566,6 +566,7 @@ void ipoib_mcast_join_task(void *dev_ptr
if (!test_bit(IPOIB_MCAST_FLAG_ATTACHED, &priv->broadcast->flags)) {
ipoib_mcast_join(dev, priv->broadcast, 0);
+ netif_carrier_on(dev);
return;
}
@@ -599,7 +600,6 @@ void ipoib_mcast_join_task(void *dev_ptr
ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n");
clear_bit(IPOIB_MCAST_RUN, &priv->flags);
- netif_carrier_on(dev);
}
int ipoib_mcast_start_thread(struct net_device *dev)
(See attached file: multicast.patch)
http://www.rfc-editor.org/rfc/rfc4391.txt
5. Setting Up an IPoIB Link
The broadcast-GID, as defined in the previous section, MUST be set up
for an IPoIB subnet to be formed. Every IPoIB interface MUST
"FullMember" join the IB multicast group defined by the broadcast-
GID. This multicast group will henceforth be referred to as the
broadcast group. The join operation returns the MTU, the Q_Key, and
other parameters associated with the broadcast group. The node then
associates the parameters received as a result of the join operation
with its IPoIB interface. The broadcast group also serves to provide
a link-layer broadcast service for protocols like ARP, net-directed,
subnet-directed, and all-subnets-directed broadcasts in IPv4 over IB
networks.
The join operation is successful only if the Subnet Manager (SM)
determines that the joining node can support the MTU registered with
the broadcast group [RFC4392] ensuring support for a common link MTU.
The SM also ensures that all the nodes joining the broadcast-GID have
paths to one another and can therefore send and receive unicast
packets. It further ensures that all the nodes do indeed form a
multicast tree that allows packets sent from any member to be
replicated to every other member. Thus, the IPoIB link is formed by
the IPoIB nodes joining the broadcast group. There is no physical
demarcation of the IPoIB link other than that determined by the
broadcast group membership.
Shirley Ma
Shirley
Ma/Beaverton/IBM@
IBMUS To
Sent by: openib-general at openib.org
openib-general-bo cc
unces at openib.org
Subject
[openib-general] Multicast join
02/02/07 08:58 PM group failure prevents IPoIB
performing
When bringing IPoIB interface up, I hit default group multicast join
failure. (This could be fixed in SM set up?)
ib0: multicast join failed for xxxx, status -22
Then the interface was UP but not RUNNING. So the nodes couldn't ping each
other. I think the right behavior of the interface should be UP and RUNNING
even with some multicast join failure. I would like to provide a patch if
there is no problem. Please advise.
Thanks
Shirley Ma_______________________________________________
openib-general mailing list
openib-general at openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic07588.gif
Type: image/gif
Size: 1255 bytes
Desc: not available
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: multicast.patch
Type: application/octet-stream
Size: 684 bytes
Desc: not available
URL:
From bugzilla-daemon at lists.openfabrics.org Sat Feb 3 23:07:21 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Sat, 3 Feb 2007 23:07:21 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070204070721.CAE32E607F9@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #14 from erezz at voltaire.com 2007-02-03 23:07 -------
(In reply to comment #13)
> I want to ask someone how I can apply the patch during build.sh run script?
> As I know when I run build.sh my old files with patch always update throught
> run rpm -i openib-1.1.src.rpm. How I can do it (apply my patches) or I need to
> wait new releases?
>
(In reply to comment #11)
> (In reply to comment #10)
> > What is the output of uname -r ? This is VERY important. Also, can you run
> `cat /etc/issue` and send the results?
> > >
>
> As you can see my first message I wrote the my machine configuration:
> >The machine configuration:
> >Kernel: Linux 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64
> x86_64 x86_64 GNU/Linux
> >OS: SUSE Linux Enterprise Server 10 (x86_64)
> >gcc version: gcc (GCC) 4.1.0 (SUSE Linux)
>
> Unfortunately my machine didn't have the version of Linux in /etc/issue because
> it is not right by IT requrements.
Why? OFED 1.1 expects that you don't change this file. This is how SuSE ships
it with SLES 10.
I have saw the ofed_scripts/configure file
> and I saw that for right choice of patches configure needed the file
> /etc/issue. I think that not good idea because first of all need to run
> command: cat /etc/*release* and find the version Linux in this file and after
> this check (if neccessary) file /etc/issue
>
I don't understand the problem.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at lists.openfabrics.org Sat Feb 3 23:14:59 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Sat, 3 Feb 2007 23:14:59 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070204071459.64292E607F9@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
erezz at voltaire.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |INVALID
------- Comment #15 from erezz at voltaire.com 2007-02-03 23:14 -------
(In reply to comment #13)
> I want to ask someone how I can apply the patch during build.sh run script?
I don't agree with your patch. It assumes that SLES 10 may be corrupted. OFED
should not try to support this. If you want to use this patch for your own
purposes, just apply it (manually) before running OFED build scripts. OFED's
backport patches mechanism is not suitable for such patches.
> As I know when I run build.sh my old files with patch always update throught
> run rpm -i openib-1.1.src.rpm. How I can do it (apply my patches) or I need to
> wait new releases?
>
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From ogerlitz at voltaire.com Sun Feb 4 00:13:40 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Sun, 04 Feb 2007 10:13:40 +0200
Subject: [openib-general] Detecting when an RDMA writer process
disappears
In-Reply-To: <45C2C7B1.7090204@evergrid.com>
References: <45C2C7B1.7090204@evergrid.com>
Message-ID: <45C595B4.3000700@voltaire.com>
Mike Heffner wrote:
> Is there any method by which a receiving process that is polling in
> preregistered memory regions for data from a sender performing RDMA
> writes, can detect if the sender is killed? Say by a SIGKILL signal? The
> RC connection is setup using the RDMA CM and there do not appear to be
> any CM events created on the event channel
If you have a process with connected RDMA CM ID whose associated peer
process died you should get DISCONNECTED event. how do you verify that
there is no rdma cm event present at the polling side?
Or.
From ogerlitz at voltaire.com Sun Feb 4 00:32:02 2007
From: ogerlitz at voltaire.com (Or Gerlitz)
Date: Sun, 04 Feb 2007 10:32:02 +0200
Subject: [openib-general] ip_ib_mc_map?
In-Reply-To: <15ddcffd0702011518qf115aaey862ef168784e81ca@mail.gmail.com>
References: <1170275331.14294.1.camel@stevo-desktop>
<45C1ABD0.5090404@voltaire.com>
<1170325052.2716.229.camel@fc6.xsintricity.com>
<15ddcffd0702011240l3c427bfcx6fcc7f7968fcf8b9@mail.gmail.com>
<1170368361.2716.239.camel@fc6.xsintricity.com>
<15ddcffd0702011518qf115aaey862ef168784e81ca@mail.gmail.com>
Message-ID: <45C59A02.6080900@voltaire.com>
Or Gerlitz wrote:
> On 2/2/07, Doug Ledford wrote:
>> Yeah, I've got a setup, I just don't have any multicast tests that I
>> run. Any test programs you have for multicast in particular would be
>> helpful.
> This is farely simple to do: have some multicast traffic routed over
> an IPoIB subnet on two nodes, eg using
>
> $ route add -net 224.0.0.0 netmask 255.0.0.0 dev ib0
> $ iperf -usB 224.5.5.5 -i 1
OK, to verifying the problem is away based on running client/server is
actually harder, since when the problem persist data is being moved on
the broadcast group... so basically, first thing you want to do is set
routing, then open an iperf server and see if the netstack has computed
a correct IPoIB multicast hw address and instructed the device to use it.
> # iperf -usB 224.5.5.5 &
this is on U3, the stack computed fine the hw addresses for 224.5.5.5
and 224.0.0.1
> # ip maddr show ib0
> 5: ib0
> link 00:ff:ff:ff:ff:12:40:1b:00:00:00:00:00:00:00:00:00:05:05:05
> link 00:ff:ff:ff:ff:12:40:1b:00:00:00:00:00:00:00:00:00:00:00:01
> inet 224.5.5.5
> inet 224.0.0.1
this is on U4, the stack did not compute any hw addresses for 224.5.5.5
and 224.0.0.1, the inet addresses are the output of /proc/net/igmp which
means the stack is aware this node joins these groups but as we know the
ARPHRD_INFINIBAND case was removed from the code computing a multicast
link layer address...
> # ip maddr show ib0
> 8: ib0
> inet 224.5.5.5
> inet 224.0.0.1
So basically, if on your U5-staged node, you have the same
# ip maddr show output as over U3 we made a progress. Really verifying
that this traffic does not go over the broadcast group is a little bit
harder, you would need a third active IPoIB device (that is another node
or a second ipoib running device on the rx machine - eg ib1), run the
iperf multicast test and make sure the --rx counters-- of the third
device doe not get progress, where on U3 they would progress since all
mcast traffic goes on the broadcast channel.
Please let me know if you need any further clarifications on how to test
this, and... thanks! for taking care of it.
Or.
From vlad at lists.openfabrics.org Sun Feb 4 02:22:23 2007
From: vlad at lists.openfabrics.org (vlad at lists.openfabrics.org)
Date: Sun, 4 Feb 2007 02:22:23 -0800 (PST)
Subject: [openib-general] ofa_1_2_kernel 20070204-0200 daily build status
Message-ID: <20070204102223.9F1DDE607F9@openfabrics.org>
This email was generated automatically, please do not reply
Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod --with-addr_trans-mod --with-cxgb3-mod
Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.12
Passed on powerpc with linux-2.6.17
Passed on powerpc with linux-2.6.19
Passed on ia64 with linux-2.6.19
Passed on powerpc with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.12
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.13
Passed on x86_64 with linux-2.6.17
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.12
Passed on powerpc with linux-2.6.16
Passed on ppc64 with linux-2.6.13
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.17
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.14
Failed:
Build failed on ia64 with linux-2.6.16.21-0.8-default
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070204-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
/home/vlad/tmp/ofa_1_2_kernel-20070204-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
/home/vlad/tmp/ofa_1_2_kernel-20070204-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070204-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070204-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070204-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070204-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
make: *** [kernel] Error 2
----------------------------------------------------------------------------------
From mst at mellanox.co.il Sun Feb 4 02:57:57 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 12:57:57 +0200
Subject: [openib-general] IPoIB CM for merge?
In-Reply-To: <1170431671.26115.25.camel@stevo-desktop>
References: <1170431671.26115.25.camel@stevo-desktop>
Message-ID: <20070204105757.GC8630@mellanox.co.il>
> Quoting Steve Wise :
> Subject: Re: [openib-general] IPoIB CM for merge?
>
> On Fri, 2007-02-02 at 13:15 +0200, Michael S. Tsirkin wrote:
> > > Quoting Roland Dreier :
> > > Subject: Re: IPoIB CM for merge?
> > >
> > > > Could you please spend some time reviewing IPoIB CM code?
> > > > I am concerned about missing the 2.6.21 merge window.
> > >
> > > Thanks for the reminder.
> > >
> > > Can we trade? Have you looked at the cxgb3 iwarp driver? Any comments?
> >
> > OK.
> > I am not sure I have the last version posted so I am going to go by what
> > is there in OFED git tree.
> >
> > And I also only looked under drivers/infiniband/.
> >
> > So, here are some questions: I looked in the archives and have not seen
> > these addressed. Maybe these can be answered and then I'll go from there?
> > Does this sound OK?
> >
> > Files with names like
> > ./core/cxio_hal.c
> > ./core/cxio_hal.h
> > normally generate a fair bit of discussion which wasn't present here,
> > I did not guess everyone was just busy.
> > For example, why is there both struct iwch_cq and struct t3_cq?
> >
>
> The cxgb3/core code defines a low level interface to the RDMA bits of
> the T3 device.
>
> This code was originally a separate module (named cxio) that allowed
> other RDMA middleware layers to sit on top of the this core rdma module.
> At the time, there was RNIC-PI and OFA being developed. So that is the
> history of this. As per the first openib review (about a year ago) of
> this code I merged this core module into the cxgb3 module. I left the
> file structure and names as-is because it was low priority IMO.
>
> The t3_cq struct is the low level CQ structure used to manage both a HW
> accessed CQ and a SW CQ (needed to handle error cases and out of order
> completions). The iwch_cq struct contains the stuff needed to integrate
> with the OFA core and uverbs code. It contains a t3_cq inline.
So now that there's a common module, there's no technical reason for
the two-level structure to exist? I would say you want to at least
move the files into a common directory.
I think you will also find that for datapath operations such as poll cq,
converting completion from hardware to struct t3_cqe, and from
that to ib_wc adds an untrivial amount of overhead.
> > File tcb.h comment says:
> > /* This file is automatically generated --- do not edit */
> > This looks like a GPL violation, does it not?
> >
>
> I can add the license if that's what you mean.
I mean that this file does not seem to be the source, in the GPL sense.
The following comes from COPYING under linux source directory:
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable.
So I think you must make the actual source available under the terms of GPL.
> > What's the deal with the naming convention?
> > Is there a reason in cxgb3, some files start with iwch and some with cxio?
> > How about using cxgb3 prefix all over?
>
> The cxio_ prefix is used for the low-level functions/types that talk
> directly with the HW. iwch_ is the provider driver functions that
> interface with the OFA stack. I'd rather not change the names.
> Especially since this has already gone through several review cycles.
> I'm hoping we can get this in and improve it with subsequent
> submissions. Is that reasonable?
--
MST
From monis at voltaire.com Sun Feb 4 04:21:02 2007
From: monis at voltaire.com (Moni Shoua)
Date: Sun, 04 Feb 2007 14:21:02 +0200
Subject: [openib-general] IB/mthca: question about HCA profile module
parameters
In-Reply-To: <45C1C3D5.1050301@dev.mellanox.co.il>
References: <45C1C3D5.1050301@dev.mellanox.co.il>
Message-ID: <45C5CFAE.9000302@voltaire.com>
Dotan Barak wrote:
> Hi Moni.
>
> I tried to use the mthca module parameter: for example i tried to change
> the number of QPs.
>
> I got several failures when i used the HCA 25204:
> * sometimes i got the following error message (when using big values,
> for example 512K QPs):
> ib_mthca: 0000:0c: INIT_HCA command failed aborting.
> ib_mthca: probe of 0000:0c: failed with error -16
> * when i tried to use small amount of QPs (1024) the machine just hanged
> and i noticed a kernel oops message on the console
>
>
> Did you verify the HCA profile module parameter feature?
> Is there is any known limitation for the values that should be used?
> (for example: only values which are power of two)
>
>
> thanks
> Dotan
>
Hi Dotan,
I verified the profile feature up to the level of successful modprobe.
I am working now to look into your report.
thanks
From mst at mellanox.co.il Sun Feb 4 04:58:20 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 14:58:20 +0200
Subject: [openib-general] IPoIB connected mode review comments
In-Reply-To: <1170443893.26115.59.camel@stevo-desktop>
References: <1170443893.26115.59.camel@stevo-desktop>
Message-ID: <20070204125820.GA14288@mellanox.co.il>
> Quoting Steve Wise :
> Subject: IPoIB connected mode review comments
>
> On Thu, 2007-02-01 at 20:48 -0800, Roland Dreier wrote:
> > > Have you had a chance to review this?
> >
> > Still on my list.
> >
> > Can we trade? Can you look at the IPoIB connected mode stuff in the
> > ipoib-cm branch in
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git
> >
> > and let me know if you see anything you don't like?
> >
> > - R.
>
> Here are my comments. I'm not an ib cm expert though. These are mostly
> questions:
Steve, thanks for looking at the code!
I hope the following answers your questions.
>
> Since IPoIB is using IP addresses already, wouldn't it be simpler to use
> the rdma cm to setup connections?
IPoIB is not using IP addresses. It uses hardware addresses as any network
device would. So using rdma cm does not make sense.
> Could you optimize this design and only signal some of the tx wrs?
This optimization would apply to UD mode too.
No one so far came up with a way to do this cleanly.
> In ipoib_cm_send() you call ipoib_cm_skb_too_long() if the packet is too
> large for the interface mtu. And you print a warning. But
> ipoib_cm_skb_too_long() actually queues the packet for the cm case. For
> ud it just drops the packet. The skb task for cm then will send a
> ICMP_DEST_UNREACH for these packets. Why the difference?
For UD I just kept the current behaviour - I think
this can actually only happen in case of a race when packet was queued
before MTU was changed, so the originator was already notified of
the MTU change by the stack above us.
For CM the local MTU may exceed the size of a buffer that was posted on
the remote QP. So we need to send ICMP_DEST_UNREACH to reduce the
originator's dest MTU to whatever this QP actually can support.
Since this needs the original skb, and must be done from task or bh context,
so we queue the skb and handle it in task context.
> Also if this
> packet came from the local stack via a local application, you don't want
> to send DEST_UNREACH, right? (I'm probably just confused about the
> purpose of this).
Yes, sending DEST_UNREACH does not seem to affect local interface. That's why
I call update_pmtu too. It is also good to update the MTU ASAP to reduce the
number lot of packets that are dropped - and update_pmtu can be called from
atomic context. I do not know how to tell the packet is from local
stack and it does not seem to do any harm to handle all packets in a uniform
manner.
net/ipv4/ip_gre.c and net/ipv4/ipip.c are examples of code that do something
similiar.
> In ipoib_cm_tx_completion() you rearm, then drain the cq. I thought
> there was some reason that it was better to do drain/rearm/drain?
> Something about if you rearm and there's a cq entry mthca does another
> immediate interrupt?
Again, this comment applies to UD mode as well.
AFAIK so far this worked best.
> In ipoib_cm_handle_tx_wc():
>
> When can a tx completion happen with a wr_id that isn't within the
> ipoib_sendq_size range? This looks like it is really a bug condition
> that should never happen.
Because of this:
post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1))
so wr_id is always within range.
Again, this is exactly the same logic as in UD case.
> I see the same code in the rx completion path too.
It's even simpler there:
+ for (i = 0; i < ipoib_recvq_size; ++i) {
...
+ if (ipoib_cm_post_receive(dev, i)) {
...
+ }
+ }
So i is always within RX size range.
> Also, what's up with the /* FIXME */ comment?
Since I have QPs which I never post send WRs on, I should be able to set
.cap.max_send_wr to 0 and .cap.max_send_sge should not matter.
However, low level drivers do not seem to support this at the moment, so
I set these to 1 for now - this is also correct but has a small memory cost.
> You lock the priv->lock inside of the priv->tx_lock. Is this ordering
> correct and consistent across all the code?
Yes, that's the nesting rule.
> ipoib_cm_handle_rx_wc() - what's up with the XXX comment?
We have the same comment in UD code - that's where this comes from.
Basically we don't have an easy way to know the correct packet type,
and always setting it to PACKET_HOST seems to work.
> What's the algorithm to keep enough buffers posted in the SRQ?
Same as with UD really - if I can't allocate a new skb I repost
the old one and increment the dropped packet counter.
--
MST
From mst at mellanox.co.il Sun Feb 4 05:06:06 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 15:06:06 +0200
Subject: [openib-general] IPoIB CM for merge?
In-Reply-To:
References:
Message-ID: <20070204130606.GB14288@mellanox.co.il>
> Quoting Pradeep Satyanarayana :
> Subject: Re: [openib-general] IPoIB CM for merge?
>
>
> Hello Michael,
>
> Here are a few more observations :
Pradeep, I think you are posting in the wrong thread: it seems you are not
talking about my code, but rather about the project you mentioned of
implementing IPoIB CM without SRQ.
IPoIB CM currently falls back on UD mode for HCAs that do not support SRQ,
so there should be no problem for the ehca - as new code won't be activated.
As I said already, I do not see a clean way to address this limitation,
so I would rather have current IPoIB CM code merged upstream first, and think
about enhancements later.
>
> 1. For the SRQ case, the skbs and recieve biffers are posted during init and even before the rx_qp is created. This causes a problem (atleast for non SRQs) for the ehca. We need to call the ipoib_cm_alloc_skb() and ipoib_cm_post_recieve() after the rx_qp
> is in the RTR state.
>
> 2. Also found that in ipoib_cm_create_rx_qp() one needs to initialize .cap.max_recv_wr and .cap.max_recv_sge. Otherwise this leads to some problems like rq overflows and causing communication failures.
Yes, I think these are some of the things that would need to be done to make IPoIB CM
work without SRQ. It is clearly not something we want to do for SRQ case however:
for example, posting WRs to SRQ during connection setup would race
against completion events for other connections. And assigning .cap.max_recv_wr > 0
for a QP not connected to SRQ does not make sense, and might thinkably confuse
low level drivers.
--
MST
From mst at mellanox.co.il Sun Feb 4 05:07:18 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 15:07:18 +0200
Subject: [openib-general] ofa_1_2_kernel 20070202-0200 daily buildstatus
In-Reply-To: <1170430869.26115.12.camel@stevo-desktop>
References: <1170430869.26115.12.camel@stevo-desktop>
Message-ID: <20070204130718.GC14288@mellanox.co.il>
> Quoting Steve Wise :
> Subject: Re: [openib-general] ofa_1_2_kernel 20070202-0200 daily buildstatus
>
> On Fri, 2007-02-02 at 02:20 -0800, vlad at lists.openfabrics.org wrote:
> > This email was generated automatically, please do not reply
>
> Which distro is 2.6.16.21-0.8-default? I'm sure I didn't do a netevent
> backport that.
That's SLES10 actually.
> Failed:
> Build failed on ia64 with linux-2.6.16.21-0.8-default
> Log:
> /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
> /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
> /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
> make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
> make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
> make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
> make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
> make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
> make: *** [kernel] Error 2
--
MST
From mst at mellanox.co.il Sun Feb 4 05:14:14 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 15:14:14 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <1170429504.26115.1.camel@stevo-desktop>
References: <1170429504.26115.1.camel@stevo-desktop>
Message-ID: <20070204131414.GD14288@mellanox.co.il>
> Quoting Steve Wise :
> Subject: Re: [PATCH 00/12] ofed_1_2 - Neighbour update support
>
> On Fri, 2007-02-02 at 08:03 +0200, Michael S. Tsirkin wrote:
> > > We could use a global refcnt to count the number of pending destructions
> > > and use a completion object to block unload until all the destructors
> > > fire and the refcnt goes to zero.
> >
> > It has the same race as module refcnt. So just use that.
> >
>
> I don't understand the race. Can you explain please? This should be
> able to be done without a race with a refcnt, a spinlock, a bit saying
> we're unloading, and a completion object.
>
> But maybe I'm confused ;-)
In short, the rule is that you can't pass a pointer to your function
to another module, and the unload module safely without synchronizing with that
other module.
Simplified example:
destructor
{
complete(&foo);
A:
return;
}
module_cleanup:
{
wait(foo)
return;
}
Now, assume destructor runs up to point A, then your module unloads,
and the memory its text occupied is overwritten by something else.
An attempt to execute code from point A will now crash.
So completion is not better than just module refcount here.
That said, I think the race is unlikely and just using module
refcount should be sufficient, and it's certainly simple.
--
MST
From mst at mellanox.co.il Sun Feb 4 05:15:00 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 15:15:00 +0200
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <1170355331.16637.25.camel@stevo-desktop>
References: <1170355331.16637.25.camel@stevo-desktop>
Message-ID: <20070204131500.GE14288@mellanox.co.il>
> Quoting Steve Wise :
> Subject: Re: [PATCH] RE: regression in ofed 1.2
>
> Um, now on rhel4u4 we crash creating the mcast workqueue.
>
> The name is "ib_mcast_wq" which is too long for older kernels.
>
> Did we loose a backport patch?
Not sure what happened here.
Sean, could you rename ib_mcast_wq to ib_mcast please?
--
MST
From mst at mellanox.co.il Sun Feb 4 06:00:19 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 16:00:19 +0200
Subject: [openib-general] [RFC] [PATCH] ib_usa: export multicast and
informinfo registration to userspace
In-Reply-To: <000001c74726$94d0f500$e598070a@amr.corp.intel.com>
References: <000001c74726$94d0f500$e598070a@amr.corp.intel.com>
Message-ID: <20070204140019.GC18543@mellanox.co.il>
+static void usa_remove_one(struct ib_device *device)
+{
+ struct usa_device *dev;
+
+ dev = ib_get_client_data(device, &usa_client);
+ if (!dev)
+ return;
+
+ mutex_lock(&usa_mutex);
+ list_del(&dev->list);
+ mutex_unlock(&usa_mutex);
+
+ /* TODO: force immediate device removal */
+ put_dev(dev);
+ wait_for_completion(&dev->comp);
+ kfree(dev);
+}
I think we really need to address this TODO.
An application waiting for data from SA needs to get woken up and get
an error code indicating that the device was removed.
This is currently broken in umad, but let's do it correctly here.
--
MST
From mst at mellanox.co.il Sun Feb 4 06:02:49 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 16:02:49 +0200
Subject: [openib-general] Detecting when an RDMA writer process
disappears
In-Reply-To: <45C595B4.3000700@voltaire.com>
References: <45C2C7B1.7090204@evergrid.com>
<45C595B4.3000700@voltaire.com>
Message-ID: <20070204140249.GD18543@mellanox.co.il>
> Quoting Or Gerlitz :
> Subject: Re: Detecting when an RDMA writer process disappears
>
> Mike Heffner wrote:
> > Is there any method by which a receiving process that is polling in
> > preregistered memory regions for data from a sender performing RDMA
> > writes, can detect if the sender is killed? Say by a SIGKILL signal? The
> > RC connection is setup using the RDMA CM and there do not appear to be
> > any CM events created on the event channel
>
> If you have a process with connected RDMA CM ID whose associated peer
> process died you should get DISCONNECTED event. how do you verify that
> there is no rdma cm event present at the polling side?
You may or may not get this event in case of packet loss - same as with sockets.
Sending keepalives is really the only way if you want to handle all
cases such as remote node crash.
--
MST
From vlad at mellanox.co.il Sun Feb 4 06:34:25 2007
From: vlad at mellanox.co.il (Vladimir Sokolovsky)
Date: Sun, 04 Feb 2007 16:34:25 +0200
Subject: [openib-general] openib diags installation issue
Message-ID: <1170599665.5887.14.camel@vladsk-laptop>
Hi Hal,
I have the following issue while executing 'make DESTDIR=/var/tmp/OFED install':
See the patch below for fixing this issue.
/usr/bin/install -c -m 644 './man/ibprintca.8' '/var/tmp/OFED/usr/local/ofed/share/man/man8/ibprintca.8'
/usr/bin/install -c -m 644 './man/ibfindnodesusing.8' '/var/tmp/OFED/usr/local/ofed/share/man/man8/ibfindnodesusing.8'
make install-data-hook
make[3]: Entering directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/management/diags'
for script in scripts/ibqueryerrors.pl scripts/ibswportwatch.pl scripts/iblinkinfo.pl scripts/ibprintswitch.pl scripts/ibprintca.pl scripts/ibfindnodesusing.pl; do \
binname=`echo $script | sed -e "s/scripts\/\(.*\)/\1/"`; \
cat $script | sed -e "s,use lib \"\(/lib/perl\)\";,use lib \"/usr/local/ofed\1\";," > /usr/local/ofed/bin/$binname; \
chmod 755 /usr/local/ofed/bin/$binname; \
done
/bin/bash: line 2: /usr/local/ofed/bin/ibqueryerrors.pl: No such file or directory
chmod: cannot access `/usr/local/ofed/bin/ibqueryerrors.pl': No such file or directory
/bin/bash: line 2: /usr/local/ofed/bin/ibswportwatch.pl: No such file or directory
chmod: cannot access `/usr/local/ofed/bin/ibswportwatch.pl': No such file or directory
/bin/bash: line 2: /usr/local/ofed/bin/iblinkinfo.pl: No such file or directory
chmod: cannot access `/usr/local/ofed/bin/iblinkinfo.pl': No such file or directory
/bin/bash: line 2: /usr/local/ofed/bin/ibprintswitch.pl: No such file or directory
chmod: cannot access `/usr/local/ofed/bin/ibprintswitch.pl': No such file or directory
/bin/bash: line 2: /usr/local/ofed/bin/ibprintca.pl: No such file or directory
chmod: cannot access `/usr/local/ofed/bin/ibprintca.pl': No such file or directory
/bin/bash: line 2: /usr/local/ofed/bin/ibfindnodesusing.pl: No such file or directory
chmod: cannot access `/usr/local/ofed/bin/ibfindnodesusing.pl': No such file or directory
make[3]: *** [install-data-hook] Error 1
make[3]: Leaving directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/management/diags'
make[2]: *** [install-data-am] Error 2
make[2]: Leaving directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/management/diags'
make[1]: *** [install-am] Error 2
make[1]: Leaving directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/management/diags'
make: *** [install_diags] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.37589 (%install)
Patch for fixing the issue above:
diff --git a/diags/Makefile.am b/diags/Makefile.am
index 06b21fc..81ece28 100644
--- a/diags/Makefile.am
+++ b/diags/Makefile.am
@@ -150,9 +150,9 @@ dist-hook: diags.spec
install-data-hook:
for script in $(IB_SW_COUNT_DEPENDANT); do \
binname=`echo $$script | sed -e "s/scripts\/\(.*\)/\1/"`; \
- cat $$script | sed -e "s,use lib \"\(/lib/perl\)\";,use lib \"$(prefix)\1\";," > $(bindir)/$$binname; \
- chmod 755 $(bindir)/$$binname; \
+ cat $$script | sed -e "s,use lib \"\(/lib/perl\)\";,use lib \"$(prefix)\1\";," > $(DESTDIR)$(bindir)/$$binname; \
+ chmod 755 $(DESTDIR)$(bindir)/$$binname; \
done
- $(top_srcdir)/config/install-sh -m 755 -d $(prefix)/lib/perl
- $(top_srcdir)/config/install-sh -m 755 scripts/IBswcountlimits.pm $(prefix)/lib/perl
+ $(top_srcdir)/config/install-sh -m 755 -d $(DESTDIR)$(prefix)/lib/perl
+ $(top_srcdir)/config/install-sh -m 755 scripts/IBswcountlimits.pm $(DESTDIR)$(prefix)/lib/perl
From monis at voltaire.com Sun Feb 4 06:57:14 2007
From: monis at voltaire.com (Moni Shoua)
Date: Sun, 04 Feb 2007 16:57:14 +0200
Subject: [openib-general] IB/mthca: question about HCA profile module
parameters
In-Reply-To: <45C1C3D5.1050301@dev.mellanox.co.il>
References: <45C1C3D5.1050301@dev.mellanox.co.il>
Message-ID: <45C5F44A.9020802@voltaire.com>
Dotan Barak wrote:
> Hi Moni.
>
> I tried to use the mthca module parameter: for example i tried to change
> the number of QPs.
>
> I got several failures when i used the HCA 25204:
> * sometimes i got the following error message (when using big values,
> for example 512K QPs):
> ib_mthca: 0000:0c: INIT_HCA command failed aborting.
> ib_mthca: probe of 0000:0c: failed with error -16
> * when i tried to use small amount of QPs (1024) the machine just hanged
> and i noticed a kernel oops message on the console
>
OK. So I ran more tests on my setup which now include
- Dual x86_64 processor (Intel Xeon)
- 1GB RAM
- 25204 HCA - fw_ver=1.1.0
In the range of 16K - to 256K of value for num_qp I got no errors.
For lower and higher values I got errors from INIT_HCA and (not always and just for very low values) a machine hung.
Do you have the Oops saved somewhere? Can you put it here please?
>
> Did you verify the HCA profile module parameter feature?
As I mentioned earlier, I verified that non default values can be assigned
and that the HCA works for some selected values.
I also noticed that illegal cause the driver to throw a message to the kernel log.
However, I didn't test the exact behaviout of all possible values for each profile variable.
> Is there is any known limitation for the values that should be used?
> (for example: only values which are power of two)
>
>
I guess that it is clear that there are hardware limitations that don't allow setting of any value.
Unfotunately, even after looking for them in the PRM, I couldn't figure out which are they.
The software limits the value to be a power of 2 and corrects the users if they try to set a wrong value (to the nearest power of 2). In that case a warning message is thrown to the kernel log.
> thanks
> Dotan
>
From mst at mellanox.co.il Sun Feb 4 06:59:58 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 16:59:58 +0200
Subject: [openib-general] [PATCH 00/12] ofed_1_2 - Neighbour update
support
In-Reply-To: <1170372217.16637.87.camel@stevo-desktop>
References: <1170372217.16637.87.camel@stevo-desktop>
Message-ID: <20070204145958.GA20087@mellanox.co.il>
> If you're worried about regressing straight rdma address
> translation, then you can call the address translation timer function
> synchronously in the snoop function like before and change the
> addr_trans module to not use netevents...
This seems the prudent thing to do.
OK, I'll do that.
--
MST
From swise at opengridcomputing.com Sun Feb 4 07:48:57 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Sun, 04 Feb 2007 09:48:57 -0600
Subject: [openib-general] [PATCH] ofed_1_2 Cleanup RHEL4U4 netevent
backport]
In-Reply-To: <1170360441.16637.41.camel@stevo-desktop>
References: <1170360441.16637.41.camel@stevo-desktop>
Message-ID: <1170604137.4129.13.camel@linux-q667.site>
Vlad/Michael,
I'm still tracking this as an outstanding patch. Have you pulled this
in yet?
Thanks,
Steve.
On Thu, 2007-02-01 at 14:07 -0600, Steve Wise wrote:
> From: Steve Wise
>
> Add skbuff.h to include list for RHEL4U4 netevent.c file. This makes
> it identical to the SLES9SP3 file.
>
> Signed-off-by: Steve Wise
> ---
>
> .../backport/2.6.9_U4/include/src/netevent.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> index 1589300..87fb55c 100644
> --- a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> @@ -13,6 +13,7 @@
> * Fixes:
> */
>
> +#include
> #include
> #include
> #include
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
From swise at opengridcomputing.com Sun Feb 4 07:49:41 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Sun, 04 Feb 2007 09:49:41 -0600
Subject: [openib-general] [PATCH] ofed_1_2 Chelsio ethernet driver
updates.
In-Reply-To: <1170360543.16637.45.camel@stevo-desktop>
References: <1170360543.16637.45.camel@stevo-desktop>
Message-ID: <1170604181.4129.15.camel@linux-q667.site>
Vlad/Michael,
I'm still tracking this as an outstanding patch. Can you pull it in
please?
Thanks,
Steve.
On Thu, 2007-02-01 at 14:09 -0600, Steve Wise wrote:
> From: Steve Wise
>
> This patch updates the ofed_1_2 cxgb3 module to the latest queued
> for 2.6.21.
>
> Signed-off-by: Steve Wise
> ---
>
> drivers/net/cxgb3/firmware_exports.h | 2 +-
> drivers/net/cxgb3/sge.c | 21 +++++++++------------
> drivers/net/cxgb3/t3_cpl.h | 3 ---
> 3 files changed, 10 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/cxgb3/firmware_exports.h b/drivers/net/cxgb3/firmware_exports.h
> index 4538377..6a835f6 100755
> --- a/drivers/net/cxgb3/firmware_exports.h
> +++ b/drivers/net/cxgb3/firmware_exports.h
> @@ -129,7 +129,7 @@ #define FW_OFLD_NUM 8
> #define FW_OFLD_SGEEC_START 0
>
> /*
> - *
> + *
> */
> #define FW_RI_NUM 1
> #define FW_RI_SGEEC_START 65527
> diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
> index 6b053bf..3f2cf8a 100755
> --- a/drivers/net/cxgb3/sge.c
> +++ b/drivers/net/cxgb3/sge.c
> @@ -601,17 +601,16 @@ static struct sk_buff *get_packet(struct
> if (len <= SGE_RX_COPY_THRES) {
> skb = alloc_skb(len, GFP_ATOMIC);
> if (likely(skb != NULL)) {
> - struct rx_desc *d = &fl->desc[fl->cidx];
> - dma_addr_t mapping =
> - (dma_addr_t)((u64) be32_to_cpu(d->addr_hi) << 32 |
> - be32_to_cpu(d->addr_lo));
> -
> __skb_put(skb, len);
> - pci_dma_sync_single_for_cpu(adap->pdev, mapping, len,
> - PCI_DMA_FROMDEVICE);
> + pci_dma_sync_single_for_cpu(adap->pdev,
> + pci_unmap_addr(sd,
> + dma_addr),
> + len, PCI_DMA_FROMDEVICE);
> memcpy(skb->data, sd->skb->data, len);
> - pci_dma_sync_single_for_device(adap->pdev, mapping, len,
> - PCI_DMA_FROMDEVICE);
> + pci_dma_sync_single_for_device(adap->pdev,
> + pci_unmap_addr(sd,
> + dma_addr),
> + len, PCI_DMA_FROMDEVICE);
> } else if (!drop_thres)
> goto use_orig_buf;
> recycle:
> @@ -1667,7 +1666,7 @@ #endif
> credits = G_RSPD_TXQ0_CR(flags);
> if (credits)
> qs->txq[TXQ_ETH].processed += credits;
> -
> +
> credits = G_RSPD_TXQ2_CR(flags);
> if (credits)
> qs->txq[TXQ_CTRL].processed += credits;
> @@ -2220,14 +2219,12 @@ static irqreturn_t t3b_intr_napi(int irq
> if (likely(map & 1)) {
> dev = adap->sge.qs[0].netdev;
>
> - BUG_ON(napi_is_scheduled(dev));
> if (likely(__netif_rx_schedule_prep(dev)))
> __netif_rx_schedule(dev);
> }
> if (map & 2) {
> dev = adap->sge.qs[1].netdev;
>
> - BUG_ON(napi_is_scheduled(dev));
> if (likely(__netif_rx_schedule_prep(dev)))
> __netif_rx_schedule(dev);
> }
> diff --git a/drivers/net/cxgb3/t3_cpl.h b/drivers/net/cxgb3/t3_cpl.h
> index 96b2f36..b7a1a31 100755
> --- a/drivers/net/cxgb3/t3_cpl.h
> +++ b/drivers/net/cxgb3/t3_cpl.h
> @@ -184,9 +184,6 @@ #define V_OPCODE(x) ((x) << S_OPCODE)
> #define G_OPCODE(x) (((x) >> S_OPCODE) & 0xFF)
> #define G_TID(x) ((x) & 0xFFFFFF)
>
> -#define S_QNUM 0
> -#define G_QNUM(x) (((x) >> S_QNUM) & 0xFFFF)
> -
> /* tid is assumed to be 24-bits */
> #define MK_OPCODE_TID(opcode, tid) (V_OPCODE(opcode) | (tid))
>
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
From mst at mellanox.co.il Sun Feb 4 07:52:44 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 17:52:44 +0200
Subject: [openib-general] [PATCH] ofed_1_2 Cleanup RHEL4U4
neteventbackport]
In-Reply-To: <1170604137.4129.13.camel@linux-q667.site>
References: <1170604137.4129.13.camel@linux-q667.site>
Message-ID: <20070204155244.GC20087@mellanox.co.il>
No, but it really makes sense. Vlad?
Quoting Steve WIse :
Subject: Re: [openib-general] [PATCH] ofed_1_2 Cleanup RHEL4U4 neteventbackport]
Vlad/Michael,
I'm still tracking this as an outstanding patch. Have you pulled this
in yet?
Thanks,
Steve.
On Thu, 2007-02-01 at 14:07 -0600, Steve Wise wrote:
> From: Steve Wise
>
> Add skbuff.h to include list for RHEL4U4 netevent.c file. This makes
> it identical to the SLES9SP3 file.
>
> Signed-off-by: Steve Wise
> ---
>
> .../backport/2.6.9_U4/include/src/netevent.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> index 1589300..87fb55c 100644
> --- a/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> +++ b/kernel_addons/backport/2.6.9_U4/include/src/netevent.c
> @@ -13,6 +13,7 @@
> * Fixes:
> */
>
> +#include
> #include
> #include
> #include
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
--
MST
From mst at mellanox.co.il Sun Feb 4 07:54:47 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 17:54:47 +0200
Subject: [openib-general] [PATCH] ofed_1_2 Chelsio ethernet driver
updates.
In-Reply-To: <1170604181.4129.15.camel@linux-q667.site>
References: <1170360543.16637.45.camel@stevo-desktop>
<1170604181.4129.15.camel@linux-q667.site>
Message-ID: <20070204155447.GD20087@mellanox.co.il>
Vlad?
Quoting Steve WIse :
Subject: Re: [PATCH] ofed_1_2 Chelsio ethernet driver updates.
Vlad/Michael,
I'm still tracking this as an outstanding patch. Can you pull it in
please?
Thanks,
Steve.
On Thu, 2007-02-01 at 14:09 -0600, Steve Wise wrote:
> From: Steve Wise
>
> This patch updates the ofed_1_2 cxgb3 module to the latest queued
> for 2.6.21.
>
> Signed-off-by: Steve Wise
--
MST
From swise at opengridcomputing.com Sun Feb 4 07:57:57 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Sun, 04 Feb 2007 09:57:57 -0600
Subject: [openib-general] ofa_1_2_kernel 20070202-0200 daily buildstatus
In-Reply-To: <20070204130718.GC14288@mellanox.co.il>
References: <1170430869.26115.12.camel@stevo-desktop>
<20070204130718.GC14288@mellanox.co.il>
Message-ID: <1170604677.4129.20.camel@linux-q667.site>
So its building sles10 ok on all other platforms but ia64? It seems
like its not including the netevent.c file. But that backport does
exist.
On Sun, 2007-02-04 at 15:07 +0200, Michael S. Tsirkin wrote:
> > Quoting Steve Wise :
> > Subject: Re: [openib-general] ofa_1_2_kernel 20070202-0200 daily buildstatus
> >
> > On Fri, 2007-02-02 at 02:20 -0800, vlad at lists.openfabrics.org wrote:
> > > This email was generated automatically, please do not reply
> >
> > Which distro is 2.6.16.21-0.8-default? I'm sure I didn't do a netevent
> > backport that.
>
> That's SLES10 actually.
>
> > Failed:
> > Build failed on ia64 with linux-2.6.16.21-0.8-default
> > Log:
> > /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
> > /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
> > /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
> > make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
> > make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
> > make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
> > make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
> > make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
> > make: *** [kernel] Error 2
>
>
From swise at opengridcomputing.com Sun Feb 4 08:14:33 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Sun, 04 Feb 2007 10:14:33 -0600
Subject: [openib-general] ofa_1_2_kernel 20070202-0200 daily buildstatus
In-Reply-To: <20070204130718.GC14288@mellanox.co.il>
References: <1170430869.26115.12.camel@stevo-desktop>
<20070204130718.GC14288@mellanox.co.il>
Message-ID: <1170605673.4129.43.camel@linux-q667.site>
Michael,
You've setup a cross-compile environment on staging.openfabrics.org, eh?
How can I utilize that to resolve this issue? Or is someone else
handling it?
Steve.
On Sun, 2007-02-04 at 15:07 +0200, Michael S. Tsirkin wrote:
> > Quoting Steve Wise :
> > Subject: Re: [openib-general] ofa_1_2_kernel 20070202-0200 daily buildstatus
> >
> > On Fri, 2007-02-02 at 02:20 -0800, vlad at lists.openfabrics.org wrote:
> > > This email was generated automatically, please do not reply
> >
> > Which distro is 2.6.16.21-0.8-default? I'm sure I didn't do a netevent
> > backport that.
>
> That's SLES10 actually.
>
> > Failed:
> > Build failed on ia64 with linux-2.6.16.21-0.8-default
> > Log:
> > /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
> > /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
> > /home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
> > make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
> > make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
> > make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
> > make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070202-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
> > make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
> > make: *** [kernel] Error 2
>
>
From vlad at mellanox.co.il Sun Feb 4 08:54:30 2007
From: vlad at mellanox.co.il (Vladimir Sokolovsky)
Date: Sun, 04 Feb 2007 18:54:30 +0200
Subject: [openib-general] [PATCH] ofed_1_2 Chelsio ethernet driver
updates.
In-Reply-To: <1170360543.16637.45.camel@stevo-desktop>
References: <1170360543.16637.45.camel@stevo-desktop>
Message-ID: <1170608070.5887.15.camel@vladsk-laptop>
On Thu, 2007-02-01 at 14:09 -0600, Steve Wise wrote:
> From: Steve Wise
>
> This patch updates the ofed_1_2 cxgb3 module to the latest queued
> for 2.6.21.
>
> Signed-off-by: Steve Wise
> ---
>
> drivers/net/cxgb3/firmware_exports.h | 2 +-
> drivers/net/cxgb3/sge.c | 21 +++++++++------------
> drivers/net/cxgb3/t3_cpl.h | 3 ---
> 3 files changed, 10 insertions(+), 16 deletions(-)
Applied.
--
Vladimir Sokolovsky
Mellanox Technologies Ltd.
From mst at mellanox.co.il Sun Feb 4 09:58:33 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 19:58:33 +0200
Subject: [openib-general] idea for ofed 1 2 kernel file structure
Message-ID: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
Hi!
I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike:
It is hard to see changes that are specific to OFED since we have whole
kernel history mixed in.
It would easy to split OFED specific files In separate directory and
have OFED scripts combine that with upstream kernel.
All out of tree modules we distribute would go there too.
What do others think about this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From swise at opengridcomputing.com Sun Feb 4 10:19:20 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Sun, 04 Feb 2007 12:19:20 -0600
Subject: [openib-general] idea for ofed 1 2 kernel file structure
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
Message-ID: <1170613160.4129.110.camel@linux-q667.site>
On Sun, 2007-02-04 at 19:58 +0200, Michael S. Tsirkin wrote:
> Hi!
>
> I looked a current ofed 1.2 kernel tree and there is 1 thing I
> dislike:
>
> It is hard to see changes that are specific to OFED since we have
> whole kernel history mixed in.
>
>
>
> It would easy to split OFED specific files In separate directory and
> have OFED scripts combine that with upstream kernel.
>
>
>
> All out of tree modules we distribute would go there too.
>
> What do others think about this?
>
>
I'm not exactly clear what you mean. Could you expand a little on your
idea?
From mst at mellanox.co.il Sun Feb 4 10:27:59 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 20:27:59 +0200
Subject: [openib-general] idea for ofed 1 2 kernel file structure
In-Reply-To: <1170613160.4129.110.camel@linux-q667.site>
References: <1170613160.4129.110.camel@linux-q667.site>
Message-ID: <20070204182759.GA28729@mellanox.co.il>
> Quoting Steve WIse :
> Subject: Re: idea for ofed 1 2 kernel file structure
>
> On Sun, 2007-02-04 at 19:58 +0200, Michael S. Tsirkin wrote:
> > Hi!
> >
> > I looked a current ofed 1.2 kernel tree and there is 1 thing I
> > dislike:
> >
> > It is hard to see changes that are specific to OFED since we have
> > whole kernel history mixed in.
> >
> >
> >
> > It would easy to split OFED specific files In separate directory and
> > have OFED scripts combine that with upstream kernel.
> >
> >
> >
> > All out of tree modules we distribute would go there too.
> >
> > What do others think about this?
> >
> >
>
> I'm not exactly clear what you mean. Could you expand a little on your
> idea?
Well, OFED kernel tree is currently kernel.org files + OFED files.
We could have OFED files in a separate tree and build script
would put them together.
--
MST
From swise at opengridcomputing.com Sun Feb 4 11:43:27 2007
From: swise at opengridcomputing.com (Steve WIse)
Date: Sun, 04 Feb 2007 13:43:27 -0600
Subject: [openib-general] idea for ofed 1 2 kernel file structure
In-Reply-To: <20070204182759.GA28729@mellanox.co.il>
References: <1170613160.4129.110.camel@linux-q667.site>
<20070204182759.GA28729@mellanox.co.il>
Message-ID: <1170618207.4129.118.camel@linux-q667.site>
>
> Well, OFED kernel tree is currently kernel.org files + OFED files.
> We could have OFED files in a separate tree and build script
> would put them together.
>
So the ofed_1_2 tree would become just new drivers/ulps that are not in
the kernel its based on (2.6.20), kernel_patches/, kernel_addons/, and
ofed_scripts/. Right?
I think that's a reasonable approach, and it keeps the kernel tree clean
and makes it very clear which files are added to any given kernel
release that ofed bases on.
my 2 cents.
Steve.
From mst at mellanox.co.il Sun Feb 4 12:58:55 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Sun, 4 Feb 2007 22:58:55 +0200
Subject: [openib-general] idea for ofed 1 2 kernel file structure
In-Reply-To: <1170618207.4129.118.camel@linux-q667.site>
References: <1170613160.4129.110.camel@linux-q667.site>
<20070204182759.GA28729@mellanox.co.il>
<1170618207.4129.118.camel@linux-q667.site>
Message-ID: <20070204205855.GF29029@mellanox.co.il>
> Quoting Steve WIse :
> Subject: Re: idea for ofed 1 2 kernel file structure
>
>
> >
> > Well, OFED kernel tree is currently kernel.org files + OFED files.
> > We could have OFED files in a separate tree and build script
> > would put them together.
> >
>
> So the ofed_1_2 tree would become just new drivers/ulps that are not in
> the kernel its based on (2.6.20), kernel_patches/, kernel_addons/, and
> ofed_scripts/. Right?
Yes.
> I think that's a reasonable approach, and it keeps the kernel tree clean
> and makes it very clear which files are added to any given kernel
> release that ofed bases on.
On the other hand, we are at feature freeze, so this is only
acceptable only if this can be done with only minor changes
in Vlad's build scripts.
So I'll check with him.
--
MST
From dotanb at dev.mellanox.co.il Mon Feb 5 01:31:34 2007
From: dotanb at dev.mellanox.co.il (Dotan Barak)
Date: Mon, 05 Feb 2007 11:31:34 +0200
Subject: [openib-general] IB/mthca: question about HCA profile module
parameters
In-Reply-To: <45C5F44A.9020802@voltaire.com>
References: <45C1C3D5.1050301@dev.mellanox.co.il>
<45C5F44A.9020802@voltaire.com>
Message-ID: <45C6F976.3000802@dev.mellanox.co.il>
Hi Mini and thanks for the quick response.
Moni Shoua wrote:
> OK. So I ran more tests on my setup which now include
> - Dual x86_64 processor (Intel Xeon)
> - 1GB RAM
> - 25204 HCA - fw_ver=1.1.0
>
> In the range of 16K - to 256K of value for num_qp I got no errors.
> For lower and higher values I got errors from INIT_HCA and (not always and just for very low values) a machine hung.
> Do you have the Oops saved somewhere? Can you put it here please?
>
>
Sorry but i don't have a dump of the kernel oops but i have a strong
belief that we saw the same kernel oops ...
If it is needed, i will try to reproduce it one more time.
>
>> Did you verify the HCA profile module parameter feature?
>>
> As I mentioned earlier, I verified that non default values can be assigned
> and that the HCA works for some selected values.
> I also noticed that illegal cause the driver to throw a message to the kernel log.
> However, I didn't test the exact behaviout of all possible values for each profile variable.
>
I guess that this is something that need to be done. i will add this to
our regression in the future ....
>> Is there is any known limitation for the values that should be used?
>> (for example: only values which are power of two)
>>
>>
>>
> I guess that it is clear that there are hardware limitations that don't allow setting of any value.
> Unfotunately, even after looking for them in the PRM, I couldn't figure out which are they.
> The software limits the value to be a power of 2 and corrects the users if they try to set a wrong value (to the nearest power of 2). In that case a warning message is thrown to the kernel log.
>
As much as i know, the minimum amount of any resource (for example, QPs)
are the number of resources that
the HCA report as reserved.
I will open a bug in the Bugzilla, so we will know that there are
problems in this feature.
thanks
Dotan
From vlad at dev.mellanox.co.il Mon Feb 5 01:50:47 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Mon, 05 Feb 2007 11:50:47 +0200
Subject: [openib-general] MVAPICH2 SRPM and install file patches
In-Reply-To: <45C14344.9010602@cse.ohio-state.edu>
References: <45C14344.9010602@cse.ohio-state.edu>
Message-ID: <1170669047.6049.4.camel@vladsk-laptop>
On Wed, 2007-01-31 at 20:32 -0500, Shaun Rowland wrote:
> I've placed the MVAPICH2 SRPM on the OFA server in ~rowland/ofed_1_2,
> and it is linked to here:
>
> http://www.openfabrics.org/~rowland/ofed_1_2/
>
Hi Shaun,
Please change mvapich2.spec to avoid using of %build macro.
It removes RPM_BUILD_ROOT on SuSE distros:
Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.9418
+ umask 022
+ cd /var/tmp/OFEDRPM/BUILD
+ /bin/rm -rf /var/tmp/OFED
++ dirname /var/tmp/OFED
+ /bin/mkdir -p /var/tmp
+ /bin/mkdir /var/tmp/OFED
+ cd mvapich2-0.9.8
+ export OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed
+ OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed
--
Vladimir Sokolovsky
Mellanox Technologies Ltd.
From rdreier at cisco.com Mon Feb 5 02:15:25 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 05 Feb 2007 02:15:25 -0800
Subject: [openib-general] [GIT PULL] please pull infiniband.git
Message-ID:
Linus, please pull from
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus
This tree is also available from kernel.org mirrors at:
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus
This is my first merge for 2.6.21:
Hoang-Nam Nguyen (2):
IB/ehca: Remove use of do_mmap()
IB/ehca: Remove obsolete prototypes
Ishai Rabinovitz (1):
IB/srp: Don't wait for response when QP is in error state.
Jason Gunthorpe (1):
IB: Make sure struct ib_user_mad.data is aligned
Michael S. Tsirkin (2):
IB: Include explicitly in
IB: Return qp pointer as part of ib_wc
Steve Wise (1):
RDMA/addr: Handle ethernet neighbour updates during route resolution
drivers/infiniband/core/addr.c | 3 +-
drivers/infiniband/core/mad.c | 11 +-
drivers/infiniband/core/uverbs_cmd.c | 2 +-
drivers/infiniband/hw/amso1100/c2_cq.c | 2 +-
drivers/infiniband/hw/ehca/ehca_classes.h | 29 +--
drivers/infiniband/hw/ehca/ehca_cq.c | 65 ++----
drivers/infiniband/hw/ehca/ehca_iverbs.h | 8 -
drivers/infiniband/hw/ehca/ehca_main.c | 6 +-
drivers/infiniband/hw/ehca/ehca_qp.c | 78 +-----
drivers/infiniband/hw/ehca/ehca_reqs.c | 2 +-
drivers/infiniband/hw/ehca/ehca_uverbs.c | 395 ++++++++++++-----------------
drivers/infiniband/hw/ipath/ipath_qp.c | 2 +-
drivers/infiniband/hw/ipath/ipath_rc.c | 8 +-
drivers/infiniband/hw/ipath/ipath_ruc.c | 8 +-
drivers/infiniband/hw/ipath/ipath_uc.c | 4 +-
drivers/infiniband/hw/ipath/ipath_ud.c | 8 +-
drivers/infiniband/hw/mthca/mthca_cmd.c | 2 +-
drivers/infiniband/hw/mthca/mthca_cq.c | 2 +-
drivers/infiniband/ulp/srp/ib_srp.c | 7 +
drivers/infiniband/ulp/srp/ib_srp.h | 1 +
include/rdma/ib_user_mad.h | 2 +-
include/rdma/ib_verbs.h | 3 +-
22 files changed, 243 insertions(+), 405 deletions(-)
From vlad at lists.openfabrics.org Mon Feb 5 02:22:18 2007
From: vlad at lists.openfabrics.org (vlad at lists.openfabrics.org)
Date: Mon, 5 Feb 2007 02:22:18 -0800 (PST)
Subject: [openib-general] ofa_1_2_kernel 20070205-0200 daily build status
Message-ID: <20070205102221.765A7E607FE@openfabrics.org>
This email was generated automatically, please do not reply
Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod --with-addr_trans-mod --with-cxgb3-mod
Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.18
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.13
Passed on powerpc with linux-2.6.19
Passed on x86_64 with linux-2.6.16
Passed on powerpc with linux-2.6.17
Passed on powerpc with linux-2.6.18
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.19
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.16
Passed on powerpc with linux-2.6.12
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.18
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.13
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.16
Failed:
Build failed on ia64 with linux-2.6.16.21-0.8-default
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380: error: implicit declaration of function ‘register_netevent_notifier’
/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c: In function ‘addr_cleanup’:
/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386: error: implicit declaration of function ‘unregister_netevent_notifier’
make[4]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o] Error 1
make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core] Error 2
make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband] Error 2
make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070205-0200_linux-2.6.16.21-0.8-default_ia64_check] Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
make: *** [kernel] Error 2
----------------------------------------------------------------------------------
From hello001 at emirates.net.ae Mon Feb 5 15:04:56 2007
From: hello001 at emirates.net.ae (International IP - Dubai (WorldWide Trademarks Attorneys))
Date: Mon, 05 Feb 2007 15:04:56 -0800
Subject: [openib-general] Our ref. 702/a5tms/12
Message-ID: <0a0e01c7497a$1b938940$0201a8c0@YASSER4>
February5th, 2007
Our ref. 702/a5tms/12
Kind Attn. of General Manager ESQ,
CC. Kind Attn. of Marketing Manager ESQ.
Dear Sir,
Good Afternoon....
As a leading company specializing in the registration of trademarks/ logos and Commercial Agencies in United Arab Emirates & WorldWide, we would like to express our sincere desire to be at your service concerning the same in both of UAE and worldwide.
For setting up your company branch in Dubai, It's our most pleasure to assist you in this regard.
Awaiting your kind inquiries, instructions, suggestions, we always remain.
Warm regards,
Sincerely,
For International IP - Dubai (WorldWide Trademarks Attorneys)
Main Branch - Dubai
P.O. Box:64246, Dubai, United Arab Emirates
Tel. #+ 971-4-2977-930
Fax. #+ 971-4-2977-776
Cellular # +971-50-2519-528
E-mail: hello001 at emirates.net.ae
Rashid Khalfan Bin Sabt
General Manager
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Clear Day Bkgrd.JPG
Type: image/jpeg
Size: 5675 bytes
Desc: not available
URL:
From bugzilla-daemon at lists.openfabrics.org Mon Feb 5 03:17:05 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Mon, 5 Feb 2007 03:17:05 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070205111705.7372CE607FE@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #16 from dmitry.yulov at intel.com 2007-02-05 03:17 -------
> I don't agree with your patch. It assumes that SLES 10 may be corrupted. OFED
> should not try to support this. If you want to use this patch for your own
> purposes, just apply it (manually) before running OFED build scripts. OFED's
> backport patches mechanism is not suitable for such patches.
I don't agree with you because my patch do not any changes in system files. It
only search version of SUSE, but if you think that OFED should not try to
support this I think that many Intel people who will install OFED on SLES10
platform will be unhappy. Thanks a lot for you help.
-- Dmitry.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From vlad at dev.mellanox.co.il Mon Feb 5 03:44:23 2007
From: vlad at dev.mellanox.co.il (Vladimir Sokolovsky)
Date: Mon, 05 Feb 2007 13:44:23 +0200
Subject: [openib-general] MVAPICH2 rpmbuild issue
In-Reply-To: <45C14344.9010602@cse.ohio-state.edu>
References: <45C14344.9010602@cse.ohio-state.edu>
Message-ID: <1170675863.6049.11.camel@vladsk-laptop>
Hi Shaun,
Please check the following issue:
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.84872
+ umask 022
+ cd /var/tmp/OFEDRPM/BUILD
+ cd mvapich2-0.9.8
+ export OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed
+ OPEN_IB_HOME=/var/tmp/OFED/usr/local/ofed
+ '[' -d /var/tmp/OFED/usr/local/ofed/lib ']'
+ '[' -d /var/tmp/OFED/usr/local/ofed/lib64 ']'
+ export PREFIX=/var/tmp/OFED/usr/local/ofed/mpi/gcc/mvapich2-0.9.8-1
+ PREFIX=/var/tmp/OFED/usr/local/ofed/mpi/gcc/mvapich2-0.9.8-1
+ export CC=gcc CXX=g++ F77=gfortran
+ CC=gcc
+ CXX=g++
+ F77=gfortran
+ export ROMIO=yes
+ ROMIO=yes
+ export SHARED_LIBS=yes
+ SHARED_LIBS=yes
+ ./make.mvapich2.gen2
Could not find the OPEN_IB_HOME/lib64 or OPEN_IB_HOME/lib directory.
Exiting.
error: Bad exit status from /var/tmp/rpm-tmp.84872 (%install)
RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.84872 (%install)
ERROR: Failed executing "rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_name mvapich2_gcc' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich2-0.9.8-1' --define 'build_root /var/tmp/OFED' --define 'open_ib_home /usr/local/ofed' --define 'ofed_build_root /var/tmp/OFED' --define 'comp_env CC=gcc CXX=g++ F77=gfortran' --define 'iwarp 0' --define 'romio 1' --define 'shared_libs 1' --define 'auto_req 1' /mswg2/work/vlad/ofed/test/OFED-1.2-alpha1/SRPMS/mvapich2-0.9.8-1.src.rpm"
--
Vladimir Sokolovsky
Mellanox Technologies Ltd.
From bugzilla-daemon at lists.openfabrics.org Mon Feb 5 03:52:57 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Mon, 5 Feb 2007 03:52:57 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070205115257.F1917E607FE@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
erezz at voltaire.com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vlad at mellanox.co.il
------- Comment #17 from erezz at voltaire.com 2007-02-05 03:52 -------
(In reply to comment #16)
> > I don't agree with your patch. It assumes that SLES 10 may be corrupted. OFED
> > should not try to support this. If you want to use this patch for your own
> > purposes, just apply it (manually) before running OFED build scripts. OFED's
> > backport patches mechanism is not suitable for such patches.
>
> I don't agree with you because my patch do not any changes in system files. It
> only search version of SUSE, but if you think that OFED should not try to
> support this I think that many Intel people who will install OFED on SLES10
> platform will be unhappy. Thanks a lot for you help.
>
> -- Dmitry.
>
Note that /etc/issue belongs to a SLES package:
rpm thyme:~ # rpm -qf /etc/issue
sles-release-10-15.2
Deleting it means that you corrupt your system. One can also delete
/etc/SuSE-release and expect that OFED will work. If you decide to delete
/etc/issue (or any other file that comes with SLES 10), you'll need to change
OFED scripts for your special needs. Anyway, I maintain iSER in OFED. You may
want to ask Vlad (vlad at mellanox.co.il) what he thinks about it. He maintains
OFED's build scripts.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at lists.openfabrics.org Mon Feb 5 04:02:29 2007
From: bugzilla-daemon at lists.openfabrics.org (bugzilla-daemon at lists.openfabrics.org)
Date: Mon, 5 Feb 2007 04:02:29 -0800 (PST)
Subject: [openib-general] [Bug 334] Problems with build
OFED-1.1.1-ib_local_sa
In-Reply-To:
Message-ID: <20070205120229.2A624E607FE@openfabrics.org>
https://bugs.openfabrics.org/show_bug.cgi?id=334
------- Comment #18 from dmitry.yulov at intel.com 2007-02-05 04:02 -------
> Note that /etc/issue belongs to a SLES package:
> rpm thyme:~ # rpm -qf /etc/issue
> sles-release-10-15.2
> Deleting it means that you corrupt your system. One can also delete
> /etc/SuSE-release and expect that OFED will work. If you decide to delete
> /etc/issue (or any other file that comes with SLES 10), you'll need to change
> OFED scripts for your special needs. Anyway, I maintain iSER in OFED. You may
> want to ask Vlad (vlad at mellanox.co.il) what he thinks about it. He maintains
> OFED's build scripts.
Thank you. I do not delete /etc/issue file. I have had it file, but it contain
next information:
: cat /etc/issue
************************************************
Use of this system by unauthorized persons or
in an unauthorized manner is strictly prohibited
************************************************
That is all.
--
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From tziporet at mellanox.co.il Mon Feb 5 04:04:29 2007
From: tziporet at mellanox.co.il (Tziporet Koren)
Date: Mon, 05 Feb 2007 14:04:29 +0200
Subject: [openib-general] QoS in opensm will not be part of OFED 1.2
Message-ID: <45C71D4D.4060503@mellanox.co.il>
Hi Hal,
I had an AI to check the QoS status with OSM.
Conclusions are that QoS support in OpenSM will not be part of OFED 1.2
(I updated the plan on the Wiki)
The reasons for this are:
1. Code not ready at code freeze.
2. There are technical discussion in the list regarding some
implementation details (e.g. XML or text syntax).
3. SPEC is not published by IBTA yet.
Hal & Yevgeny - please work on a plan that will enable QoS to be merged
on the main trunk once its ready.
Tziporet
From kliteyn at dev.mellanox.co.il Mon Feb 5 04:37:41 2007
From: kliteyn at dev.mellanox.co.il (Yevgeny Kliteynik)
Date: Mon, 05 Feb 2007 14:37:41 +0200
Subject: [openib-general] OSM QoS policy file
Message-ID: <45C72515.8090100@dev.mellanox.co.il>
Hi Hal.
I added osm/doc/qos-policy.txt file with the description of the QoS
policy file, and an example of such file (with more comments inside).
I'm sure you'll have questions and corrections regarding this file,
so for now, to make our work easier, I'm not sending it as patch,
but just as text. Please review the file.
Thanks
-- Yevgeny
=============================================================
QoS Policy File
===============
The QoS policy file is divided into 4 sub sections:
- Port Group: a set of CAs, Routers or Switches that share
the same settings. A port group might be a partition
defined by the partition manager policy in terms of
GUIDs. Future implementations might provide support
for NodeDescription based definition of port groups.
- Fabric Setup:
Defines how the SL2VL and VLArb tables should be setup.
This policy definition assumes the computation of target
behavior should be performed outside of OpenSM.
- QoS-Levels Definition:
This section defines the possible sets of parameters for
QoS that a client might be mapped to. Each set holds: SL
and optionally: Max MTU, Max Rate, Packet Lifiteme and
QoS Class.
- Matching Rules:
A list of rules that match an incoming PathRecord request
to a QoS-Level. The rules are processed in order such as
the first match is applied. Each rule is built out of set
of match expressions which should all match for the rule
to apply. The matching expressions are defined for the
following fields:
- SRC and DST to lists of port groups
- Service-ID to a list of Service-ID or Service-ID ranges
- QoS Class to a list of QoS Class values or ranges
Example of the QoS policy file
==============================
Storage0x10000000000000010x1000000000000002Virtual Serversvs1/HCA-1/P1vs3/HCA-1/P1vs3/HCA-2/P1Partition 1Part1RoutersROUTERPart1**0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7Storage1Storage20,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0StorageStorage0:255,1:127,2:63,3:31,4:15,5:7,6:3,7:18:255,9:127,10:63,11:31,12:15,13:7,14:310116123162073032111217-9,112Storage22,4719-50003
Explanation of some fields
==========================
Most of the tags meaning is either intuitive or explained by the
comments along the file. One section that deserves a special
explanation is SL2VL tables definition - .
In general, VL is a function of in-port (the port that the packet
has entered through), out-port (the port that the packet is supposed
to come out from) and the SL.
In OpenSM, SL2VL table is defined on every port, where this port is
an out-port. Hence, on every port, SL2VL table is defined as function
of in-port and SL.
n,m
This means that of all the ports of the specified port group, define
SL2VL tables where to-ports are ports number n and m. Since SL2VL
table is defined per out-port, using effectively means defining
SL2VL table on ports n and m.
In order to specify that SL2VL table should be defined on all the
ports, an asterisk (*) can be used.
i,j
This means that of all the ports of the specified port group that were
not filtered out by the value, define SL2VL table only for entries
where from-ports are ports number i and j.
In order to specify that SL2VL table should be defined for all the in-ports,
an asterisk (*) can be used.
To specify that all the SL2VL tables entries should be defined for all
the ports of a certain group, use the following:
port_group**PortGroupName
This is combination of keyword (that can be found in VLArb tables
definition) and keyword.
PortGroupName means that the ports that we're talking about
are all the ports that are connected to ports that belong to PortGroupName.
Essintially, PortGroupName means the folowing:
list_of_all_the_ports_that_are_connected_to_group_PortGroupName
Example of usage of :
A user has a set of 'special' nodes (e.g. storage nodes), and all the
traffic to these nodes has to get specific VL. The solution is to define port
group (i.e "Storage") that will include all the ports of these nodes, and then
to configure SL2VL tables on all the switch ports that are connected to the
Storage port group by specifying StoragePortGroupName
Similar to , is combination of and
keywords.
From rdreier at cisco.com Mon Feb 5 06:20:25 2007
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 05 Feb 2007 06:20:25 -0800
Subject: [openib-general] idea for ofed 1 2 kernel file structure
References: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
Message-ID:
> I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike:
> It is hard to see changes that are specific to OFED since we have whole
> kernel history mixed in.
I'm not sure how you have your branches set up, but if you have
something like a "linus" branch that tracks the upstream kernel, it's
easy to do stuff like "git log linus.." or "git diff linus.. drivers/infiniband"
and see the differences that way.
Using git that way (which is what it's designed for, after all) seems
better than some scripts to munge together two trees.
- R.
From halr at voltaire.com Mon Feb 5 06:00:50 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 05 Feb 2007 09:00:50 -0500
Subject: [openib-general] QoS in opensm will not be part of OFED 1.2
In-Reply-To: <45C71D4D.4060503@mellanox.co.il>
References: <45C71D4D.4060503@mellanox.co.il>
Message-ID: <1170684049.4525.195527.camel@hal.voltaire.com>
Hi Tziporet,
On Mon, 2007-02-05 at 07:04, Tziporet Koren wrote:
> Hi Hal,
>
> I had an AI to check the QoS status with OSM.
> Conclusions are that QoS support in OpenSM will not be part of OFED 1.2
> (I updated the plan on the Wiki)
>
> The reasons for this are:
> 1. Code not ready at code freeze.
> 2. There are technical discussion in the list regarding some
> implementation details (e.g. XML or text syntax).
> 3. SPEC is not published by IBTA yet.
I think this last reason also applies to the end client QoS changes as
well.
-- Hal
> Hal & Yevgeny - please work on a plan that will enable QoS to be merged
> on the main trunk once its ready.
> Tziporet
>
>
>
From xma at us.ibm.com Mon Feb 5 06:50:55 2007
From: xma at us.ibm.com (Shirley Ma)
Date: Mon, 5 Feb 2007 07:50:55 -0700
Subject: [openib-general] [PATCH] enable IPoIB only if broadcast join finish
In-Reply-To:
Message-ID:
Hi, Roland,
Please review this patch. According to IPoIB RFC4391 section 5, once IPoIB
broacast group has been joined, the interface should be ready for data
transfer. In current IPoIB implementation, the interface is UP and RUNNING
when all default multicast join successful. We hit a problem while the
broadcast join finishe and sucessful but the all hosts multicast join
failure.
Here is the patch, if possible please give your input asap, we have an
urgent customer issue need to be resolved:
diff -urpN ipoib/ipoib_multicast.c ipoib-multicast/ipoib_multicast.c
--- ipoib/ipoib_multicast.c 2006-11-29 13:57:37.000000000 -0800
+++ ipoib-multicast/ipoib_multicast.c 2007-02-04 22:34:16.000000000 -0800
@@ -402,6 +402,11 @@ static void ipoib_mcast_join_complete(in
queue_work(ipoib_workqueue, &priv->mcast_task);
mutex_unlock(&mcast_mutex);
complete(&mcast->done);
+ /*
+ * broadcast join finished, enable carrier
+ */
+ if (mcast == priv->broadcast)
+ netif_carrier_on(dev);
return;
}
@@ -599,7 +604,6 @@ void ipoib_mcast_join_task(void *dev_ptr
ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n");
clear_bit(IPOIB_MCAST_RUN, &priv->flags);
- netif_carrier_on(dev);
}
int ipoib_mcast_start_thread(struct net_device *dev)
(See attached file: ipoib-multicast.patch)
Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ipoib-multicast.patch
Type: application/octet-stream
Size: 777 bytes
Desc: not available
URL:
From michael.arndt at informatik.tu-chemnitz.de Mon Feb 5 07:18:24 2007
From: michael.arndt at informatik.tu-chemnitz.de (Michael Arndt)
Date: Mon, 5 Feb 2007 16:18:24 +0100
Subject: [openib-general] Unknown SMP Recv
Message-ID: <000901c74938$e10b2a30$21606d86@one7>
Hi,
I have change the driver (smi) a little and have written a tool like a
router or a bridge. It receives directed route smp's on one port and sends
it to another port. I use 3 nodes (sender on node 1, the router on node 2,
normal node on 3) and send a subnGet SMP with [0][1][1] as initial path. And
it works fine, but on way back the router also receives a second subnGetResp
packet with no data. The header is almost the same as the real subnGetResp
packet, just the DrSLID,DrDLID, initial path, return path are 0. Are there
any ideas where this packet come from? Ack?
Thanks Michael
From mst at mellanox.co.il Mon Feb 5 07:25:08 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Mon, 5 Feb 2007 17:25:08 +0200
Subject: [openib-general] idea for ofed 1 2 kernel file structure
In-Reply-To:
References:
Message-ID: <20070205152507.GA4246@mellanox.co.il>
> Quoting Roland Dreier :
> Subject: Re: [openib-general] idea for ofed 1 2 kernel file structure
>
> > I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike:
> > It is hard to see changes that are specific to OFED since we have whole
> > kernel history mixed in.
>
> I'm not sure how you have your branches set up, but if you have
> something like a "linus" branch that tracks the upstream kernel, it's
> easy to do stuff like "git log linus.." or "git diff linus.. drivers/infiniband"
> and see the differences that way.
limit to drivers/infiniband is no longer sufficient as we have components
under drivers/net etc.
Another problem is that history-rewriting tools such as git rebase
seem to easily get confused by the complicated linux history.
> Using git that way (which is what it's designed for, after all) seems
> better than some scripts to munge together two trees.
Problem is, OFED kernel code actually consists of 2 parts:
upstream kernel developed separately at lkml and out of kernel components,
developed separately. OFED does not really track linux all the time: we
only update at -RC time.
Mixing such 2 projects together does not seem to be what git was designed for.
For example, when a patch is applied upstream we need to remove it from
fixes. So after I do git pull from upstream I get a broken tree that won't
even build. Not good.
Another problem I'm trying to address is the confusion around what gets
applied as patch and what directly. This way, a bad patch won't even apply.
--
MST
From halr at voltaire.com Mon Feb 5 07:34:15 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 05 Feb 2007 10:34:15 -0500
Subject: [openib-general] Unknown SMP Recv
In-Reply-To: <000901c74938$e10b2a30$21606d86@one7>
References: <000901c74938$e10b2a30$21606d86@one7>
Message-ID: <1170689654.4525.201415.camel@hal.voltaire.com>
On Mon, 2007-02-05 at 10:18, Michael Arndt wrote:
> Hi,
>
> I have change the driver (smi) a little and have written a tool like a
> router or a bridge. It receives directed route smp's on one port and sends
> it to another port. I use 3 nodes (sender on node 1, the router on node 2,
> normal node on 3) and send a subnGet SMP with [0][1][1] as initial path. And
> it works fine, but on way back the router also receives a second subnGetResp
> packet with no data. The header is almost the same as the real subnGetResp
> packet, just the DrSLID,DrDLID, initial path, return path are 0. Are there
> any ideas where this packet come from? Ack?
A router should not allow a SMP to cross a subnet boundary. SMPs are
restricted to the local subnet.
-- Hal
> Thanks Michael
>
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
From mst at mellanox.co.il Mon Feb 5 07:38:26 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Mon, 5 Feb 2007 17:38:26 +0200
Subject: [openib-general] QoS in opensm will not be part of OFED 1.2
In-Reply-To: <1170684049.4525.195527.camel@hal.voltaire.com>
References: <45C71D4D.4060503@mellanox.co.il>
<1170684049.4525.195527.camel@hal.voltaire.com>
Message-ID: <20070205153826.GB4246@mellanox.co.il>
> > I had an AI to check the QoS status with OSM.
> > Conclusions are that QoS support in OpenSM will not be part of OFED 1.2
> > (I updated the plan on the Wiki)
> >
> > The reasons for this are:
> > 1. Code not ready at code freeze.
> > 2. There are technical discussion in the list regarding some
> > implementation details (e.g. XML or text syntax).
> > 3. SPEC is not published by IBTA yet.
>
> I think this last reason also applies to the end client QoS changes as
> well.
Yes. But the other 2 don't.
--
MST
From changquing.tang at hp.com Mon Feb 5 07:48:29 2007
From: changquing.tang at hp.com (Tang, Changqing)
Date: Mon, 5 Feb 2007 15:48:29 -0000
Subject: [openib-general] Immediate data question
In-Reply-To:
References: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
Message-ID: <349DCDA352EACF42A0C49FA6DCEA840350AAC4@G3W0634.americas.hpqcorp.net>
Roland:
If I only want to send/recv 4 bytes with immediate data:
On sender side:
opcode = IBV_WR_SEND_WITH_IMM;
imm_data = my_4_bytes_data;
Do I still need to specify sg_list and num_sge ?
On receiver side, because the immediate data is inside the completion
structure, do I need to post a receive for above message ?
If I need to post a receive, do I need to specify sg_list and num_sge
for the receive ?
I looked the spec but did not find useful information.
The reason I ask is that at some point, I can not(or hard) to provide
registered memory only for 4 bytes data.
Thank you.
--CQ
> -----Original Message-----
> From: openib-general-bounces at openib.org
> [mailto:openib-general-bounces at openib.org] On Behalf Of Roland Dreier
> Sent: Monday, February 05, 2007 8:20 AM
> To: Michael S. Tsirkin
> Cc: openib-general at openib.org
> Subject: Re: [openib-general] idea for ofed 1 2 kernel file structure
>
> > I looked a current ofed 1.2 kernel tree and there is 1
> thing I dislike:
> > It is hard to see changes that are specific to OFED since
> we have whole > kernel history mixed in.
>
> I'm not sure how you have your branches set up, but if you
> have something like a "linus" branch that tracks the upstream
> kernel, it's easy to do stuff like "git log linus.." or "git
> diff linus.. drivers/infiniband"
> and see the differences that way.
>
> Using git that way (which is what it's designed for, after
> all) seems better than some scripts to munge together two trees.
>
> - R.
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>
From mst at mellanox.co.il Mon Feb 5 07:49:22 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Mon, 5 Feb 2007 17:49:22 +0200
Subject: [openib-general] QoS in opensm will not be part of OFED 1.2
In-Reply-To: <1170690105.4525.201879.camel@hal.voltaire.com>
References: <1170690105.4525.201879.camel@hal.voltaire.com>
Message-ID: <20070205154922.GC4246@mellanox.co.il>
> > > > I had an AI to check the QoS status with OSM.
> > > > Conclusions are that QoS support in OpenSM will not be part of OFED 1.2
> > > > (I updated the plan on the Wiki)
> > > >
> > > > The reasons for this are:
> > > > 1. Code not ready at code freeze.
> > > > 2. There are technical discussion in the list regarding some
> > > > implementation details (e.g. XML or text syntax).
> > > > 3. SPEC is not published by IBTA yet.
> > >
> > > I think this last reason also applies to the end client QoS changes as
> > > well.
> >
> > Yes. But the other 2 don't.
>
> Right but I think that precludes it from being included in OFED right
> now.
Since the code is already included in OFED, moving it out would violate the feature
freeze rules, unless there's an actual bug this would fix.
--
MST
From halr at voltaire.com Mon Feb 5 07:41:48 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 05 Feb 2007 10:41:48 -0500
Subject: [openib-general] QoS in opensm will not be part of OFED 1.2
In-Reply-To: <20070205153826.GB4246@mellanox.co.il>
References: <45C71D4D.4060503@mellanox.co.il>
<1170684049.4525.195527.camel@hal.voltaire.com>
<20070205153826.GB4246@mellanox.co.il>
Message-ID: <1170690105.4525.201879.camel@hal.voltaire.com>
On Mon, 2007-02-05 at 10:38, Michael S. Tsirkin wrote:
> > > I had an AI to check the QoS status with OSM.
> > > Conclusions are that QoS support in OpenSM will not be part of OFED 1.2
> > > (I updated the plan on the Wiki)
> > >
> > > The reasons for this are:
> > > 1. Code not ready at code freeze.
> > > 2. There are technical discussion in the list regarding some
> > > implementation details (e.g. XML or text syntax).
> > > 3. SPEC is not published by IBTA yet.
> >
> > I think this last reason also applies to the end client QoS changes as
> > well.
>
> Yes. But the other 2 don't.
Right but I think that precludes it from being included in OFED right
now.
-- Hal
From guyg at Voltaire.COM Mon Feb 5 08:43:14 2007
From: guyg at Voltaire.COM (guyg)
Date: Mon, 05 Feb 2007 18:43:14 +0200
Subject: [openib-general] [libmthca] deadlock while trying to destroy QP
Message-ID: <45C75EA2.6000905@Voltaire.COM>
Hi Roland,
I am running a proprietary test over ofed1.1 (userspace).
I have one context where I poll my cq and another (signal handler
context) where I try to destroy my QP.
It looks like mthca_destroy_qp is trying to take a lock that
mthca_poll_cq is holding.
The deadlock is occurring at the end of the test run where there
are no more completions, hence deadlocking and the test never exists.
Here is a core dump:
#0 0x0000003a6ce09172 in pthread_spin_lock () from /lib64/tls/libpthread.so.0
#1 0x0000002a959cf449 in mthca_cq_clean (cq=0x607240, qpn=3277830, srq=0x0) at src/cq.c:554
#2 0x0000002a959d28b9 in mthca_destroy_qp (qp=0x607400) at src/mthca.h:246
#3 0x000000000040117b in client_sig_handler ()
#4
#5 0x0000003a6ce09165 in pthread_spin_lock () from /lib64/tls/libpthread.so.0
#6 0x0000002a959cec91 in mthca_poll_cq (ibcq=0x607240, ne=1, wc=0x7fbffff590) at src/cq.c:467
#7 0x0000002a9557bf73 in ibv_poll_cq (cq=0x607240, num_entries=1, wc=0x7fbffff590) at /usr/local/ofed/include/infiniband/verbs.h:824
Does destroy_qp needs to be dependent on the CQ?
Do you have any suggestions?
Thanks,
Guy
From michael.arndt at informatik.tu-chemnitz.de Mon Feb 5 08:56:58 2007
From: michael.arndt at informatik.tu-chemnitz.de (Michael Arndt)
Date: Mon, 5 Feb 2007 17:56:58 +0100
Subject: [openib-general] Unknown SMP Recv
References: <000901c74938$e10b2a30$21606d86@one7>
<1170689654.4525.201415.camel@hal.voltaire.com>
Message-ID: <001401c74946$a664a2e0$21606d86@one7>
Hi,
> A router should not allow a SMP to cross a subnet boundary. SMPs are
> restricted to the local subnet.
I work on a discovering mechanism for switchless InfiniBand Architectures
like Rings, Tori or maybe Hyper-Cubes. There is just one single subnet, no
switches or routers. Please ignore the background and focus to the problem
about the second packet. Maybe you have some ideas even you are not involved
in the hole project. That would be nice.
Thanks Michael
From mshefty at ichips.intel.com Mon Feb 5 09:07:34 2007
From: mshefty at ichips.intel.com (Sean Hefty)
Date: Mon, 05 Feb 2007 09:07:34 -0800
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <20070204131500.GE14288@mellanox.co.il>
References: <1170355331.16637.25.camel@stevo-desktop>
<20070204131500.GE14288@mellanox.co.il>
Message-ID: <45C76456.6090804@ichips.intel.com>
>>The name is "ib_mcast_wq" which is too long for older kernels.
>>
>>Did we loose a backport patch?
>
>
> Not sure what happened here.
> Sean, could you rename ib_mcast_wq to ib_mcast please?
I renamed the workqueue for what I requested to pull upstream, and I added a
patch to my pull request to rename a couple of other workqueues.
Didn't you already apply a rename patch to the ofed code?
- Sean
From halr at voltaire.com Mon Feb 5 09:13:12 2007
From: halr at voltaire.com (Hal Rosenstock)
Date: 05 Feb 2007 12:13:12 -0500
Subject: [openib-general] Unknown SMP Recv
In-Reply-To: <001401c74946$a664a2e0$21606d86@one7>
References: <000901c74938$e10b2a30$21606d86@one7>
<1170689654.4525.201415.camel@hal.voltaire.com>
<001401c74946$a664a2e0$21606d86@one7>
Message-ID: <1170695591.4525.207604.camel@hal.voltaire.com>
On Mon, 2007-02-05 at 11:56, Michael Arndt wrote:
> Hi,
>
> > A router should not allow a SMP to cross a subnet boundary. SMPs are
> > restricted to the local subnet.
>
> I work on a discovering mechanism for switchless InfiniBand Architectures
> like Rings, Tori or maybe Hyper-Cubes. There is just one single subnet, no
> switches or routers. Please ignore the background and focus to the problem
> about the second packet. Maybe you have some ideas even you are not involved
> in the hole project. That would be nice.
Guess you don't mean IB router when you say router in your description.
I also have no theories without more information:
Is the sender a normal node ? Is normal node mean standard OpenIB
without changes ? How was the SMI changed ? On which nodes ? Only the
intermediate one ?
Aside from the initial path being [0][1][1], what are the hop count and
hop pointer ? What are DrDLID and DrSLID as well as the LIDs in the LRH
of the SMP ?
-- Hal
> Thanks Michael
From swise at opengridcomputing.com Mon Feb 5 09:19:21 2007
From: swise at opengridcomputing.com (Steve Wise)
Date: Mon, 05 Feb 2007 11:19:21 -0600
Subject: [openib-general] cxgb3.git tree merged to 2.6.20
Message-ID: <1170695961.16661.26.camel@stevo-desktop>
All,
I've updated my tree git://staging.openfabrics.org/~swise/cxgb3.git to
linux-2.6.20.
Branches:
cxgb3 - my development branch with commits that were used to review the
rdma driver (large patch series) + the T3 Ethernet driver.
for-roland - branch where roland can pull the latest rdma driver (the
same code that is in OFED 1.2)
for-ofed_1_2 - branch used to deliver the original ethernet and rdma
driver code to the ofed_1_2 tree. It is up to date with the ofed_1_2
tree wrt the drivers.
Steve.
From suri at baymicrosystems.com Mon Feb 5 09:31:02 2007
From: suri at baymicrosystems.com (Suresh Shelvapille)
Date: Mon, 5 Feb 2007 12:31:02 -0500
Subject: [openib-general] patches to 2.6.19.1 kernel for switch Operation
In-Reply-To: <1170072757.4555.242192.camel@hal.voltaire.com>
References: <000601c7419f$d4470c60$ff0da8c0@amr.corp.intel.com>
<1170072757.4555.242192.camel@hal.voltaire.com>
Message-ID: <039701c7494b$6bd5d860$1914a8c0@surioffice>
Hal:
We are upgrading to 2.6.19.1 kernel and I finally ported the changes
required for Switch operation from my current kernel (2.6.12) version.
I have tested these changes for a switch with different SM(s). But I need
the community's help to test the changes on different HCAs to make sure I
have not broken anything.
Please see if the changes look OK.
Thanks,
Suri
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smi.c.ptch
Type: application/octet-stream
Size: 1257 bytes
Desc: not available
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: agent.c.ptch
Type: application/octet-stream
Size: 1079 bytes
Desc: not available
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mad.c.ptch
Type: application/octet-stream
Size: 3501 bytes
Desc: not available
URL:
From akepner at sgi.com Mon Feb 5 09:33:22 2007
From: akepner at sgi.com (akepner at sgi.com)
Date: Mon, 5 Feb 2007 09:33:22 -0800 (PST)
Subject: [openib-general] idea for ofed 1 2 kernel file structure
In-Reply-To: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
References: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
Message-ID:
On Sun, 4 Feb 2007, Michael S. Tsirkin wrote:
> Hi!
> I looked a current ofed 1.2 kernel tree and there is 1 thing I dislike:
> It is hard to see changes that are specific to OFED since we have whole
> kernel history mixed in.
I agree.
>
> It would easy to split OFED specific files In separate directory and
> have OFED scripts combine that with upstream kernel.
>
> All out of tree modules we distribute would go there too.
> What do others think about this?
>
I like that idea very much.
--
Arthur
From or.gerlitz at gmail.com Mon Feb 5 10:16:00 2007
From: or.gerlitz at gmail.com (Or Gerlitz)
Date: Mon, 5 Feb 2007 20:16:00 +0200
Subject: [openib-general] Immediate data question
In-Reply-To: <349DCDA352EACF42A0C49FA6DCEA840350AAC4@G3W0634.americas.hpqcorp.net>
References: <6C2C79E72C305246B504CBA17B5500C905DC04@mtlexch01.mtl.com>
<349DCDA352EACF42A0C49FA6DCEA840350AAC4@G3W0634.americas.hpqcorp.net>
Message-ID: <15ddcffd0702051016x4587a6das87c4ef116296662b@mail.gmail.com>
On 2/5/07, Tang, Changqing wrote:
> On sender side:
> opcode = IBV_WR_SEND_WITH_IMM;
> imm_data = my_4_bytes_data;
> Do I still need to specify sg_list and num_sge ?
At the sender side i think you can do well with:
opcode = IBV_WR_SEND
send_flags |= IBV_SEND_INLINE
sge.addr = pointer to the 4 bytes
sge.len = 4
sge.lkey = don't care
since the 4 bytes are --copied-- by the IB library from sge.addr
during the execution of ibv_post_send(), the owenership of sge.addr is
yours once the call returns.
> On receiver side, because the immediate data is inside the completion
> structure, do I need to post a receive for above message ?
yes, i don't see how you can get a way from posting a receive WR
> The reason I ask is that at some point, I can not(or hard) to provide
> registered memory only for 4 bytes data.
what about the mpi impl. header ??? do you have a case where only 4
bytes need to be passed to the other side?
Or.
From mst at mellanox.co.il Mon Feb 5 10:42:07 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Mon, 5 Feb 2007 20:42:07 +0200
Subject: [openib-general] [PATCH] RE: regression in ofed 1.2
In-Reply-To: <45C76456.6090804@ichips.intel.com>
References: <45C76456.6090804@ichips.intel.com>
Message-ID: <20070205184207.GB15775@mellanox.co.il>
> Quoting Sean Hefty :
> Subject: Re: [openib-general] [PATCH] RE: regression in ofed 1.2
>
> >>The name is "ib_mcast_wq" which is too long for older kernels.
> >>
> >>Did we loose a backport patch?
> >
> >
> > Not sure what happened here.
> > Sean, could you rename ib_mcast_wq to ib_mcast please?
>
> I renamed the workqueue for what I requested to pull upstream, and I added a
> patch to my pull request to rename a couple of other workqueues.
>
> Didn't you already apply a rename patch to the ofed code?
You but I assumed it's in your branch so I threw it out when I took your
latest code.
--
MST
From mst at mellanox.co.il Mon Feb 5 10:42:46 2007
From: mst at mellanox.co.il (Michael S. Tsirkin)
Date: Mon, 5 Feb 2007 20:42:46 +0200
Subject: [openib-general] idea for ofed 1 2 kernel file structure
In-Reply-To: