[openib-general] multicast join errors
Eitan Zahavi
eitan at mellanox.co.il
Sun Jan 29 23:49:42 PST 2006
Hi Amith
Sorry but the ibstat looks good.
Can you send a pointer (or attachment) for the code that does the ibumad
open ?
It seems like your application (exact same application) has already
opened that port.
Eitan Zahavi
Design Technology Director
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL
> -----Original Message-----
> From: amith rajith mamidala [mailto:mamidala at cse.ohio-state.edu]
> Sent: Sunday, January 29, 2006 8:05 PM
> To: Eitan Zahavi
> Cc: Hal Rosenstock; mvapich-core at cse.ohio-state.edu;
openib-general at openib.org
> Subject: RE: [openib-general] multicast join errors
>
> Hi Eitan,
>
> I am sending the ibstat output:
> CA 'mthca0'
> CA type: MT25208
> Number of ports: 2
> Firmware version: 5.1.0
> Hardware version: a0
> Node GUID: 0x0006270510000004
> System image GUID: 0x0000000000000000
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 10
> Base lid: 3
> LMC: 0
> SM lid: 105
> Capability mask: 0x02510a68
> Port GUID: 0x0006270510000005
> Port 2:
> State: Down
> Physical state: Polling
> Rate: 10
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x02510a68
> Port GUID: 0x0006270510000006
>
> Thanks,
> Amith
>
>
> On Sun, 29 Jan 2006, Eitan Zahavi wrote:
>
> > Hi Amith,
> >
> > Please send the ibstat output for that node.
> > I suspect the port 0x6270510000005 is not up.
> >
> > Eitan Zahavi
> > Design Technology Director
> > Mellanox Technologies LTD
> > Tel:+972-4-9097208
> > Fax:+972-4-9593245
> > P.O. Box 586 Yokneam 20692 ISRAEL
> >
> >
> > > -----Original Message-----
> > > From: openib-general-bounces at openib.org [mailto:openib-general-
> > > bounces at openib.org] On Behalf Of amith rajith mamidala
> > > Sent: Saturday, January 28, 2006 11:19 PM
> > > To: Hal Rosenstock
> > > Cc: mvapich-core at cse.ohio-state.edu; openib-general at openib.org
> > > Subject: Re: [openib-general] multicast join errors
> > >
> > > Hi Hal,
> > >
> > > There is only one application running on a node. I am running
opensm
> > on
> > > a different node. I am also listing the other processes I observed
on
> > > doing a "ps":
> > >
> > > root 3564 11 0 Jan26 ? 00:00:00 [ib_cm/0]
> > > root 3565 11 0 Jan26 ? 00:00:00 [ib_cm/1]
> > > root 1294 11 0 Jan26 ? 00:00:00 [ib_mad1]
> > > root 1295 11 0 Jan26 ? 00:00:00 [ib_mad2]
> > > root 1298 11 0 Jan26 ? 00:00:00 [ib_mad1]
> > > root 1299 11 0 Jan26 ? 00:00:00 [ib_mad2]
> > >
> > >
> > > Thanks,
> > > Amith
> > >
> > > On 28 Jan 2006, Hal Rosenstock wrote:
> > >
> > > > Hi Amith,
> > > >
> > > > On Sat, 2006-01-28 at 12:46, amith rajith mamidala wrote:
> > > > > Hi,
> > > > >
> > > > > I was able to create multicast groups after Hal's fix. But,
when I
> > do join
> > > > > subsequently from the same program I am getting a port_alloc
> > error:
> > > > >
> > > > > Jan 28 12:22:12 119632 [AB2223C0] -> osm_vendor_bind: Binding
to
> > port
> > > > > 0x6270510000005.
> > > > > -I- Created the Multicast Group:
> > > > > MGID....................0xff13a01cfe800000 :
> > 0x0000000000000000
> > > > > PortGid.................0xfe80000000000000 :
> > 0x0006270510000005
> > > > > qkey....................0x0
> > > > > Mlid....................0xC002
> > > > > ScopeState..............0x21
> > > > > Rate....................0x83
> > > > > Mtu.....................0x84
> > > > > Jan 28 12:22:12 140486 [AB2223C0] -> osm_vendor_bind: Binding
to
> > port
> > > > > 0x6270510000005.
> > > > >
> > > > > ibwarn: [4057] port_alloc: umad port id 0 is already allocated
for
> > mthca0
> > > > > 1
> > > > > Jan 28 12:22:12 143240 [AB2223C0] -> osm_vendor_open_port: ERR
> > 542C:
> > > > > umad_open_port() failed
> > > > > Jan 28 12:22:12 143253 [AB2223C0] -> osm_vendor_bind: ERR
5424:
> > Unable to
> > > > > Open Port 0x6270510000005.
> > > > > Jan 28 12:22:12 143262 [AB2223C0] -> osmv_bind_sa: ERR 5506:
> > Failed to
> > > > > bind to vendor GSI
> > > > > Jan 28 12:22:12 143267 [AB2223C0] -> ibmcgrp_bind: ERR 00137:
> > Unable to
> > > > > bind to SA
> > > > >
> > > > > I am trying to trace the source of this error,
> > > >
> > > > Is this the only IB application running or are there others (and
if
> > so,
> > > > what else is running) ?
> > > >
> > > > -- Hal
> > > >
> > > > > Thanks,
> > > > > Amith
> > > > >
> > > > > _______________________________________________
> > > > > openib-general mailing list
> > > > > openib-general at openib.org
> > > > > http://openib.org/mailman/listinfo/openib-general
> > > > >
> > > > > To unsubscribe, please visit
> > http://openib.org/mailman/listinfo/openib-general
> > > >
> > >
> > > _______________________________________________
> > > openib-general mailing list
> > > openib-general at openib.org
> > > http://openib.org/mailman/listinfo/openib-general
> > >
> > > To unsubscribe, please visit
> > http://openib.org/mailman/listinfo/openib-general
> >
More information about the general
mailing list