[ofa-general] Intermittent: ib0: multicast join failed
Hal Rosenstock
hal.rosenstock at gmail.com
Mon Sep 22 11:50:03 PDT 2008
On Mon, Sep 22, 2008 at 2:43 PM, Roger Spellman <roger at terascala.com> wrote:
> Thanks, Hal.
>
> Below is the output to ibstat and ibstatus. It shows that the rate is
> 2.5 Gb/sec, rather than 10 Gb/sec.
>
> Is there a way to get it to renegotiate the rate, short of rebooting?
Try ibportstate reset on the switch peer port. You could also replug
the cable on that link.
> [root at ts-raid6-03 lib64]# ibstat
> CA 'mthca0'
> CA type: MT25204
> Number of ports: 1
> Firmware version: 1.2.936
> Hardware version: a0
> Node GUID: 0x0002c9020026e4c0
> System image GUID: 0x0002c9020026e4c3
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 2
> Base lid: 19
> LMC: 0
> SM lid: 1
> Capability mask: 0x02510a68
> Port GUID: 0x0002c9020026e4c1
> [root at ts-raid6-03 lib64]# ibstatus
> Infiniband device 'mthca0' port 1 status:
> default gid: fe80:0000:0000:0000:0002:c902:0026:e4c1
> base lid: 0x13
> sm lid: 0x1
> state: 4: ACTIVE
> phys state: 5: LinkUp
> rate: 2.5 Gb/sec (1X)
>
>
>
>
>> It's likely a rate issue where the negotiated port rate is not the
>> broadcast group rate.
Yes, it's a rate problem (the link is coming up a 1X SDR which is 2.5
Gbps whereas I suspect that the group is 10 Gbps so it can't join.
-- Hal
>> What does ibstat or ibstatus show when the join fails ? Also, what
>> about saquery -g ?
>
>> >
>> > Rebooting the node that failed to join the group always seems to
> solve
>> > the problem.
>
>> Yes, that's consistent with the negotiated rate being a problem.
>
>> -- Hal
>
More information about the general
mailing list