[ofa-general] RE: [BUG REPORT] mlx4: Incorrect event is generated when SM is changing
Jack Morgenstein
jackm at mellanox.co.il
Wed Dec 3 08:01:13 PST 2008
> Things get worse if this patch
>
(http://lists.openfabrics.org/pipermail/general/2008-November/055680.htm
l)
> is applied since no event would be dispatched at all
Does this mean that you do not want the patches applied?
(I have put them into the upcoming OFED 1.4)
- Jack
> -----Original Message-----
> From: Moni Shoua [mailto:monis at Voltaire.COM]
> Sent: Wednesday, December 03, 2008 4:27 PM
> To: Roland Dreier; Jack Morgenstein
> Cc: OpenFabrics General; Olga Stern; Or Gerlitz
> Subject: [BUG REPORT] mlx4: Incorrect event is generated when
> SM is changing
>
>
> I have a small fabric with ConnectX HCAs and I run some tests
> with 2 Open SMs that take over of each other (either by
> discovering that the MASTER is not responding or by coming up
> with higher priority).
>
> I noticed that in 2.6.28-rc6 I get in IPoIB a LID_CHANGE
> event (while expecting CLIENT_REREGISTER) even though LID was
> not changed. I looked where the events are generated in
> drivers/infiniband/hw/mlx4/mad.c:smp_snoop()
>
> if (pinfo->clientrereg_resv_subnetto & 0x80)
> event.event =
> IB_EVENT_CLIENT_REREGISTER;
> else
> event.event = IB_EVENT_LID_CHANGE;
>
> and see that clientrereg bit is off.
>
> I checked this simultaneously with a host running kernel
> 2.6.27 and see that the bit is on and
> CLIENT_REREGISTER is generated.
>
> Things get worse of this patch
> (http://lists.openfabrics.org/pipermail/general/2008-November/
055680.html)
is applied since no event would be dispatched at all (BTW Jack, I think
you forgot to send the second half for mthca)
I'll appreciate a clue where does the reregister bit turned off.
thanks
MoniS
More information about the general
mailing list