[openib-general] osm unreliable unless -d1
Jean-Christophe Hugly
jice at pantasys.com
Mon Mar 6 15:06:46 PST 2006
On Mon, 2006-03-06 at 23:44 +0200, Sasha Khapyorsky wrote:
> On 11:44 Mon 06 Mar , Jean-Christophe Hugly wrote:
> >
> > One more detail, I am running with LMC=2 betcause I wanted to check that
> > the LMC>0 were fixed (they seem to be; I do not see any LMC-related
> > missbehaviour.
>
> Hmm, and I have the some problems with LMC (even before the test, not
> investigated yet)...
I used to have big trouble with LMC>0 (basically duplicate LIDs were
being assigned). But seems to be fixed. At least I have not noticed
anything like that this time around.
> Could you try without LMC?
Just did.
With lmc=0 the behaviour is the same. Your test's output is:
1: delay 0
2: delay 0
3: delay 0
4: delay 0
5: delay 0
6: delay 0
7: delay 0
8: delay 0
9: delay 675
<nothing after that yet>
I guess something happens every 10 minutes or so that puts back osm on
track, may be if I wait another 10 or so minutes I'll get a 10th cycle
to complete.
I will rebuild with the spinlock fixes and see if by any changes it
makes a difference to this issue.
Oh, btw, the first test I ran with LMC=0 I also put -V on the cmd line.
With all the traces going, the tests passes indefinitely (well, I put
the end of times at 15 cycles).
--
Jean-Christophe Hugly <jice at pantasys.com>
PANTA
More information about the general
mailing list