[openib-general] osm unreliable unless -d1

Jean-Christophe Hugly jice at pantasys.com
Mon Mar 6 15:06:46 PST 2006


On Mon, 2006-03-06 at 23:44 +0200, Sasha Khapyorsky wrote:
> On 11:44 Mon 06 Mar     , Jean-Christophe Hugly wrote:
> > 
> > One more detail, I am running with LMC=2 betcause I wanted to check that
> > the LMC>0 were fixed (they seem to be; I do not see any LMC-related
> > missbehaviour.
> 
> Hmm, and I have the some problems with LMC (even before the test, not
> investigated yet)...

I used to have big trouble with LMC>0 (basically duplicate LIDs were
being assigned). But seems to be fixed. At least I have not noticed
anything like that this time around.

> Could you try without LMC?

Just did.
With lmc=0 the behaviour is the same. Your test's output is:
1: delay 0
2: delay 0
3: delay 0
4: delay 0
5: delay 0
6: delay 0
7: delay 0
8: delay 0
9: delay 675
<nothing after that yet>

I guess something happens every 10 minutes or so that puts back osm on
track, may be if I wait another 10 or so minutes I'll get a 10th cycle
to complete.

I will rebuild with the spinlock fixes and see if by any changes it
makes a difference to this issue.

Oh, btw, the first test I ran with LMC=0 I also put -V on the cmd line.
With all the traces going, the tests passes indefinitely (well, I put
the end of times at 15 cycles).

-- 
Jean-Christophe Hugly <jice at pantasys.com>
PANTA




More information about the general mailing list