[openib-general] nightly osm_sim report 2006-12-18:normal completion

Sasha Khapyorsky sashak at voltaire.com
Mon Dec 18 07:10:10 PST 2006


On 15:33 Mon 18 Dec     , Eitan Zahavi wrote:
> Hi Sasha,
> 
> The failure analysis takes time and is manual...
> The logs and related files are pretty big and will take space to upload.
> 
> Today I simulated with OpenSM that was compiled on the side (my bad -
> should have incorporated my patches on the clone but I was not sure this
> is not going to "contaminate" that git tree forever) with the fixes for
> DONE/DONE_PENDING.

You can commit your changes to the branch, and later to rebase this branch
on top of the new master, something like 'git-rebase master my-branch'.

> The tests that failed today are actually false violations:
> 1. The IS1-16 failed due to lack of free sockets to connect to the
> server. Still not clear why. I will increase the number of sockets the
> client/server try to connect on.
> 2. The IS3-128 fail due to temporary replacement of the opensm with the
> one that have my fixes for DONE/DONE_PENDING. This was a mistake I did
> manually by compiling the "clone". As I was watching the log I have
> noticed that the same wrong signal was happening.

Understood.

> BTW: The DONE/DONE_PENDING bug was discovered by a change in simulator
> dispatcher that I did. The change introduced a BUG that caused the
> machine to be overloaded with busy loop in the simulator dispatcher.
> Apparently this brought up some different timing and found these bugs.

So it was helpful simulator shakes. :)

Thanks for catching this.

BTW, 


> 
> EZ
> 
> > -----Original Message-----
> > From: Sasha Khapyorsky [mailto:sashak at voltaire.com]
> > Sent: Monday, December 18, 2006 3:31 PM
> > To: Eitan Zahavi
> > Cc: Eitan Zahavi; Yevgeny Kliteynik; halr at voltaire.com; openib-
> > general at openib.org
> > Subject: Re: nightly osm_sim report 2006-12-18:normal completion
> > 
> > Hi Eitan,
> > 
> > On 13:19 Mon 18 Dec     , Eitan Zahavi wrote:
> > > OSM Simulation Regression Summary
> > > OpenSM rev = Fri_Dec_15_20:29:07_2006 d5e724 ibutils rev =
> > > Thu_Dec_14_21:48:18_2006 fd82d4 MOD_FILES=1
> > > Total=221 Pass=219 Fail=2
> > >
> > > Pass:
> > > 31 LidMgr IS1-16.topo
> > > 30 Stability IS1-16.topo
> > > 30 Pkey IS1-16.topo
> > > 30 Multicast IS1-16.topo
> > > 29 OsmStress IS1-16.topo
> > > 10 Stability IS3-loop.topo
> > > 10 Stability IS3-128.topo
> > > 10 Pkey IS3-128.topo
> > > 10 Multicast IS3-loop.topo
> > > 10 Multicast IS3-128.topo
> > > 10 LidMgr IS3-128.topo
> > > 9 OsmStress IS3-128.topo
> > >
> > > Failures:
> > > 1 OsmStress IS3-128.topo
> > > 1 OsmStress IS1-16.topo
> > 
> > Is it possible to have more details about failures (in case when it is
> real
> > failures)? Probably to upload the logs to somewhere?
> > 
> > Sasha




More information about the general mailing list