[openib-general] Re: user-mode verbs on Itanium

Grant Grundler iod00d at hp.com
Mon May 9 19:03:41 PDT 2005


On Mon, May 09, 2005 at 07:50:22PM +0300, Michael S. Tsirkin wrote:
> Hello, Grant!
> Thanks for comments on the perftest.
> I think fixed all bug you found.

probably - welcome

> I just did
> 
> >svn mv https://openib.org/svn/trunk/contrib/mellanox/perftest https://openib.org/svn/gen2/trunk/src/userspace
> Committed revision 2285.
> 
> so we'll be able to exchange patches in the regular way.

Excellent - thanks!

...
> > BTW, copying code wholesale is the whole point behind "Open Source".
> > One measure of success is if people in fact do copy the code.
> 
> Right but why dont we put it in the core then?

Isn't "userspace/perftest" part of "core"?
What do you mean by "core"?

I was just looking for a place more visible than trunk/contrib/mellanox
which on the surface has nothing to do with Openib Gen2.

> > That might help, but it's not addressing the basic problem.
> > People retrofitting an existing application will need at least
> > three different peices of code: subsystem initialization, connection
> > setup/tear down, and runtime communication. Just making those three
> > code blocks subroutines is NOT creating a new layer.
> 
> But once these subroutines are moved into a library, it sure
> begins to look like one.

Sure, but one where people can take the code and adapt it for their
own use. In effect implementing a custom layer each time.
Some people *need* another layer (e.g. MPI) and don't want
to program directly to verbs. They won't look for this example.

...
> > That's just my methodology for cleaning up stuff like this
> > and not the end point.
> 
> I see how separating things to subroutines would help here,
> but making things global will just hide who uses what, IMO.

Yes, but I don't want to make everything global in the end.
I mostly want to seperate parameters from local variables.

> > IMHO, the current code isn't readable either.
> > main() has 268 lines and  22 (in my version, more in the original)
> > local variables. Of those 22 variables, some are really local
> > and not part of the global state but we can't easily tell which.
> 
> Its a work in progress :)

hehe...exactly :^)

> I'm not sure what do you call "global state". I am trying to think
> of an application that has many connections, global state would
> be something common to all of them?

e.g. for rdma_lat global state would be:
o the connection since only one is allocated
o tstamp[] since it only runs on one CPU
o etc.

Ideally though, we want the connection state to be a parameter
in most normal apps. So I'd like to sort that out after seperating
"local variables" (e.g. unsigned int i;).

> You'll need my two patches I sent previously to make it work though.
> As a work around, bump max_send_sge a bit.

*nod* I'll added that to my code. I need to submit patches against
src/userspace/perftest/ so it looks more like what I have now. :^)

> > > On the other hand, it could be negligible compared to connection
> > > setup time.
> > 
> > hrm...true. We should measure and report that too.
> > A script could iterate the invocation with -n 1 to collect
> > a reasonable sample.
> 
> Problem is, connection setup involves the CM (or the socket kludge),
> so its not trivial to define what the overhead is.

Well, just invent a definition that works for that program.
The time the rdma_lat uses to setup *before* hitting
the main loop would be a reasonable definition.

> Ok, I checked that in: rev 2283. I also renamed --histogram to --report-all.

Yeah, I'm still wondering if anyone will want both in the same report.
The histogram can be derived from the unsorted listing.
(obviously not the converse).

The histogram is interesting from statiscal evalaulation of the
data (standard deviation sort of stuff).
But I'm only interested in the histogram to validate
the test environment is reliable enough. After that I
mostly want unsorted list.

> > > 2. get_clock in assembly replaced with asm/timex.h:
> > >    Unfortunately asm/timex.h was never intended for userspace,
> > 
> > Exact copies from linux/include/asm* are not intended for userspace.
> > But arch specific kernel headers are required to interact with the
> > kernel on any given platform (e.g. syscalls). They just need to be
> > "adjusted" by distro's so they are suitable for userspace consumption.
> 
> IMO get_cycles is never needed to interact with the kernel.

Right. That's because it's referencing a read-only source of data.
And accessing TSC through a system call would defeat the point of
implementing it as an on-chip register readable by unprivileged apps.

> It could have been part of libc or something, but its not.

Not every interaction with the kernel goes through libc.
(Thank $DEITY, glibc would be even more bloated than it already is).

It would be nice if get_cycles were a gcc instric.
gcc knows which arch/model CPU it's building code for.


> > x86_64 and ppc don't need to include config.h.
> > I'll submit patches to fix this.

x86_64 patch submitted.

ppc is harder to fix since they've wrapped __KERNEL__ around it.
Seems like not all PPC CPUs have implemented cycle counters.
PPC would be a good candidate for private one line assembly too.
And some folks will just get undefined behavior. 

> > I suggest define get_cycles() only for i386 and let everyone else use timex.h.
> 
> Sure. Still, I think I'll keep the one-line assembly till its fixed in
> most distros :)

Which distro is shipping on PPC?

Waiting for any commercial x86-64 distro's to update to 2.6.12
(assuming my patch is accepted) will be a long wait. Then they
have to update the kernel headers package too...

> I added ia64 so everyone can make progress meanwhile.

ok. That's me. :^)

> BTW, some architectures (notably sparc) define get_cycles to (0).
> Need to google for the right answer there.

That would suggest it's not implemented for those architectures.
Just like TSC was first implemented on pentium.

> I also saw this errata for Itanium:
> 
> 2. AR.ITC returns an incorrect value

Yeah. And I'll take those chances (1 in 4 billion) if it keeps the code
simpler and the machine won't crash because of it.

> Does this bug still affect Itanium systems in use out there?

Not many people liked Intanium 1. Noisy and fairly slow.
If anyone *must* use Itanium 1 to measure performance,
my deepest sympathies.

> What about Itanium 2?

It's not mentioned in the current Itanium 2 "Spec Updates".
http://www.intel.com/design/itanium2/specupdt/251141.htm
I'll assume it's fixed.

grant



More information about the general mailing list