[quake3-bugzilla] [Bug 3639] BoxOnPlaneSide patch

bugzilla-daemon at icculus.org bugzilla-daemon at icculus.org
Tue Sep 15 01:51:11 EDT 2009


http://bugzilla.icculus.org/show_bug.cgi?id=3639

Patrick Baggett <baggett.patrick at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |baggett.patrick at gmail.com

--- Comment #4 from Patrick Baggett <baggett.patrick at gmail.com> 2009-09-15 01:51:08 EDT ---
Okay, so I decided to pick a random bug and work on it, and so I selected this
one. I wrote a test case and ran it:

IOQuake3 BoxOnPlaneSide(): avg 184 milliseconds (1024 tests of 10M calls)
Proposed BoxOnPlaneSize(): avg 179 milliseconds (1024 tests of 10M calls)

This difference is small and basically negligible. Keep in mind this is the C
version, not the x86 assembly version. If the assembly version is even
marginally faster than the C version, this will drop this difference into
epsilon-squared.

My guess:
The IOQuake3 BoxOnPlaneSide() uses a switch() statement. If the case is 7, then
the first 6 cases must be checked first. The proposed version can be done as an
unrolled loop. Otherwise, the fp-add/fp-mul is exactly the same...
I'm sure if you got GCC to generate a jump table the results would be basically
the same.

SO WHAT IF THE DIFFERENCE WAS 500 MSEC?
==============================================
Let's put this into perspective. That's 500 milliseconds OVER 10M tests. That
means for every 100,000 calls, you save 5 msec. Since there aren't even 100K
calls to this function, probably 1K-5K at most, you'd be talking about saving
microseconds per frame.

BUT
===
But the difference isn't 500, it's 5, so were talking tens to hundreds of
nanoseconds per frame. This is basically no speed up. Light travels one meter
in about 2.2 nanoseconds.

MY IDEAS
========
As for deleting the assembly, I don't think it is likely to be dramatically
faster, and even if it was dramatically faster, I don't think it would be a big
deal. It did exist for a reason when it was first written, so while a Phenom
doesn't care whether the code is hand-optimized asm or just whatever GCC spits
out, an older machine might. Replacing the C version more or less doesn't
matter.

Patrick

-- 
Configure bugmail: http://bugzilla.icculus.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


More information about the quake3-bugzilla mailing list