Python performance report
Neil Toronto
ntoronto at cs.byu.edu
Wed Dec 13 20:33:30 EST 2006
System: 1.66GHz Intel Core Duo laptop with frequency scaling, Intel
945GM graphics, Ubuntu 6.10.
On this system, with com_maxfps set to 60, Python is doing fine at low
latencies, and mostly fine at high.
Here's some background. I've got the Python client game loaded and
rendering the map, all players, all items, and some effects like rocket
explosions. Most of the sounds play. A significant part of all this is
player prediction, which I'll go over now. It's actually rather
annoying, but kind of cool at the same time.
If you remember Quake before QuakeWorld, you might remember that it was
impossible to play online without a really good connection. The reason
is that the snapshots your client got were interpolated between to get
your current position. The snapshots, of course, had some delay. The end
result is that, when you moved, it would have to bounce your move off of
the server before you actually saw it. If you want to see what it's
like, you can make Quake 3 do this by setting cg_nopredict to 1. Things
like rocket fire and bullets are still like this.
Player prediction compensated for that, and has been included since.
Here's the short version: for every frame rendered, the client starts
with a fresh player state from the snapshot and *plays back every move
you've made that the server hasn't seen yet*.
Yeah, so that's rough on the CPU. Your client plays back about "ping *
FPS / 1000" moves. If you ping 500 and your FPS is 60, it does 30.
According to my profiling, the hardest part of moving a player is a
trace (collision detection), and your average player move makes 3 or 4
calls to trace: to find the ground, to move him forward, to slide him
against a wall or another player, and to find the ground again. At 30
moves per frame, 100 trace calls is pretty expensive.
Contrast this to the server game, which only has to make *one* move per
player move. The client game is definitely a sucker for punishment.
The good news is that Python is doing fine proxying all those trace
calls, and that the traces themselves take half the time it takes to
render a frame when you ping 500. Python does get choppy before a native
DLL does, but at normal pings (sub-200) and normal frame rates
(sub-120), it'll perform just fine on current hardware. Also, I now have
no worries whatsoever about Python's performance on either the client or
the server.
Couple of tidbits:
1) I've implemented cg_latentSnaps (from Unlagged), and it was way
easier than in 1.27. Somebody seriously cleaned up cg_snapshot.c. This
cvar allows you to simulate lag. The client game simply requests
snapshot number "current - cg_latentSnaps".
2) I've added com_laptop, which, if 1, causes Sys_Sleep(0) to be called
at the end of the (busy-wait) rendering loop. It makes Quake 3 a little
more laptop-friendly, and also lets you see the *real* amount of CPU
Quake 3 takes on your performance meter. (Otherwise it's always 100%.)
Neither of these are in the main trunk, of course, but they ought to be
considered.
Neil
More information about the quake3
mailing list