Python performance report

Neil Toronto ntoronto at cs.byu.edu
Wed Dec 13 20:33:30 EST 2006


System: 1.66GHz Intel Core Duo laptop with frequency scaling, Intel 
945GM graphics, Ubuntu 6.10.

On this system, with com_maxfps set to 60, Python is doing fine at low 
latencies, and mostly fine at high.

Here's some background. I've got the Python client game loaded and 
rendering the map, all players, all items, and some effects like rocket 
explosions. Most of the sounds play. A significant part of all this is 
player prediction, which I'll go over now. It's actually rather 
annoying, but kind of cool at the same time.

If you remember Quake before QuakeWorld, you might remember that it was 
impossible to play online without a really good connection. The reason 
is that the snapshots your client got were interpolated between to get 
your current position. The snapshots, of course, had some delay. The end 
result is that, when you moved, it would have to bounce your move off of 
the server before you actually saw it. If you want to see what it's 
like, you can make Quake 3 do this by setting cg_nopredict to 1. Things 
like rocket fire and bullets are still like this.

Player prediction compensated for that, and has been included since. 
Here's the short version: for every frame rendered, the client starts 
with a fresh player state from the snapshot and *plays back every move 
you've made that the server hasn't seen yet*.

Yeah, so that's rough on the CPU. Your client plays back about "ping * 
FPS / 1000" moves. If you ping 500 and your FPS is 60, it does 30. 
According to my profiling, the hardest part of moving a player is a 
trace (collision detection), and your average player move makes 3 or 4 
calls to trace: to find the ground, to move him forward, to slide him 
against a wall or another player, and to find the ground again. At 30 
moves per frame, 100 trace calls is pretty expensive.

Contrast this to the server game, which only has to make *one* move per 
player move. The client game is definitely a sucker for punishment.

The good news is that Python is doing fine proxying all those trace 
calls, and that the traces themselves take half the time it takes to 
render a frame when you ping 500. Python does get choppy before a native 
DLL does, but at normal pings (sub-200) and normal frame rates 
(sub-120), it'll perform just fine on current hardware. Also, I now have 
no worries whatsoever about Python's performance on either the client or 
the server.

Couple of tidbits:

1) I've implemented cg_latentSnaps (from Unlagged), and it was way 
easier than in 1.27. Somebody seriously cleaned up cg_snapshot.c. This 
cvar allows you to simulate lag. The client game simply requests 
snapshot number "current - cg_latentSnaps".

2) I've added com_laptop, which, if 1, causes Sys_Sleep(0) to be called 
at the end of the (busy-wait) rendering loop. It makes Quake 3 a little 
more laptop-friendly, and also lets you see the *real* amount of CPU 
Quake 3 takes on your performance meter. (Otherwise it's always 100%.)

Neither of these are in the main trunk, of course, but they ought to be 
considered.

Neil




More information about the quake3 mailing list