Finger info for phaethon@icculus.org...



Raw source and immediate archive at http://www.icculus.org/~phaethon/plan/

Entries are currently sorted in: Newest First

Syntax (time and date in UTC, my local timezone is PST/PDT):
[YEAR].[MONTH].[DAY] <space> ~[APPROX. HOUR] <space> <space> [SUMMARY]
<empty line>
[CONTENT]
<empty line>
<empty line>



2002.11.06 ~07 The so-called "Super Tuesday"; voting day

Election Day. The ballot reminded me again that the US is based on a
representive democracy (as most so-called democracies tend to be). Five
of the seven pages regarded "whom" to vote for, and the other two were
"what" to vote on.

I prefer not to dawdle and languish over the ramifications of a
so-called democracy that puts into office people easily... "swayed"
by... "donations"... that lead to... er... "sponsored bills" that "help"
the said "sponsors" campaign to put into office said people that were...
"persuaded". So I'll concentrate on something that came mind as I
looked at the voting mechanism doohickey, which sort of relates to
security in general and a little bit on forensics.

The two times I've ever voted in my life (today, and two years ago), the
voting mechanism looked the same. In fact, it still looks the same from
when I went in with my dad a decade ago to watch him punch his ballot.
I don't know if this is an artifact of my city in particular (been here
for nearly 16 years), or the State of California in general.

The (paper) ballot itself has roughly the same area as that taken up by
the typographic ("letter") keys on a typical PC-ish keyboard. The
ballot is basically a souped up punchcard; for those familiar with
Scantron forms, it's like a big version of one that has the choice poked
out to form a hole instead of bubbling-in with a #2 pencil.

The voting machine is all-mechanical. It's a mostly plastic thin box,
just large enough to house the ballot. The machine's business face is a
grid of hole masks (to guide the holepunch). Stiff paper pages with the
meaning of the particular holes are bound between columns of holes
(rotating on fine metal rods). Laid horizontally, the pages flip down
to hide all but one column at a time. The end result is that the voter
turns to a particular page, and the printed choice points with an arrow
to the exposed hole representing the choice. As a neat trick against
improperly inserted ballots, another layer of plastic hole mask lies
just under the machine main grid of holes, spring-loaded against the
ballot, such that the two holes do not line up (blocking the holepunch)
unless the ballot is properly inserted deeply. An equally simple
mechanism at the insertion end uses small plastic posts that the ballot
end slips over to keep it anchored against the springs. Should the
ballot slip out, the spring pops out the ballot and de-aligns the two
layers of hole masks to prevent any punching.

(ASCII art, monospace font required)
  punch holes
 contents | semi-blank page
  V V V more pages>>
+-------------+-+-------------+-+-+
|#OFFICE FIO=>|o| |#|#|
|# ERI=>|o| |#|#| <- # = zebra codes (!)
|# RUMI=>|o| ========> |#|#|
|# |o| GO TO |#|#|
|#FOO YES=>|o| NEXT PAGE |#|#|
| NO=>|o| | | |
| |o| | | |
| GOATSE YES=>|o| check for | | |
| NO=>|o| chad | | |
+-------------+-+-------------+-+-+
  next column on ballot
  |
(next page) V
+-+-------------+-+-------------+-+
|#|#PROP X YES=>|o| |#|
|#|#blah NO=>|o| |#|
|#|#blahblah |o| ========> |#|
|#|#blahblah |o| GO TO |#|
| | |o| NEXT PAGE | |
| | MEASURE Q |o| | |
| | blah YES=>|o| | |
| | blah NO=>|o| check for | |
| | blahblah |o| chad | |
+-+-------------+-+-------------+-+

Eh... where was I...

Oh yeah. I mulled over my choices "on the spot", taking it SATs-style
(mark what I dunno, mark it for return (but on another sheet paper!), go
on to others, come back later). Yeah yeah, should've researched, yadda
yadda. But while I did, I noticed bits of chads in the holes, of
choices I didn't decide on yet. So much for secret ballots. Well, OK,
so I don't know who was at it before, but it's certainly disheartening
when you see a bit of chad in the "No" hole when you're thinking "maybe
Yes...". Even more disheartening when the chad doesn't fly away despite
all your blowing and realizing the only way you're going to get rid of
it is by punching it through into the chad collector... as your choice
(can't do it with ballot removed -- the holes masks don't line up).

Anyway, I was thinking, despite all this effort to make the ballots
"secret", something so insignifcant as a random cruft of paper sticking
to an anonymous holepunch on its way out can still give away so much
information.

I find it intriguing how something so insignificant as a tiny tad of
chad can say so much. Security is one thing, secrecy another. Security
through obscurity rests on secrecy, and then some tiny bit (literally)
in just the right/wrong place can blow this kind of security.

Hrm. I forgot where I headed with this.


2002.10.31 ~23 Using balanced BST in TinyScheme

Basically I copied the trees algorithms in q3asm over to TinyScheme, the
C portion. Currently only the oblist (list of existing symbol names) is
a tree, but the speed increase in my mod (layered on top of TinyScheme)
is still noticeable, since TinyScheme constantly searches for existing
symbol names. The environment register (E), for resolving variable
names, still uses a list of frames, each frame using a list of bindings.
I don't think there's much I can do about the list of frames, but the
frames themselves might improve with the use of trees. I don't remember
how large the root (global) frame gets; I think it's beyond a thousand.

I haven't gotten around to incorporating splay trees into q3asm, since I
don't have a very strong motivation to do so at the moment. When I do,
I'll probably bring splay trees into TinyScheme as well.


2002.10.29 ~04 Splay tree, a more complex tree structure

OK, because of this q3asm thing, I'm suddenly in a tree mood.
Apparently in my textbook, three chapters after the one on R-B and A-A
trees, is splay tree. This may be a(n overkill) solution for my opcodes
worries. Basically the idea of a splay tree is to repeatedly split
apart the tree and glue them back together in different ways such that
more frequently-accessed elements rise towards the root (and thus
cheaper for *future* accesses), while still maintaining the tree's
order. The penalty here is that worst-case scenario is much much worse
than RB/AA tree, on the hopes that average-case, for the more frequent
nodes, gets better. And all that splitting/splaying also costs.

The idea is to spread out the cost of the few worst-case accesses across
many very-good-case accesses (this is where the term "amortize" comes
in, but I'm still not sure exactly what the definition of the word is).
Of course, this depends on there being *some* "frequent nodes".
Apparently the cost of splay trees is horrendous for purely-random
accesses. Fortunately, (functioning) assembly doesn't involve throwing
together random opcodes in random sequences.

The splay tree, in some ways, reminds me of Huffman coding, where the
more frequent sequences use fewer bits than less-frequent sequences. Or
of caches, scoring big wins for frequently accessed memory. Splay tree
would dynamically re-arrange itself so that the more freqently sought
node cost less to find than less frequently sought nodes.

Splay tree may also be a good idea for the symbols tree, as there are
likely to be some symbols accessed more frequently than others. For
example, g_entities[], the list of all entities in the game world.


2002.10.29 ~01 Bring back hashtable to q3asm-turbo?

Just now I bothered looking at the numbers related to q3asm-turbo's data
structures. For my mod, the hashtable q3asm-turbo had these numbers:

Symbol hash: 17819 nodes, longest chain = 45, mean non-empty = 9.320
Opcode hash: 95 nodes, longest chain = 3, mean non-empty = 1.484

The Bal.Bin.Tree implementation has these numbers:

Symbol tree: 17819 nodes, tree height = 19
Opcode tree: 95 nodes, tree height = 10
Export tree: 26515 nodes, tree height = 21

In the optimal case the symbol tree would have a height of 15 (log2 of
17819), and the opcode tree would have height of 7. The height of the
tree represents worst case time for searching.

In the hashtable implementation, the longest chain indicates worst-case
time for search, and the mean average roughly corresponds to mean
(average) search time. Within each chain in the hashtable, though,
sorting is done by linear-time insertion sort, and thus the major
portion of cost. The thing is that the opcode hash has a longest chain
of 3. This is next to squat (almost constant) for sorting and
searching. Meanwhile, the opcode tree has a height of 10, worst case
searching is visiting 10 nodes. In the tree, only 15 opcodes would be
reachable in 3 hops, while any opcode in the hashtable can be reached,
guaranteed, within 3 hops.

The question here is if I should resurrect the hashtable specifically
for the opcodes. Trees have major wins in the large-amount-of-data
department (due to reduced insertion costs), but hashing seems to win in
small-sets-of-data.


2002.10.29 ~00 q3asm "static" and balanced binary tree done

http://www.icculus.org/~phaethon/q3a/q3asm-turbo/q3asm-turbo.html
http://www.linux.ucla.edu/~phaethon/q3a/q3asm-turbo/q3asm-turbo.html

I did it. Two birds with... uh... two stones. Two separate forked
updates. The first is just a straight port of the old q3asm to the 1.32
SDK, so that patch(1) stays quiet. The second has some major mojo.

Firstly, I finally carried out my threat of moving the data structures
from hashtables to (balanced) binary trees. The speed improvement over
hashtables isn't very dramatic, but still noteworthy. Empirical testing
suggests a time decrease of 10% to 30% compared to hashtables. I'm too
lazy to set up millisecond-resolution timing, but it feels faster to me.

I actually cracked open my CS textbook and read through the trees
chapter for the nth time. I finally half-grok what's going on with the
rotations, but I'd be ****ed if I have to implement a R-B tree from
scratch on a blank screen without any references.

An added benefit of putting in trees is extra flexibility in additional
data... data... err... yeah, additional data. For example, a mapping
of private symbol names to public symbol names. Such as that implied by
the "static" keyword in C.

As I had priorly brainstormed, all symbols in a file are mangled based
on filename (actually, file index number, but close enough) to mimic
private namespaces. The "export" assembler directive, which takes as
the sole parameter the name of the local symbol to be made global,
creates a mapping such that the specified global name resolves to the
file's particular private symbol (name). The "import" directive acts
similarly, creating a mapping such that the local private symbol name
resolves to the global name, then recursively re-resolves to the
"actual" private symbol in the other file('s namespace). These mappings
are sorted and searched in a balanced binary tree. Good thing, too,
since there are upwards of 20K such entries.

Basically, it was a matter of mimicking private namespaces and shifting
names between those (fake) namespaces.


2002.10.28 ~09 Thoughts on Blender win32 build, and q3asm "static" keyword

+ Blender win32

Blender is a cross-platform app, so consistency dictates the build process
should remain cross-platform. The problem is that the primary compiler
used in win32 is Microsoft Visual... um... Studio? C++? Whatever.
(Whether this can be considered "the system's C compiler" is a topic of
another rant). The point stands that MSVC isn't very make(1)-unfriendly,
is very non-standard. The problem is that automake as-is does not work
well with MSVC as the C(++) compiler.

One solution is to attempt to conjure up the necessary MSVC files based on
the automake files. This method seems doable -- the tree structure of the
source is patterned enough to establish a build path even without the
automake files. The problem with this attempt is, again, non-standardism.
I can't grok the formats of the MSVC files (the ones packaged with the
Blender source tree). Or maybe I don't want to... but they seem just plain
weird. Primarily on issues such as whether those header lines are
critical, the sensitivity on whitespaces (valid chars and amounts), and wtf
are the comment delimiters. Oh yeah, and those crazy-ass backslash for
path separator.

Wonder if there's a grammar describing these .dsp (.dsw?) files.

+ q3asm

I have many reasons for wanting to split q3asm into two distinct programs,
q3as and q3ld. One is to assist the migration of q3vm compilation into GCC
later on (WAAAAAAY later on). Another is to allow use of library (.a)
files in Q3VM linking -- this helps cut down on reaching across
directories, as is currently done in cgame/ for obtaining bg_* (both games)
source files. Related is testing my libc reimplementation, to avoid
repeated recompilation and to test standalone worthiness (i.e. not
intimately linked to my mod). An offshoot to that is properly handling the
C keyword "static" (libc has stuff that really should be private).

I already started on a hack for "static" support, using the source filename
as a "tag" to symbol names, to make symbols unique to a particular source
file. This establishes private namespaces per file. The sticky part with
this method is trying to get globals to be globals. At some point the
filename-"tag" portion needs to be stripped (simple), and somehow resolve
to the unique still-tagged symbol (a little hairy), while ensuring no
conflicts (some more hair). The key to the breakthrough lies in the
assembler directives "import" and "export". The hard part is trying to
come up with the proper data structures to express this method of scoping.
I think I need to work out some more of this on paper.

I really want to shift q3asm's symbol table to a tree, for many reasons.
For one, trees would boost the speed some more (although the need is
arguable, as some people are getting reported time-elapsed of 0 seconds).
Second, the data structure and methods for trees are much neater than those
for a chain-and-bucket hash table. And finally, trees are just cool. For
speed reasons, the tree would need to be weight-balanced somehow. I would
like to just link to a tree library of some kind, but that would be an
extra burden on whomever wants to use the tree'd q3asm. So my options are
either to rip tree code and stuff 'em into q3asm, or dig through my CS
books again looking at the tree implementations. I understand the basic
concept of R-B tree, that the markings help keep the balance... but I just
get lost in all those rotations. :(


2002.10.27 ~06 improved RNG in Q3VM, Q3 1.32

+ Towards a better (Pseudo-)Random Number Generator.

http://www.icculus.org/~phaethon/q3a/nbrng/nbrng.html

My quest for a better RNG started in Quake 3 Urban Terror (q3ut) v2.5,
fueled by countless numbers of "Bullshit Incidents", aka "being 2.5'd".
I'm not in a position to improve q3ut myself (read: damn elitist "dev
group"), but the repeated 2.5'ing fired up my drive.

The stock random number generator in Q3A is a laughably deterministic
cyclic PRNG. Just multiplying the seed by 69069, storing the result back
into the seed, and then using the new seed value as the random number.
(Note: mods compiled to native code use the host OS's (P)RNG)

A google search for free RNG source turned up "noiz". The original noiz
acts like an ad-hoc system-wide RNG, using an entropy pool file in /etc (or
elsewhere) and explicit programs to take the pool, stirs the pool with
stdin, and writes back the altered pool; basically a shellified+disked
variant of an entropy collector.

My terminology is probably rather shoddily used and incorrectly applied,
since I just started to look into RNG. I never realized the breadth of
research on just figuring out how to *induce* randomness.

Anyway, I took noiz and modified it to be function+cvar based. The
function to actually collect entropy needs to be strategically placed
throughout the mod to correctly scoop up entropy, but otherwise the
noiz-based RNG is far more nondeterministic than the stock PRNG.

+ Quake 3 Software Development Kit v1.32.

ftp://ftp.idsoftware.com/idstuff/quake3/source/linuxq3a-sdk-1.32.x86.run

So, the new SDK to reflect the new Q3A mod (baseq3) source was released a
few days ago. Apparently most of the baseq3 changes involve
PunkBuster-related options, such as UI elements to turn it on/off. At
last id released their patched lcc, which compiles C to Q3VM assembly
(which is actually a variant on the lcc bytecode machine description).

As far as I can tell, the 1.32 q3asm doesn't have any speedups over the
prior (1.27?) version, although someone took the time to clean up the code
to keep gcc quiet.

+ PunkBuster

So now Q3 1.32 has PunkBuster. I have no idea what the basic principles
are behind it, but apparently it's supposed to reduce cheating somehow.
Since it (PB) deposits .so files on my (client) side, I presume these
checks are done client-side. One lesson I recall from the Q1 source
release is that anything run on client-side is totally untrustable, as
there is really no limit to what the client-side can "say". So I seriously
doubt PB can catch "all" or even "most" cheats, if it relies on this
client-side .so file to check for client-side... um... stuff.

Aside from that, I'm starting to notice strange lags in Q3. Now everything
feels like a 100+ ms lag, instead of the usual inherent 50ms lag. It
almost feels like CPU/client-side lag, but the framerate still pegs the
meter at high double digits. Everything, from baseq3 to ufreeze to q3ut2
to wfa, from extremely low ping to extremely high ping. Either the netcode
in Q3 1.32 has hit the shitter, or PunkBuster is stuffing Q3's face into
the shitter with extra net comm.

Neither one bodes well. Still, it would be extremely stupid if PunkBuster
maintains some kind of "out-of-bounds" channel _during_ gameplay, since
even so much as a rogue SYN packet can exponentially compound the lag on a
56K connection (a large segment of Q3 players, iirc) and completely ruin
the experience (of either side of 0wn@ge). But I find it harder to believe
Id would deliberately cripple the netcode in 1.32 sufficiently to make the
difference noticeable on a broadband connection. Which is it? Which way
would Occam's Razor cut?


Life: No Life found.

Project:
1. Project FI, Quake 3 mod (http://www.icculus.org/fi/)
 a. provide an extensible environment for a Q3 mod. The intended notion is that of "mutators" in Unreal Tournament.
 b. FI:WFC, a more faithful reproduction of Q2WF for Q3 than WFA.

2. QuakeScheme
 * Extensible language for Project FI.
 * Builds on TinySCHEME (http://tinyscheme.sourceforge.net/)
 * Deal with idiosyncrasies of Q3VM not handled by most other Scheme impls.

3. Q3VM libc
 * Implementation of Standard C Library for Q3VM bytecode.
 * Implementation of a subset of Single Unix Specification v2 (SUS v2).
 * Help import third-party library into Q3VM.

4. QS GUI/widget set
 a. Need to research advanced OO and GUI of Scheme derivatives and Common Lisp.
 b. Replication/extension of boxy widgets in Q3TA (Q3 PR 1.27+).
 c. Pie menus -- just to annoy theoddone33.

5. Q3 compilation toolchain
 [X] q3lcc sources (official version out with Q3A SDK 1.32)
 [ ] q3asm - get static to work, dammit.
 [ ] q3as - assemble-only (.asm to .o).
 [ ] q3ld - link-only (.o/.a to .qvm).

5. PalmOS stuff
 a. PiNGer (gfx viewer)
  * generalize interface to a "any-gfx" viewer (libpnm?)
 b. ZBoxZ (file manager)
  * beef up its appness: menus, dialogs, pen actions


When this .plan was written: 2002-11-06 18:38:00
.plan archives for this user are here (RSS here).
Powered by IcculusFinger v2.1.27
Stick it in the camel and go.