more query notes

Frank Foeth foeth01 at orange.nl
Fri Mar 30 07:35:51 EDT 2007


Hi Pavel,

Sorry I still haven't updated anything yet. But I did manage to write
down some notes that were swirling through my mind the past weeks. I'd
like some comments, if possible. If the notes are not helpful, please
indicate why.

Yours,

Frank
-------------- next part --------------
Query.py use

Bottom line
You define a parameter function which extracts a piece of data from somewhere;
you define a (boolean) statement which evaluates if the output of the parameter  function meets a criterium;
you use the output of multiple such statements and other queries in AND or OR type relationships to determine the result of a complicated query.

Complications
1. Comparing data to references is akin to the sortkey problem in the sort() function, you may need to supply sortkey-type function. [see sort in the python documentation]
2. Complicated queries may get so confusing that they can call themselves, resulting in an endless loop. To avoid this subqueries are checked to see if they depend on the query that is edited. (Endless loop errors are common enough, but in this case difficult to find.)
3. Calculation times can easily spiral out of control, especially in large collections in a real time application. So although optimization is not really allowed in pydance's code, some considerations regarding calculation time will be necessary.
4. Sometimes a single parameter will not do the job. E.g. in pydance both difficulty and level say something about how hard a dance is. The combo difficulty - level is most meaningfull here. This is dubbed a Compound and is largely handled the same way a single parameters is.

Namespace
Although not strictly necessary, I put a separate namespace in the module. I need the module for command line use, and I know I forget a lot if things get drawn out timewise. The bonus is that this offers more control over types, something python lacks altogether. At the same time this is its main weakness: it is a breach of python's design philosofy. On the upside, it allows me to catch complication 2 in the design stage, well before it happens, in simple cases that is. If you start hacking and rehacking queries, the simple check provided will probably not prove sufficient. (I believe that if query A is added -in statement form- to query B, query B is checked for circularity, but query A should be checked too, as B changes... I doubt I do this.)

The .info dictionary
One of my main problems with python is that it will not divulge what the hell is in a lambda function. Even something as stupid as 'lambda x: x' can never be printed back as a string. So to know what went into say a Parameter(), you'll have to do a lot of bookkeeping. At least the .info[] will allow you to store not only human readable info, but also (through eval()) machine interpretable information.

Use in pydance - possible alterations
The namespace is not needed in pydance, and it would be possible to take it out. However, catching errors will be more difficult, and all testing code would be lost.
The freeze state, which localises some information and prevents sheer endless recalculation of the same constants - see elsewhere for more info, does not speed up the code as much as I hoped it would. Eliminating it is easy and probably best in keeping with pydance's don't optimize rule.

Use in pydance - songselect
Currently both players choose from the same set of dances. Allowing both players individual queries could change this. It would allow showing each player to select only the dances allowed by his query, while still using one instance of SongDisplayItem per song. This would further trivialise selecting different games for both players (at songselect level.) (Currently Gameselect receives SongItem-s which show an impressive number of optional dances, which are pruned away, especially after game selection. In theory at least, the queries would facilitate a different solution, the most extreme of which: leave the songitems completely intact and let the queries decide whether to show a dance. I would suggest a middle road to keep the calculation time down.)

Use in pydance - query stage
To keep in touch with reality, a player needs to be able to see how many songs and dances are selected. I personally find anything less than 10 mildly annoying at best. As some song/dance collections will be seriously big, this poses a calculation time problem. If a user is allowed to get annoyed while waiting and stamps his foot a couple of times, all kinds of unwanted things may happen (both from the user's point of view and hopefully ours too.) In the two user case, this problem multiplies. 
If memory server me correctly, 
1. We should get rid of the waiting time in getting events, somewhere deep in the bowels of the program. (Maybe return PASS if insufficient time has elapsed.) 
2. Furthermore, the queries should be evaluated blockwise, say check 20 song/dance-items a time. Only after the entire query has been performed should results be made available on screen. Perhaps intermediate results should be stored for each menu item, so no query on any item will be performed twice. With current computer memory sizes, this should not be troublesome.
Especially in a two player situation where each player is allowed his own query, only SONGS with valid dances for both players should be allowed (allowed dances follow from these songs.) In practice this is most easily checked if four lists are compiled per player: one with allowed song and one with rejected songs and similarly allowed dances and rejected dances. At player level song is rejected either directly or because all its dances were rejected. At application level a song is rejected if it was rejected by either player. All other songs have at least one allowed dance per player.



More information about the pyddr-devel mailing list