[physfs] 7z/lzma in trunk

Tue Sep 26 14:30:04 EDT 2006

Am Freitag, 1. September 2006 21:45 schrieb Thomas Einsporn:
> > So it will take some time till I got an answer and longer till that will
> > be working in physfs... Please be patient. ;)
It took me quite a while till I figured out how the lzma sdk is supposed to 
work but after that I got a simple solution.
I hope it is bugfree and at least found no crashes when trying with some 
sample archives.
All of them were compressed with standard (read: no extra) options.
The test was compiled on Linux using GCC, so it would be nice if this got 
tested on Windows with MSVC.

This is not a small bugfix but rather a whole reworked version using many more 
of the LZMA SDKs functionality.

A patch is attached. Please note that I removed some files and added others.
D      lzma/LzmaStateDecode.c
D      lzma/LzmaStateDecode.h
A      lzma/7zExtract.c
A      lzma/7zExtract.h
A      lzma/LzmaDecode.c
A      lzma/LzmaDecode.h
A      lzma/LzmaTypes.h

The SDK has also been updated to 4.43 and is (nearly) a non modified version:
In files 7zIn.c:185 and 7zDecode.c:73 I had to change inBuffer to void* 
because otherwise GCC4 complained about dereferencing type-punned pointers.
See also the bugreport: 
http://sourceforge.net/tracker/index.php?func=detail&aid=1565800&group_id=14481&atid=114481

In case others want to know how this works:
A 7z "folder" means a block of the compressed archive.
SzExtract extracts one complete block and gives the offset of the requested 
file in that block.

My implementation caches one block per archive. You can experiment with the 
blocksize of your archives to find out what gives you the best speed / memory 
usage ratio.

It could be changed to cache more than one block and switch between them on 
read() in case where you need random access on big archives.
In case where you only read one file at a time the current code will be 
sufficient (IMO).
In the multi cache approach LZMA_read() would keep the last X cached blocks.
So when it finds that the folder (read: block) of the requested file is 
currently cached, it would increase the priority of that block.
If the folder of the requested file is currently not in cache it would drop 
the cached block with the lowest priority (read: least frequently used) block 
and instead cache another block.
But as this probably needs quite some fine tuning (number of cached blocks and 
priority algorithm) I didn't implement it. (And as I said, the current 
approach is probably sufficient.)

I hope this makes it's way into SVN soon. :)

--Dennis
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reworked_lzma.patch
Type: text/x-diff
Size: 86317 bytes
Desc: not available
URL: <http://icculus.org/pipermail/physfs/attachments/20060926/76d6b63e/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://icculus.org/pipermail/physfs/attachments/20060926/76d6b63e/attachment.pgp>