[physfs] 7z/lzma in trunk

Dennis Schridde devurandom at gmx.net
Wed Sep 27 13:40:45 EDT 2006


Am Mittwoch, 27. September 2006 18:03 schrieb Dennis Schridde:
> Am Mittwoch, 27. September 2006 17:56 schrieb Dennis Schridde:
> > Am Mittwoch, 27. September 2006 16:06 schrieb Ryan C. Gordon:
> > > > After I hardfixed the CRC calculation I tested with random access on
> > > > 2 files. Was very slow, too. Even more than expected... I'll use
> > > > multiple caches in the next version...
> > >
> > > Generally speaking, it's not worth going out of the way to optimize
> > > random access to formats that don't support it, like zipfiles, which
> > > have to decompress the whole file to get to the seek point...I assume
> > > lzma has the same problem.
> > >
> > > The solution is usually to cache the decompressed file in RAM inside
> > > PhysicsFS, but this is basically unacceptable on low-memory systems
> > > like the PlayStation 2...especially considering that most apps do not
> > > seek randomly through a file, or seek at all. Games and apps that need
> > > to deal with a slow file can just as easily cache the data through
> > > PHYSFS_read() without adding complexity or resource usage to the
> > > library.
> > >
> > > So if the solution is "cache more," I'd encourage you to leave it as a
> > > slow operation. If there was a fast way to jump to roughly the correct
> > > location in the compressed stream and then figure out the right
> > > plaintext offset by uncompressing a block or two of data, that would be
> > > a win, but that's not usually possible.
> >
> > Thanks for that advice.
> >
> > The current implementation (which just redirects down to to 7z / LZMA)
> > works like this:
> > - Decompress the whole block, the file we want to read from is in, into
> > the archive's cache.
> > - Every subsequent read will just find the correct position in that block
> > and copy the requested part into the buffer passed to LZMA_read().
> > - If a different file is read from the archive's cache is freed and the
> > block of the new file is cached.
> >
> >
> > I bet there is some documentation on it, but this is what I found in
> > experiments:
> > - Apparently a file is allways completely in one block. Or a block
> > allways includes at least one complete file.
> > - Apparently it is possible that multiple files are in one block, either
> > when working on a completely solid archive or an solid archive with a
> > blocksize greater than the filesize of 2 files.
> > - If not using such solid voodoo there is exactly one file in each block.
> >
> > > (10 minutes to read 100 kilobytes of sequential data would be a problem
> > > worth optimizing though!)
> >
> > Yes, I am working on this. :)
> >
> > Question:
> > Is the current approach of using 7z's cache ok?
> > Or should I try to decompress only the needed part of the file? (I don't
> > know yet how this could work...)
>
> If it would be ok (memory usage, delay on first access, etc.), then I would
> implement it so that each block which is decompressed is cached and the
> files keep references to the block they are in and their offset.
> That way I only need to call SzExtract on the first access to a block and
> if multiple files are in a block they each keep a reference to the "shared"
> block.
Implemented in the attached patch. If it is not ok this way just tell me and 
I'll dig into LZMA further and find out what I can do about it.
Speed has greatly improved, but now every block (folder) which has opened 
files stays allocated as long as the files are opened.

(In LZMA-LICENSE.txt a lone "k" remained at the end, caused by me deleting 
unneeded lines in Nano with CTRL+K.)

--Dennis
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lzma_multicache.patch
Type: text/x-diff
Size: 6884 bytes
Desc: not available
URL: <http://icculus.org/pipermail/physfs/attachments/20060927/881f9ed5/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://icculus.org/pipermail/physfs/attachments/20060927/881f9ed5/attachment.pgp>


More information about the physfs mailing list