The latest version of PhysicsFS can be found at: http://icculus.org/physfs/
PhysicsFS; a portable, flexible file i/o abstraction.
This API gives you access to a system file system in ways superior to the stdio or system i/o calls. The brief benefits:
This system is largely inspired by Quake 3's PK3 files and the related fs_* cvars. If you've ever tinkered with these, then this API will be familiar to you.
With PhysicsFS, you have a single writing directory and multiple directories (the "search path") for reading. You can think of this as a filesystem within a filesystem. If (on Windows) you were to set the writing directory to "C:\MyGame\MyWritingDirectory", then no PHYSFS calls could touch anything above this directory, including the "C:\MyGame" and "C:\" directories. This prevents an application's internal scripting language from piddling over c:\config.sys, for example. If you'd rather give PHYSFS full access to the system's REAL file system, set the writing dir to "C:\", but that's generally A Bad Thing for several reasons.
Drive letters are hidden in PhysicsFS once you set up your initial paths. The search path creates a single, hierarchical directory structure. Not only does this lend itself well to general abstraction with archives, it also gives better support to operating systems like MacOS and Unix. Generally speaking, you shouldn't ever hardcode a drive letter; not only does this hurt portability to non-Microsoft OSes, but it limits your win32 users to a single drive, too. Use the PhysicsFS abstraction functions and allow user-defined configuration options, too. When opening a file, you specify it like it was on a Unix filesystem: if you want to write to "C:\MyGame\MyConfigFiles\game.cfg", then you might set the write dir to "C:\MyGame" and then open "MyConfigFiles/game.cfg". This gives an abstraction across all platforms. Specifying a file in this way is termed "platform-independent notation" in this documentation. Specifying a a filename in a form such as "C:\mydir\myfile" or "MacOS hard drive:My Directory:My File" is termed "platform-dependent notation". The only time you use platform-dependent notation is when setting up your write directory and search path; after that, all file access into those directories are done with platform-independent notation.
All files opened for writing are opened in relation to the write directory, which is the root of the writable filesystem. When opening a file for reading, PhysicsFS goes through the search path. This is NOT the same thing as the PATH environment variable. An application using PhysicsFS specifies directories to be searched which may be actual directories, or archive files that contain files and subdirectories of their own. See the end of these docs for currently supported archive formats.
Once the search path is defined, you may open files for reading. If you've got the following search path defined (to use a win32 example again):
Then a call to PHYSFS_openRead("textfiles/myfile.txt") (note the directory separator, lack of drive letter, and lack of dir separator at the start of the string; this is platform-independent notation) will check for C:\mygame\textfiles\myfile.txt, then C:\mygame\myuserfiles\textfiles\myfile.txt, then D:\mygamescdromdatafiles\textfiles\myfile.txt, then, finally, for textfiles\myfile.txt inside of C:\mygame\installeddatafiles.zip. Remember that most archive types and platform filesystems store their filenames in a case-sensitive manner, so you should be careful to specify it correctly.
Files opened through PhysicsFS may NOT contain "." or ".." or ":" as dir elements. Not only are these meaningless on MacOS Classic and/or Unix, they are a security hole. Also, symbolic links (which can be found in some archive types and directly in the filesystem on Unix platforms) are NOT followed until you call PHYSFS_permitSymbolicLinks(). That's left to your own discretion, as following a symlink can allow for access outside the write dir and search paths. For portability, there is no mechanism for creating new symlinks in PhysicsFS.
The write dir is not included in the search path unless you specifically add it. While you CAN change the write dir as many times as you like, you should probably set it once and stick to it. Remember that your program will not have permission to write in every directory on Unix and NT systems.
All files are opened in binary mode; there is no endline conversion for textfiles. Other than that, PhysicsFS has some convenience functions for platform-independence. There is a function to tell you the current platform's dir separator ("\\" on windows, "/" on Unix, ":" on MacOS), which is needed only to set up your search/write paths. There is a function to tell you what CD-ROM drives contain accessible discs, and a function to recommend a good search path, etc.
A recommended order for the search path is the write dir, then the base dir, then the cdrom dir, then any archives discovered. Quake 3 does something like this, but moves the archives to the start of the search path. Build Engine games, like Duke Nukem 3D and Blood, place the archives last, and use the base dir for both searching and writing. There is a helper function (PHYSFS_setSaneConfig()) that puts together a basic configuration for you, based on a few parameters. Also see the comments on PHYSFS_getBaseDir(), and PHYSFS_getPrefDir() for info on what those are and how they can help you determine an optimal search path.
PhysicsFS 2.0 adds the concept of "mounting" archives to arbitrary points in the search path. If a zipfile contains "maps/level.map" and you mount that archive at "mods/mymod", then you would have to open "mods/mymod/maps/level.map" to access the file, even though "mods/mymod" isn't actually specified in the .zip file. Unlike the Unix mentality of mounting a filesystem, "mods/mymod" doesn't actually have to exist when mounting the zipfile. It's a "virtual" directory. The mounting mechanism allows the developer to seperate archives in the tree and avoid trampling over files when added new archives, such as including mod support in a game...keeping external content on a tight leash in this manner can be of utmost importance to some applications.
PhysicsFS is mostly thread safe. The error messages returned by PHYSFS_getLastError() are unique by thread, and library-state-setting functions are mutex'd. For efficiency, individual file accesses are not locked, so you can not safely read/write/seek/close/etc the same file from two threads at the same time. Other race conditions are bugs that should be reported/patched.
While you CAN use stdio/syscall file access in a program that has PHYSFS_* calls, doing so is not recommended, and you can not use system filehandles with PhysicsFS and vice versa.
Note that archives need not be named as such: if you have a ZIP file and rename it with a .PKG extension, the file will still be recognized as a ZIP archive by PhysicsFS; the file's contents are used to determine its type where possible.
Currently supported archive types:
String policy for PhysicsFS 2.0 and later:
PhysicsFS 1.0 could only deal with null-terminated ASCII strings. All high ASCII chars resulted in undefined behaviour, and there was no Unicode support at all. PhysicsFS 2.0 supports Unicode without breaking binary compatibility with the 1.0 API by using UTF-8 encoding of all strings passed in and out of the library.
All strings passed through PhysicsFS are in null-terminated UTF-8 format. This means that if all you care about is English (ASCII characters <= 127) then you just use regular C strings. If you care about Unicode (and you should!) then you need to figure out what your platform wants, needs, and offers. If you are on Windows before Win2000 and build with Unicode support, your TCHAR strings are two bytes per character (this is called "UCS-2 encoding"). Any modern Windows uses UTF-16, which is two bytes per character for most characters, but some characters are four. You should convert them to UTF-8 before handing them to PhysicsFS with PHYSFS_utf8FromUtf16(), which handles both UTF-16 and UCS-2. If you're using Unix or Mac OS X, your wchar_t strings are four bytes per character ("UCS-4 encoding"). Use PHYSFS_utf8FromUcs4(). Mac OS X can give you UTF-8 directly from a CFString or NSString, and many Unixes generally give you C strings in UTF-8 format everywhere. If you have a single-byte high ASCII charset, like so-many European "codepages" you may be out of luck. We'll convert from "Latin1" to UTF-8 only, and never back to Latin1. If you're above ASCII 127, all bets are off: move to Unicode or use your platform's facilities. Passing a C string with high-ASCII data that isn't UTF-8 encoded will NOT do what you expect!
Naturally, there's also PHYSFS_utf8ToUcs2(), PHYSFS_utf8ToUtf16(), and PHYSFS_utf8ToUcs4() to get data back into a format you like. Behind the scenes, PhysicsFS will use Unicode where possible: the UTF-8 strings on Windows will be converted and used with the multibyte Windows APIs, for example.
PhysicsFS offers basic encoding conversion support, but not a whole string library. Get your stuff into whatever format you can work with.
All platforms supported by PhysicsFS 2.1 and later fully support Unicode. We have dropped platforms that don't (OS/2, Mac OS 9, Windows 95, etc), as even an OS that's over a decade old should be expected to handle this well. If you absolutely must support one of these platforms, you should use an older release of PhysicsFS.
Many game-specific archivers are seriously unprepared for Unicode (the Descent HOG/MVL and Build Engine GRP archivers, for example, only offer a DOS 8.3 filename, for example). Nothing can be done for these, but they tend to be legacy formats for existing content that was all ASCII (and thus, valid UTF-8) anyhow. Other formats, like .ZIP, don't explicitly offer Unicode support, but unofficially expect filenames to be UTF-8 encoded, and thus Just Work. Most everything does the right thing without bothering you, but it's good to be aware of these nuances in case they don't.
Other stuff:
Please see the file LICENSE.txt in the source's root directory for licensing and redistribution rights.
Please see the file CREDITS.txt in the source's "docs" directory for a more or less complete list of who's responsible for this.