Journaling Filing Systems What's the deal? Do I want them? Does anybody care? How do I make them do stuff? And since it'll all mean nothing to you at the end, I'll put it all together simply with instructions & recommendations... Right. Since I've been asked so many times WTF all this journaling stuff is about, let me explain. You know how, when your linux box bites the dust [say, for the sake of example, your nVidia drivers blow their guts out... purely hypothetically], and it reboots... but spends bloody hours checking hundreds of gigs of filesystem, then asks you "do you want to fix broken inode 23423?" I find that annoying. And so, it seems, do a great many others. A quick step backwards before we go on: The way the filing system works in linux is kinda like a big tree. In order to get into /usr/local/games/quake3, your computer actually traverses this big ugly path on your hard drive, going from /, looking up where on the disk to find /usr, then going into /usr, looking for /usr/local/... and so on, and so forth. Unfortunately, if your hard drive lost the bit on / that says "here's where to find /usr", then /usr and everything below it [eg, /usr/local] would effectively be unaccessible. And that's a Bad Thing (TM) The reason that this might ever happen at all is because changes on the disk aren't necessarily atomic. The disk doesn't instantly go from having one file to having that file with a different name; it has to do lots of on-disk edits. If the system crashes after beginning a big editing session, but before it completes it, the disk is essentially corrupted. Luckily, several things: 1) The linux filing system is actually alarmingly robust for other reasons 2) The tools to fix said filing system are very mature, and a damn site smarter than you. OK. Up till now, we've been referring to "the linux filing system" when we mean "ext2". That's that thing in /etc/fstab that says something to the effect of: /dev/hda2 /usr ext2 defaults 1 1 Fine. But by now, you're probably thinking "but there must be another way", and you're right. Imagine if, instead of writing changes to the disk /a la/ changing a typo in a file, you merely remember what the change was, then make a note on the disk [that /does/ happen atomically] that the change needs to be made. Later, when the computer has a spare moment, it reads that note, and makes the necessary corrections on-disk. Until the corrections have been completed, the note isn't removed from the disk. Once they have been made, the note is atomically removed again. Cool. At this point, your disk will always be 100% correct, right? Close. What I've just described is a system somewhat akin to ReiserFS. The system can pretty much always garauntee that, so long as the journal [the technical name for all the little notes the drive makes] is completely played back. So why don't we all move to ReiserFS? This is kinda a matter of personal opinion, but: 1) Only metadata is journaled, not data. In essence, if the computer crashes, it can fix the filing system fairly trivially by playing back the journal. The problem is that the data, in and of itself, is actually lost. You may have corrected some typos before the machine crashed, but your system never wrote the corrections to disk - just the fact that you'd made them. 2) SPEED IS NOT OF THE ESSENCE. You may think it is, but it's not. Unless you know that you need a faster filing system, you don't. Reiser is fast, but not enough that you, the gamer, can tell. 3) The tools to fix it if/when it explodes are still quite immature. While they officially work, I don't yet trust them. Hmmm. So we need something else. Ext3 is, IMHO, the currently best answer to this problem, because: 1) It journals data. If you change a file, it stays changed. 2) It's merely an extension gaffered onto ext2, with all the benefits of ext2 [fast, reliable, good set of tools] 3) To convert your disks to it, you need to run one command, instead of moving all your files off the relevant partition, formatting the drive, then copying them all back on again. 4) [although you can do this by hand on your machine with ext2] all of the partitions are given different "number-of-mounts-until-fsck" figures. You know how, even after your machine hasn't exploded for about 20 reboots or so, all the filing systems get checked for good measure. If, on the other hand, you make all the number-of-mounts-until-fsck numbers relatively prime, you won't see them all get checked until their LCM of mounts is reached - which, for me, constitutes a couple thousand reboots, IIRC. 5) If everything goes very horribly wrong, you can still mount it as ext2 - the journal will break after that, but you'll still have all your data. And a journal can be fixed. All of my filing systems are now ext3. It took me about 20 mins to do, and I have about 10 or 15 paritions on my machine doing different things. Easy instructions: 1) Pick an ext2 partition. umount it. If it won't umount, it might serve you to try reading some of my other writing about runlevels, etc. 2) Look in /etc/fstab at that partition. Remember which one it was, ok? [eg, the /usr partition above is /dev/hda2] 3) Check that partition for errors: e2fsck -vf /dev/hda2 4) Add a journal tune2fs -j /dev/hda2 5) Edit /etc/fstab to say "ext3" where it used to say "ext2" 6) Mount it again. Several things: 1) If your kernel doesn't support the relevant filing system, you're onto a loser. You need to get one that does support it [notably, most out-of-the-box mandrake, redhat, etc, kernels all support it]. It's included in 2.4.18 and above for anyone who's into building their own kernels. 2) You need to do this from a boot CD to get your root partition working. How to do that is an exercise best left to the reader 3) Read the man pages for everything I've just told you about before you run them. Note that if you fancy your chances, you can set the options yourself using the -J option, but -j sets them to defaults for the size of partition, etc. If you know how to work -J, why are you reading this? 4) If tune2fs doesn't support the "-j" option, you just gotta get a newer version from here: http://e2fsprogs.sourceforge.net/ And a note for gamers: Yes. You want this. It won't necessarily enhance your gaming machine's performance [it certainly won't detract from the performance], but it will definitely save time & effort on reboots & file system checks in the long run. Yes, for anyone curious, there /are/ other journaling filing systems out there, and if you're an admin running a news server or similar, neither of these two are for you. But again, if that's the case, you probably already know all about this stuff. Hope all that's helped or cleared some stuff up or whatever. Gary (-; Addendum, 2002-10-17 To clarify, the meaning of the word "atomic", for the purposes of this exercise: It's a single operation that either did or didn't occur; it's not possible to dump a load of data to disk in a infintesimal amount of time, so what the journal does is dump all the relevant data to disk, then mark it as dumped. It's therefore possible to have some data that an application has written that's not actually been completely put into the journal; it is as if it was never put into the journal at all, when the time comes to play it back. [If that becomes necessary] Those are the breaks, there's not a lot you can do, I'm afraid :/ Addendum, 2003-07-26 Nowadyas, ReiserFS is getting to the point where it's used all over the place, for many-many things. I still like ext3, though (=