Wednesday, 30 March 2011

File Systems Part Two Not Invented Here

(Part Two of an article on files-ystems in Linux. Catch up on Part One)

There's more to file-systems than the descendants of our Unix ancestors. 'Plethora' doesn't even begin to describe it. Why do we care? Sadly we don't live in a Linux bubble, un-enlightened colleagues family and friends insist on using other non-native file-systems with which we often need to interact. The biggest? Microsoft blessed us with FAT and NTFS, while Apple gave us HFS through the Macintosh...

Chewing the FAT

We mentioned MS-DOS last time, but not the actual file-system under-pinning it; the most limited, but most ubiquitous file system, FAT. So named after the File Allocation Table which provided the file index, it is more accurately the translation table mapping file contents to storage locations on disk.

Neither FAT16 or FAT32 (having sixteen- and thirty-two-bit address spaces) are journaled, neither have access controls, but thanks to their use by Microsoft, VFAT and FAT32 rode the Windows '95 and '98 desktops to world domination. They became the default file systems for flash RAM devices - digital cameras, USB memory sticks and the like. Small and highly portable, they work adequately on those devices and in embedded applications, which is why we need FAT-support in Linux if we want to plug-in in standard cameras, music players and other portable storage devices.

FAT16's simplicity (or lack of features) is it's strength and weakness. This gave us the legacy of eight-character file-names with three-character suffixes denoting the file type. You think Twitter is a challenge, you have to be very clever to get a meaningful file-name in 8 characters and be very organized with your folders, named in eleven characters or less. Imagine how overjoyed we all were when we got long file-names up to 255 characters and large-disk support in VFAT (Virtual FAT) and FAT32. Large partitions mostly worked fine once created, however some software wouldn't allow creation of FAT32 partitions larger than 32GB, including, notoriously, the Windows XP installation program.

You really don't want to use either for your desktop file-system any more. Even with a redundant back-up copy of the FAT itself used to provide some kind of data security, FAT file-systems are far too fragile and liable to corruption. They need regular health checks and de-fragmentation in order to maintain any kind of performance.

NTFS arrived with Windows-NT (standing for New Technology, which it wasn't). NTFS remains the current Windows file system, widely used for work-groups, shared file-serving over local networks. This one is journaled, has solid access controls and is based on Novell Netware. It gives you a lot of networking file-sharing operations, large volume support and decent performance, but it's still unique to Microsoft. More importantly, there is open-source NTFS support in Linux. It enables you to access your Windows disks and create new NTFS partitions when you need to administer Windows disks or create compatible shared folders. The Samba suite of file-sharing tools in Linux provides most of the infrastructure you need to run Windows shares day-to-day without needing a Doctorate in file-systems. I hardly ever manage to break NTFS. Hardly.

I doubt you'll do much technical support for family and friends (and as their tame 'computer enthusiast' you will do technical support for family and friends) without contact with NTFS.

Apples is Apples

Apple's own HFS, Hierarchical File-system, also called Mac OS Standard, used on Macintosh computers (or other systems running Mac OS), has now evolved into HFS Plus or HFS+ or Mac OS Extended (but not, apparently, “HFS Extended” which is wrong). If only they would make their mind up. HFS Plus is also one of the formats used by the iPod digital music player.

As another descendant of Unix file-systems, HFS Plus has all the smart features of journaling, access controls, meta-data, aliases and symbolic links. But being Apple, they do things just that bit differently from everyone else. Parts of the original HFS used to break easily and frequently, thanks to the lack of journaling. HFS Plus is that bit more elegant.

If you have, or need simply to talk to, a Mac, the Linux kernel supports basic reading of HFS and HFS Plus. However, journaling support which is needed for writing to HFS is nearly non-existent (too many licensing and patent issues). By default most modern Macs using HFS Plus have journaling enabled and you really don't want to disable journaled writes from Linux on an HFS Plus partition.

Mac OS also supports Universal File System (UFS), which is based on the Fast File System (FFS) of BSD 4.4, so it's serious industrial strength journaling file-system, but with less meta-data. I'm thinking life is too short to breakdown another flavor and it's getting close to my bed-time...

Dead End Canyon

As a side-note, IBM tried to compete with Windows when it launched OS/2, using HPFS or High Performance File-System. It was highly performant for it's day. Nobody used it.

Shiny, shiny

I'm almost embarrassed to include those shiny round things – optical media.

ISO9660 is a CD-ROM file-system type conforming to the ISO 9660 standard. Support for CD-ROMS and ISO images is, thankfully, well supported in Linux. You can copy and mount ISO images from almost anything using the command line or point-and-click GUI-utilities. It's the surest way of moving data from one machine to another - or at least it was, until optical drives went out of fashion.

UDF is the Universal Disk Format which it... isn't. UDF almost took off as a standard. You could use it to burn re-writable optical media with a 'normal' file-system that you didn't have to finalize (close) or create multiple sessions. In theory, UDF disks could be used like high-capacity floppy disks. In reality the differences in optical re-writer drive hardware and Windows support across the versions meant you could easily trash the equivalent of the file allocation table, particularly when you deleted files. Many CD-ROM drives had trouble reading them, open or closed. Linux doesn't have nearly the same problems. It's just that using optical disks this way is painfully slow.


That rounds off the long, long list of file-systems you may see or touch in the course of Linux computing. The sharp-eyed will be asking about the one I've missed: NFS. I haven't missed it. The Network File-System isn't really a file-system, it's a protocol for file-sharing, like (but not actually like) Samba. Maybe there's a how-to article in that one... RC

No comments:

Post a Comment

At least try to be nice, it won't kill you...