Tracking Files on the Mac


Mac OS offers several mechanisms for tracking files, each with their own strengths and weaknesses.

Paths
File paths have never been a recommended method of identifying files on the Mac, but over the last several years as experienced developers from other platforms have started writing Mac software there's been an increase in the number of people using them. The problem with using paths as file identifiers on the Mac is simple: they're not reliable. Traditional, colon-delimited paths have no guarantee of uniqueness because there's no guarantee that two volumes with the same name won't be mounted concurrently. POSIX style paths that came along with Mac OS X are unique, but not persistent; there's no guarantee that a given volume will always appear at the same mount point. They're also not robust. A path name doesn't really represent a file as a discrete object; it represents a location in the file system and will cause clients to use whatever happens to be at that location at the time they try to use it. POSIX paths do have some benefits for short-term use, however. They're simple in general, and they're one of the best options for representing a file that doesn't exist.

URLs
URLs are paths with some extra bells and whistles. Which shouldn't be taken as an attempt to trivialize those bells and whistles; they're pretty cool. But ultimately this is a way to reference a location (which may be only conceptual in nature) and whatever happens to be there (if anything) right now.

FSRefs
The Mac OS FSRef type is an opaque structure. At a fixed 80 bytes they're more compact than many paths but they're also not persistent and (unlike paths) can't be reliably shared via IPC. They can also only point to existing files, as they do represent an object rather than a location. Aside from (and probably related to) their compact, predictable size, the most useful thing about an FSRef is that many API calls deal with them.

Volume/Folder/Name
In the early history of Mac OS file systems, the most reliable means of persistently referencing a file was to track a tuple consisting of some invented unique identifier for the host volume, the 32-bit folder reference number on that volume and the name of the file. This became enough of a standard that it was eventually codified in the FSSpec type. They're relatively compact (although brittle with respect to the maximum length of a file name). They're also fairly robust in that the containing folder can be moved within the volume or renamed and the reference remains valid. They can even be persistent depending on what value has been chose for the volume ID. (The FSSpec structure was not persistable, because the volume ID was transient.) As the FSSpec type is now deprecated, this mechanism is probably more awkward to use than its limited benefits justify. Like paths, these can point to something that doesn't exist right now. As with paths these ultimately reference a location rather than a specific object, but that location is relative to a reliable object (the volume/folder pair).

Volume/FileID
Files on HFS volumes (as well as other formats) can have unique IDs within the volume. So if you can come up with a viable volume identifier, that plus an additional few bytes for the file ID will uniquely identify a file no matter where it moves (within the volume) and despite being renamed. It will be broken by the common UNIX style archive-and-save technique of renaming the old file and then create a whole new file for the current version of the document.

Alias
The king of Mac file reference objects is the alias. It's robust, it's persistent and it's flexible. It's not (by a small margin) the easiest to use - typically you'll have to convert from alias to FSRef and then possibly to something else - but it should be the first choice for any file reference that has to last longer than the current pass through the event loop. They can reference a location or an object, they can ask for volumes to be mounted, they can reference objects that don't exist (albeit less robustly) and they can notify the programmer that they're out of date (the file has moved, been renamed or replaced) and need to be re-saved.

Posted: Mon - July 28, 2008 at 09:10 PM          


©