Important links:
The proposal itself appears below.
Many developers and users in the Mac community are concerned about the direction of file system metadata in Mac OS X. I wrote about the topic in the article, "Metadata, The Mac, and You" http://arstechnica.com/reviews/01q3/metadata/metadata-1.html and revisited it in a section of my Mac OS X 10.1 review: http://arstechnica.com/reviews/01q4/macosx-10.1/macosx-10.1-11.html The purpose of this proposal is to condense the philosophy and proposed changes found in those articles, augmented by the input of the larger community, and submit it formally to Apple as a bug report (Radar 2826368). --- PHILOSOPHY It's important to have a good picture of where one wants to be in the future. Without a clear goal, changes may seem arbitrary, and it may be difficult to weigh the value of one change versus another. Part of the concern about the direction of file system metadata in Mac OS X is due to the uncertainty about Apple's long-term goals. To put the changes that will be proposed in this document into perspective, it's important to clearly identify the proposed "goal state" of file system metadata in Mac OS X, and the philosophy behind it. We believe that Mac OS X should achieve a superior user experience through its native support for a complement of metadata that surpasses that of other platforms. Supporting a superset of the file system metadata found elsewhere means that Mac OS X files can be pared down to suit the limitations of file system metadata on other platforms when this is necessary or desirable. Similarly, transformations from a more limited set of metadata to Mac OS X's richer environment should be possible when that is desirable. Even more broadly, the overriding philosophy is to maintain, and even extend, the user experience advantages inherent in doing things not just differently, but better than other platforms--without sacrificing cross-platform interoperability. It is our position that achieving transparent cross-platform compatibility must not, and does not have to come at the cost of the superior user experience that has defined the Macintosh. People choose the Macintosh not just because it is different, but because it is better. While removing barriers to interoperability is desirable, doing so must not take precedence over retaining the advantages of the Mac. The natural extension of the willingness to sacrifice usability for compatibility is to simply use a PC. BASIC TENETS There are many problems with Mac OS X's file system metadata policies and mechanisms as of Mac OS X 10.1.1. The most important are outlined below by way of the historical advantages of the Mac platform that they negate, and the basic tenets of the philosophy described above that they violate. 1. USERS MUST HAVE FULL OWNERSHIP OF THE FILE NAME. A file's name should be entirely in the user's domain. (Note well: this means the *actual* file name, not the "displayed" file name.) Changing a file's name should never have adverse effects on its usage on a Mac OS X system. This is the interface that Mac users have lived with and enjoyed for over a decade. It is an advantage of the Mac that is so widely recognized that Apple itself has based ad campaigns on it in the past. Mac OS X 10.1 eliminates this advantage by removing the user's control over file names through its user interface, and through Apple's official human interface recommendations to developers stating that all files created by Mac OS X application should include file name extensions. Furthermore, OS X 10.1 obfuscates the previously direct and simple interface to file naming and file name display. For this grievance to be successfully addressed, it must be possible for Mac OS X users to exercise full control of their file names using the simple and direct interface that they have enjoyed in classic Mac OS. There should be no functional or interface "penalty" for choosing to operate this way in Mac OS X. 2. THE COMPUTER MUST SERVE THE USER WHEN IT COMES TO CROSS-PLATFORM COMPATIBILITY, NOT THE OTHER WAY AROUND. When a file from another platform is encountered, Mac OS X should help the user by identifying the file using any information it has available. It should be possible to bring the file up to Mac OS X metadata standards by "upgrading" its metadata in such a way that it fits into the normal flow of file usage on the Mac. Similarly, Mac OS X should be able to identify files that are not prepared for a successful existence on other platforms and offer to "down-grade" their metadata and/or convert their metadata representation in such a way that they will be successfully handled on other platforms. The mechanisms for metadata representation transformation should be provided by the OS, but the policies should be user-configurable. Mac OS X 10.1 does little of the above. It merely provides vague warnings during certain file renaming operations and hides and/or restricts file name editing. Apple further recommends that all Mac OS X applications save files with the lowest common denominator in mind. In Mac OS X 10.1, the user must serve the computer by tolerating the mysterious file names and file naming interface he is given, avoiding too much file name manipulation, and by knowing or guessing how to manually convert files to and from various metadata representations when dealing with cross-platform file transfers. 3. THE ROBUSTNESS OF FILE IDENTIFICATION AND ASSOCIATION IS MORE IMPORTANT THAN PERFORMANCE MINUTIA WHEN IT SERVES TO PROVIDE A SUPERIOR USER INTERFACE. While simple "paths" may have better performance than more abstracted file identification and tracking mechanisms, the user interface provided by the abstract mechanisms is worth the trade-off. The fact that more primitive operating systems do not "natively" support these abstractions is not an adequate reason to deprecate or abandon them. Moreover, any performance penalties and historic limitations inherent in such file access abstractions should be eliminated, not by eliminating or decreasing the abstraction, but by creating new, more advanced implementations of these abstractions in Mac OS X. PROPOSED SOLUTIONS There are many milestones on the road to the goals outlined at the start of this document. To achieve them all requires a substantial technological investment: new file systems, new APIs, new metadata standards, etc. Such a big change is not feasible in the short-term. But an awareness of the long-term goals is necessary to see past the details of the proposed short-term changes, and to keep any subsequent changes on track. The proposals below are split into several "phases", ordered from short-term to long-term. The baseline for these changes is Mac OS 10.1.1. Any Mac OS X 10.1.1 feature or policy that is not explicitly or implicitly negated by a change listed below is understood to be retained. (Please note that this includes the deprecation of Mac resource forks.) PHASE 1: EMERGENCY REPAIR The purpose of this phase is to quickly address the most serious problems and to restore faith in Apple's direction regarding file system metadata in Mac OS X. The following changes are proposed. (Note that many of the changes depend on each other and may be harmful if implemented independently.) --- * Officially recommend that all Mac OS X applications write HFS/HFS+ type and creator codes when creating a file. This change is necessary in order to maintain good backwards compatibility with classic Mac OS systems and applications, and to prevent the disappearance of one or more pieces of metadata from files created by Mac OS X applications, which would further restrict the number of possible policies that reference file system metadata (e.g. application binding or icon display) in the future. * Provide preferences at all levels (system-wide, per-app/per-user, and per-task) that determine when file type information is redundantly encoded in the file name (i.e. when file name extensions are appended to files). The system-wide preference could be set by an administrator. It would be the default for new users on the system. User-level preferences (set in System Preferences) would determine the default setting for all applications run by that user. These settings could be overridden in applications' preferences dialogs. Finally, the default open/save dialogs and other mechanisms through which files are created should include an interface for overriding any of the above settings on a one-time, per-task basis. * Make the application binding policy user-configurable. Include preset configurations that correspond to the traditional Mac OS and Windows policies, and allow custom policies to be created by the user that reference (or ignore) metadata of the user's choosing when determining which application a file will open in and which icon will be displayed for that file in the Finder. The classic Mac OS application binding policy gives precedence to the creator code (see Mac OS 9 for an example). The traditional Windows application binding policy is based on the file type (as encoded in the file name). The Mac OS X 10.1.1 application binding policy is outlined in the Mac OS X System Overview document. Those three presets should cover most users' preferences, but further configuration should be possible to ensure that every user can create a work environment that suits his or her needs. * Change any Apple applications or system services that currently unnecessarily depend on the presence of file name extensions to also understand file type metadata stored in locations other than the file name. * Allow the entire file name, including any extension, to be visible and editable at all times without any warnings. See Mac OS 9 for an example of how this should work. * Do not deprecate classic Mac OS file identification and tracking abstractions until there is an adequate replacement that duplicates and/or surpasses their abilities. * Maintain HFS+ as the preferred volume format for Mac OS X. It provides the best compatibility with classic Mac OS, and can natively support the most metadata. --- Note that many of the changes proposed above (and below) specify *abilities*, not necessarily *default behaviors.* These changes do not preclude future versions of Mac OS X shipping with default behaviors that closely correspond to Mac OS X 10.1.1 as it exists today: file name extensions could still be appended by default; "smart" file name extension hiding could still be enabled by default; Finder warnings about file name extensions could still exist by default; application binding could still follow the 10.1.1 policy by default; and so on. Retaining all of those defaults is not necessarily our recommendation, but rather is meant to highlight how little impact the changes could have on the interoperability benefits created by OS X to-date, while simultaneously enabling users to take back control of their systems, if they so choose. But enabling Mac users to set a handful of preferences and restore the working environment that they are used to should not be seen as a "temporary concession" towards "the old ways." Instead, it should be seen as a good-faith gesture towards Apple's core customers meant to at least maintain the user experience quality achieved in classic Mac OS, while buying Apple time to come up with something even better. During this transitional period, the policy defaults are not that important (again, they could conceivably be almost identical to those in Mac OS X 10.1.1), *provided* the long-term goal (outlined in the "philosophy" section above) is clearly articulated to developers and users. PHASE 2: ENHANCEMENTS The "enhancements" phase builds on the "emergency repairs" phase, but it still does not require any radical rethinking or the creation of new standards. It requires substantial new code and user interface enhancements, but all the changes are based on known technologies and standards like HFS+, MIME, type/creator codes, etc. --- * Add a "metadata services" framework to the OS for converting to and from various metadata representations (e.g. MIME, type/creator, file name extensions). This includes appending the appropriate file name extensions based on other file type metadata (or even the file contents), and removing file name extensions and setting other metadata as appropriate. These APIs would reference a customizable, per-user metadata representation mapping table much like an expanded version of Mac OS 9's "File Mapping" table in the Internet Control Panel. * Enhance the standard open/save dialog boxes to leverage the new metadata services framework by providing both system-wide and per-application preferences for each user to control which metadata representations are used when files are saved. (Note that this is an extension of one of the "emergency repair" changes, refactored and enhanced in terms of the new metadata services framework.) * Add menu commands and context menus to the Finder (and any other appropriate applications or services) that use the new metadata services framework to allow the user to convert selected files (or folders full of files, or entire volumes full of files) to and from different metadata representations. * Provide a simple interface to pieces of metadata that may, in rare circumstances, have to be manually corrected by the user. (This interface should be built on the metadata services framework, of course.) At no time should the "raw" metadata values be exposed to the user in this interface. An example of such an interface would allow access to file type metadata in an "Advanced" tab of the "Get Info" panel in the Finder. The file type would be chosen and displayed using human-readable text such as "Microsoft Word Document." The raw format of the file type metadata (e.g. a 32-bit HFS/HFS+ type code) should never be seen by the user. * Officially recommend that application developers leverage the metadata framework as they feel is appropriate for their application. Examples: A compression program could (optionally, or by default) warn users that files with a particular metadata representation (e.g. file type stored someplace other than encoded in the file name) compressed in an archive may not be readable if they are uncompressed on a foreign platform. The Finder could (optionally, or by default) warn users when copying files with a particular metadata representation to a volume whose metadata abilities are either unknown (or known to be limited), or when files are transferred via a protocol that does not support Mac OS X's native set of metadata. Files that arrive on the system via a web browser or any other network service or disk could (optionally, or by default) be brought up to "native" metadata standards by extrapolating and filling in any missing metadata according to the per-user mapping tables described above. * Extend, enhance, or create robust, high-performance APIs for file identification and tracking. --- The "enhancement" phase brings Mac OS X closer to the goals outlined at the start of this document. It does not rest on the laurels of the "emergency repairs" phase, thinking that they had "satisfied" the desires of the "old timers." Instead, it recognizes that those changes were just the start of a journey towards a new destination (rather then a side trip on the way towards the lowest common denominator). PHASE 3: RADICAL CHANGES This phase requires substantial work and cooperation with the rest of the industry. In order for this phase to succeed, work should be concurrent with the other phases. By completing this phase, Apple will once again have established itself as a leader and an innovator in the computer industry. --- * Create a new volume format that supports arbitrarily extensible metadata and robust data integrity (e.g. journaling), while providing extremely high performance. This new volume format will be the foundation for future metadata initiatives. It will provide Mac OS X with the ability to truly handle a superset of all metadata found on foreign systems. Combined with the changes in phase 1 and 2, it will enable Mac OS X to be the "skeleton key" of data formats, able to understand, store, and translate files created on any other file system or platform. * Lead, initiate, or participate in the creation of open standards for file metadata representation. MIME was a start, but it is limited. Robust, hierarchical, extensible, standardized, and above all, *open* standards for file metadata representation are necessary for the future of computing. Apple should work with the rest of the industry on this problem. FireWire is one example of a successful *open* technology created by Apple that has found a place in the market and improved the products we use. If Apple can do it once, it can do it (even better) again. This new metadata standard should include both a standard taxonomy for basic attributes like file type, name, and dates, as well as an extension mechanism for domain- and vendor-specific attributes. There are many similarities to the development of XML and its various namespace mechanisms, schema standards, domain-specific DTDs, and so on. While a new metadata standard does not necessarily have to be based on XML, the development model XML has followed is a good guide. * Transition Mac OS X's "native" metadata representation to one based on the new volume format and new metadata standards. Deprecate HFS/HFS+, type/creator codes, and other vestiges of classic Mac OS (while providing easy translation to and from that representation via the metadata services framework, of course). --- ADDITIONAL NOTES DARWIN/UNIX All of the above implicitly applies to "Mac OS X applications", meaning GUI applications based on Carbon, Cocoa, Java, or similar high-level frameworks that make up "Mac OS." The Unix side of things should be addressed differently, with an awareness of (and respect for) the history, conventions, and customers of that environment. The primary strength of the Unix layer is its compatibility with other Unix-flavored OSes, and this must not be compromised unnecessarily. Command-line tools that build on the metadata services framework and other higher-level APIs are the appropriate level of integration for the Unix layer. Current examples include the "defaults" program, and the "SetFile"/"GetFileinfo" commands. They provide integration for the Unix environment without breaking any of its conventions. Further "additive" integration is possible through extension modules (e.g. apache's "mod_hfs") and even new options to basic commands (e.g. new flags to the "ls" command that list Mac OS X native metadata), but it is not necessary or desirable to try to fully integrate all the Mac OS X guidelines described above into decades worth of Unix software. The whole Darwin layer should be treated as a separate, less abstracted OS of its own--which it is, after all. Its user interface should not influence, or be influenced by, the guidelines that apply to the "Mac OS layer." METADATA REPRESENTATION CONVERSIONS The process of converting from one "metadata representation" to another deserves some clarification. The most basic premise of such conversions is that the "baseline" representation must contain a minimum complement of native Mac OS X metadata. For example, imagine a file named "resume.doc" arriving on a Mac OS X system with no metadata other than the type information encoded in its file name (i.e. ".doc"). That file can be "promoted" to the Mac OS X metadata baseline by adding the appropriate complement of metadata (via the metadata framework, and according to the user-configurable metadata representation mapping tables). (In phases 1 and 2, "the appropriate complement of metadata" would be HFS/HFS+ type and creator codes. In phase 3, it would be defined by the new metadata standards created therein.) Note that such a "promotion" does *not* necessarily imply the *removal* of any metadata, including the file type encoded in the file name in the form of a filename extension. Once a file is "promoted", the file name is safely back in the user domain and does not have to be modified in any way by the system. (Such functionality should exist in the metadata framework, however, and should be available if the user requests it.) Going in the other direction, "demoting" a file to a more primitive metadata representation must not remove the baseline complement of Mac OS X metadata. For example, imagine a Mac OS X Word document named "My Resume" that needs to be shared with other platforms. The most likely metadata representation conversion requires the encoding of the file type information in the file name (e.g. by appending a ".doc" extension). If the file continues to exist on a (possibly shared) Mac OS X disk (HFS/HFS+, or even UFS with its "._" metadata files), the Mac OS X metadata should remain intact even after the addition of the ".doc" file name extension. Again, the metadata representation conversion process adheres to the basic philosophy covered at the top of this document. Mac OS X should achieve cross-platform compatibility and a superior local user experience by maintaining a large, advanced complement of metadata that represents a superset of that found elsewhere. This collection of metadata should be created and preserved whenever possible. Preparing files for survival on other platforms does not mean removing Mac OS X's rich set of metadata. It merely means redundantly encoding file metadata as necessary to ensure that a file is usable if its metadata collection is necessarily pared down by a transfer to a more limited environment. Note that the conversion techniques described above are not constrained to any particular events. Metadata representations may be chosen and/or translated at any time, including (but not limited to) when a file is saved by a Mac OS X application or when a file arrives on or leaves a Mac OS X system. Maintaining Mac OS X's baseline complement of rich metadata is the important part, not choosing which metadata representation is chosen at any given time. The latter should be controlled by the user. METADATA DETERMINATIONS In the world of Mac OS X, "Mac OS X metadata" rules. Take file type as an example. A file's type is determined by looking at Mac OS X's native file type metadata (the HFS/HFS+ type code in phases 1 & 2, something else in phase 3). If there is no Mac OS X file type metadata, the file type is determined by a cascade of hints and clues, all of which are secondary to the Mac OS X metadata that was missing. For example, file type information encoded in the file name may be checked next, triggering a subsequent look-up in the metadata mapping tables to determine what Mac OS X native file type is indicated by a ".qux" extension, if any. As a last resort, the file contents themselves may be examined using /etc/magic-style byte ranges and values or some other more advanced system. In all cases, the outcome of a file type determination process is a Mac OS X file type (again, an HFS/HFS+ type code in phases 1 & 2, something else in phase 3). This clear prioritization of "Mac OS X metadata" over the redundant representations required for cross-platform compatibility is central to this proposal. It means that "file type" and "file name extension" are two different things, for example. "File type" is determined by the process described above. It *may* be arrived at by looking at a file name extension and then looking up the corresponding file type in the user's metadata mapping tables, but there's no guarantee that the file name extension will have any bearing on the file type. If proper Mac OS X file type metadata exists, the file name extension is just some characters at the end of the file name. Remember, the file name should be in the user's domain. Keeping the file name in the user domain requires actually assigning proper Mac OS X metadata to "foreign" files as soon as the proper native metadata can be derived from the information available. Again, this does not necessarily imply the removal of "foreign" metadata such as file name extensions. Also note that this prioritization of proper Mac OS X metadata over the various foreign representations does not mean that these foreign representations cannot be referenced by Mac OS X policies. For example, a user may choose a "Windows style" application binding policy which is based entirely on the file name extension. As far as Mac OS X is concerned, this policy is based entirely on the file *name* (which is, after all, a proper piece of Mac OS X metadata). File metadata that simply does not exist in a "foreign" file in any form may be added in order to aid in classic Mac OS compatibility and/or the application binding process. For example, files from other platform rarely provide any information about which application created them. But Mac OS X may, at the user's discretion, add creator metadata as per the metadata mapping tables. Metadata determination strategy summary: * Try to understand every possible metadata representation. * Prefer proper Mac OS X metadata over all other representations. * "Promote" files to proper Mac OS X metadata when possible. * The outcome of all metadata determinations in Mac OS X should be a piece of proper Mac OS X metadata, regardless of which pieces of information contributed to the determination. THE EXISTENCE OF METADATA VERSUS THE POLICIES BASED ON IT The distinction between the existence of metadata and the policies based on it is a very important concept. "Promoting" a file to a proper set of Mac OS X metadata should never be seen as a "harmful" process. Adding file creator metadata, for example, is sometimes considered harmful in Mac OS X as it exists today due to the application binding policy that prioritizes creator metadata over all else. Since the overriding philosophy of Mac OS X metadata should be that "more metadata is better", any situation in which the existence of metadata is considered harmful to the user experience must be dealt with by allowing the *policy* that references the "harmful" piece of metadata to be changed, *not* by recommending the removal of the piece of metadata. Making the application binding process user-configurable is the best example of this value system in action. Remember, the more metadata that is available, the richer the possible interactions with the data can be. And while it is trivial to ignore metadata, it is impossible to reference it once it is gone. THE MAC OS X METADATA USER EXPERIENCE The goal of the user experience is to allow Mac users to think in terms of a "vocabulary" of file metadata defined by Apple. Every time a computer user mentions a "dot-pee-ess-dee file" or a "dot-tee-ecks- tee" file, the user experience vocabulary of the Windows platform (something that is not under Apple's control) is reinforced. Before the advent of Mac OS X, Apple largely succeeded in defining its own user experience vocabulary for the Mac platform that was much less obscure than that of the PC world--more "user friendly", as it was known. The ultimate "friendly vocabulary" was the GUI itself. But even when the GUI became commonplace, the Mac user expereince still enjoyed significant advantages due to its more sensible and friendly file management vocabulary. Mac OS X has reversed that trend significantly. Every mention of "Mail-dot-app" and "dot-pee-list files" erodes more of the Mac platform's historic user-friendliness. By focusing again on defining a consistent and powerful vocabulary for file metadata in Mac OS X that is more friendly that found on other platforms, Apple can regain its leadership position in this area. By endeavoring to understand and handle the metadata vocabularies of other platforms, Apple can bring the Mac platform beyond its former ease of use by providing its users with the power to deal with any file that comes their way--and, more importantly, to quickly and easily bring it into their preferred vocabulary of the Mac. A Mac OS X user should feel secure in the knowledge than any file he encounters can be dealt with using the simple and sensible vocabulary defined in Mac OS X, regardless of the (possibly obscure) vocabulary of the originating system. Similarly, a Mac OS X user should feel confident that the operating system, applications, and his own actions will work together to ensure that his files will survive happily on other platforms, regardless of his knowledge of the "vocabularies" of those systems. CONCLUSION The road to a better future for metadata in Mac OS X is long, but staying focused on the correct destination is half the battle. Keep your eyes on the prize, as they say. The prize is a superior user experience on the Mac. That is the number one goal. As I hope was made clear in this proposal, this does not preclude vastly improved cross-platform interoperability. Chasing standards that others in the industry have been stuck with for decades and are desperately trying to transition away from is not a formula for success. Apple must innovate its way to a better tomorrow. Furthermore, the rest of the industry does not stand still. In order to provide customers with compelling reasons to keep buying Macs, Apple must continue to improve its user experience over time. Given Apple's market share, compromises are sometimes necessary to maintain acceptable levels of interoperability. But in cases where alternate solutions provide the same interoperability improvements without sacrificing the favorable aspects of the Mac user experience, Apple must do everything in its power to implement them as such. Any part of the Mac OS user experience that duplicates the experience on another platform ceases to be a compelling reason to buy a Mac.