![[Unicode Encoded Logo]](uencoded.gif)
Unleash Your Multilingual Mac
(updated 10/25/2003)
This page is for Jaguar. For Panther go here, and for Tiger go here.
One of the best-kept secrets about MacOS is the built-in support it
contains for reading and writing languages beyond English, including ones
that use non-Latin scripts and characters. This document explains these capabilities and provides various resources to help users exploit them to the maximum degree possible. Comments and additions from readers are most welcome.
OS 9 can handle input for a considerable number of languages
out of the box: Danish, Dutch, Finnish, French, German, Italian,
Norwegian, Portuguese, Spanish, Swedish, Czech, Hungarian, Polish, Slovak,
Russian, Ukrainian, Bulgarian, Hebrew, Arabic, Persian, Hindi, Nepali, Gujarati,
Punjabi, Chinese, Korean, and Japanese.
By default, multilingual input capabilities are invisible to the Mac OS 9 user. To turn
them on, you need to go to the Keyboard control panel and select at least
one keyboard layout in addition to the default. Your menu bar will then include a new
flag icon at the far right. Clicking on this menu will show whichever languages or alternative keyboards
you have checked, and selecting one will activate it.
The Keyboard control panel should allow you to choose Danish, Dutch, Finnish, French, German, Italian,
Norwegian, Portuguese, Spanish, and Swedish. For the others, you will need to get out
your system disk and do a Custom Install for the language kits of interest.
Afterwards also look in the CD Extras file on the system disk, where the
Language Kits CD Extras folder has some additional fonts and keyboards.
A nice tutorial on installing the kits is at NISUS Software
If your system does not appear to offer a Custom Install with language kits, look on the CD in the folder OS9 Applications/Apple Extras for an installer. Also look for a folder with the same name on your hard drive if you have used Restore disks to install your OS9. Some latest versions of OS 9 seem to be missing Persian and Unicode Hex input, and Punjabi is called Gurmukhi. The Persian and Unicode folders from standard 9.2.1 can be found here.
After you install a language kit, except for Chinese/Japanese/Korean, you need to go back to the Keyboard Control Panel and check that the keyboards you want to use are activated. Click on the "Script" menu to choose the script you are interested in to make its keyboards visible for selection. If you are naming files in another language, you may want to go to the Fonts tab of the Appearances Control Panel and choose an appropriate font.
The keyboard shortcut for switching from one "script" to another is Command+Space. The scripts are Roman, Central European, Cyrillic, Arabic, Hebrew, Devanagari, Gurmukhi, and Gujarati. By going to the Keyboard Control Panel and poking the Options button, you can activate another shortcut, Command+Option+Space, which switches from one language to another within a given script. Note that if the same keyboard shortcut is used for an application, the script/language switching function will take precedence, and the first of these cannot be turned off. The only workaround is to deactivate the other scripts or to always type a space before invoking Command+Space (a common shortcut for "zoom") in the application.
Software for working in many other languages not included in the standard US OS 9 is available on the Internet and commercially. See the "Other Resources" section at the end of this page for some suggestions about where to look.
Some more limited multilingual access was included starting
with OS 8.5. An explanation can be found at
MacInTouch OS
8.5 Special Report: MIA
Prior to OS 8.5, working in a foreign language usually required installing separate software packages. Apple itself at one time sold individual Chinese, Japanese, and Korean language kits, and these can sometimes be found on eBay if you need one for an old system.
Note that standard US OS 9 and earlier comes with only one "system language," in which all the standard menus and dialogues are written. "Localized" versions, with the system in a language other than English, are sometimes available in other countries. If you want one of these on your machine you will have to buy it separately and install it in place of the existing system (or on a separate partition).
Whether a particular application has a localized version for a particular language depends on whether its authors have provided it. You may need to use the Language Register program in Applications/Utilities to enable all the features of a localized application. If having a non-Roman system font is critical for your purpose, a hack to allow this is found here.
Setting up your browser to read foreign language web sites usually requires having the proper
font installed for the language in question and setting some browser parameters.
In particular, you need to go into Preferences/Fonts and make sure the right
font is selected under the right language. You may also need to go to the
Character Set item under the View menu and set it for the right language.
Sometimes experimentation is necessary because there is more than one choice
for a language. The "user defined" option can be useful where
your language/font does not fit one of the other categories.
For a nice explanation of how to do this, see Alan Wood's page on Setting
Up Mac Browsers for Multilingual Support at
Mac Multilingual
Browser Page
For best results with foreign languages, you should use the latest browsers, such as Netscape 7 or Mozilla 1.2, rather than the versions of IE and Netscape that come installed with OS 9.
If you want to write in a foreign language, the procedure is usually pretty
simple. You first need to open a Mac text editor or word processor that
is able to handle other scripts (known as "WorldScript-savvy").
SimpleText, WorldText, AppleWorks 5, Nisus Writer, Word Perfect 3.5, BBEdit 6.0, Mariner Write 2.0,
and Word 2001 are examples. For scripts that run from right to left, like
Arabic and Hebrew, we have heard that Nisus
Writer and Mariner Write give
the best results.
Then you go to the Flag (keyboard) menu and select the language. To see
how the keys are mapped, go to Keycaps in the Apple menu. A small keyboard
will appear on the screen with the foreign letters in place of the usual
ones (you may have adjust the font in the Keycaps Font menu to get this
right). You can type directly into the document from the real keyboard or
type on the screen keyboard and copy/paste the result.
You might want to print out a copy of Keycaps for the language you are using.
You can try using the Mac's built-in screen capture function (Command-Shift-3
or Command-Shift-4), or other third-party capture utilities you may have,
but sometimes these make the keys go back to normal and won't work. One
program I found that seems to do the trick is Gif-gIf-giF.
When you can't find a keyboard with specific letters that you need, all is not lost. Many fonts contain characters beyond those which the keyboard can access or which require obscure key sequences. For example, Mac Central European fonts have macron vowels which are otherwise hard to find. Useful utilities for locating and typing these are PopChar and FontBuddy..
For the more unusual scripts that use keyboards you can download helpful
manuals from the Apple web site: Cyrillic (Part 030-7977), Indian (U96600-025),
Arabic (030-7912), and Hebrew (030-7978). For a full list, see.
Apple Language Kit Manuals
What about Chinese, Japanese, and Korean, which may require the use of thousands of ideographs? For these the Mac kits
include special input methods, operated via an additional "pencil" menu located next to the Keyboard menu. Each language has several options for generating final text.
Finding English-language
documentation on how these methods function requires some extra work, since the manuals
are not provided on the OS 9 CD. "Help" in Japanese and Chinese is, however,
available from the on-screen Help Center, and in Korean via the "pencil" menu.
Fortunately, manuals for Chinese input are available at the Apple site (Parts 034-0602 and 030-4900). And
a good explanation has also been put into the Chinese-Mac
FAQ:
Chinese-Mac FAQ User Guide
The Mac's Traditional Chinese input system covers about 13,000 characters and gives you the choice of three modes using strokes/radicals (Cangjie, Jianjie, and Dayi), two using phonetics (Pinyin and Zhuyin/Bopomofo), plus Big5 hex codes. For Simplified Chinese, covering about 6,700 characters, there are two modes using strokes/radicals (Wubi Xing and Wubi Hua) and one using phonetics (ABC/Pinyin), plus GB numeric codes (Quwei). In both cases, hitting the space bar after input generates a list of possible characters for selection.
Many users find Apple's Traditional "pinyin" input option to be too primitive for their needs, as it cannot parse sentences or phrases, but requires each character to be chosen separately. Hanin and BoPoMoFo are non-Apple input methods which are considerably "smarter" in this regard. Info on Hanin can be found on the Chinese-Mac page. Another system is Cihui.
For Korean the manual can also be obtained from Apple (Part U95602-004). The Mac's "Power Input Method" provides the user with Jamo, Romaja, and (Japanese) Kana phonetic keyboards to create Hangul characters. You can also transform Hangul and Romaja into Hanja (Korean Chinese characters) using a 5,000 character dictionary: Typing option-return or control-return after your input will generate a list of Hanja for selection.
The OS 9.1/9.2 Korean Language Kit fixes some font problems with 9.04's Hanja converter and adds a second input method for Korean, called "Hangul Direct," but without any English menu or explanation regarding how it works. A rough translation of the "pencil" menu is at:
Hangul Direct IM Menu in English
During May, 2001 updated fonts for Korean and Traditional Chinese were made available for System 9.1 only (also via the Software Update control panel).
For Japanese the manual is also available online (Part 030-4174, 22MB). This is particularly important because (unlike for Chinese and Korean) the "pencil" menu cannot be switched to English. For a rough translation of the key menus see:
Japanese
Language Kit Menus in English
The Mac's "Kotoeri" input method offers a choice of Romaji, Hiragana, and Katakana phonetic keyboards. Romaji can be transformed automatically into either Hiragana or Katakana, and hitting the space bar twice will convert Hiragana to a list of possible Kanji (Japanese Chinese characters). You can also key in Kanji directly via a dictionary that lets you search for about 6000 characters by radical or by any of four numerical codes. For info on other input systems favored by intensive Japanese users, ATOK and EG Word, see the Other Resources section at the end of this page.
A fourth language, Classical Vietnamese, uses a subgroup of about 3000 Traditional Chinese characters (Chu Han) or a different, unique character set derived from these (Chu Nom -- about 2000 glyphs). For Nom one must download a special font (look under "Vietnamese" in the "Other Resources" section) and use it with the Japanese Language Kit.
For info on the (limited) Unicode capabilities of OS 9, see the section devoted to this topic below.
These comments are based on OS X 10.2.8 (build 6R73), Jaguar, issued 10/3/03.
OS X is a complex animal. For the first time, Apple has deployed Unicode on a broad scale in its operating system, which provides the potential for great linguistic flexibility. At the same time, OS X operates in 3 different modes -- Classic, Carbon, and Cocoa -- each of which have different capabilities. The Darwin OS, based on Unix, is also available. Plus OS X's features have changed significantly from versions 10.0 to 10.2. So it is difficult to generalize about how applications, languages, and modes work together. Apple Article 107379 provides some basic info on Jaguar's language capabilities.
Unlike OS 9 and earlier Mac systems, which were produced in localized versions for foreign countries, OS X offers the choice of 15 system languages out of the box -- English, Japanese, French, German, Spanish, Italian, Dutch, Swedish, Danish, Norwegian, Finnish, Traditional Chinese, Simplified Chinese, Korean, and Brazilian Portuguese. The system language, which affects system-wide menus and dialogues, can also be changed, for your next login, via the Languages menu of the International pane (in System Preferences of the Desktop menu). Just move your preferred language to the top of the list.
Other system languages may be available via kits from local Apple branches. Examples I am aware of (for 10.1) are Polish, Russian, Czech (also 10.2), and maybe Greek and
Turkish. Russian is available for 10.2 here, and Ukrainian here.
If you poke the "Edit" button in the Language menu to see all varieties available, you get a list of some 64. Other than the 15 system languages, these relate primarily to user preferences regarding menus and dialogues for applications, which can have their own localizations independent of the system.
If you move a language to the top of the Languages Pane for which you do not have the localization files installed, you may disable your system. To recover from this, a reinstall may be necessary, or at least installing the files for the language you chose, from the Optional Installs folder of your installation CD.)
Note that the system language is distinct from the keyboard language, which determines what you can type. The latter is set from the Input Menu of the International Pane. Also the language of the login function is fixed at whatever is chosen upon installation, and can only be changed by reinstalling (unless you use TinkerTool System).
If you do not want all the 15 system languages, be sure to do a Custom Install. To get rid of system languages after they have been installed (normally to liberate hard drive space, probably 150-200MB), you can check out the programs Monolingual, and DeLocalizer.
If you buy a machine with OS X outside the US, you should be aware that the OS 9/Classic that comes with it will most likely be in the local language only. If you want English instead, you will probably need to buy and install another copy of your system software with an English OS 9/Classic.
Typing foreign languages can be done in either OS X proper or in Classic. OS X switches automatically to OS 9, operating in "Classic" mode, whenever you open an application which is only designed for the older systems. In this situation your system language is that of the OS 9 which is being used, and you have access to all the language kits that you have installed on it. See the OS 9 section above for installation info. If your system does not appear to offer an OS 9 Custom Install with language kits, look on the CD in the folder OS9 Applications/Apple Extras for an installer. If you want to use the carbonized AppleWorks in Classic mode, with all the System 9 language kits available, control-click on the application icon, select Show Package Contents, open Contents and then MacOSClassic, and double-click on AppleWorks6.
If you are in OS X proper you can select keyboards for English, Danish, Catalan, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish, Vietnamese, Russian, Bulgarian, Ukrainian, Polish, Hungarian, Czech, Slovak, Japanese, Korean, Chinese (simplified and traditional), Arabic, Persian, Hebrew, Devanagari, Gujarati, Gurmurkhi (Punjabi), Irish, Croatian, Greek, Hawaiian, Icelandic, Romanian, Slovenian, Thai, and Turkish, plus Unicode Hex, US Extended (formerly called Extended Roman), and the Character Palette . To activate these you go the Desktop menu, then to System Preferences, International, and Input Menu. (Note that the Arabic, Persian, Hebrew, and Indic scripts are not available in 10.0 and 10.1, and that Chinese and Korean may or may not be present in these versions, as explained at the end of this page.)
If you do not do a Custom Install and check the box for "fonts for additional languages," you could be missing some fonts that may be useful or essential for Arabic, Hebrew, Indic, Thai, Cyrillic, Greek, Turkish, and Romanian. If you have the 2 CD system installer, you can add these separately by inserting the second OS X 10.2 disk, opening the packages named "AdditionalFonts.pkg" and "AdditionalAsianFonts.pkg" and running the installers.
To type "accented characters" you do not necessarily need to switch to a specialized language keyboard. The standard Mac US keyboard has "dead keys" for 5 common accents activated via the Option key, and the US Extended keyboard has the same for many other diacritical marks. A Windows-style US International keyboard is also available here. Opening the KeyCaps Utility (located in Applications/Utilities) and poking the physical Option key will indicate how this works. Also here is a chart.
If you need to make unusual accented characters, like macronned vowels in a Carbon app like AppleWorks or Word, you can try the Czech or Slovak keyboards and use one of the fonts ending in CE.
Many of the available keyboards can be used with all the Carbon and Cocoa programs that run on OS X. Just as in OS 9, there is a "Flag" menu at the top right where you select the language. When Japanese, Chinese, or Korean are selected you will get the "pencil" menu and the other facilities of the input methods familiar from OS 9, which are essentially identical, although the Japanese IM has been upgraded somewhat, and the available Chinese characters are increased to 28,000. The Korean input method is that of System 9.0 rather than the somewhat different one in 9.1/9.2. For details on how to use these input methods and links to manuals, see the section above on Typing Foreign Language Texts in OS 9. For Chinese (and sometimes also Japanese and Korean as concerns applications) the key info site is the
Chinese-Mac FAQ
Apple's Traditional Chinese "pinyin" input option is considered by many users to be not very "smart." Unfortuately the alternatives often used under OS 9, Hanin and BoPoMoFo, have not been available for OS X, until Hanin was included in 10.2.4.
Because of the Unicode capabilities of OS X, Chinese/Japanese/Korean users potentially have access to a vastly increased number of characters (used mainly in historical documents) compared to OS 9. See the section on Unicode below for more information.
"Unicode" keyboards, including Thai, Arabic, Persian, Hebrew, Devanagari, Greek, Romanian, Slovenian, and several others, normally require a Cocoa program to function, which excludes MS Office X, IE, and AppleWorks. Examples of Unicode-savvy programs are the OS X text processor called TextEdit, the Stickies screen notes program, the Address Book, the "compose" mode of the Mail program, and the Finder. The only Carbon programs I have found in which the Unicode keyboards are available are the latest browsers, SUE, and WorldText 1.x (found in the OS X Developer Tools).
Unicode Word processors worth looking at include Mellel (which has superb font control and does CJK and Arabic/Hebrew, plus characters beyond Plane 0), Nisus Writer Express, and ThinkFree Write.
Unicode-capable HTML editors include Taco HTML Edit, DreamWeaver MX 2004, Mozilla Composer, and OmniWeb. MUWSE provides a special mechanism for using the Unicode keyboards to generate strings which can be transferred to its documents, where they appear in hex NCR (&#xnnnn) form. A more elaborate DTP and web design program is Create. Some other programs that can use the Unicode keyboards are the OmniOutliner 1.2 outlining application, the OmniGraffle diagramming/charting program, and the notepade program MoosePad.
For a Unicode-savvy database program, you will have to resort to the Unix side of OS X and use programs like MySQL 4.1 (works for the BMP/Plane 0) and PostgreSQL (works for all Unicode planes). The iChat and MSN Messenger chat programs are also Unicode compliant.
For more info on the significance of Unicode and on using the US Extended and Unicode Hex keyboards, see the section on Unicode below.
For a non-Apple Cyrillic keyboard and font that covers Slavonic, Old Church Slavonic, Russian, Byelorusian, Ukrainian, Serbian, Bulgarian, Macedonian, and non-Slavic Cyrillic scripts, check out Slavija.org.
Devanagari input has a bug which makes it difficult or impossible to edit text in Cocoa apps after it has been composed. Hopefully this will be fixed in a future update. (One work-around for conjuncts involves typing them as c-c-leftarrow-f-rightarrow instead of the usual c-f-c.) Input in RTL scripts also reportedly also has some problems, including display overlaps, cursor jumping, inability to handle diacritics correctly, and difficulty embeding non-RTL scripts.
An addition which can help deal with some glitches in OS X's handling of Hebrew and Arabic is DirectionService.
For a non-Apple Chinese input system, you can check out PanALEX. The Chinese IM has a large number of input options, and there is onscreen help available in English. For Taiwanese, see Jason Cox's Page.
If you want make your own keyboards, there are a couple different approaches, often depending on whether a Unicode keyboard is required. Apple Tech Note 2056 has some information on various options. For Unicode keyboards, you can compose an XML .keylayout file along the lines of those contained in /System/Library/Keyboard
Layouts/Unicode.bundle/Contents/Resources. An online utility for doing this can be found here. Another keyboard editing program can be found here. For non-Unicode scripts, you can take an existing keyboard from OS9, rename it as a .rsrc file, and put it into /Library/Keyboard Layouts/. You can also modify such keyboards using the ResEdit program. Here are some instructions for editing a kchr resource and this site has similar instructions in French.
If there is a Unicode keyboard that you want to use in a non-Uncode-savvy app, like AppleWorks, you may be able to modify it to work in some cases. For example, in the Slovenian.keylayout file, change the id code number to something positive and also change the keyboard group number from 126 to 29 (for the CE script). Then change the name to SlovenianCE.
Unicode keyboards for Lao, Tibetan, Biblical Hebrew, Esperanto, Pinyin, Hausa, and Navajo (plus alternative keyboards for Farsi/Persian, Brazilian, Polish, Canadian French, and US International) can be found here. Also available is a super-comprehensive Latin Extended Keyboard. and a keyboard for Aramaic. Another source for QWERTY keyboards in several languages, plus Armenian, Georgian, and Thaana, is here.
For information on IPA fonts and keyboards, see the Other Resources section at the end of this page.
To install keyboards that you download or create yourself, put them in Users/username/Library/Keyboard Layouts (or in Library/Keyboard Layouts if all usernames need access to them). Then go to System Preferences/International/Input Menu and check the box for the new keyboard. You may need to log out and log in again to have it appear.
A number of online pseudo-keyboards, covering Armenian, Bengali, Cherokee, Cirth, Devanagari, Etruscan, Georgian, Gothic, Gujarati, Gurmukhi, Kannada, Khmer, Lao, Malayalam, Myanmar, Ogham, Old Italic, Old Persian Cuneiform, Oriya, Runic, Tamil, Telugu, Tengwar, and Ugaritic, are available here. Many of these can be used with the more advanced browser to create strings of odd scripts for copy/paste operations. For best results, use Opera 6.
OS X includes a system-wide spell-checker, which is accessible from any Cocoa program via the Edit/Spelling menu. In addition to US English, 10.2 has dictionaries for British English, German, Spanish, French, Italian, Dutch, Portuguese, and Swedish. To switch dictionaries go to Edit/Spelling/Spelling/Dictionary. A non-Apple Cocoa spell-checker covering Breton, Catalan, Czech, Danish, Dutch, German, Greek, Esperanto, Faroese, French, Italian,Norwegian, Polish, Portuguese, Russian, Spanish, Swedish, and Ukrainian is CocoaSpell. The stand-alone spell-checker Excalibur can be used in both Cocoa and Carbon environments, and has dictionaries available for British, Catalan, Danish, French, Dutch, German, Haitian, Indonesian, Italian, Manx, Norwegian, Portuguese, Spanish, and Swedish.
For commerical bilingual dictionaries and related tools, check out the products of Ultralingua.
For reading foreign language web pages, the browser which comes with OS X, IE 5.2, is definitely not the best. Among the standard alternatives, the latest version of Mozilla, Opera, Chimera Navigator, and Netscape offer much better performance with a wide variety of scripts. The Mac-only browser OmniWeb is also excellent but cannot do Arabic and Hebrew. Apple's Safari, released in beta in January, 2003, is of the same quality and can do Arabic/Hebew as well.
When appropriate fonts are installed, the better browsers have many encoding choices and can display a large number of languages and scripts, even on the same page. They convert all incoming characters to Unicode, and then search all installed fonts for corresponding glyphs.
Opera 6, the latest version of Mozilla, OmniWeb 4.2 and up, and (with some glitches) Safari are the only browsers which can read pages that employ Unicode beyond the Basic Multilingual Plane (BMP)
A good source for info on several OS X browsers is the
Mac Multilingual
Browser Page
Reading web pages in Arabic, Persian, or Hebrew is more difficult than for most other languages, and often depends on the browser, your fonts, and the page. Opera is probably the best browser for Arabic. The latest versions of the Mozilla web browser for OS X are reported to do a better job of displaying Hebrew than the alternatives. The Camino browser is essentially the same. See the section on Troubleshooting below for suggestions if Arabic pages do not display properly.
OS X, unlike OS 9, can make routine use of big Windows fonts which contain characters for dozens of languages. Note, however, that viewing complex scripts which require reordering, contextual shaping, or stacking of characters (such as Arabic, Devanagari, Tibetan, Classic Mongolian, and Thai) requires a combination of font and rendering engine technology. On the Mac this is accomplished via an AAT (Apple Advanced Typography) font and ATSUI, while Windows uses an OpenType font plus Uniscribe. The result is that when you select a Windows font in OS X, complex scripts are unlikely to display correctly, and an Apple font should be used if available.
The largest easily available Windows font I am aware of is Code2000 with 30,000 characters. If you have access to Arial Unicode MS, provided with certain MS products, this is still larger, with 50,000 characters. The multilingual capabilities of various browsers under OS X can be demonstrated by installing these and going to UTF-8 Sampler or Alan Wood's Unicode Sample Pages.
OS X has a built-in font inspector called the Character Palette, found in the Flag (keyboards) menu. This shows all the characters in a selected fonts in any Unicode range, and allows you to copy/paste them into documents. Note that copy/paste will not work for many characters into non-Unicode-savvy apps like Word X and AppleWorks. Similar utilities are UnicodeChecker, and Unicode Font Info. UnicodeChecker includes some very useful conversion capabilities accessible from the Services menu (as long as you install the application in the Applications folder.)
TextEdit can save plain text in 82 different encodings. To see them all, open the encoding menu in the Save dialogue and check "Customize Encoding Menu."
Further details on OS X Unicode reading and input capabilities, including CJK Extension B in Plane 2 and scripts in Plane 1, is contained in the section below on Unicode.
OS X also allows the sharing of files named in multiple languages over a network. If you use the Go/Connect to Server dialogue, after you enter a password you can select the character set from among Arabic, Central European, Chinese, Croatian, Cyrillic, Greek, Hebrew, Icelandic, Japanese, Korean, Romanian, Thai, Turkish, and Western.
The Mail program included with OS X automatically searches for glyphs in installed fonts for whatever encoding is indicated on the incoming text. The user can also change the encoding for received messages from the Format/Text Encodings menu. There are 35 options covering Western, Japanese, Korean, Arabic, Hebrew, Greek, Cyrillic, Thai, Chinese, Central European, Vietnamese, Turkish, Baltic Rim, and UTF-8. Unfortunately these choices are grayed-out for sending messages, which is precisely where they are really needed. Mail on its own occasionally uses something uncommon or completely inappropriate, which can make it very hard for recipients to read your mail.
One way around this problem is to modify the localization of your system. For example, in order to send Simplified Chinese as GB (rather than the uncommon UTF-8) or to send Traditional Chinese mail as Big5 (instead of the erroneous ISO-2022-JP, which will happen if all the characters are contained in the latter), temporarily put Simplified or Traditional Chinese at the top of the list in System Prefs/International/Languages before opening Mail. If you want to keep Mail in English, deselect Traditional Chinese in Mail's Get Info dialogue. Alternatively you can put Mail into Traditional Chinese via the Get Info dialogue and leave System Prefs alone. Or you can set up a new User with these parameters. Or you can add a character to your message which is definitely in Big5 and not in ISO-2022-JP (like U+2013). Or you can force the message into UTF-8 by adding the character U+2015. The same trick will likely work with other languages where Mail's outgoing encodings are wrong.
A Unicode-savvy mail client similar to Mail is GyazMail. The program has two abilities still lacking in Mail, namely it can set the default encoding for incoming messages and also set the encoding for outgoing messages. To activate the latter, you must go to View/Customize Toolbar when in "new message" mode and add the Encoding selector to the toolbar. GNUMail.app has similar capabilities. Another alternative is to use the Mail programs included in Mozilla or Netscape 7 for OS X, which seem to be able to handle outgoing encodings without problem.
Cyclone and Codepage Converter are good utilities for translating texts from one encoding to another.
OS X includes the Darwin OS, based on the FreeBSD variety of Unix, which can be accessed via the Terminal program in Applications/Utilities. Terminal offers the choice of two shells, tcsh and zsh, and 12 encodings (including ISO-2022 and EUC Japanese, Latin 1 and 2, MacRoman, and UTF-8). To see file names in their proper script, use ls -v. The Unix X Window GUI (one version here, another here) can be added to Darwin, and in principle this can be internationalized by modifying various parameter files. Open Office is a suite of programs designed to run in X Window which should eventually have multilingual capabilities. For info on this, you can consult the OO OS X testing forum at OOoDocs and Apple's X11-user mailing list.
Finally, OS X, via the program Virtual PC for Mac allows you to run WindowsXP Home Edition (and other Windows OS's), which have extensive language capabilities of their own.
The large variety of mail clients, platforms, and encodings, plus the potential influence of intervening internet equipment, makes it impossible
to generalize regarding the sending/receiving of email in foreign languages on the Mac. An internet search using the keywords "email" and the name of the language will normally produce a page or two of wisdom on how best to do it.
One technique worth trying is to employ the mail client contained in Netscape or Mozilla, since these are programs which lots of people have (even if they don't normally use it), and the encoding for both sending and receiving can be set via the View/Character Set menu item. Outlook Express has similar capabilities, with the encoding set in the Format/Character Set menu.
The mail client Eudora requires the addition of special "tables" to function properly with many languages. When these are installed you can choose character sets via the Message/Change/Transliteration menu. John Delacour offers means for Eudora to do Unicode UTF-8 here.
The email program Magellan is especially designed to handle multilingual text, as is PowerMail.
The Mail program included with OS X and clones with similar Unicode capabilities are discussed in that section above.
Traditionally computer systems could deal with only a limited number of distinct characters at once. Handling diverse languages meant remapping the same 256 codes to different characters for each one, using a font specifically designed for it. Successful communication over the internet sometimes required synchronizing the fonts at each end and translating among a couple dozen mutually incompatible character set standards, a list of which you can find in the "character encoding menu" of any browser or email program.
The development of Unicode, which is the agreed international standard for the unique encoding of all the characters used in different languages, changes this situation radically for the better. By creating a single character set that covers all scripts, Unicode allows the reading and writing of texts in any language, or the simultaneous display of many languages, without changing encodings and fonts. It should eventually become the common basis for text processing across all platforms and programs. A recent New York Times interview provides some useful general info.
The basic principle of Unicode is to assign a unique number (usually expressed in hexadecimal form) to every character. 1.1 million "codepoints" have been allocated for this purpose, divided among 17 "planes" with about 65,000 characters each. All characters in common use have been assigned to Plane 0, also known as the Basic Multilingual Plane (BMP), and some others have been placed into Planes 1, 2, and 14, as part of an ongoing process. Under the current version, Unicode 4.0, just over 96,000 characters have been allocated (plus 136,000 codepoints reserved for private use), and another 90 or so scripts are in the pipeline under consideration by various committees. For further information see the Roadmap to Unicode and Michael Everson's Paper Leaks in the Unicode Pipeline.
In practice Unicode data is represented by one of several possible "transformation formats," or UTF's. There are two common ones, UTF-16 and UTF-8. However, only UTF-8 is normally used over the internet. Unfortunately some Mac programs use the word Unicode in their encoding menus to mean UTF-16, so users need to watch out for this and specifically select UTF-8 when dealing with Unicode web pages and email. (Email also often has an additional "content transfer encoding," either "base64" or "printed-quotable," which is not related to language or character set issues.) Here is a summary of some UTF details.
Mac OS 9, actually beginning with OS 8.5, includes some limited support for Unicode. For example, the most advanced OS 9 browsers like Mozilla 1.2 can read UTF-8 web pages with Chinese, Japanese, Korean, Cyrillic, Greek, Arabic, Hebrew, and Devanagari, plus some languages using accented Roman if fonts such as Everson Mono Unicode and Gentium are installed. The MacBrowsers Page provides more info.
On the input side, if the Unicode language kit is installed, you can see two new items at the bottom of the Keyboard (flag) menu : Unicode Hex Input and Extended Roman (U). There are currently two (experimental) text editors available online which can make use of this OS 9 input support:
MLTE (Multilingual Text Editor)
SUE (Simple Unicode Editor)
Mozilla 1.2 Composer can be used in a similar way.
Also OS 9.1/9.2 includes a Unicode text editor called World Text, which works only on those systems. It is basically a Unicode-savvy version of SimpleText, but can work with files larger than 32K and also embed pictures, sound, and movies. A carbonized version of this is available in OS X. This page has more info.
To use the Unicode Hex Input system, you hold down the Option key and (for a character in the BMP) enter the 4-digit Unicode hex code. For example, 99AC gives the Chinese character for "horse." Any Mac or Windows TrueType Unicode font containing the characters you want to generate can be used.
The US Extended input system (called Extended Roman in 10.1 and earlier) lets you access a more limited set of characters, namely Unicode Extended Latin-A, via various key sequences. You can see how this works with the KeyCaps utility or on this page. Also available is non-Apple Latin Extended Keyboard.
For a guide to Unicode fonts (produced essentially for Windows) of various sizes and capabilities, see
Alan Wood's Font Page.
Mac OS X, covered in the section above, has a much broader level of Unicode support than OS 9. Under OS X 10.1 and higher, with appropriate fonts intalled, TextEdit can read characters in Unicode Planes 1 through 16, in addition to the usual Plane 0. The Unicode Hex Input system can also type characters from Planes 1 and above if you know the pair of 4-digit Hex "surrogates" which represent them (just input the two sequences in succession.) The same range of characters can be copy/pasted from the Character Palette. Custom keyboards, based on XML text files, can be created to access and input any desired set of Unicode characters.
OmniWeb 4.2 and higher, Opera 6, Safari, and Mozilla can display characters beyond the BMP in UTF-8. Both OmniWeb and Opera 6 can read such characters (assuming the font for them is installed) if a web page is encoded in UTF-16. An example is at Tex Texin's Unicode Examples Page.
One way to find the surrogate pairs for a given character code (or the character represented by a pair of surrogates) is to use Michael Kaplan's UTF-32 to 16 Translator.
A beta test font for Plane 1 and some other areas (planes 0 and 15) is Code2001, which contains characters for Old Persian Cuneiform, Deseret, Tengwar, Cirth, Old Italic, Gothic, Aegean Numbers, Cypriot Syllabary, Pollard Script, and Ugaritic.
The Hiragino Japanese font in OS X includes a small number of characters in Unicode Plane 2, (including about 300 from JIS X 0213) which can be accessed via the JLK character palette or the Unicode Hex Input keyboard (and some are in the phonetic input dictionary as well). The font Ming Uni provides access to the HKSCS-2001 character set, which also includes 1650 characters in Plane 2. A much more complete font for Plane 2 is Simsun (Founder Extended), which comes with MS Office XP and includes 37,000 Chinese Mainland, HKSAR, and Taiwan characters (in addition to another 28,000 Chinese and many western language characters from Plane 0). Info on its contents can be found here.
To see what Unicode characters are available on your system, a good utility is UnicodeChecker. It covers all 17 Unicode planes, can be searched by character block or name, and characters can be copy/pasted into TextEdit.
In OS X, Symbol and Zapf Dingbat characters are also produced using Unicode fonts, so that special keyboards (10.1) or the Character Palette (10.2) need to be activated in order to type them (you cannot just select the font as was possible in OS 9). This is explained in TIL 106731. If you need Wingdings-like symbols, use the Webdings font and look in the Unicode Private Use range in the Character Palette.
For codepoints in the Unicode Private Use Area (PUA) used by Apple, see this page.
The ability of applications to use OS X's excellent Unicode support varies widely. Cocoa programs like TextEdit are normally "Unicode-savvy": They can accept Unicode input via keyboard or copy/paste, and save and open Unicode text. Carbon programs are usually only "Unicode-aware" and lack key features. For example, Word X does not accept direct Unicode input, but it can save text as UTF-16 and HTML as UTF-8 (although it can only open the latter). Some Carbon programs, including AppleWorks, and almost all Classic programs, are "Unicode-deaf" and can neither input, save, nor open Unicode text.
1. OS X - Character Palette Gone: Try trashing the file Users/username/Library/Preferences/com.apple.CharPaletteServer.plist
2. OS X - International Preferences Pane Gone: Try trashing the file com.apple.preferencepanes.cache in your Users/username/Library/Caches folder.
3. OS X - The Keyboard-Font Synchronization Option Doesn't Work: This is a left-over from OS 9 and only functions with certain non-Cocoa apps, including SimpleText and WorldText and maybe AppleWorks (some versions have a bug which prevents it).
4. OS X - KeyCaps Application Doesn't Work: Make sure KeyCaps application is in front, with its menus showing. Make sure both the right keyboard and font are selected (CE keyboards require a font with CE at the end). Use the physical (not virtual) Option and Shift keys to see changes. Try trashing the file com.apple.keycaps.plist in your Users/username/Library/Preferences folder.
5. OS X - OS 9 Asian-Script File Names Are Mangled: A possible fix for this is Apple's File Name Encoding Repair Utility.
6. OS X - System Language is set to Y, but Folder Names Are Still In English: Try going to Finder Preferences and checking the box "Hide File Extension."
7. OS X - Some Keyboards Missing: Install the Additional Fonts Packages from the 2nd Install CD. For Cyrillic in 10.1, install the Cyrillic language kit in your Classic/OS 9.
8. OS X - Japanese Input Method Stalls: See here.
9. OS X - Can't Copy/Paste Language X from Browser Y Into Program Z: Try using Mozilla instead of browser Y. Try using TextEdit instead of Program Z. Try pasting into TextEdit, changing to Rich Text, and then copy/pasting into Program Z.
10. OS X - Can't Type Cyrillic or Central European in Program X: Make sure you install the "additional fonts" in the 2nd install CD. Some programs require that you select fonts which end in CY or CE for correct input in these scripts.
11. OS X - I Get an Accented E When I Type ?: Somehow the Canadian keyboard has been activated. Go to System Prefs/International/Input Menu and change the selection to U.S.
12. OS X - Can't Input Language X Correctly into InDesign, Dreamweaver, etc: It may be useful to query participants in the Adobe or Macromedia support forums.
13. OS X - Can't Read Webpages in X Language: Install Additional Fonts Packages from 2nd Install CD, and use a good multilingual browser like Mozilla, Opera, or Safari (not IE or iCab).
14. OS X - Keyboard Choice Won't Stick: Try disabling automatic login, then logging in as Root and resetting your keyboard there.
15. OS X - Can't Type Accented Characters Like I Always Did in Windows: Mac's use the Option key to access various accent dead keys. For a simple keyboard that works like the Windows US International, try USIntl.rsrc available here. (Use a Mac to access this link.)
16. OS X - Upgrading to 10.2.x Wipes Out Non-Roman Languages: Try removing files /Library/Caches/IntlDataCache.501 and /Library/Caches/IntlDataCache.sbdl.501, reset prefs in System Prefs/International/Languages, and logout/login. Try logging in as a different user to see if the problem is system-wide or just one account. Also you can clean caches with Jaguar CacheCleaner.
17. OS X - My Username Is In Asian Characters Instead of What It Should Be: Try adding a new user so that you have an odd number of them, or change the names of all your users so they begin with different letters.
18. OS X - Arabic Web Pages Don't Display Properly: Make sure you are using Opera (best), Mozilla/Netscape, or Safari and not some other browser, and that the "fonts for additional languages" package is installed. Try installing the font Arial Unicode (if you have it with an MS application) or the 12 Java Lucida fonts in Users/username/library/fonts (the Lucida fonts should be copied from the files for Java 1.4.1 - do a Sherlock search to find them). Try removing the font DecoTypeNaskh.ttf from Library/Fonts and all the Arabic/Persian fonts from your OS 9 System Folder/Fonts folder.
19. Classic - Can't Install Extra Keyboard in Classic System File on Machine That Doesn't Boot Into OS 9: Try copying the Classic System file and moving the copy to a machine that does boot into OS 9, installing the extra keyboard there, moving it back to the original machine, and replacing the original with the modified version.
20. OS 9 - Can't Find Language Kits: Look on the CD in the folder OS9 Applications/Apple Extras for an installer. Also look for a folder with the same name on your hard drive if you have used Restore disks to install your OS9.
21. OS 9 - Need Non-Roman Fonts in System to Have Certain Menus and File Names Show Up Correctly: Infomation on how to hack your system to use non-roman fonts can be found here. (Note that this does not translate system dialogues, etc. into another language.)
22. General - My Foreign Language Web Page is Fine Locally But Garbage on the Server: Make sure the FTP or other program you use to upload the page is set to allow double-byte text. If using Fetch, make sure you uncheck the item "Translate ISO Characters" in Customize > Preferences > Misc.
Online Translation Sites and Translation Software
Babelfish
Logomedia
Translation Experts
Translation.net
Commercial Fonts for Unusual Languages
Linguist's Software
XenoType Technologies
Non-commercial Fonts
Everson Mono Unicode
Gentium
Everson's Worldscript
Utilities
Yamada Font Archive
SIL
Font Creation Tools
FontLab TypeTool
Apple Font Tools
PfaEdit
African Languages
Bisharat.net
Amharic
Wazema System
Aramaic, Syriac, Assyrian
Aramaic Keyboards
Aramaic Site List
Arabic
The Arabic Mac
Mac4Arabs
WinArabic
Arabic Genie
Baltic Languages
DekSoft
Chinese
The Chinese Mac FAQ
Cuneiform Scripts
Cuneiform Digital Library Initiative
Akkadian Language Page
Sumerian Language Page
Sumerian Sign List
Sumerian Script Example
Old Persian Test Page
Cyrillic Scripts
Macintosh.ru
Cyrillic WordProcessing Solutions
Mac Cyrillic Fonts and Keyboards
Help Me Learn Church Slavonic
Greek (Modern)
Hellenic Resources Network
Rainbow Computing
Greek (Ancient)
Stoa Consortium
GreekKeys
Hebrew
Hebrew on the Internet
MacHebrew
HUJI Hebrew Internet Support
Icelandic
Apple Iceland
Inuktitut
Nunacom
Font
IPA (International Phonetic Alphabet)
IPA Unicode Keyboard
SIL Encore Fonts (non-Unicode)
SIL Unicode Font
Japanese
ErgoSoft (EGBridge, EGWord)
Justsystem (ATOK)
MacFan
Korean
Patchman's Patches (Hangul for various programs)
Lao
JGLao Font
Saysettha Unicode Font
Manchu/Mongolian
Hash's Cyrillic Mongolian Keyboard and Font
Tom's Manchu Test Page
Andrew's Mongolian Test Page
Andrew's Manchu Test Page
Navajo
Navajo on the Web
Persian
FarsiWeb
Taiwanese
Jason Cox's Taiwanese Page
Tengwar, Cirth, and other Tolkien Languages
Ardalambion
UTF-8 Tengwar Test Page
UTF-8 Cirth Test Page
Thai
Apple Thailand
Software.Thai.Net
Tibetan
S-B CRI Tibetan Language Kit
Nitartha Tibetan Software
Tibetan Test Page
Tibetan Uchen Font
XenoType Tibetan Language Kit
uTibetan Font
Turkish
Turkish Fonts and Computing
Bilkom (Apple Turkey)
Ottoman Text Projects
Vietnamese
Viet-nam.org
Nom Preservation Foundation
Mojikyo (Chu Nom Script)
Yiddish
Mac OS 10 Yiddish Computing
Mac OS X Troubleshooting
The X Lab
Foreign Language Newspapers Online
Yahoo
Newspaper List
Kiosk
Sources of Language Instruction Materials
Audio Forum: The Language Source
Schoenhof's Foreign Books
Cheng and Tsui Books
Languages and Writing Systems Reference Books
Blackwell Encyclopedia of Writing Systems (Coulmas)
World's Major Languages (Comrie)
World's Writing Systems (Daniels and Bright)
Note on OS X 10.0 Chinese/Korean
During June, 2001 Apple also issued a separate OS X 10.0.3 CD (build 4M23) for Asia and Scandinavia with the system languages English, French, German, Swedish, Norwegian, Finnish, Danish, Brazilian-Portuguese, Chinese (Traditional and Simplified), Japanese, and Korean. Chinese and Korean input methods are functional in this version. According to TIL 25317, this CD is not available in the U.S. and must be installed on a separate partition from the US version of OS X.
Note on OS X 10.1 Chinese/Korean
In the most common variety of the OS X 10.1 upgrade (build 5G64 before subsequent updates, number begins with 1Z) there was no Apple "input method" for Korean and Chinese, which means you could read but not write them. There was a second variety of the 10.1 upgrade disk that installed build 5J34 (rather than 5G64, number begins with 2Z) and which had the Chinese and Korean input methods included (as well as Asian and Scandinavian system languages).
Copyright 2000,2003 by Thomas H. Gewecke