Alan Wood’s Unicode Resources

Unicode and Multilingual Editors and Word Processors for Mac OS X

Home       Site Map


Introduction

Mac OS X 10 did not originally include support for as many languages and scripts as Mac OS 9. Mac OS X 10.1 supported Central European, Cyrillic and Japanese, and Korean, Simplified Chinese and Traditional Chinese were made available as downloads. Mac OS X 10.2 introduced support for Arabic, Devanagari, Greek, Gujarati, Gurmukhi, Hebrew and Thai scripts. Mac OS X 10.3 introduced support for Armenian, Unified Canadian Aboriginal Syllabics and Cherokee scripts.

The Editors listed below are those that are available in versions designed for Mac OS X; other editors that are designed for Mac OS 9 can be used in Classic mode.



BBEdit

BBEdit is a text editor for OS X 10.3.9 or later that includes extensive support for producing HTML files and program code, as well as plain text files. It can edit text in several left-to-right languages and scripts, including double-byte scripts, and it supports the Mac’s Unicode keyboards. Older versions allowed only one font to be active at a time, and so only one non-Latin script plus unaccented Latin characters could be displayed properly simultaneously, but it can now display multiple scripts simultaneously. It can use any installed Web browser for WYSIWYG preview. Files that contain multiple scripts can be opened and saved with UTF-8 or UTF-16 character encoding.

Screen shot

BBEdit displaying multiple scripts simultaneously
(screen shot courtesy of Mark Garrett)

HTML tags and attributes can be typed directly, or selected from a floating palette or a menu, and are shown in user-selectable colours. BBEdit includes an HTML syntax checker, and a link checker for links within your site.

It is produced by Bare Bones Software, Inc. and costs US $199.00 plus shipping. A trial copy that can be used for 30 days is available.


Top

jEdit

jEdit is a Unicode text editor that is written in Java and can run under Mac OS X, Linux and Windows. It can be used with any text file, but is intended for editing programming and markup languages, and has syntax colouring for over 60 of these, including HTML and XML. jEdit can open and save files with any encoding that is supported by Java, including UTF-8 and UTF-16. It can use any of the normal Mac OS X keyboards, but not the Unicode Hex Input keyboard.

Screen shot of jEdit

A multi-script HTML document with UTF-8 encoding in jEdit

For multi-script documents, it is convenient to use a large Unicode font such as Arial Unicode MS. To change the default font:

  1. Click the jEdit title bar, to make sure that it is the current application.
  2. On the Utilities menu, select "Global Options…".
  3. In the Global Options dialog box, select "Text Area" under jEdit Options.
  4. Click the font name in the box to the right of "Text font:".
  5. In the Font Selector dialog box, choose a "Font family" (e.g. Arial Unicode MS), and optionally choose a font size and style.
  6. Click "OK" to close the Font Selector dialog box.
  7. Click "OK" to close the Global Options dialog box.

jEdit is produced by Slava Pestov and is freeware. For more information and to download the software, visit the jEdit - Open Source programmer's text editor Web site.


Top

Mellel

Mellel is a Unicode-aware word processor that is designed for Mac OS X and supports many scripts and languages including Latin, Cyrillic, Greek, Arabic, Farsi, Hebrew, Chinese, Japanese and Korean. In addition to its native format, it can import and export RTF files (including multi-script files from Word for Windows) and plain text files with Mac, Windows and ISO encodings. It can use the normal Mac OS X keyboards and the Unicode Hex Input keyboard.

Screen shot of Mellel

A multi-script document in Mellel

The program is still being developed, and future plans include HTML import and export.

Mellel is produced by RedleX and costs US $39; a free trial version is available. More information and downloads are available from the Welcome to RedleX - Creators of Mellel Web site. The optional downloads include Arabic and Hebrew keyboards and Persian fonts.


Top

Mozilla Composer

The Composer component of Mozilla is a multilingual HTML editor that supports Unicode and can edit files in WYSIWYG, WYSIWYG plus tags and plain HTML modes. It supports Apple’s Unicode Hex Input and Extended Roman keyboards.

Mozilla Composer screen shot.

Mozilla Composer can produce files that include multiple scripts and languages, and it can save HTML files with UTF-8 character encoding.

By default, Mozilla Composer re-formats your HTML code to conform to its idea of good style. To turn off this option, so that HTML formatting is left alone:

  1. Click the Mozilla title bar to ensure that it is the current application.
  2. Click “Mozilla” on the menu bar at the top of the screen.
  3. Click “Preferences...” on the Mozilla menu.
  4. In the Preferences dialog box, click “Composer” in the list of categories.
  5. In the When Saving Files section, click the radio button for ”Retain original source formatting".
  6. Click the “OK” button to close the Preferences dialog box.

Available only as part of Mozilla, which includes the Mozilla Navigator Web browser and can be downloaded free of charge from http://www.mozilla.org/releases/.


Top

Netscape Composer 6.2

The Composer component of Netscape 6.2 is a multilingual HTML editor that supports Unicode and can edit files in WYSIWYG, WYSIWYG plus tags and plain HTML modes. It does not yet support Apple’s Unicode Hex Input and Extended Roman keyboards.

Composer 6 screen shot.

Composer 6.2 can produce files that include multiple scripts and languages, and it can save HTML files with UTF-8 character encoding.

By default, Netscape Composer re-formats your HTML code to conform to its idea of good style. To turn off this option, so that HTML formatting is left alone:

  1. Click the Mozilla title bar to ensure that it is the current application.
  2. Click "Edit" on the menu bar at the top of the screen.
  3. Click "Preferences..." on the Edit menu.
  4. In the Preferences dialog box, click "Composer" in the list of categories.
  5. In the When Saving Files section, click the radio button for "Retain original source formatting".
  6. Click the "OK" button to close the Preferences dialog box.

Available only as part of Netscape 6.2, which includes Netscape Navigator and can be downloaded free of charge from Netscape 6 Release.


Top

Nisus Writer Express

Nisus Writer Express is a word processor for Mac OS X 10.3 or later. Its preferred file format is Rich Text Format (RTF), but it can also open and save as Rich Text Format Directory (RTFD), Microsoft Word, WordPerfect, AbiWord and HTML. It can open and save text files in UTF-8, UTF-16 and several other encodings. It supports all of the keyboards for left-to-right scripts, the IMEs for CJK, and Apple’s Unicode Hex Input keyboard driver, which allows you to enter any Unicode character by holding down the Options key while typing the 4-character hexadecimal character reference, e.g. 0E05 for the Thai character kho khon. From version 2.5, it supports editing of Arabic and Hebrew.

Screen shot of Nisus Writer Express

Unicode text displayed in Nisus Writer Express

Nisus Writer Express can open and save multi-script files produced by Word for Mac, but it has problems opening multi-script Word for Windows files.

Nisus Writer Express is a commercial application; more information is available from Nisus Writer Express. A 30-day trial version is available from Nisus Writer Express Download.


Top

Pepper

Pepper is a text editor for Macintosh computers that runs under both Mac OS 9 and Mac OS X 10, and can make use of the Unicode support that has been built into Mac OS starting with version 8.5. It can therefore use Apple’s Unicode Hex Input keyboard driver, which allows you to enter any Unicode character by holding down the Options key while typing the 4-character hexadecimal character reference, e.g. 0E05 for the Thai character kho khon. Pepper has the unusual ability for a text editor to display scripts for which Language Kits are installed in appropriate fonts; the mapping can be changed in the FontMapping section of the Preferences dialog box. Alternatively, it can use a single multi-script font; this option is turned on by selecting "ATSUI text rendering" in the Editing section of the Preferences dialog box.

Screen shot of Pepper

Unicode text displayed in Arial Unicode MS in Pepper

Pepper can import and export files in UTF-8, UTF-16 (big and little endian), ANSI, MacRoman, ShiftJIS, Big5 and all of the ISO 8859 character sets. Pepper has syntax styling for HTML and several programming languages. It has a few aids to producing HTML files, but you have to type in most HTML tags, and it can save files with UTF-8 character encoding in order to produce multilingual Web pages.

Pepper used to be shareware but is now a commercial application. It is available from Digital Wandering and costs US $35.00.


Top

Simredo 3

Simredo 3.31 is a freeware Unicode text editor, written in Java, that runs under various operating systems, including Mac OS X. Its default format is UTF-16, it can convert over 100 encodings to and from Unicode (UTF-8 and big and little endian UTF-16), and it includes support for Esperanto and right-to-left scripts. Simredo can also re-map the keyboard, to allow typing in unusual scripts, and has a character map (but the facility to copy selected characters to the document does not work in Mac OS X). It has no direct support for HTML, but you can type in HTML tags, or copy and paste all or part of the contents into an existing HTML document in another editor. It supports Apple’s Unicode Hex Input and Extended Roman keyboards.

Screen shot of Simredo

Multiple scripts displayed simultaneously in Simredo

More information about Simredo and a free download are available from the Simredo 3.3 - Java Unicode Editor Web site.


Top

Style

Style is a shareware text editor that can read and write formats including Rich Text Format (RTF) and Unicode (UTF-16). For editing multiple languages and scripts, it uses Apple’s proprietary character sets, and converts to and from Unicode when documents are saved or opened. It does not support Apple’s Unicode Hex Input and Extended Roman keyboards. It includes an AppleScript for generating HTML files with UTF-8 encoding.

Screen shot of Style

Multiple scripts displayed simultaneously in Style

More information about Style and a trial download are available from the Welcome to Style! Web site. Style is shareware, and registration costs US $12.00.


Top

SUE

SUE (Simple Unicode Editor) is an experimental Unicode editor that makes use of the Unicode support that is built into Mac OS X. This means that it can make full use of the large Unicode fonts that are designed for Windows, such as Arial Unicode MS and Bitstream CyberBit, as well as the Unicode fonts supplied with Mac OS X. It can also use Apple’s Unicode Hex Input keyboard driver, which allows you to enter any Unicode character by holding down the Options key while typing the 4-character hexadecimal character reference, e.g. 0E05 for the Thai character kho khon.

Screen shot of SUE

Unicode text displayed in Arial Unicode MS in SUE

SUE can import and export files in a variety of Macintosh, ISO, Windows and DOS code pages as well as UTF-7, UTF-8 and 16-bit Unicode. It can save files as text, Unicode or Textension (the format of Apple’s Multilingual Text Editor technology). SUE has no direct support for HTML, but you can type in HTML tags and save your file with UTF-8 character encoding in order to produce multilingual Web pages.

SUE is written by Tomasz Kukielka and is available from the SUE Web page.


Top

TextEdit

TextEdit is an editor for formatted text that uses RTF (Rich Text Format) as its native format. It can also open and save plain text files in UTF-8, UTF-16, Western (Mac and Windows), Japanese, Korean, Simplified Chinese and Traditional Chinese. It supports Apple’s Extended Roman and Unicode Hex Input keyboards.

Screen shot of TextEdit

TextEdit displaying multiple scripts simultaneously

TextEdit is supplied with Mac OS X 10.1, and is installed as part of a default installation.


Top

ThinkFree Write

ThinkFree Write is a Java-based word processor that can read and write RTF (Rich Text Format) files, Microsoft Word files (including multi-script files produced with Word for Windows) and HTML files with UTF-8 encoding. It supports Apple’s Extended Roman, Unicode Hex Input and other Unicode keyboards.

Screen shot of ThinkFree Write

ThinkFree Write displaying multiple scripts simultaneously

ThinkFree Write is part of ThinkFree Office. The suite costs US $49.95, but a fully-functional 30-day trial version is available.


Top

Word:mac v. X

Microsoft’s Word:mac v. X word processor for Mac OS X 10.1 uses the same file format as Word 97, Word 2000 and Word 2002 for Windows, but cannot read multi-script documents from Word for Windows. Multiple scripts are retained if native Word:mac v. X documents are transferred to Word 97, Word 2000 or Word 2002, and when Unicode (UTF-16) text and HTML (UTF-8) pages are produced.

Word:mac v. X does not support Apple’s Extended Roman, Unicode Hex Input or Vietnamese Unicode keyboards. It can see Windows Unicode fonts, but it can only use them for Latin script.

Word:mac v. X has a dialog box for picking characters from large fonts, accessed from Symbol... on the Insert menu. However, it does not show Unicode ranges and it does not work with all fonts (e.g. Lucida Grande contains Latin Extended-A, Latin Extended Additional, Greek and Cyrillic characters, but only MacRoman characters are shown).

Screen shot of Word X

Word:mac v. X can be used as a WYSIWYG HTML editor for producing multi-script Web pages. To save an existing Word document as a Web page:

  1. On the File menu, select "Save as Web Page...".
  2. In the Save: Microsoft Word dialog box, click the "Web Options..." button.
  3. In the Web Options dialog box, click the "Encoding" tab.
  4. From the pop-up list of encodings, select "Unicode (UTF-8)".
  5. Click "OK" to close the Web Options dialog box.
  6. In the Save: Microsoft Word dialog box, specify a name (Save As:) and location (Where:) for your HTML file.
  7. To make your HTML file as small as possible, click the "Save only display information into HTML" radio button. This option is recommended for a page to go on a Web site.
  8. Alternatively, to retain the entire structure of the Word document, click the "Save entire file into HTML" button. This option creates much larger files that include all of the special Word formatting that is normally not supported in Web pages.
  9. Click the "Save" button to save your document and close the Save: Microsoft Word dialog box.

To set UTF-8 as the default encoding for all HTML files produced in Word:

  1. On the Word menu, select "Preferences...".
  2. In the Preferences dialog box, click "General" in the list of categories.
  3. Click the "Web Options..." button on the General page.
  4. In the Web Options dialog box, click the "Encoding" tab.
  5. From the pop-up list of encodings, select "Unicode (UTF-8)".
  6. Click "OK" to close the Web Options dialog box.
  7. Click "OK" to close the Preferences dialog box.

To create a multi-script Web page that is not based on an existing Word document:

  1. On the File menu, select "Project Gallery..."
  2. in the Gallery, click "Blank Documents" in the Category list.
  3. Click the large "Web Page" icon, and then click "OK" to close the Gallery.

The Formating Palette allows you to format your text. To type in other scripts or languages, select the appropriate keyboard and if necessary also choose an appropriate font.


Screen shot of Word X

If you need to work directly on the HTML code, open the View menu and select HTML Source. To revert to the normal WYSIWYG view, open the View menu and select Exit HTML Source.

Top

Word 2004

Microsoft’s Word 2004 word processor for Mac OS X 10.2.8 onwards uses the same file format as Word 97, Word 2000, Word 2002 and Word 2003 for Windows, and is the first version of Word for Mac to make use of the operating system’s Unicode support, including the Unicode keyboards and fonts.

It is supplied with a range of Unicode fonts that enable it to display many multi-script documents from Word for Windows. However, it does not support editing of right-to-left scripts (e.g. Arabic and Hebrew) or complex scripts such as Thai and the Indian languages. The Insert Symbol dialog box does not show all fonts or all characters, but Apple’s Character Palette can be used instead.

Screen shot of Word 2004

A 30-day trial version of Microsoft Office 2004, which includes Word 2004, is available from Office 2004 Test Drive.

Top

Top

Copyright © 2001–2005 Alan Wood

Created 26th December 2001   Last updated 13th September 2005

Send comments or questions to Alan Wood

HTML 4.01     Built with BBEdit     Made on a Mac     iCab smiles