Alan Wood’s Unicode ResourcesUnicode and Multilingual Editors and Word Processors for WindowsAbiWordAbiWord is a freeware Open Source word processor that is available for 32-bit Windows platforms, for several versions of Unix, and in an experimental version for BeOS. In addition to its own file format, it can read UTF-8, text, Rich Text Format, XHTML and Microsoft Word files. It can save as UTF-8, text, Rich Text Format, XHTML and LaTeX. You can find more information at http://www.abisource.com/products.phtml, and download it from http://www.abisource.com/free.phtml. Aprotool TM EditorAprotool TM Editor is a shareware Unicode editor developed by Maedera Masahiko for use with 32-bit versions of Windows. It can handle Unicode with language tags, and has a console window which can also handle Unicode. It includes a facility to select HTML tags from a menu, and a Unicode character map. It is available with English and Japanese interfaces. You can find more information at http://hp.vector.co.jp/authors/VA002891/READTM.TXT, and download it from http://hp.vector.co.jp/authors/VA002891/CWFILES.HTM. Registration costs US $20.00 or 2000 Yen. BabelPadBabelPad is an editor for plain text files that is available for Windows 95 and all later versions. It can use either a single Unicode font for all ranges, or separate fonts for each range. It includes input methods for Tibetan and Yi. BabelPad can open and save files with UTF-8, UTF-16, UTF-32 and various other encodings, supports all of the Unicode 3.2 characters, and can convert Numeric Character References (NCR) and Universal Character Names (UCN) to or from Unicode characters. The BabelMap character selection utility is included. It supports Microsoft’s Visual Keyboards. BabelPad is produced by Andrew West. You can find more information, and download a free copy, from BabelStone : Software : BabelPad. CE UniWriterCE UniWriter is a planned set of applications for Windows CE, one each for Chinese, French, German, Greek, Italian, Japanese, Russian and Spanish, that allow Unicode characters to be entered from the real keyboard, an on-screen keyboard, a character map, or an input box for hexadecimal numbers. CE UniWriter supports HPC, HPC Pro, PPC, and its documents are compatible with MS Pocket Office and MS Office 2000. It is being developed by Laser Publishing Group, and you can find more information at the CE UniWriter Web site. Dreamweaver MXMacromedia Dreamweaver MX is a fully-featured HTML editor that can edit multilingual files in code or WYSIWYG modes, or both at the same time. It can open and save files with UTF-8 encoding, as well as many other encodings. It does not support Microsoft’s Visual Keyboards. To set the default encoding to UTF-8:
To set the encoding to UTF-8 for the current document:
Macromedia Dreamweaver is commercial software, but a 30-day trial version is available from Macromedia - Downloads. Versions 5 and earlier of Dreamweaver do not support Unicode. EmEditorEmEditor is a Unicode text editor that runs under Windows 95, Windows 98, Windows Me, Windows NT 4 and Windows 2000 and can read and write files in UTF-16, UTF-8, UTF-7 and many language-specific encodings. It has colour coding for HTML and several programming languages. It can only use one font at one time, so not all scripts may be readable simultaneously. It can display Chinese, Japanese and Korean (but does not support Microsoft's Global IMEs), and can display Arabic and Hebrew (but it does not show them right-to-left).
EmEditor is shareware, and costs $30 US to register after a 30-day evaluation period. It is available with the user interface in English, German, Simplified Chinese, Traditional Chinese, Japanese or Korean. You can find more information at the Emurasoft Web site. FrontPage 2000Microsoft's FrontPage 2000 HTML editor can be used to produce multilingual Web pages with the aid of Unicode fonts and Visual Keyboards, and Proofing Tools can be purchased that provide spelling checkers, grammar checkers, thesauri and other facilities in more than 30 languages. Corporate users can change the language of the user interface, using the same MultiLanguage Pack as for Word 2000. FrontPage 2000 cannot be used with the Global IMEs.
You can set default proportional and fixed-width fonts for each language or character set; these will be used whenever a specific font is not selected. Use the Default Font tab of the Page Options dialog box, which is accessed from Page Options... on the Tools menu.
You can set the language and the encoding for a page on the Language tab of the Page Properties dialog box, which is accessed from Page Properties... on the right-click pop-up menu.
You can enable additional features, commands and resources that make it easier to use other languages in your Web pages, from Language Settings on the Microsoft Office Tools menu on the Start menu. FrontPage 2000 has been superseded by FrontPage 2002. FrontPage 2002Microsoft FrontPage 2002 is the successor to FrontPage 2000, and has been superseded by FrontPage 2003. FrontPage 2003Microsoft FrontPage 2003 is the successor to FrontPage 2002 FrontPage 2003 supports editing in all of the scripts that Windows supports, including those that require right-to-left layout, input sequence checking, and special word-breaking, and can be used with the Global IMEs. The supported scripts includes Arabic, Hebrew, Thai, Indian scripts, Chinese, Japanese and Korean. FrontPage 2003 can open, edit and save files in UTF-8 and UTF-16 (BE and LE) encodings, and has full support for surrogates. It also enables the user to automatically (based on keyboard) or manually indicate the language of text in multilingual pages through the “lang” attribute, which it then uses to apply the appropriate spelling dictionary and engine for the language (when available). Global OfficeGlobal Office is a commercial, multilingual accessory for Microsoft Office 97 that runs under Windows 95, Windows 98 and Windows NT 4. It works with Word, Excel, PowerPoint and Outlook and provides text entry and editing facilities in over 100 languages, with on-screen keyboard displays to help with typing. Multilingual Web pages can be produced by working in Word and then saving as HTML. Global Office is produced by Unitype Inc. and costs $250 (US dollars). Global WriterGlobal Writer is a commercial, multilingual word processor that runs under Windows 95, Windows 98 and Windows NT 4 and supports more than 100 languages. It provides on-screen keyboard displays to help with typing. Text in some languages, including Chinese, Japanese, Korean, Greek, Russian and Turkish, can be pasted into Microsoft Office 97 applications. Global Writer is produced by Unitype Inc. and costs $150 (US dollars). GoLiveAdobe GoLive is an HTML editor that can edit multilingual files in WYSIWYG mode or with the codes visible. GoLive 5, GoLive 6 and GoLive CS can be used to produce multilingual Web pages with the aid of Unicode fonts, and can open and save files with UTF-8 character encoding. GoLive 5 supports Microsoft’s Visual Keyboards, but later versions do not. It can display Chinese, Japanese and Korean, but does not seem to be able to edit them.
Adobe GoLive 5 requires Windows 98, Windows Me, Windows NT 4 SP4 or Windows 2000. To set the default encoding to UTF-8:
Adobe GoLive is commercial software, but a 30-day trial version is available from Adobe Product Tryouts. jEditjEdit is a Unicode text editor that is written in Java and can run under Mac OS X, Linux and Windows. It can be used with any text file, but is intended for editing programming and markup languages, and has syntax colouring for over 60 of these, including HTML and XML. jEdit can open and save files with any encoding that is supported by Java, including UTF-8 and UTF-16. It can use any Windows keyboard driver, but it does not support Microsoft’s Global IMEs.
For multi-script documents, it is convenient to use a large Unicode font such as Arial Unicode MS. To change the default font:
jEdit is produced by Slava Pestov and is freeware. For more information and to download the software, visit the jEdit - Open Source programmer's text editor Web site. Mozilla ComposerThe Composer component of Mozilla is a multilingual HTML editor that supports Unicode and can edit files in WYSIWYG, WYSIWYG plus tags and plain HTML modes. The Windows version can work in any language and script that is supported by the operating system and for which a font and a keyboard driver are installed. It supports Microsoft’s Visual Keyboards and Global IME. Mozilla Composer can produce files that include multiple scripts and languages, and it can save HTML files with UTF-8 character encoding. Available only as part of Mozilla, which includes a Web browser and can be downloaded free of charge from http://www.mozilla.org/releases/. Netscape 7 is based on Mozilla. Namo WebEditor 5Namo WebEditor is a multilingual HTML editor for 32-bit Windows platforms, and supports Unicode. It is available with Chinese (simplified and traditional), English, German, French, Korean, Japanese or Spanish interfaces, and has dictionaries for 13 Western languages. It does not supply its own fonts or keyboard drivers, so you need to have keyboard drivers for any languages you wish to use. It supports Microsoft's Global IMEs (which allow you to type in Chinese, Japanese and Korean) and Visual Keyboards (useful if you are not very familiar with a particular keyboard layout), and it allows files to be saved in UTF-8 encoding, in addition to some CJK and other encodings. It is produced by Namo Interactive Inc. and costs US $79.00 if downloaded or $99.00 by mail order. A 45-day trial version of the current release is available to download.
Versions 3 and 4 of Namo Web Editor also support Unicode. Netscape ComposerThe Composer component of Netscape 7 is a multilingual HTML editor that supports Unicode and can edit files in WYSIWYG, WYSIWYG plus tags and plain HTML modes. The Windows version can work in any language and script that is supported by the operating system and for which a font and a keyboard driver are installed. It supports Microsoft’s Visual Keyboards and Global IME. Composer can produce files that include multiple scripts and languages, and it can save HTML files with UTF-8 character encoding. Available only as part of Netscape 7, which includes Netscape Navigator and can be downloaded free of charge from Netscape 7.2. Netscape 7 is based on Mozilla. Version 4 of Composer does not support Unicode. NotepadNotepad is a basic text editor that supports Unicode and is supplied with Windows NT, Windows 2000 and Windows XP. It can be used to edit any script that is supported by Windows and for which a font and a keyboard driver are installed, including right-to-left scripts. Notepad can open, edit and save files with UTF-8 and UTF-16 (BE and LE) encodings. It always includes a BOM. The versions of Notepad supplied with Windows 95, Windows 98 and Windows ME do not support Unicode. NvuNvu is a multilingual HTML editor that supports Unicode and can edit files in WYSIWYG, WYSIWYG plus tags and plain HTML modes. The Windows version can work in any language and script that is supported by the operating system and for which a font and a keyboard driver are installed, including right-to-left scripts. It supports Microsoft’s Visual Keyboards and Global IME, and includes an FTP site manager and a CSS editor. Nvu can produce files that include multiple scripts and languages, and it can save HTML files with UTF-8 character encoding. More information and free download from Nvu - The Complete Web Authoring System for Linux. Nvu is an enhanced version of Mozilla Composer, and is also available in a version for Linux. OnePenOnePen is a commercial add-in that runs under Windows 95, Windows 98 and Windows NT 4 and allows blocks of foreign text to be inserted into most Windows applications, including word processors, HTML editors, spreadsheets, databases and DTP. The user interface can be selected from English, Arabic, French, German, Hebrew and Spanish. The languages that can be typed depend on the options that have been purchased, with over 100 languages supported in the most expensive version. It can import and export Unicode text and files, and can also export text as graphics. It is available from Aramedia Group, and costs from $129 to $499 (US dollars), depending on the languages required. OpenOfficeOpenOffice is an OpenSource office package that uses XML as its native format and includes a word processor and an HTML editor, as well as spreadsheet, drawing and presentation programs. The word processor can open and save formatted files in Microsoft Word and RTF formats, and plain text files in UTF-7, UTF-8, UTF-16 and several national encodings. The HTML editor has similar abilities, except for RTF and Word formats.
CJK support needs to be enabled (Tools > Options > Language Settings > Languages > Asian languages). The default encoding for HTML files needs be set in Options (on the Tools menu).
More details and free downloads are available from OpenOffice.org. QJotQJot is a free, simple word processor that supports Unicode and runs under Windows 2000 and Windows XP. Its native format is RTF, but it can open, edit and save Word and WordPerfect files (though not with all of their features). It can be used with any script that is supported by Windows and for which a font and a keyboard driver are installed, including right-to-left scripts. With RTF and Word files, QJot seems to be able to use any Unicode character, including those in the supplementary planes. QJot does not seem to support Unicode in plain text or HTML files. More information and a free download are available from xtort.net - Dedicated to being massive directory of freeware. Simredo 3Simredo 3.31 is a freeware Unicode text editor, written in Java, that runs under Windows 95 and higher. It can convert over 100 encodings to and from Unicode (UTF-8 and big and little endian UTF-16), and includes support for Esperanto and right-to-left scripts. Simredo can also re-map the keyboard, to allow typing in unusual scripts, and has a character map from which selected characters can be copied to the document (using Ctrl+C and Ctrl+V). It has no direct support for HTML, but you can type in HTML tags, or copy and paste all or part of the contents into an existing HTML document in another editor.
More information about Simredo and a free download are available from the Simredo 3.3 - Java Unicode Editor Web site. UDPUDP (or Unicode Document Processor) is a Unicode text editor that has its own native format, and can also open and save files as RTF. In addition to Unicode, it has facilities for Tibetan, for Romanized Sanskrit and Pali, for Braille, and for custom keyboard layouts. It can handle limited formatting, including bold, italic, font size and color, indentation and links.
More information and a free download are available from the UDP Web site or its mirror site. UltraEditUltraEdit is a text editor that runs under Windows 95 and Windows NT 4 or later, and can be used to edit UTF-16 and UTF-8 files. When used for HTML files, it can run CSE HTML Validator and HTML Tidy from a menu option.
UltraEdit only supports a single font, so for multi-script Web pages a large font such as Arial Unicode MS is needed in order to show all of the characters. The font can be changed from an option on the View menu:
In order to create multi-script files, one of UltraEdit’s settings needs to be changed so that new files are created as Unicode. This is done via the “Configuration…” option on the “Advanced” menu.
More information about UltraEdit and downloads are available from the Text Editor - HEX Editor - HTML Editor - Programmers Editor - UltraEdit Web site. The program can be used free for a trial period, and after that a license costs US $35.00. UnicEditUnicEdit is a freeware Unicode text editor for Windows NT 4.0; it will not run under Windows 95 or Windows 98. It can import and export in ANSI, UCS-2, UCS-4, UTF-8 and UTF-16 formats, and it can convert files from one codepage to another. It has no direct support for HTML, but you can type in HTML tags, or open UnicEdit files in NotePad and copy and paste all or part of the contents into an existing HTML document. UnicEdit is produced by Heiner Eichmann, and can be downloaded from http://heiner-eichmann.de/software/unicedit/unicedit.htm. UniEditUniEdit is a multilingual text editor that is produced by the Humanities Computing Laboratory (formerly the Humanities Computing Facility of Duke University). The current version (1.3) supports Unicode 2 and runs under Windows 3.1x or higher. Version 1.5 is soon to be released, and this will support Unicode 3 and run under 32-bit versions of Windows. The user interface is English only, but it can handle files in over 20 scripts and in hundreds of languages. It can save files in UTF-8 and other Unicode-compliant formats, but it maps to its own non-Unicode TrueType fonts to display and print text in a very wide range of languages, including Indic, CJK and a number of South-East Asian languages. It has no direct support for HTML, but you can type in HTML tags, or open UniEdit files in NotePad and copy and paste all or part of the contents into an existing HTML document. More information is available from What is UniEdit?. UniEdit is a commercial product, but a trial version is available for downloading. UniPadUniPad is a Unicode text editor for Windows 95, Windows 98, Windows ME, Windows NT, Windows 2000 and Windows XP. It uses its own bitmap font, not TrueType fonts, and has on-screen keyboard layouts that allow you to type in several languages and scripts. Characters can also be selected from a character map. It can save files in UTF-8 format, so you can produce multilingual HTML files either by typing in the tags, or by opening UniPad files in NotePad and copying and pasting into an existing HTML file. The user interface is English only. More information is available from the Sharmahd Computing UniPad Web site. UniPad is a commercial product costing US $199.00, but a trial version (limited to files of no more than 1000 characters) is available for downloading. UniRedUniRed is a free Unicode text editor that runs under Windows 95, Windows 98, Windows Me, Windows NT 4 and Windows 2000 and can read and write files in UTF-16, UTF-8 and many ISO, Windows, Mac and language-specific encodings. It has colour coding for HTML and several programming languages. It can display lists of links and headers as an aid to navigating in HTML files.
UniRed is supplied with the user interface switchable between English, Russian and Esperanto, and optional interfaces can be downloaded for Byelorussian, Czech, Danish, French, German, Italian, Brazilian Portuguese, Sorabian(?), Slovak, Spanish or Swedish. You can download it and find more information at the UniRed Web site. Word 97You can use characters from Unicode fonts in Microsoft’s Word 97 (or Word 2000 or Word 2002) by picking them from the Symbol dialog box (select Symbols... on the Insert menu): Alternatively, you can create a macro, which you can then assign to a shortcut key or a toolbar button:
You can use the Find command on the Edit menu to look for a Unicode character, by supplying its decimal number preceded by ^u, for example ^u945 to find α (lower case Greek alpha). This does not work with Replace. The following macro will attempt to identify a single character that you have selected, and display its Unicode decimal character reference:
Word 97 can be used as an HTML editor for producing Web pages incorporating several scripts:
The drop-down Style menu allows you to format your text. If you need to work directly on the HTML code, click the View menu and select HTML Source. To type in other scripts or languages, select the appropriate keyboard and if necessary also choose an appropriate font. You can add support for displaying CJK scripts by installing Far East support from the ValuPack folder on the Word CD-ROM.
Multi-script documents that have been produced in Word format can be converted to HTML format by creating a new Web page with UTF-8 encoding and pasting the content of the Word-format document. Word 97 has been superseded by Word 2000, which improves on the multilingual abilities of Word 97. Word 2000Microsoft’s Word 2000 word processor for Windows 95, Windows 98, Windows ME, Windows NT 4, Windows 2000 and Windows XP has gained new features since Word 97, which itself supported Unicode characters and had the ability to insert them from the Symbol dialog box (accessed from Symbols... on the Insert menu). With the introduction of Visual Keyboard, which can be downloaded from the Office Update site, the ability to switch keyboard layouts has been enhanced by the option to have the new keyboard layout visible in a floating window. Typing in a new language can then be accomplished either by clicking the keys in the on-screen display, or by using the on-screen display to guide you in the use of the real keyboard.
You can enable additional features, commands and resources that make it easier to use other languages in your documents, from Language Settings on the Microsoft Office Tools menu on the Start menu. A pack of Proofing Tools can be purchased that provides spelling checkers, grammar checkers, thesauri and other facilities in more than 30 languages. Word 2000 can also be used with Microsoft’s free Global IMEs to allow input of Chinese, Japanese and Korean characters. It is now starting to challenge the commercial multilingual word processors. Word 2000 can also save files as HTML, but if you want to use the HTML files on a Web site be sure to obtain the HTML Filter 2.0 from the Microsoft Office Download Center to remove the Office-specific markup tags. Under Windows 2000, you can type Shift+Alt+X after a character to replace the character with its Unicode hexadecimal value. Type Alt+X to convert the value back to the Unicode character. This also works in WordPad under Windows 2000. For corporate users only – There is a single executable for all language versions of Word 2000 except Thai, Vietnamese and Indic languages. If you are a participant in the Open, Select or Enterprise Agreement volume licensing programmes, Microsoft will allow you to buy copies of Office 2000 Standard, Professional, Developer and Premium editions that include the MultiLanguage Pack, enabling you to take advantage of the single executable and change Word's user interface and Help files into more than 25 languages. Apart from English, the available languages include Arabic, Basque, Brazilian Portuguese, Chinese (Simplified and Traditional), Croatian, Czech, Danish, Dutch, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Slovakian, Slovenian, Spanish, Swedish and Turkish. The Pack also includes proofing tools for more than 30 languages, the extras being Bulgarian, Catalan, Estonian, Latvian, Lithuanian, Serbian and Ukrainian. Word 2000 has been superseded by Word 2002, which improves on the Unicode abilities of Word 2000. Word 2002Microsoft’s Word 2002 word processor for Windows 98, Windows ME, Windows NT 4, Windows 2000 and Windows XP has gained new Unicode features since Word 2000. I do not have access to a copy yet, but it includes the following improvements:
WorldPadWorldPad is a text editor for Windows 98 and later, and can use the Graphite rendering engine to display complex scripts. It can save files as plain text (without formatting) or as XML (retaining formatting), it can use UTF-8 and UTF-16 encodings, and it can handle right-to-left and mixed text directions.
More information about WorldPad and a free download are available from WorldPad and Graphite. WPS Office 2003The WPS Office 2003 suite has a Chinese interface, and includes Unicode-aware word processor, spreadsheet, presentation and e-mail programs. It can open and save files in several formats, including Word, RTF and HTML. This is the suite that is being introduced into government establishments in China in order to avoid the need to pay licence fees for Microsoft Office.
WPS Office 2003 is produced by Kingsoft. More information (in Chinese) about WPS Office 2003 and a functioning download are available from WPS Office 2003 限次版 :: 金山软件 or WPS产品服务网. XML SpyXML Spy is a commercial XML Integrated Development Environment for 32-bit and 64-bit Windows platforms that includes full Unicode support as well as support for all major character-set encodings. It can import text files, Word documents, and data from Access, Oracle and SQL Server databases (those that are used in most web hosting solutions). It is produced by Altova, and costs US $999.00 for the Enterprise version, US $499.00 for the Professional version, and US $189.00 for the Standard version. A 30-day evaluation version is available. Copyright © 1999–2010 Alan Wood |