Difference between revisions of "EiffelStudio Internationalization"

(Maintainer Guide)
(Maintainer Guide)
Line 88: Line 88:
 
   perl $EIFFEL_SRC/Delivery/studio/lang/script/preference_po_extraction.pl
 
   perl $EIFFEL_SRC/Delivery/studio/lang/script/preference_po_extraction.pl
 
   msguniq -o $EIFFEL_SRC/Delivery/studio/lang/po_files/estudio.pot $EIFFEL_SRC/Delivery/studio/lang/po_files/estudio.pot
 
   msguniq -o $EIFFEL_SRC/Delivery/studio/lang/po_files/estudio.pot $EIFFEL_SRC/Delivery/studio/lang/po_files/estudio.pot
* Make sure etudio.pot file committed is generated from repository code. Do not commit estudio.pot that was generated from local code. Because other maintainers might override your changes that are not from repository.
+
* Make sure etudio.pot file committed is generated from repository code. Do not commit estudio.pot that was generated from local code. Because other maintainers might override your changes that are not from repository. And modification of estudio.pot file by hand is not recommended.
 
* Commit the estudio.pot file to let translators to update.
 
* Commit the estudio.pot file to let translators to update.
 
* To add a new language support, simply copy estudio.pot to LOCALE_ID.po where LOCALE_ID should be:
 
* To add a new language support, simply copy estudio.pot to LOCALE_ID.po where LOCALE_ID should be:

Revision as of 22:44, 19 December 2006

Overview

Since i18n have been mostly implemented in Eiffel, Eiffel Studio is coming into the new era of internationalization. The goal of i18n integration is to provide multiple language support in Eiffel Studio and let users switch languages of the interface easily at runtime.

Steps to integrate i18n

Non-editor part

The first step concentrates on the interface of Eiffel Studio. All buttons, labels, tool tips and grids that are directly used by Eiffel Studio project.

  1. Collect all static interface strings in the system., including some context dependent strings.
    1. This is not necessary, but doing this give us better management and code quality. Only INTERFACE_NAMES knows i18n.
    2. Change all types in INTERFACE_NAMES, EB_METRIC_NAMES, CONF_INTERFACE_NAMES, WARNING_MESSAGES to be STRING_GENERAL. Caller should be adapted correspondingly.For some stings, two versions maybe needed. One for internal use, the other one for the interface,especially for strings saved as preferences and strings constants used in configure XML files.
    3. Rewrite bodies of those strings using i18n translation routines, STRING_32 instances are actually produced.
    4. Modify places using EV_CONSTANTS, make new classes if needed. e.g. EV_CONFIRMATION_DIALOG; EV_WARNING_DIALOG are not usable.
    5. Write scripts to draw strings from default.xml. "Directory", names and descriptions of preferences are also needed to be translated.
    6. Find a solution to wizards, since some of them are external excutables.
  2. Build language menus to switch language.
    1. Make interface classes locale observers so that all tools know when interface names should be reread.
  3. Solve the problems in vision2.
    1. In Chinese, menu chars are conventionally parenthesized and under scored following the menu text. This can be done by the translator.
    2. Handle "&" as both char and wchar for menu items.
    3. Fix "tab" issue for menu items.
  4. Integrate i18n .po generation tool. It has been done in i18n branch.
  5. using .po generation tool integrated ec to compile an ec in which interface names have been adapted.
  6. Use .po generation tool to generate estudio.pot file. The .po generation tool generally extracts strings that are taken as parameters of `translated' and `translated_plural' and produce .pot file.
  7. Since strings used in EiffelStudio not only come from the source code. Write scripts to extract them and merge them into estudio.pot. Strings should be extracted from: default.xml → descriptions of preferences
  8. Duplicate estudio.pot file to .po files with names of locale ids. Each .po file represents a locale. i18n library read .mo files with correct names of id. Though .po files are not necessary to have names of locale id, .mo files are one-to-one produced from .po files. So using locale ids as names of .po files is reasonable.
  9. Translators using .po editor open .po files and translate interface names to all kinds of languages.
  10. Generate .mo files using .mo generation script.

Editor part

  1. This step might be more complicated. And will be done later after the first step. Probably after 6.0 release. This step concentrates on extending the editor library to accept wide characters. Internationalization of any output directed to the editor is done in the step. Many existing tools might be affected, search tool, formatting tools etc.
  2. Encoding conversion facility is needed.

File structure

Repository

All files are stored in %EIFFEL_SRC%\Delivery\studio\lang

%EIFFEL_SRC%\Delivery\studio\lang\script

Place where scripts for generating .mo files are put. The scripts are invoked when building a delivery.

%EIFFEL_SRC%\Delivery\studio\lang\mo_files

Place to put .mo files. Those files are actually used at runtime.
Only .mo files need to be included in a delivery.

%EIFFEL_SRC%\Delivery\studio\lang\po_files

Place to put .pot file and .po files. 

Delivery

 Windows:
 %ISE_EIFFEL%\studio\lang\mo_files\*.mo
 Unix:
 /usr/share/locale/(product_version_name)/*.mo

Maintenance

General

  • .pot file is PO template file which is generated by .po generation tool. .pot file is simply untranslated file with only source entries and blank target entries.
  • .po files are the files translators actually work on. Whenever translators get a new version of .pot file, they should update .po file they are working on from the .pot file. Update is normally done by third party tools. Tools like poEdit give lists of new strings and obsolete strings. And in the full list new strings and fuzzy strings are marked in different colors by poEdit.
  • Fuzzy strings are applied when updating. msgmerge of Gettext make slightly changed strings fuzzy. When the checking of fuzzy strings is done, translators should remove fuzzy marks.
  • Obsolete strings are commented out at the end of .po files when merging. Those comments can be removed at anytime if we wish.
  • When the translation or modification is done, translators only need to commit .po file(s) that they are working on.
  • Whenever new languages are decided to add in. New .po file can be added directly in %EIFFEL_SRC%\Delivery\studio\lang\po_files. Eiffel Studio should have the ability to detect at runtime what languages are available.

Translator Guide

  • Update $EIFFEL_SRC/Delivery.
  • Download a po editor. poEdit for Windows and KBabel or gtranslator for KDE and Gnome.
  • In $EIFFEL_SRC/Delivery/studio/lang/po_files, find out the .po file(s) one should work on. Take zh_CN.po as example. Open zh_CN.po in po editor. In the po editor, there should be a command to update from pot file. Update from $EIFFEL_SRC/Delivery/studio/lang/po_files/estudio.pot within the po editor. If there is any change took place in estudio.pot, the po editor should inform. The translator fills empty entries, solves FUZZY strings or modifies translated entries.
  • When the translation is done, just commit those modified po files.
  • Make sure that po files are saved in UTF-8 encoding.
  • Note that never try to edit .po files by hand. Because other translators wouldn't see the changes if estudio.pot file were not updated.
  • Note that a translator should not modify estudio.pot file.

Developer Guide

  • The major thing a developer should take care of is code quality. All names need to be translated should in principle be put in framework/interface_names. Whenever a string of sentence is needed in the interface, just leave it as a sentence to be translated. Be careful to SEPERATE a sentence into terms or phrases, because ways to sequence those terms again into sentence vary in different languages. Plural form should be used whenever needed. There are a lot of examples in INTERFACE_NAMES.
  • If a developer wants to take changes into effect immediately, see what a maintainer and a translator should do.

Maintainer Guide

  • Build po generation tool which is located at internal svn repository $EIFFEL_SRC/tools/po_generation_tool.
  • Make sure gettext and perl is installed. On windows cygwin cantains perl and gettext modules.
  • When there are new or modified strings need to be translated in the code. $EIFFEL_SRC/Delivery/studio/lang/estudio.pot should be regenerated.
 To regenerate estudio.pot, one should do:
 po_generation_tool -D $EIFFEL_SRC/Eiffel $EIFFEL_SRC/framework -o $EIFFEL_SRC/Delivery/studio/lang/estudio.pot
 On windows:
 $EIFFEL_SRC/Delivery/studio/lang/script/build_preference_entries.bat
 On Unix:
 perl $EIFFEL_SRC/Delivery/studio/lang/script/preference_po_extraction.pl
 msguniq -o $EIFFEL_SRC/Delivery/studio/lang/po_files/estudio.pot $EIFFEL_SRC/Delivery/studio/lang/po_files/estudio.pot
  • Make sure etudio.pot file committed is generated from repository code. Do not commit estudio.pot that was generated from local code. Because other maintainers might override your changes that are not from repository. And modification of estudio.pot file by hand is not recommended.
  • Commit the estudio.pot file to let translators to update.
  • To add a new language support, simply copy estudio.pot to LOCALE_ID.po where LOCALE_ID should be:
 Case 1: LL-RR
 Case 2: LL-SS-RR
 Case 3: LL_RR
 Case 4: LL_RR.Enc
 Case 5: LL_RR@SS  [sometimes the SS is simply variant information]
 LL is a two-letter language identifier from ISO 639-1 or, if there is none, a three-letter
 identifier from ISO 639-2/T
 RR is a two-letter country coding from ISO 3166-1, except when it is not (en-029 ('English (Carribean)') under Windows)
 SS under windows is mostly either 'Latn' or 'Cyrl'. @SS on linux is sometimes useful and sometimes meaningless

Locale Id for reference

       Afrikaans (South Africa)         af-ZA         
       Amharic (Ethiopia)         am-ET         
       Arabic (U.A.E.)         ar-AE         
       Arabic (Bahrain)         ar-BH         
       Arabic (Algeria)         ar-DZ         
       Arabic (Egypt)         ar-EG         
       Arabic (Iraq)         ar-IQ         
       Arabic (Jordan)         ar-JO         
       Arabic (Kuwait)         ar-KW         
       Arabic (Lebanon)         ar-LB         
       Arabic (Libya)         ar-LY         
       Arabic (Morocco)         ar-MA         
       Arabic (Oman)         ar-OM         
       Arabic (Qatar)         ar-QA         
       Arabic (Saudi Arabia)         ar-SA         
       Arabic (Syria)         ar-SY         
       Arabic (Tunisia)         ar-TN         
       Arabic (Yemen)         ar-YE         
       Mapudungun (Chile)         arn-CL         
       Assamese (India)         as-IN         
       Azeri (Azerbaijan, Cyrillic)         az-Cyrl-AZ         
       Azeri (Azerbaijan, Latin)         az-Latn-AZ         
       Bashkir (Russia)         ba-RU         
       Belarusian (Belarus)         be-BY         
       Bulgarian (Bulgaria)         bg-BG         
       Bengali (India)         bn-IN         
       Tibetan (Bhutan)         bo-BT         
       Tibetan (PRC)         bo-CN         
       Breton (France)         br-FR         
       Bosnian (Bosnia and Herzegovina, Cyrillic)         bs-Cyrl-BA         
       Bosnian (Bosnia and Herzegovina, Latin)         bs-Latn-BA         
       Catalan (Catalan)         ca-ES         
       Corsican (France)         co-FR          -- Note: Corsican is in the msdn table, but has no LCID - maybe in future releases it will get one (corsican nationalists might threaten to blow up Microsoft HQ)
       Czech (Czech Republic)         cs-CZ         
       Welsh (United Kingdom)         cy-GB         
       Danish (Denmark)         da-DK         
       German (Austria)         de-AT         
       German (Switzerland)         de-CH         
       German (Germany)         de-DE         
       German (Liechtenstein)         de-LI         
       German (Luxembourg)         de-LU         
       Lower Sorbian (Germany)         dsb-DE         
       Divehi (Maldives)         dv-MV         
       Greek (Greece)         el-GR         
       English (Caribbean)         en-029         
       English (Australia)         en-AU         
       English (Belize)         en-BZ         
       English (Canada)         en-CA         
       English (United Kingdom)         en-GB         
       English (Ireland)         en-IE         
       English (India)         en-IN         
       English (Jamaica)         en-JM         
       English (Malaysia)         en-MY         
       English (New Zealand)         en-NZ         
       English (Philippines)         en-PH         
       English (Singapore)         en-SG         
       English (Trinidad and Tobago)         en-TT         
       English (United States)         en-US         
       English (South Africa)         en-ZA         
       English (Zimbabwe)         en-ZW         
       Spanish (Argentina)         es-AR         
       Spanish (Bolivia)         es-BO         
       Spanish (Chile)         es-CL         
       Spanish (Colombia)         es-CO         
       Spanish (Costa Rica)         es-CR         
       Spanish (Dominican Republic)         es-DO         
       Spanish (Ecuador)         es-EC         
       Spanish (Spain)         es-ES         
       Spanish (Guatemala)         es-GT         
       Spanish (Honduras)         es-HN         
       Spanish (Mexico)         es-MX         
       Spanish (Nicaragua)         es-NI         
       Spanish (Panama)         es-PA         
       Spanish (Peru)         es-PE         
       Spanish (Puerto Rico)         es-PR         
       Spanish (Paraguay)         es-PY         
       Spanish (El Salvador)         es-SV         
       Spanish (United States)         es-US         
       Spanish (Uruguay)         es-UY         
       Spanish (Venezuela)         es-VE         
       Estonian (Estonia)         et-EE         
       Basque (Basque)         eu-ES         
       Persian (Iran)         fa-IR         
       Finnish (Finland)         fi-FI         
       Filipino (Philippines)         fil-PH         
       Faroese (Faroe Islands)         fo-FO         
       French (Belgium)         fr-BE         
       French (Canada)         fr-CA         
       French (Switzerland)         fr-CH         
       French (France)         fr-FR         
       French (Luxembourg)         fr-LU         
       French (Monaco)         fr-MC         
       Frisian (Netherlands)         fy-NL         
       Irish (Ireland)         ga-IE         
       Dari (Afghanistan)         gbz-AF         
       Galician (Spain)         gl-ES         
       Alsatian (France)         gsw-FR         
       Gujarati (India)         gu-IN         
       Hausa (Nigeria, Latin)         ha-Latn-NG         
       Hebrew (Israel)         he-IL         
       Hindi (India)         hi-IN         
       Croatian (Bosnia and Herzegovina, Latin)         hr-BA         
       Croatian (Croatia)         hr-HR         
       Hungarian (Hungary)         hu-HU         
       Armenian (Armenia)         hy-AM         
       Indonesian (Indonesia)         id-ID         
       Igbo (Nigeria)         ig-NG         
       Yi (PRC)         ii-CN         
       Icelandic (Iceland)         is-IS         
       Italian (Switzerland)         it-CH         
       Italian (Italy)         it-IT         
       Inuktitut (Canada, Syllabics)         iu-Cans-CA         
       Inuktitut (Canada, Latin)         iu-Latn-CA         
       Japanese (Japan)         ja-JP         
       Georgian (Georgia)         ka-GE         
       Khmer (Cambodia)         kh-KH         
       Kazakh (Kazakhstan)         kk-KZ         
       Greenlandic (Greenland)         kl-GL         
       Kannada (India)         kn-IN         
       Korean (Korea)         ko-KR         
       Konkani (India)         kok-IN         
       Kyrgyz (Kyrgyzstan)         ky-KG         
       Luxembourgish (Luxembourg)         lb-LU         
       Lao (Lao PDR)         lo-LA         
       Lithuanian (Lithuania)         lt-LT         
       Latvian (Latvia)         lv-LV         
       Maori (New Zealand)         mi-NZ         
       Macedonian (Macedonia, FYROM)         mk-MK         
       Malayalam (India)         ml-IN         
       Mongolian (Mongolia)         mn-Cyrl-MN         
       Mongolian (PRC)         mn-Mong-CN         
       Mohawk (Canada)         moh-CA         
       Marathi (India)         mr-IN         
       Malay (Brunei Darussalam)         ms-BN         
       Malay (Malaysia)         ms-MY         
       Maltese (Malta)         mt-MT         
       Norwegian (Bokm?l, Norway)         nb-NO         
       Nepali (India)         ne-IN          --also missing    
       Nepali (Nepal)         ne-NP         
       Dutch (Belgium)         nl-BE         
       Dutch (Netherlands)         nl-NL         
       Norwegian (Nynorsk, Norway)         nn-NO         
       Sesotho sa Leboa/Northern Sotho (South Africa)         ns-ZA         
       Occitan (France)         oc-FR         
       Oriya (India)         or-IN         
       Punjabi (India)         pa-IN         
       Polish (Poland)         pl-PL         
       Pashto (Afghanistan)         ps-AF         
       Portuguese (Brazil)         pt-BR         
       Portuguese (Portugal)         pt-PT         
       K'iche (Guatemala)         qut-GT         
       Quechua (Bolivia)         quz-BO         
       Quechua (Ecuador)         quz-EC         
       Quechua (Peru)         quz-PE         
       Romansh (Switzerland)         rm-CH         
       Romanian (Romania)         ro-RO         
       Russian (Russia)         ru-RU         
       Kinyarwanda (Rwanda)         rw-RW         
       Sanskrit (India)         sa-IN         
       Yakut (Russia)         sah-RU         
       Sami (Northern, Finland)         se-FI         
       Sami (Northern, Norway)         se-NO         
       Sami (Northern, Sweden)         se-SE         
       Sinhala (Sri Lanka)         si-LK         
       Slovak (Slovakia)         sk-SK         
       Slovenian (Slovenia)         sl-SI         
       Sami (Southern, Norway)         sma-NO         
       Sami (Southern, Sweden)         sma-SE         
       Sami (Lule, Norway)         smj-NO         
       Sami (Lule, Sweden)         smj-SE         
       Sami (Inari, Finland)         smn-FI         
       Sami (Skolt, Finland)         sms-FI         
       Albanian (Albania)         sq-AL         
       Serbian (Bosnia and Herzegovina, Cyrillic)         sr-Cyrl-BA         
       Serbian (Serbia and Montenegro, Cyrillic)         sr-Cyrl-CS         
       Serbian (Bosnia and Herzegovina, Latin)         sr-Latn-BA         
       Serbian (Serbia and Montenegro, Latin)         sr-Latn-CS         
       Swedish (Finland)         sv-FI         
       Swedish (Sweden)         sv-SE         
       Swahili (Kenya)         sw-KE         
       Syriac (Syria)         syr-SY         
       Tamil (India)         ta-IN         
       Telugu (India)         te-IN         
       Tajik (Tajikistan)         tg-Cyrl-TJ         
       Thai (Thailand)         th-TH         
       Turkmen (Turkmenistan)         tk-TM         
       Tamazight (Algeria, Latin)         tmz-Latn-DZ         
       Setswana/Tswana (South Africa)         tn-ZA         
       Urdu (India)         tr-IN         
       Turkish (Turkey)         tr-TR         
       Tatar (Russia)         tt-RU         
       Uighur (PRC)         ug-CN         
       Ukrainian (Ukraine)         uk-UA         
       Urdu (Pakistan)         ur-PK         
       Uzbek (Uzbekistan, Cyrillic)         uz-Cyrl-UZ         
       Uzbek (Uzbekistan, Latin)         uz-Latn-UZ         
       Vietnamese (Vietnam)         vi-VN         
       Upper Sorbian (Germany)         wen-DE         
       Wolof (Senegal)         wo-SN         
       Xhosa/isiXhosa (South Africa)         xh-ZA         
       Yoruba (Nigeria)         yo-NG         
       Chinese (PRC)         zh-CN         
       Chinese (Hong Kong SAR, PRC)         zh-HK         
       Chinese (Macao SAR)         zh-MO         
       Chinese (Singapore)         zh-SG         
       Chinese (Taiwan)         zh-TW         
       Zulu/isiZulu (South Africa)         zu-ZA