Software localization: Difference between revisions
m (using an external editor) |
m (using an external editor) |
||
Line 141: | Line 141: | ||
=== LAMS === | === LAMS === | ||
[[LAMS]] is a system for authoring and delivering learning activities, i.e. a [[learning design]] and learning activity management software. | [[LAMS]] is a system for authoring and delivering learning activities, i.e. a [[learning design]] and learning activity management software. Since I made a little commitment to help with translation, I also shall insert some comments regarding features and UI elements that we could make better - [[User:Daniel K. Schneider|Daniel K. Schneider]] 14:29, 22 January 2010 (UTC). | ||
This open source project developed an original online solution to support translation. Software translators will use two tools: | This open source project developed an original online solution to support translation. Software translators will use two tools: | ||
* An internationalization site (see below) | * An internationalization site (see below) | ||
* A LAMS Translators server (that is upgrade after a translation effort so that a translator can check the look and feel on a live system and make sure that there are no conceptual errors) | * A LAMS Translators server (that is upgrade after a translation effort so that a translator can check the look and feel on a live system and make sure that there are no conceptual errors) | ||
; Editing translation strings | |||
Strings to translate are called labels. A string may include slots to filled in with data and use the syntax ''{0}'', ''{1}'', etc. | |||
The LAMS [http://lamscommunity.org/i18n Internationalization site] presents a list of available software modules and within each module, completion is shown as the following picture shows. | The LAMS [http://lamscommunity.org/i18n Internationalization site] presents a list of available software modules and within each module, completion is shown as the following picture shows. | ||
[[image:lams-translation-server.jpg|frame|none|Lams translation server screendump (retrieved 1/2010 from [http://wiki.lamsfoundation.org/display/lams/Translating+LAMS Translating LAMS] ]] | [[image:lams-translation-server.jpg|frame|none|Lams translation server screendump (retrieved 1/2010 from [http://wiki.lamsfoundation.org/display/lams/Translating+LAMS Translating LAMS] (Note: One ought be able to list just the modules for one language).]] | ||
Translators can display a module, each label is displayed as a list of | Translators can display a module, each label is displayed as a list of | ||
:English String, a french translation, update date, | :English String, a french translation, update date, translator | ||
[[image:LAMS-translation-module-display.png|frame|none| | [[image:LAMS-translation-module-display.png|frame|none|Display of module labels]] | ||
The translator then can choose between: | The translator then can choose between: | ||
Line 161: | Line 165: | ||
Below are a few screenshots that illustrate the principle | Below are a few screenshots that illustrate the principle | ||
[[image:LAMS-translation-editing-single-label.png| | [[image:LAMS-translation-editing-single-label.png|thumb|700px|none|Editing a single label (Note: We might add the Java property label since its a unique identfier)]] | ||
[[image:LAMS-translation-bulk-translate.png|frame|none|Bulk translating labels]] | |||
; Homogenization | |||
For the moment, there is no support for language-specific glossary and terminology management. | |||
Good translators may develop several strategies to make sure that a common terminology is used. | |||
Terminology must be homogeneous, e.g. the English word "Cancel" should normally translate to the same french word, e.g. "annuller" (v.s. "abandonner"). One also could argue that translators sometimes have to use less meaningful words, because at some point MS made certain decisions and people are used to it. | |||
"Non-natural" terminology like "branching" or "gates" raise additional challenges. In french I translated this to "verrou" after looking at the German "Sperre". For various reasons I don't like the obvious translation of "Gate" to "Portail" (meaning "Portal"). Verrou means "lock" but also could mean a barrier in some geological sense. Anyhow, if several translators participate or if one translator translates once in a while, there is a very high chance that very different words are being used to deal with the same object. Also in my case, only once I decided to use LAMS for real, I stumbled upon terminology (created by myself) and that I really found unclear. | |||
Fortunately in LAMS (and this is a mission-critical feature), translators can search for labels in any language (Note: Not true actually only in English and the target language) and then see the result in a similar fashion as in the bulk translate window. | |||
[[image:LAMS-translation-search-labels.png|frame|none|Result of a label search]] | |||
[[image:LAMS-translation-search-labels-result.png|frame|none|Result of a label search in English]] | |||
[[image:LAMS-translation- | [[image:LAMS-translation-search-labels-result2.png|frame|none|Result of a label search in Target language (french)]] | ||
; Testing | |||
Testing is basically done by playing with a translator's test server. Translators have full rights, e.g. can explore the admin interface, create new teacher and student users and play with these. | |||
When the project manager sees that new important translations have been comitted he will update the test server, in particular the weeks before a software upgrade. | |||
; Under the hood | ; Under the hood |
Revision as of 15:29, 22 January 2010
This article or section is currently under construction
In principle, someone is working on it and there should be a better version in a not so distant future.
If you want to modify this page, please discuss it with the person working on it (see the "history")
Introduction
Software localization (or localisation) could mean simple "translation of software to another language", including adaptation of some formats (e.g. measures and dates) and currency. But usually, software localization implies more.
Firstly, software should be internationalized (I18N), i.e. desiged in a way that it can be adapted to various languages and regions without engineering changes to the programming logic. It is "stuff that has to be done once" in principle. But sometimes I18N is done in stages, e.g. often developers forget that somelanguages are more verbose than others (need more space) or work in other directions (e.g. right-to-left).
Localization ("L10N") means adaption of a product following the needs of a particular population in a precise geographic region. Such a definition implies that translation includes linguistic, cultural and ergonomic aspects. (Le grand dictionnaire terminologique). McKethan and White (2005) define localization as “the process of adapting an internationalized product to a specific language, script, cultural, and coded character set environment. In localization, the same semantics are preserved while the syntax may be changed.” The authors further argue that “Localization goes beyond mere translation. The user must be able to not only select the desired language, but other local conventions as well. For instance, one can select German as a language, but also Switzerland as the specific locale of German. Locale allows for national or locale-specific variations on the usage of format, currency, spellchecker, punctuation, etc., all within the single German language area.”
Gregory M. Shreve (retrieved 16:27, 21 January 2010 (UTC)) adds adaptation of "non-textual materials". “Localization is the process of preparing locale-specific versions of a product and consists of the translation of textual material into the language and textual conventions of the target locale and the adaptation of non-textual materials and delivery mechanisms to take into account the cultural requirements of that locale.”
Usually, language localization extends to national subcultures. E.g. German would deline in De-de (German) and other versions for Switzerland and Austria. National versions of a language include different words, different spellings (like "localization" vs. "localisation"), and sometimes different grammar. Conversely, a multi-lingual country like Switzerland would have German Swiss (de-ch) French Suiss (fr-ch) and Italian-Swiss (it-ch). The latter would have in common the way data/time, decimals and currency is represented. Software translators may adapt the following strategry:
- Create a generic translation for one language, e.g. fr and then add specific local variants like fr_fr, fr_be on top of it.
- Create only one translation, e.g. fr_fr, and have users cope with it. Data/time and decimal/currency representation differences should be handled though. Most often, this is the case in free open source software.
Finally, the term Software globalization (G11N), also known as National Language Support is the combination of software internationalization (I18N) and localization (L10N).
Let's recapitulate and explain the abbreviations for internationalization, localization and globalization:
Internationalization, known as I18N, is a funny acronym called numeronym, where 18 stands for the number of letters between the first i and last n in internationalization.
Localization, known as l10n (or L10N) , is composed of the l of localization, followed by 10 letters (ocalizatio) and the final n of localization.
Sofware globalization in short: G11N = I18N + L10N
Issues
Since we shall focus on translation of open source software, we shall stress the importance of "ergonomic aspects" as the #1 priority of localization. Ergonomic translation means both "surface usability" (users can understand the meaning of UI interface elements and system messages) and cognitive ergonomic (user can get meaningful tasks done with the system).
Infrastructure and people
In a larger project, the list of types of participants can be quite long: E.g. Gregory M. Shreve identifies: Project managers, Translators (Generic), Localization Translators (Specialists), Terminologists, Internationalization/Localization Engineers (Software Background), Proofreaders, QA specialists, Testing engineers, Multilingual Desktop publishing specialists.
Now, what would the absolute minimal rules in a volonteer-based open source project ?
- one person to coordinate software development and translation
- one person to coach software string translators (can be the one above)
- one person to coach manual writing, including a glossary. (can be the one above)
- translators
- users that provide feedback about ergonomics (difficult to get) and spelling
Target
Target of I18N and L10N should be thought of in terms of the whole system. A software product may include:
- User documentation (includes several genres and available through several media, e.g. the software itself, HTML, paper/PDF)
- Manuals
- Short contextual help
- Glossary
- Tutorials
- ....
- Software (user)
- Menus and Icons (visual command language)
- Messages (various output)
- Command languages (maybe)
- ....
- Software documentation
- Documentation of language constants (and other useful elements)
- Developer manuals (maybe)
Since translation work is often split up between several volonteers (e.g. software modules, manuals, tutorials), there ought to be some potential for synergy that FOSS developers might try to develop. E.g. people writing the tutorials also should make comments about the meaningfulness of system messages as well as other usability issues, and not just try to adapt the user to the system.
On the code side
- Language files
- All output messages to the user must be defined as a kind of constant that the programmers will use
- Name of the constant should be meaningful to translators. E.g.
- Languages files must be separate (if terms are not in a database)
- Encoding
- Use Unicode
- Space and Layout
- Space of text fields: Some languages are more verbose and one must plan for that, by using wider icons, menu items, user input fields and such or else use a "fluid" design.
Managing volonteers in an open source project
Opensource projects don't have the funding to pay professional translators. This situation has disadvantages but also some advantages.
- Disadvantages: Quality of the translation, completion (untranslated strings for new versions, missing languages, etc.)
- Advantages: Meaninfulness of the translation (often translators are users, i.e. they have know how of the tool which "normal affordable" translators do not have.
I (16:27, 21 January 2010 (UTC)) believe that there ought to be some strategies to improve volonteer translation efforts. The main issues:
(1) Motivating people to help and to continue translating
(2) Make sure that translation is usable by providing a decent enough translation support environment (see next item)
Technical infrastructure
Translators should "see" what they translate. This implies concerns several items. When translating a string, the translator should be able to see:
- The name of the constant (which must be meaningful, e.g. "modulename.mainmenu.edit" or "modulename.errormsg.upload.xxx")
- A meanigful short description. This description may include a link to a glossary.
- All other translations (languages strings) the translator understands (e.g. if I translate to German, I'd like see both English and French)
- (If possible) the constant displayed in the interface. That of course requires extra programming. Or even better: be able to edit strings directly on the interface
- Tools for consistency
- When translating hundreds of strings (and the situation gets worse if it's done by several people) there should be a way to search through all terms in all modules in three ways:
- Find same expressions in the target language and display an other language next to it
- Find same expressions in another language and display the target strings next to it
- Be able to edit and consult a short glossary that includes the most important terms (might be combined with the general user manual)
- (dreaming) direct access to some online translation dictionary like the english/french grand dictionnaire terminologique
Examples
Mozilla Projects / Firefox
Let's examine a few features of the Mozilla L10N strategy.
In the Mozilla project, localization strings are managed through XUL. The following fragment defines two strings to be displayed as so-called tokens
<caption label="&<b class="token">identityTitle.label</b>;"/>
<description>&<b class="token">identityDesc.label</b>;</description>
identityTitle.label and identityDesc.label will be substituted by a strings defined in a DTD as entities.
<!ENTITY <b class="token">identityTitle.label</b> "Identity">
<!ENTITY <b class="token">identityDesc.label</b> "Each account has an identity, which is the ↵
↳ information that other people see when they read your messages.">
("↵ ↳" indicates a single line, broken for readability) If you want to find more example constants defined as XML entities, locate the firefox installation directory on your computer and examine chrome/ab-CD.jar file, e.g. en-US.jar.
The L10N tools include
- Use of a text editor that can handle UTF-8 files
- A langpack2cvstree.sh script that converts the en-US language package into another locale.
- A command line/web tool, called compare locales: finds missing and obsolate strings in a localization
- Example: short text that describes the tool
- MozillaBuild: An easy way to install everything you need to checkout/pull and checkin/push your localization and run compare-locales on Windows.
- Mozilla Translator is a tool to help translate programs
- Narro, is a web application that allows online translation and coordination. You can see in operation at l10n.mozilla.org
- Translate Toolkit (moz2po and po2moz): converts various sorts of Mozilla files to Gettext PO format for translation efforts using a PO editor and the other way round. It's used be Pootle for example (see below).
- MozLCDB similiar to PO but more dedicated to Mozilla products
- Pootle a web server for localisation that allows web-based contributions and management. Combined with the Translate Toolkit it allows Mozilla products to be localised online.
- Virtaal is an off-line PO editor developed by the Pootle team.
We wonder a bit who uses which tools. If we understand the situation right, there are several ways to translate as long as the translation does find its way back into the CVS at some point.
Also, it is not suprising, that the project makes a clear distinction between "official releases" and others. Official releases include translation of the installation and migration process, localizing the start page and other web pages built into the product, customizing settings like "live bookmarks", locally relevant search engine plugins, and more.
Now I wonder if translation learning management systems into different languages could imply using a different pedagogical vocabulary. E.g. "french didactics" vs. "belgian instructional design" vs. "canadian constructivism" (without any English word) vs. Swiss "let's have a bit of all".
LAMS
LAMS is a system for authoring and delivering learning activities, i.e. a learning design and learning activity management software. Since I made a little commitment to help with translation, I also shall insert some comments regarding features and UI elements that we could make better - Daniel K. Schneider 14:29, 22 January 2010 (UTC).
This open source project developed an original online solution to support translation. Software translators will use two tools:
- An internationalization site (see below)
- A LAMS Translators server (that is upgrade after a translation effort so that a translator can check the look and feel on a live system and make sure that there are no conceptual errors)
- Editing translation strings
Strings to translate are called labels. A string may include slots to filled in with data and use the syntax {0}, {1}, etc.
The LAMS Internationalization site presents a list of available software modules and within each module, completion is shown as the following picture shows.
Translators can display a module, each label is displayed as a list of
- English String, a french translation, update date, translator
The translator then can choose between:
- editing an isolated label
- bulk translate all labels (date and author information of all tags will be overwritten)
- translate only missing labels
Below are a few screenshots that illustrate the principle
- Homogenization
For the moment, there is no support for language-specific glossary and terminology management.
Good translators may develop several strategies to make sure that a common terminology is used.
Terminology must be homogeneous, e.g. the English word "Cancel" should normally translate to the same french word, e.g. "annuller" (v.s. "abandonner"). One also could argue that translators sometimes have to use less meaningful words, because at some point MS made certain decisions and people are used to it.
"Non-natural" terminology like "branching" or "gates" raise additional challenges. In french I translated this to "verrou" after looking at the German "Sperre". For various reasons I don't like the obvious translation of "Gate" to "Portail" (meaning "Portal"). Verrou means "lock" but also could mean a barrier in some geological sense. Anyhow, if several translators participate or if one translator translates once in a while, there is a very high chance that very different words are being used to deal with the same object. Also in my case, only once I decided to use LAMS for real, I stumbled upon terminology (created by myself) and that I really found unclear.
Fortunately in LAMS (and this is a mission-critical feature), translators can search for labels in any language (Note: Not true actually only in English and the target language) and then see the result in a similar fashion as in the bulk translate window.
- Testing
Testing is basically done by playing with a translator's test server. Translators have full rights, e.g. can explore the admin interface, create new teacher and student users and play with these. When the project manager sees that new important translations have been comitted he will update the test server, in particular the weeks before a software upgrade.
- Under the hood
- Labels are Java properties and my include slots for arguments, e.g.
The {0} cannot be within the range of an existing condition.
Technical issues
Document Formats
In open source, there exist several strategies:
- Gettext is based on the idea that keys used to retrieve local language strings corresponds to the original string used in the source code. Documentation also is added as programming comment just before the corresponing line. From the source code so-called .PO files that are then use by translators.
- XLIFF (XML Localization Interchange File Format) is an XML-based format created to standardize localization. XLIFF was standardized by OASIS in 2002.
There exist convertors from PO to XLIFF.
Software
- PO editors
- See gettext
Links
- Definitions
- Organizations
- How-to
- Localisation Guide and Document translation at Sourceforge.
- How to Localize Software (Developer-resource.com, retrieved 16:27, 21 January 2010 (UTC)).
- Software Localization versus Translation, TranslatorsCafé.com, by Alexander Schunk. Submitted on March 22, 2008
- for Developing Non-English Web Sites by [http://tlt.psu.edu/ Teaching and Learning with Technology, Penn State University (includes several good web pages, e.g. about HTML "lang" attribute.
- Internationalization
- Dotnet-culture.net provides information about date/time and decimal/currency representation.
- About language file formats
- XLIFF: An Aid To Localization by John Corrigan and Tim Foster, Sun Developer Network.
- XLIFF (Wikipedia)
- Example Project and languages
- Mozilla
- Does Internalization through the XUL User Interface Language
- L10n:Home Page
- Mozilla Localization Project (Archives)
- Courses
- Software Localization MCLS 600012 taught by Gregory M. Shreve. (2001, retrieved 16:27, 21 January 2010 (UTC)). Includes PPT and HTML files for reading. Also available here
- Lecture 1
- What is Software Localization? A list Alexa Dubreuil that summarizes the course syllaus
- Software
- Translate Toolkit (Wikipedia)
- More: Trados® Freelance™, Atril Déjà Vu, STAR Transit, SDLX™, IBM TranslationManager,
- Indexes
- Translators’ On-Line Resources, hosted by the Translation Journal.
Bibliography
- Dohler, Per N. (1979). Facets of Software Localization, A Translator's View. Translation Journal 1, July 1997. (retrieved 16:27, 21 January 2010 (UTC))
- McKethan, Kenneth A. (Sandy)Jr. and Graciela White (2005). Demystifying Software Globalization, Translation Journal 9 (2), April 2005. HTML, retrieved 16:27, 21 January 2010 (UTC).
- Esselink Bert (2000), A Practical Guide to Localization, , John Benjamins Publishing, ISBN 1-58811-006-0