Difference between revisions of "Pseudo Localization"

From Lingoport Wiki
Jump to: navigation, search
(Video)
 
(41 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  +
== Video ==
  +
For an introduction video on the subject, please see:
  +
  +
[[File:Pseudo-Localization Thumbnail.jpg|200px|link=https://www.youtube.com/watch?v=EdJ-oPhP0CY&ab_channel=Lingoport]]
  +
  +
which covers the topic on this Wiki page.
  +
 
== What is Pseudo-Localization? ==
 
== What is Pseudo-Localization? ==
 
Pseudo-localization is an effective way to test the localization-readiness of an application.  By pseudo-localizing the resource files, an application can be tested for internationalization without waiting for localization.
 
Pseudo-localization is an effective way to test the localization-readiness of an application.  By pseudo-localizing the resource files, an application can be tested for internationalization without waiting for localization.
   
  +
A '''pseudo-locale''' is like a regular locale, like de-DE for German translation, but instead of a translation, it modified the source strings in the following way:
Creating pseudo-localized resource files help test for embedded strings, text that was externalized but should not be, text expansion issues, character-encoding problems, text concatenation issues, and UI boundary issues can be identified.
 
 
An application typically retrieves strings based on a locale, such as French. A pseudo-locale is like a normal locale, but the strings are not translated, they simply show differently, as in the example below:
 
   
  +
A source string like:
[File:Pseudo-localization.jpg|500px]
 
  +
* '''Order placed successfully!'''
  +
will be localized as:
  +
* '''[Öŕðéŕ þļåçéð šûççéššƒûļļý!------------- П國カ내]'''
   
   
  +
The pseudo-localize accented string is still legible, so the developer or QA can run the application. A start character, [, is added at the beginning of the string. The string is expanded to reflect possible width of target locales strings. Some characters from other writing systems are added to check for encoding or font issues. An end character, ], is added to show there that the source string ends to help detect concatenations.
== config_pseudo_loc.xml.xml ==
 
File '''<code>config_pseudo_loc.xml</code>''' is the configuration file for setting up pseudo localization instructions for LRM resource files. It is located in the global '''<code><HOME>/Lingoport_Data/L10nStreamlining/config</code>''' folder. If you need different criteria for a group or project then the file can be copied and moved to '''<code><HOME>/Lingoport_Data/L10nStreamlining/<group>/config</code>''' or '''<code><HOME>/Lingoport_Data/L10nStreamlining/<group>/projects/<project>config</code>''' respectively.
 
   
  +
When the application uses the pseudo-locale, the pseudo-localized resource text will be displayed.
The '''config_pseudo_loc.xml''' file is used when running the <code>--pseudo-loc</code> command. A pseudo-localized file will be created for each base resource file if the project has a designated pseudo-locale.
 
   
  +
Creating pseudo-localized resource files help test for
There are 2 categories of information contained in the ''config_pseudo_loc.xml'' file.
 
  +
* embedded strings,
* pseudo localization instructions
 
  +
* text that was externalized but should not be,
* regex pattern for each parser type defining the parameters as well as special characters that should not be pseudo-localized.
 
  +
* text expansion issues,
  +
* character-encoding problems,
  +
* text concatenation issues, and
  +
* UI boundary issues can be identified.
   
  +
An application typically retrieves strings based on a locale, such as French. A pseudo-locale is like a normal locale, but the strings are not translated, they simply show differently.
=== Pseudo Localization Instructions ===
 
The configuration consists of the following xml elements:
 
* '''expansion lengths''' - defines the expansion percentage based on a strings length. This is useful to test language expansions.
 
* '''expansion character''' - the character that will be used to pad the string in order to simulate string expansion
 
* '''expansion end character''' - characters at the end of expansion. Typically, these are symbols across a range of codepoints in order to ensure that content from the various code pages are supported
 
* '''start and end characters''' - characters at the start and end of the resource string. This is useful to catch resource strings that are concatenated within the code.
 
* '''accents''' - a flag indicating whether the resource string should be accented. This is needed to ensure that all visible user strings are localized.
 
   
  +
== How to identify i18n Issues with Pseudo-localization ==
==== Typical Configuration Example ====
 
  +
This section shows how to use pseudo-localization to find and correct issues.
   
  +
=== UI in the source Locale ===
&lt;expansion&gt;
 
&lt;lengths&gt;
 
&lt;length max="10" expand-percentage="200"/&gt;
 
&lt;length max="20" expand-percentage="100"/&gt;
 
&lt;length max="30" expand-percentage="80"/&gt;
 
&lt;length max="50" expand-percentage="60"/&gt;
 
&lt;length max="70" expand-percentage="40"/&gt;
 
&lt;length max="" expand-percentage="30"/&gt;
 
&lt;/lengths&gt;
 
&lt;expansion-char>-&lt;/expansion-char&gt;
 
 
&lt;!-- П: A Cyrillic character - (for Russian or ‘R’) --&gt;
 
&lt;!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) --&gt;
 
&lt;!-- カ: A Katakana character (used for Japanese or ‘J’) --&gt;
 
&lt;!-- 내: A Hangul character (for Korean or ‘K’) --&gt;
 
&lt;expansion-end-chars&gt; П國カ내&lt;/expansion-end-chars&gt;
 
&lt;/expansion&gt;
 
&lt;start-char>[&lt;/start-char&gt;
 
&lt;end-char>]&lt;/end-char&gt;
 
&lt;accent-chars>1&lt;/accent-chars&gt;
 
   
  +
First, let's show the user interface in the source locale, here English/US:
Based on this configuration, an example pseudo-localization would be:
 
   
  +
[[File:Source Locale UI.jpg|center|700px]]
Base Resource string: "This is an example using the typical configuration"
 
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé ûšîñĝ ţĥé ţýþîçåļ çöñƒîĝûŕåţîöñ------------------------П國カ내]"
 
   
==== No Expansion Configuration Example ====
 
Edit the ''config_pseudo_loc.xml'' for no expansion by setting the ''expand-percentage'' to 0 and not entering any characters for the ''expansion character'' as well as the ''expansion end characters''.
 
   
  +
=== UI in the pseudo-locale with issues ===
&lt;expansion&gt;
 
  +
Below, the application is running with the pseudo-locale (for example 'esperanto') and the English strings have been pseudo-localized.
&lt;lengths&gt;
 
&lt;length max="" expand-percentage="0"/&gt;
 
&lt;/lengths&gt;
 
&lt;expansion-char>&lt;/expansion-char&gt;
 
&lt;expansion-end-chars&gt;&lt;/expansion-end-chars&gt;
 
&lt;/expansion&gt;
 
&lt;start-char>[&lt;/start-char&gt;
 
&lt;end-char>]&lt;/end-char&gt;
 
&lt;accent-chars>1&lt;/accent-chars&gt;
 
   
Based on this configuration, an example pseudo-localization would be:
 
   
  +
Some issues can be identified when the application is running using the pseudo-locale:
Base Resource string: "This is an example of no expansion configuration"
 
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ çöñƒîĝûŕåţîöñ]"
 
   
  +
[[File:Pseudo-Localization Example Issues.jpg|center|700px]]
==== End Character Symbols with no Expansion Configuration Example ====
 
Edit the ''config_pseudo_loc.xml'' for no expansion by setting the ''expand-percentage'' to 0 and not entering any characters for the ''expansion character'' as well as the ''expansion end characters''. Add appropriate symbols to the ''end characters'' in order to ensure that content from the various code pages are supported.
 
&lt;expansion&gt;
 
&lt;lengths&gt;
 
&lt;length max="" expand-percentage="0"/&gt;
 
&lt;/lengths&gt;
 
&lt;expansion-char>&lt;/expansion-char&gt;
 
&lt;expansion-end-chars&gt;&lt;/expansion-end-chars&gt;
 
&lt;/expansion&gt;
 
&lt;start-char>[&lt;/start-char&gt;
 
&lt;!-- П: A Cyrillic character - (for Russian or ‘R’) --&gt;
 
&lt;!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) --&gt;
 
&lt;!-- カ: A Katakana character (used for Japanese or ‘J’) --&gt;
 
&lt;!-- 내: A Hangul character (for Korean or ‘K’) --&gt;
 
&lt;end-char> П國カ내]&lt;/end-char&gt;
 
&lt;accent-chars>1&lt;/accent-chars&gt;
 
   
  +
* <span style="color:red">1 : Truncation</span>: The end characters have been truncated, indicating a likely UI issues around the space set for the text. The UI may need to be refactored to accommodate for longer text in languages such as German.
Based on this configuration, an example pseudo-localization would be:
 
  +
* <span style="color:red">2 : Mojibake</span>: When the pseudo-localized text is showing as mojibake, such as �, this likely indicates a character encoding or a font issue. The application does not support non ASCII characters or non Latin-1 characters.
  +
* <span style="color:red">3 : Embedded String / Hard Code String</span>: If the text shows in the original source locale, here English, as opposed to being pseudo-localized, it indicates a likely hard coded string which has not been externalized into a resource file. That string cannot be sent to translation and will show in the interface as the original source string. This is a common internationalization issue.
  +
* <span style="color:red">4: Concatenation</span>: The pseudo-localization shows that two strings have been put together to make up ''Y-Wing Galactic Fighter'', since the end character shows up after ''Galactic Fighter'' and a start character shows up before ''Fighter''.
   
  +
=== UI in the pseudo-locale without issues ===
Base Resource string: "This is an example of no expansion configuration"
 
  +
If the i18n issues above were corrected, running the application using the pseudo-locale would look like the following:
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ ŵîţĥ éñð çĥåŕåçţéŕš šýɱƀöļš çöñƒîĝûŕåţîöñ П國カ내]"
 
   
  +
[[File:Pseudo-Localization No Issues.jpg.png|center|700px]]
   
==== Accent Only Configuration Example ====
+
== Configuration ==
Edit the ''config_pseudo_loc.xml'' for no expansion by setting the ''expand-percentage'' to 0 and not entering any characters for the ''expansion character'' as well as the ''expansion end characters''. Remove any characters defined in the ''start and end characters''.
 
   
  +
Localyzer makes pseudo-localization quite simple: For the resource files in a repository to be pseudo-localized, edit the Locales by clicking the pseudo-locale checkbox and enter the locale to be used as shown below with '''eo''' (esperanto):
&lt;expansion&gt;
 
&lt;lengths&gt;
 
&lt;length max="" expand-percentage="0"/&gt;
 
&lt;/lengths&gt;
 
&lt;expansion-char>&lt;/expansion-char&gt;
 
&lt;expansion-end-chars&gt;&lt;/expansion-end-chars&gt;
 
&lt;/expansion&gt;
 
&lt;start-char>&lt;/start-char&gt;
 
&lt;end-char>&lt;/end-char&gt;
 
&lt;accent-chars>1&lt;/accent-chars&gt;
 
   
  +
[[File:Pseudo-Localization Configuration.jpg|700px|center]]
Based on this configuration, an example pseudo-localization would be:
 
   
  +
Once this configuration is save, every time the project is analyzed, the resource files will be pseudo-localized.
Base Resource string: "This is an example of accent only configuration"
 
Pseudo-localized string: "Ţĥîš îš åñ éẋåɱþļé öƒ åççéñţ öñļý çöñƒîĝûŕåţîöñ"
 

Latest revision as of 21:50, 10 April 2024

Video

For an introduction video on the subject, please see:

 Pseudo-Localization Thumbnail.jpg

which covers the topic on this Wiki page.

What is Pseudo-Localization?

Pseudo-localization is an effective way to test the localization-readiness of an application.  By pseudo-localizing the resource files, an application can be tested for internationalization without waiting for localization.

A pseudo-locale is like a regular locale, like de-DE for German translation, but instead of a translation, it modified the source strings in the following way:

A source string like:

  • Order placed successfully!

will be localized as:

  • [Öŕðéŕ þļåçéð šûççéššƒûļļý!------------- П國カ내]


The pseudo-localize accented string is still legible, so the developer or QA can run the application. A start character, [, is added at the beginning of the string. The string is expanded to reflect possible width of target locales strings. Some characters from other writing systems are added to check for encoding or font issues. An end character, ], is added to show there that the source string ends to help detect concatenations.

When the application uses the pseudo-locale, the pseudo-localized resource text will be displayed.

Creating pseudo-localized resource files help test for

  • embedded strings,
  • text that was externalized but should not be,
  • text expansion issues,
  • character-encoding problems,
  • text concatenation issues, and
  • UI boundary issues can be identified.

An application typically retrieves strings based on a locale, such as French. A pseudo-locale is like a normal locale, but the strings are not translated, they simply show differently.

How to identify i18n Issues with Pseudo-localization

This section shows how to use pseudo-localization to find and correct issues.

UI in the source Locale

First, let's show the user interface in the source locale, here English/US:

Source Locale UI.jpg


UI in the pseudo-locale with issues

Below, the application is running with the pseudo-locale (for example 'esperanto') and the English strings have been pseudo-localized.


Some issues can be identified when the application is running using the pseudo-locale:

Pseudo-Localization Example Issues.jpg
  • 1 : Truncation: The end characters have been truncated, indicating a likely UI issues around the space set for the text. The UI may need to be refactored to accommodate for longer text in languages such as German.
  • 2 : Mojibake: When the pseudo-localized text is showing as mojibake, such as �, this likely indicates a character encoding or a font issue. The application does not support non ASCII characters or non Latin-1 characters.
  • 3 : Embedded String / Hard Code String: If the text shows in the original source locale, here English, as opposed to being pseudo-localized, it indicates a likely hard coded string which has not been externalized into a resource file. That string cannot be sent to translation and will show in the interface as the original source string. This is a common internationalization issue.
  • 4: Concatenation: The pseudo-localization shows that two strings have been put together to make up Y-Wing Galactic Fighter, since the end character shows up after Galactic Fighter and a start character shows up before Fighter.

UI in the pseudo-locale without issues

If the i18n issues above were corrected, running the application using the pseudo-locale would look like the following:

Pseudo-Localization No Issues.jpg.png

Configuration

Localyzer makes pseudo-localization quite simple: For the resource files in a repository to be pseudo-localized, edit the Locales by clicking the pseudo-locale checkbox and enter the locale to be used as shown below with eo (esperanto):

Pseudo-Localization Configuration.jpg

Once this configuration is save, every time the project is analyzed, the resource files will be pseudo-localized.