Pseudo Localization

From Lingoport Wiki
Revision as of 19:52, 28 November 2018 by Llawson (talk | contribs) (Accent Only Configuration Example)
Jump to: navigation, search

What is Pseudo-Localization?

Pseudo-localization is an effective way to test the localization-readiness of an application.  By pseudo-localizing the resource files, an application can be tested for internationalization without waiting for localization.

Creating pseudo-localized resource files help test for

  • embedded strings,
  • text that was externalized but should not be,
  • text expansion issues,
  • character-encoding problems,
  • text concatenation issues, and
  • UI boundary issues can be identified.

An application typically retrieves strings based on a locale, such as French. A pseudo-locale is like a normal locale, but the strings are not translated, they simply show differently, as in the example below where the pseudo-locale is 'esperanto' and the English strings have been pseudo-localized:

Pseudo-localization.jpg

Configuration: config_pseudo_loc.xml.xml

config_pseudo_loc.xml is the configuration file for setting up pseudo localization instructions for LRM resource files. It is located in the global <HOME>/Lingoport_Data/L10nStreamlining/config folder. If you need different criteria for a group or project then the file can be copied and moved to <HOME>/Lingoport_Data/L10nStreamlining/<group>/config or <HOME>/Lingoport_Data/L10nStreamlining/<group>/projects/<project>config respectively.

The config_pseudo_loc.xml file is used when running the --pseudo-loc command. A pseudo-localized file will be created for each base resource file if the project has a designated pseudo-locale.

There are 2 categories of information contained in the config_pseudo_loc.xml file.

  • pseudo localization instructions
  • regex pattern for each parser type defining the parameters as well as special characters that should not be pseudo-localized.

Pseudo Localization Instructions

The configuration consists of the following xml elements:

  • expansion lengths - defines the expansion percentage based on a strings length. This is useful to test language expansions.
  • expansion character - the character that will be used to pad the string in order to simulate string expansion
  • expansion end character - characters at the end of expansion. Typically, these are symbols across a range of codepoints in order to ensure that content from the various code pages are supported
  • start and end characters - characters at the start and end of the resource string. This is useful to catch resource strings that are concatenated within the code.
  • accents - a flag indicating whether the resource string should be accented. This is needed to ensure that all visible user strings are localized.
  • add-invisible-chars - a flag indicating whether instrumented characters should prep-pend and append the pseudo-localized text. This is useful when testing the Lingoport Incontext Reviewer.

Pseudo-Localization Examples

Typical Configuration Example

      <expansion>
        <lengths>
           <length max="10" expand-percentage="200"/>
           <length max="20" expand-percentage="100"/>
           <length max="30" expand-percentage="80"/>
           <length max="50" expand-percentage="60"/>
           <length max="70" expand-percentage="40"/>
           <length max=""   expand-percentage="30"/>
        </lengths>
        <expansion-char>-</expansion-char>
        
        <!-- П: A Cyrillic character - (for Russian or ‘R’) -->
        <!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) -->	    
        <!-- カ: A Katakana character (used for Japanese or ‘J’) -->
        <!-- 내: A Hangul character (for Korean or ‘K’) -->
        <expansion-end-chars> П國カ내</expansion-end-chars>
      </expansion>
      <start-char>[</start-char>
      <end-char>]</end-char>
      <accent-chars>1</accent-chars>
      <!-- Flag indicating whether to add instrumented (non-printable unicode characters) to the start and end of the string
        The instrumented characters will be before any start character and after any end character 
        0 - do not add instrumented characters 
        1 - add instrumented characters     -->
      <add-invisible-chars>0</add-invisible-chars>

Based on this configuration, an example pseudo-localization would be:

Base Resource string: "This is an example using the typical configuration"
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé ûšîñĝ ţĥé ţýþîçåļ çöñƒîĝûŕåţîöñ------------------------П國カ내]"

No Expansion Configuration Example

Edit the config_pseudo_loc.xml for no expansion by setting the expand-percentage to 0 and not entering any characters for the expansion character as well as the expansion end characters.

      <expansion>
        <lengths>
           <length max=""   expand-percentage="0"/>
        </lengths>
        <expansion-char></expansion-char>
        <expansion-end-chars></expansion-end-chars>
      </expansion>
      <start-char>[</start-char>
      <end-char>]</end-char>
      <accent-chars>1</accent-chars>
      <add-invisible-chars>0</add-invisible-chars>

Based on this configuration, an example pseudo-localization would be:

Base Resource string: "This is an example of no expansion configuration"
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ çöñƒîĝûŕåţîöñ]"

End Character Symbols with no Expansion Configuration Example

Edit the config_pseudo_loc.xml for no expansion by setting the expand-percentage to 0 and not entering any characters for the expansion character as well as the expansion end characters. Add appropriate symbols to the end characters in order to ensure that content from the various code pages are supported.

      <expansion>
         <lengths>
            <length max=""   expand-percentage="0"/>
         </lengths>
         <expansion-char></expansion-char>
         <expansion-end-chars></expansion-end-chars>
      </expansion>
      <start-char>[</start-char>
      <!-- П: A Cyrillic character - (for Russian or ‘R’) -->
      <!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) -->	    
      <!-- カ: A Katakana character (used for Japanese or ‘J’) -->
      <!-- 내: A Hangul character (for Korean or ‘K’) -->
      <end-char> П國カ내]</end-char>
      <accent-chars>1</accent-chars>
      <add-invisible-chars>0</add-invisible-chars>

Based on this configuration, an example pseudo-localization would be:

Base Resource string: "This is an example of no expansion configuration"
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ ŵîţĥ éñð çĥåŕåçţéŕš šýɱƀöļš çöñƒîĝûŕåţîöñ П國カ내]"

Accent Only Configuration Example

Edit the config_pseudo_loc.xml for no expansion by setting the expand-percentage to 0 and not entering any characters for the expansion character as well as the expansion end characters. Remove any characters defined in the start and end characters.

      <expansion>
         <lengths>
            <length max=""   expand-percentage="0"/>
         </lengths>
         <expansion-char></expansion-char>
         <expansion-end-chars></expansion-end-chars>
      </expansion>
      <start-char></start-char>
      <end-char></end-char>
      <accent-chars>1</accent-chars>
      <add-invisible-chars>0</add-invisible-chars>

Based on this configuration, an example pseudo-localization would be:

Base Resource string: "This is an example of accent only configuration"
Pseudo-localized string: "Ţĥîš îš åñ éẋåɱþļé öƒ åççéñţ öñļý çöñƒîĝûŕåţîöñ"