Pseudo Localization: Difference between revisions
Created page with "== config_pseudo_loc.xml.xml == File '''<code>config_pseudo_loc.xml</code>''' is the configuration file for setting up pseudo localization instructions for LRM resource files...." |
No edit summary |
||
| Line 12: | Line 12: | ||
* '''expansion lengths''' - defines the expansion percentage based on a strings length. This is useful to test language expansions. | * '''expansion lengths''' - defines the expansion percentage based on a strings length. This is useful to test language expansions. | ||
* '''expansion character''' - the character that will be used to pad the string in order to simulate string expansion | * '''expansion character''' - the character that will be used to pad the string in order to simulate string expansion | ||
* ''expansion end character'' - characters at the end of expansion. Typically, these are symbols across a range of codepoints in order to ensure that content from the various code pages are supported | * '''expansion end character''' - characters at the end of expansion. Typically, these are symbols across a range of codepoints in order to ensure that content from the various code pages are supported | ||
* ''start and end characters'' - characters at the start and end of the resource string. This is useful to catch resource strings that are concatenated within the code. | * '''start and end characters''' - characters at the start and end of the resource string. This is useful to catch resource strings that are concatenated within the code. | ||
* '''accents''' - a flag indicating whether the resource string should be accented. This is needed to ensure that all visible user strings are localized. | * '''accents''' - a flag indicating whether the resource string should be accented. This is needed to ensure that all visible user strings are localized. | ||
==== Typical Configuration Example ==== | |||
==== | <expansion> | ||
<lengths> | |||
<length max="10" expand-percentage="200"/> | |||
<length max="20" expand-percentage="100"/> | |||
<length max="30" expand-percentage="80"/> | |||
<length max="50" expand-percentage="60"/> | |||
<length max="70" expand-percentage="40"/> | |||
<length max="" expand-percentage="30"/> | |||
</lengths> | |||
<expansion-char>-</expansion-char> | |||
<!-- П: A Cyrillic character - (for Russian or ‘R’) --> | |||
<!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) --> | |||
<!-- カ: A Katakana character (used for Japanese or ‘J’) --> | |||
<!-- 내: A Hangul character (for Korean or ‘K’) --> | |||
<expansion-end-chars> П國カ내</expansion-end-chars> | |||
</expansion> | |||
<start-char>[</start-char> | |||
<end-char>]</end-char> | |||
<accent-chars>1</accent-chars> | |||
Based on this configuration, an example pseudo-localization would be: | |||
Base Resource string: "This is an example using the typical configuration" | |||
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé ûšîñĝ ţĥé ţýþîçåļ çöñƒîĝûŕåţîöñ------------------------П國カ내]" | |||
==== No Expansion Configuration Example ==== | |||
== | <expansion> | ||
<lengths> | |||
<length max="" expand-percentage="0"/> | |||
</lengths> | |||
<expansion-char></expansion-char> | |||
<expansion-end-chars></expansion-end-chars> | |||
</expansion> | |||
<start-char>[</start-char> | |||
<end-char>]</end-char> | |||
<accent-chars>1</accent-chars> | |||
Based on this configuration, an example pseudo-localization would be: | |||
Base Resource string: "This is an example of no expansion configuration" | |||
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ çöñƒîĝûŕåţîöñ]" | |||
=== | ==== No Expansion with end character symbols Configuration Example ==== | ||
<expansion> | |||
<lengths> | |||
<length max="" expand-percentage="0"/> | |||
</lengths> | |||
<expansion-char></expansion-char> | |||
<expansion-end-chars></expansion-end-chars> | |||
</expansion> | |||
<start-char>[</start-char> | |||
<!-- П: A Cyrillic character - (for Russian or ‘R’) --> | |||
<!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) --> | |||
<!-- カ: A Katakana character (used for Japanese or ‘J’) --> | |||
<!-- 내: A Hangul character (for Korean or ‘K’) --> | |||
<end-char> П國カ내]</end-char> | |||
<accent-chars>1</accent-chars> | |||
Based on this configuration, an example pseudo-localization would be: | |||
Base Resource string: "This is an example of no expansion configuration" | |||
Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ ŵîţĥ éñð çĥåŕåçţéŕš šýɱƀöļš çöñƒîĝûŕåţîöñ П國カ내]" | |||
: | |||
Revision as of 17:23, 31 July 2018
config_pseudo_loc.xml.xml
File config_pseudo_loc.xml is the configuration file for setting up pseudo localization instructions for LRM resource files. It is located in the global <HOME>/Lingoport_Data/L10nStreamlining/config folder. If you need different criteria for a group or project then the file can be copied and moved to <HOME>/Lingoport_Data/L10nStreamlining/<group>/config or <HOME>/Lingoport_Data/L10nStreamlining/<group>/projects/<project>config respectively.
The config_pseudo_loc.xml file is used when running the --pseudo-loc command. A pseudo-localized file will be created for each base resource file if the project has a designated pseudo-locale.
There are 2 categories of information contained in the config_pseudo_loc.xml file.
- pseudo localization instructions
- regex pattern for each parser type defining the parameters as well as special characters that should not be pseudo-localized.
Pseudo Localization Instructions
The configuration consists of the following xml elements:
- expansion lengths - defines the expansion percentage based on a strings length. This is useful to test language expansions.
- expansion character - the character that will be used to pad the string in order to simulate string expansion
- expansion end character - characters at the end of expansion. Typically, these are symbols across a range of codepoints in order to ensure that content from the various code pages are supported
- start and end characters - characters at the start and end of the resource string. This is useful to catch resource strings that are concatenated within the code.
- accents - a flag indicating whether the resource string should be accented. This is needed to ensure that all visible user strings are localized.
Typical Configuration Example
<expansion>
<lengths>
<length max="10" expand-percentage="200"/>
<length max="20" expand-percentage="100"/>
<length max="30" expand-percentage="80"/>
<length max="50" expand-percentage="60"/>
<length max="70" expand-percentage="40"/>
<length max="" expand-percentage="30"/>
</lengths>
<expansion-char>-</expansion-char>
<!-- П: A Cyrillic character - (for Russian or ‘R’) -->
<!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) -->
<!-- カ: A Katakana character (used for Japanese or ‘J’) -->
<!-- 내: A Hangul character (for Korean or ‘K’) -->
<expansion-end-chars> П國カ내</expansion-end-chars>
</expansion>
<start-char>[</start-char>
<end-char>]</end-char>
<accent-chars>1</accent-chars>
Based on this configuration, an example pseudo-localization would be:
Base Resource string: "This is an example using the typical configuration" Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé ûšîñĝ ţĥé ţýþîçåļ çöñƒîĝûŕåţîöñ------------------------П國カ내]"
No Expansion Configuration Example
<expansion>
<lengths>
<length max="" expand-percentage="0"/>
</lengths>
<expansion-char></expansion-char>
<expansion-end-chars></expansion-end-chars>
</expansion>
<start-char>[</start-char>
<end-char>]</end-char>
<accent-chars>1</accent-chars>
Based on this configuration, an example pseudo-localization would be:
Base Resource string: "This is an example of no expansion configuration" Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ çöñƒîĝûŕåţîöñ]"
No Expansion with end character symbols Configuration Example
<expansion>
<lengths>
<length max="" expand-percentage="0"/>
</lengths>
<expansion-char></expansion-char>
<expansion-end-chars></expansion-end-chars>
</expansion>
<start-char>[</start-char>
<!-- П: A Cyrillic character - (for Russian or ‘R’) -->
<!-- 國: A common Chinese character (Hanzi, Kanji or Hanja) defined in Unicode BMP or Plane 0 – (for Chinese or ‘C’) -->
<!-- カ: A Katakana character (used for Japanese or ‘J’) -->
<!-- 내: A Hangul character (for Korean or ‘K’) -->
<end-char> П國カ내]</end-char>
<accent-chars>1</accent-chars>
Based on this configuration, an example pseudo-localization would be:
Base Resource string: "This is an example of no expansion configuration" Pseudo-localized string: "[Ţĥîš îš åñ éẋåɱþļé öƒ ñö éẋþåñšîöñ ŵîţĥ éñð çĥåŕåçţéŕš šýɱƀöļš çöñƒîĝûŕåţîöñ П國カ내]"