Difference between revisions of "Resource Files"

From Lingoport Wiki
Jump to: navigation, search
(How to work with unsupported File Types)
(How to work with unsupported File Types)
 
(20 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== What resource file types are supported by LRM?==
+
== What resource file types are supported by Localyzer?==
=== Standard LRM extensions ===
+
=== Standard Localyzer extensions ===
 
<ul>
 
<ul>
 
<li><b>[[LRM html Support | .htm and .html]]</b> files using the [[LRM_html_Support#html_parser_type | ''html'']] parser</li>
 
<li><b>[[LRM html Support | .htm and .html]]</b> files using the [[LRM_html_Support#html_parser_type | ''html'']] parser</li>
Line 18: Line 18:
   
 
=== Unique Extensions ===
 
=== Unique Extensions ===
Any file extension can be handled by LRM as long as the corresponding parser type is defined. The file must be able to be parsed correctly by the defined parser type or an error will occur.
+
Any file extension can be handled by Localyzer as long as the corresponding parser type is defined. The file must be able to be parsed correctly by the defined parser type or an error will occur.
   
 
[[File: ExtensionParserType.jpg |600px]]
 
[[File: ExtensionParserType.jpg |600px]]
   
Above is an example configuring a Jenkins LRM project. The '''Extension''' is 'properties' and the '''Parser Type''' is 'properties' so LRM will recognize the file '''myfile_en_US.properties''' as a resource file. If the filename is '''myfile_en_US.prop''', that would <u>not</u> be recognized as a properties resource file. Changing the '''Extension''' to 'prop' would allow '''myfile_en_US.prop''' to be recognized as a properties parser type file.
+
Above is an example configuring a Jenkins Localyzer project. The '''Extension''' is 'properties' and the '''Parser Type''' is 'properties' so Localyzer will recognize the file '''myfile_en_US.properties''' as a resource file. If the filename is '''myfile_en_US.prop''', that would <u>not</u> be recognized as a properties resource file. Changing the '''Extension''' to 'prop' would allow '''myfile_en_US.prop''' to be recognized as a properties parser type file.
   
 
The parser types are:
 
The parser types are:
Line 37: Line 37:
 
<li><b>[[LRM_XML_Support|''xml'']]</b> parser</li>
 
<li><b>[[LRM_XML_Support|''xml'']]</b> parser</li>
 
<li><b>[[LRM_yaml_Support|''yaml'']]</b> parser</li>
 
<li><b>[[LRM_yaml_Support|''yaml'']]</b> parser</li>
  +
<li><b>[[LRM_Binary_Support|''binary'']]</b> parser</li>
 
</ul>
 
</ul>
 
   
 
==How to work with unsupported File Types==
 
==How to work with unsupported File Types==
   
LRM supports a number of file types out of the box (See [[Resource_Files]]). However, other file types may represent user facing strings to be translated. In that case, some customization is required to on-board those projects.
+
Localyzer supports a number of file types out of the box (See above). However, other file types may represent user facing strings to be translated. In that case, some customization is required to on-board those projects.
The '''bash script transform framework''' facilitates the customization.
+
The '''[[Transform_Framework|Transform Framework]]''' facilitates the customization.
 
=== Analyze the file types ===
 
If the file types fall into a category not supported by LRM out of the box, the first thing to do is to see what is the closest file types supported by LRM.
 
 
=== Use the transform framework ===
 
The transform framework needs '''three scripts''' in order to fit in with LRM. The three scripts need to be under the <code>$JENKINS_HOME/lingoport/transform/<nameoftransform>/</code> directory.
 
 
 
The <nameoftransform> must be indicative of the type of transformation to apply. For instance, it could be <code>loc</code> to handle .loc files (see below). In that case, three scripts will need to be under <code>/var/lib/jenkins/lingoport/transform/loc</code> for a typical installation where the <code>jenkins</code> user is under <code>/var/lib/jenkins</code>.
 
 
 
The three scripts to write are:
 
* '''transform_from_repo.sh''': How to transform the files from the repository so they fit into an LRM supported file type
 
* '''transform_to_repo.sh''': How to transform translated/pseudo-localized files in an LRM supported file type into the repository file type
 
* '''transform_files_list.sh''': How to transform the file names from the LRM supported file naming into the repository file naming
 
 
 
When those scripts are written, the transformation is defined in the config directory of the on-boarded project with the <code>transform.properties</code>. This file contains one properties, 'transform'. For instance, if <code>loc</code> is the directory with those three scripts under <code>$JENKINS_HOME/lingoport/transform/</code> for a <PROJECT> under a <GROUP>, the file will be:
 
 
 
<code>$JENKINS_HOME/Lingoport_Data/L10nStreamlining/<GROUP>/projects/<PROJECT>/config/transform.properties</code>
 
<pre>
 
transform=loc
 
</pre>
 
 
==== Bash Variables ====
 
A few Bash variables are available when called from the Lingoport Jenkins jobs that use the transform framework.
 
They are set before calling the transform framework.
 
 
* '''CLIENT_SOURCE_DIR''' : For an LRM project such as CET.json, the CLIENT_SOURCE_DIR would typically be ~jenkins/jobs/CET.json/workspace. Note: This is not necessarily the WORKSPACE of the running Jenkins job from which the transform is called (Dashboard Update for instance).
 
* '''LRM_GROUP_NAME''' : The name of the LRM Group Name (e.g. 'CET' )
 
* '''LRM_PROJECT_NAME''' : The name of the LRM Project Name (e.g. 'json' )
 
* '''TRANSFORM_DIR''' : The transform scripts directory (e.g. 'loc' )
 
 
==== Example: .loc files ====
 
Say the repository contains resource files like the following <code>hmUiMessage.loc</code> file:
 
<pre>
 
;hmUiMessage.loc
 
;*********************************************************************
 
#include hmUiMain.loc
 
;*********************************************************************
 
message1 The first message
 
message2 The second message
 
message3 The third message
 
message4 The fourth message
 
</pre>
 
 
The file may not be in ASCII or UTF-8 format; For instance this file is in UTF-16BE
 
 
A supported file format that is close to this one is <code>properties</code>.
 
 
==== transform_from_repo.sh ====
 
An <i>example</i> snippet of bash code for this type of file may be something like:
 
<pre>
 
#!/bin/bash
 
   
# Find all the files ending in 'loc'
+
== Why are there files in my repository that end in ''_LRMLQA''? ==
  +
These are the Localyzer instrumented files that were created during the ''instrument resource files'' command.
find $CLIENT_SOURCE_DIR -name "*loc" > ~/tmp/input_files.txt
 
  +
See [[LRM_Instrumentation#LRM_Instrumentation|Localyzer Instrumentation]] for more information.
   
  +
== What is Send Unique Filenames? ==
# Transform each .loc file into a .properties file
 
cat ~/tmp/input_files.txt | while read -r FILEPATH
 
do
 
FILENAME=`basename $FILEPATH`
 
DIRNAME=`dirname $FILEPATH`
 
file "$FILEPATH"
 
SUFFIX=".loc"
 
ROOTNAME=${FILEPATH%$SUFFIX}
 
TARGET="${ROOTNAME}.properties"
 
iconv -f UTF-16 -t UTF-8 -c "$FILEPATH" > "$TARGET"
 
sed -i 's/^#/# #/' "$TARGET"
 
sed -i 's/^;/# ;/' "$TARGET"
 
sed -i -e "s/[[:space:]]\+/=/" "$TARGET"
 
sed -i -e "s/^=$//" "$TARGET"
 
done
 
</pre>
 
   
  +
When configuring an Localyzer project in Jenkins, under the General Settings, there is a checkbox for '''Send Unique Filenames'''.
==== transform_to_repo.sh ====
 
An <i>example</i> snippet of bash code for this type of file may be something like:
 
<pre>
 
#!/bin/bash
 
   
  +
[[File:SendUniqueFilenames.jpg | 600px ]]
# Find all the files ending in .properties
 
find $CLIENT_SOURCE_DIR -name "*.properties" > ~/tmp/input_files.txt
 
   
  +
This is defaulted to be unchecked. Check this box if the files to be translated have the same names, but are in different folders. For example if you have resource files in two directories, but the files themselves are the same names.
#
 
# Transform each .properties into a .loc
 
#
 
cat ~/tmp/input_files.txt | while read -r FILEPATH
 
do
 
FILENAME=`basename $FILEPATH`
 
DIRNAME=`dirname $FILEPATH`
 
ls -l "$FILEPATH"
 
SUFFIX=".properties"
 
ROOTNAME=${FILEPATH%$SUFFIX}
 
TARGET="${ROOTNAME}.loc"
 
cp "$FILEPATH" "$TARGET"
 
sed -i 's/^#=#/#/' "$TARGET"
 
sed -i 's/^#=;/;/' "$TARGET"
 
sed -i -e "s/^#\([[:alnum:]]*\)/;\1/" "$TARGET"
 
sed -i -e "s/\([[:alnum:]]*\)=/\1\t/" "$TARGET"
 
iconv -f UTF-8 -t UTF-16 -c "$TARGET" > tmp.tmp
 
mv tmp.tmp "$TARGET"
 
done
 
</pre>
 
   
  +
../first_en_US/values.json
==== transform_files_list.sh ====
 
  +
../second_en_US/values.json
An <i>example</i> snippet of bash code for this type of file may be something like:
 
<pre>
 
#!/bin/bash
 
# Check if there is a parameter
 
if [ -z "$1" ]
 
then
 
echo "Error: Missing the argument like /<path>/pseudo_files.txt"
 
exit 1
 
fi
 
   
  +
Localyzer sends only the files to be translated. Checking the '''Send Unique Filename''' box ensures that the files get unique names that are tracked and then returned to the correct location upon import.
# If the file exists then do something, otherwise exit
 
if [ -f "$1" ]; then
 
echo " File to rewrite: $1"
 
else
 
echo " $1 not found"
 
exit 1
 
fi
 
   
  +
For this example, if the default is left and it is unchecked, then ''a prep kit will be created for each file''. For this example, two prep kits would be created and sent to be translated. If there are many folders that contain the same name, many prep kits will be created.
# Rename .properties to .loc files inside the list of files passed as a parameter
 
sed -i 's/\.properties/.loc/' "$1"
 
</pre>
 

Latest revision as of 19:29, 22 September 2021

What resource file types are supported by Localyzer?

Standard Localyzer extensions

Unique Extensions

Any file extension can be handled by Localyzer as long as the corresponding parser type is defined. The file must be able to be parsed correctly by the defined parser type or an error will occur.

ExtensionParserType.jpg

Above is an example configuring a Jenkins Localyzer project. The Extension is 'properties' and the Parser Type is 'properties' so Localyzer will recognize the file myfile_en_US.properties as a resource file. If the filename is myfile_en_US.prop, that would not be recognized as a properties resource file. Changing the Extension to 'prop' would allow myfile_en_US.prop to be recognized as a properties parser type file.

The parser types are:

How to work with unsupported File Types

Localyzer supports a number of file types out of the box (See above). However, other file types may represent user facing strings to be translated. In that case, some customization is required to on-board those projects. The Transform Framework facilitates the customization.

Why are there files in my repository that end in _LRMLQA?

These are the Localyzer instrumented files that were created during the instrument resource files command. See Localyzer Instrumentation for more information.

What is Send Unique Filenames?

When configuring an Localyzer project in Jenkins, under the General Settings, there is a checkbox for Send Unique Filenames.

SendUniqueFilenames.jpg

This is defaulted to be unchecked. Check this box if the files to be translated have the same names, but are in different folders. For example if you have resource files in two directories, but the files themselves are the same names.

../first_en_US/values.json
../second_en_US/values.json

Localyzer sends only the files to be translated. Checking the Send Unique Filename box ensures that the files get unique names that are tracked and then returned to the correct location upon import.

For this example, if the default is left and it is unchecked, then a prep kit will be created for each file. For this example, two prep kits would be created and sent to be translated. If there are many folders that contain the same name, many prep kits will be created.