LRM html Support
Contents
Examples of .html or .htm Files or a file extension that uses the html parser type
1. <html> <p>This is some text in an html file</p> </html> 2. <body> <p>This is a html text fragment</p> </body>
html parser type
valid html syntax
Files that use the html parser are expected to have valid html syntax
htm/html uses the html parser type
When defining a project containing LRM Standard .html or .html resource files, there is no need to define a <parser-type> as the html parser will always be used.
unique file extension needs to define html parser type
If a unique file extension is a valid html file, then the <parser-type> should be html in the project definition file.
XHTML - well-formed xml needs to define html parser type
A well-formed document in XML can be parsed by the html parser type. If there are well-defined keys/values, such as in a .resx or strings.xml file then the xml parser type should be used.
LRM interaction with html parser type files
Number of keys in file is 1
All files that are parsed using the html parser have only 1 key called key1. The value that corresponds to this key is the entire html type file. Because there are no key/values pairs html parsed files cannot be instrumented (used in our InContext Reviewer/Translation product).
Prep kit files are always full file
If the checksum of the base file has changed then the file will be sent out in the next prep kit for all target locales. Since the file contains only 1 key, the entire file will be sent out for translation.
File can be pseudo-localized and number of words counted
Since LRM is able to parse the text portion of html parsed files, the files can be pseudo-localized and the number of words counted.
Example of Project Definition for Resources
The following is an example of html resource file definitions. See resource extensions for more information.
<resource-extensions> <resource-extension> <!-- parser-type not needed since .html is a standard LRM extension that maps to the html parser type --> <extension>html</extension> <file-name-pattern>*-l_c_v</file-name-pattern> <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> <file-location-pattern>l_c_v</file-location-pattern> <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> <base-file-encoding>UTF-8</base-file-encoding> <localized-file-encoding>UTF-8</localized-file-encoding> <parameter-regex-pattern></parameter-regex-pattern> </resource-extension> <resource-extension> <!-- parser-type not needed since .htm is a standard LRM extension that maps to the html parser type --> <extension>htm</extension> <file-name-pattern>*-l_c_v</file-name-pattern> <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> <file-location-pattern>l_c_v</file-location-pattern> <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> <base-file-encoding>UTF-8</base-file-encoding> <localized-file-encoding>UTF-8</localized-file-encoding> <parameter-regex-pattern></parameter-regex-pattern> </resource-extension> <resource-extension> <!-- parser-type is required because .myext is not a standard LRM extension --> <extension>myext</extension> <parser-type>html</parser-type> <file-name-pattern>*-l_c_v</file-name-pattern> <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> <file-location-pattern>l_c_v</file-location-pattern> <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> <base-file-encoding>UTF-8</base-file-encoding> <localized-file-encoding>UTF-8</localized-file-encoding> <parameter-regex-pattern></parameter-regex-pattern> </resource-extension> </resource-extensions>