Difference between revisions of "LRM html Support"
(→valid html syntax) |
(→LRM interaction with html parser type files) |
||
(11 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | == |
+ | == Examples of .html or .htm Files or a file extension that uses the ''html'' parser type == |
<pre> |
<pre> |
||
+ | 1. |
||
<html> |
<html> |
||
<p>This is some text in an html file</p> |
<p>This is some text in an html file</p> |
||
</html> |
</html> |
||
+ | |||
+ | 2. |
||
+ | <body> |
||
+ | <p>This is a html text fragment</p> |
||
+ | </body> |
||
</pre> |
</pre> |
||
+ | |||
== ''html'' parser type == |
== ''html'' parser type == |
||
=== valid html syntax === |
=== valid html syntax === |
||
Line 10: | Line 17: | ||
=== htm/html uses the ''html'' parser type=== |
=== htm/html uses the ''html'' parser type=== |
||
− | When defining a project |
+ | When defining a project containing LRM Standard .html or .html resource files, there is no need to define a ''<parser-type>'' as the ''html'' parser will always be used. |
+ | |||
=== unique file extension needs to define ''html'' parser type === |
=== unique file extension needs to define ''html'' parser type === |
||
If a unique file extension is a valid html file, then the ''<parser-type>'' should be ''html'' in the project definition file. |
If a unique file extension is a valid html file, then the ''<parser-type>'' should be ''html'' in the project definition file. |
||
+ | |||
+ | === ''XHTML'' - well-formed xml needs to define ''html'' parser type === |
||
+ | A well-formed document in XML, such as .dita files, can be parsed by using the ''html'' parser type. If there are well-defined keys/values, such as in a ''.resx'' or ''strings.xml'' file then the [[LRM_xml_Support|''xml'']] parser type should be used. |
||
== LRM interaction with html parser type files == |
== LRM interaction with html parser type files == |
||
− | === Prep kit files are always full file === |
||
− | If the checksum of the base file has changed then the file will be sent out in the next prep kit for all target locales. |
||
===Number of keys in file is 1 === |
===Number of keys in file is 1 === |
||
− | All files that are parsed using the ''html'' parser have only 1 key called '''key1'''. The value that corresponds to this key is the entire html type file. |
+ | All files that are parsed using the ''html'' parser have only 1 key called '''key1'''. The value that corresponds to this key is the entire html type file. Because there are no key/values pairs, ''html'' parsed files cannot be instrumented (used in our InContext Reviewer/Translation product). |
+ | === Prep kit files are always full file === |
||
+ | If the checksum of the base file has changed then the file will be sent out in the next prep kit for all target locales. Since the file contains only 1 key, the entire file will be sent out for translation. |
||
+ | === File can be pseudo-localized and number of words counted === |
||
+ | Since LRM is able to parse the text portion of ''html'' parsed files, the files can be pseudo-localized and the number of words counted. |
||
− | == Example Project Definition |
+ | == Example of Project Definition for Resources == |
+ | The following is an example of html resource file definitions. See [[Supported_Resource_Bundles#Resource_Extensions| resource extensions]] for more information. |
||
+ | <resource-extensions> |
||
− | <?xml version="1.0" encoding="UTF-8" standalone="no"?> |
||
− | <lrmconf> |
||
− | <model-version>2.0.11</model-version> |
||
− | <project-name>DemoRxml</project-name> |
||
− | <project-desc>This is a sample LRM Project definition file, configured for Globalyzer Rxml resource files</project-desc> |
||
− | <group-name>acme</group-name> |
||
− | <top-level-dir>C:\acme\source</top-level-dir> |
||
− | <detect-errors> |
||
− | <missed-trans-error>0</missed-trans-error> |
||
− | <parameter-mismatch-error>1</parameter-mismatch-error> |
||
− | </detect-errors> |
||
− | <track-back-locale>br</track-back-locale> |
||
− | <pseudo-locale>eo</pseudo-locale> |
||
− | <target-locales> |
||
− | <locale>es_MX</locale> |
||
− | <locale>fr_CA</locale> |
||
− | <locale>fr_FR</locale> |
||
− | </target-locales> |
||
− | <default-locale>en_US</default-locale> |
||
− | <resource-extensions> |
||
<resource-extension> |
<resource-extension> |
||
'''<!-- parser-type not needed since .html is a standard LRM extension that maps to the ''html'' parser type -->''' |
'''<!-- parser-type not needed since .html is a standard LRM extension that maps to the ''html'' parser type -->''' |
||
'''<extension>html</extension>''' |
'''<extension>html</extension>''' |
||
− | + | <file-name-pattern>*-l_c_v</file-name-pattern> |
|
− | + | <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> |
|
− | + | <file-location-pattern>l_c_v</file-location-pattern> |
|
− | + | <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> |
|
− | + | <base-file-encoding>UTF-8</base-file-encoding> |
|
− | + | <localized-file-encoding>UTF-8</localized-file-encoding> |
|
− | + | <parameter-regex-pattern></parameter-regex-pattern> |
|
</resource-extension> |
</resource-extension> |
||
<resource-extension> |
<resource-extension> |
||
'''<!-- parser-type not needed since .htm is a standard LRM extension that maps to the ''html'' parser type -->''' |
'''<!-- parser-type not needed since .htm is a standard LRM extension that maps to the ''html'' parser type -->''' |
||
'''<extension>htm</extension>''' |
'''<extension>htm</extension>''' |
||
− | + | <file-name-pattern>*-l_c_v</file-name-pattern> |
|
− | + | <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> |
|
− | + | <file-location-pattern>l_c_v</file-location-pattern> |
|
− | + | <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> |
|
− | + | <base-file-encoding>UTF-8</base-file-encoding> |
|
− | + | <localized-file-encoding>UTF-8</localized-file-encoding> |
|
− | + | <parameter-regex-pattern></parameter-regex-pattern> |
|
</resource-extension> |
</resource-extension> |
||
<resource-extension> |
<resource-extension> |
||
Line 69: | Line 64: | ||
'''<extension>''myext''</extension>''' |
'''<extension>''myext''</extension>''' |
||
'''''<parser-type>html</parser-type>''''' |
'''''<parser-type>html</parser-type>''''' |
||
− | + | <file-name-pattern>*-l_c_v</file-name-pattern> |
|
− | + | <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> |
|
− | + | <file-location-pattern>l_c_v</file-location-pattern> |
|
− | + | <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> |
|
− | + | <base-file-encoding>UTF-8</base-file-encoding> |
|
− | + | <localized-file-encoding>UTF-8</localized-file-encoding> |
|
− | + | <parameter-regex-pattern></parameter-regex-pattern> |
|
</resource-extension> |
</resource-extension> |
||
</resource-extensions> |
</resource-extensions> |
||
− | <dirset> |
||
− | <includes> |
||
− | <include-dir-file>**/**</include-dir-file> |
||
− | </includes> |
||
− | <excludes> |
||
− | <exclude-dir-file>**/source/bin/**</exclude-dir-file> |
||
− | </excludes> |
||
− | </dirset> |
||
− | </lrmconf> |
Latest revision as of 17:19, 18 November 2019
Contents
Examples of .html or .htm Files or a file extension that uses the html parser type
1. <html> <p>This is some text in an html file</p> </html> 2. <body> <p>This is a html text fragment</p> </body>
html parser type
valid html syntax
Files that use the html parser are expected to have valid html syntax
htm/html uses the html parser type
When defining a project containing LRM Standard .html or .html resource files, there is no need to define a <parser-type> as the html parser will always be used.
unique file extension needs to define html parser type
If a unique file extension is a valid html file, then the <parser-type> should be html in the project definition file.
XHTML - well-formed xml needs to define html parser type
A well-formed document in XML, such as .dita files, can be parsed by using the html parser type. If there are well-defined keys/values, such as in a .resx or strings.xml file then the xml parser type should be used.
LRM interaction with html parser type files
Number of keys in file is 1
All files that are parsed using the html parser have only 1 key called key1. The value that corresponds to this key is the entire html type file. Because there are no key/values pairs, html parsed files cannot be instrumented (used in our InContext Reviewer/Translation product).
Prep kit files are always full file
If the checksum of the base file has changed then the file will be sent out in the next prep kit for all target locales. Since the file contains only 1 key, the entire file will be sent out for translation.
File can be pseudo-localized and number of words counted
Since LRM is able to parse the text portion of html parsed files, the files can be pseudo-localized and the number of words counted.
Example of Project Definition for Resources
The following is an example of html resource file definitions. See resource extensions for more information.
<resource-extensions> <resource-extension> <!-- parser-type not needed since .html is a standard LRM extension that maps to the html parser type --> <extension>html</extension> <file-name-pattern>*-l_c_v</file-name-pattern> <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> <file-location-pattern>l_c_v</file-location-pattern> <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> <base-file-encoding>UTF-8</base-file-encoding> <localized-file-encoding>UTF-8</localized-file-encoding> <parameter-regex-pattern></parameter-regex-pattern> </resource-extension> <resource-extension> <!-- parser-type not needed since .htm is a standard LRM extension that maps to the html parser type --> <extension>htm</extension> <file-name-pattern>*-l_c_v</file-name-pattern> <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> <file-location-pattern>l_c_v</file-location-pattern> <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> <base-file-encoding>UTF-8</base-file-encoding> <localized-file-encoding>UTF-8</localized-file-encoding> <parameter-regex-pattern></parameter-regex-pattern> </resource-extension> <resource-extension> <!-- parser-type is required because .myext is not a standard LRM extension --> <extension>myext</extension> <parser-type>html</parser-type> <file-name-pattern>*-l_c_v</file-name-pattern> <use-pattern-on-dflt-locale>0</use-pattern-on-dflt-locale> <file-location-pattern>l_c_v</file-location-pattern> <use-location-pattern-on-dflt-locale>1</use-location-pattern-on-dflt-locale> <base-file-encoding>UTF-8</base-file-encoding> <localized-file-encoding>UTF-8</localized-file-encoding> <parameter-regex-pattern></parameter-regex-pattern> </resource-extension> </resource-extensions>