hxltmcli uses Python 3 and the reference version can be installed
with hdp-toolchain.
TL;DR: pip install hdp-toolchain[hxltm] . The result of the lastest
hxltmcli --help. is on the bottom of this page.
|
The HXLTM core ontology cor.hxltm.yml
that comes with hxltmcli can be customized with
hxltmcli --archivum-configurationem path/to/mycopy.hxltm.yml .
|
Bootstrapping-HXLTM (very, VERY long presentation):
1. eng-Latn: Bootstrapping technical translations and multilingual controlled vocabularies with HXLTM 2. por-Latn: Como criar do zero traduções técnicas e vocabulários controlados multilíngues com HXLTM |
For advanced hackers or (people helping others in middle of urgency and
with basic knowledge of python),
is possible copy only the single file
hxltmcli.py,
put on your executable path, and use immediately instead of
hdp-toolchain version. The hard requirements are
pip install libhxl langcodes pyyaml python-liquid
|
TL;DR: Too Long; Didn’t read
hxltmcli
Use case: "I need convert from HXLTM to something else"
__EPILOGUM__ = """
Exemplōrum gratiā:
HXLTM (csv) -> Translation Memory eXchange format (TMX):
hxltmcli fontem.tm.hxl.csv objectivum.tmx --objectivum-TMX
HXLTM (xlsx; sheet 7) -> Translation Memory eXchange format (TMX):
hxltmcli fontem.xlsx objectivum.tmx --sheet 7 --objectivum-TMX
HXLTM (xlsx; sheet 7, Situs interretialis) -> HXLTM (csv):
hxltmcli https://example.org/fontem.xlsx --sheet 7 fontem.tm.hxl.csv
HXLTM (Google Docs) -> HXLTM (csv):
hxltmcli https://docs.google.com/spreadsheets/(...) fontem.tm.hxl.csv
HXLTM (Google Docs) -> Translation Memory eXchange format (TMX):
hxltmcli https://docs.google.com/spreadsheets/(...) objectivum.tmx \
--objectivum-TMX
"""
hxltmdexml
Use case: "I need convert from something else (in XML) to HXLTM"
__EPILOGUM__ = """
Exemplōrum gratiā:
XML Localization Interchange File Format (XLIFF) v2.1+: -> HXLTM (bilinguam):
hxltmdexml fontem.xlf objectivum.tm.hxl.csv
XML Localization Interchange File Format (XLIFF) v1.2: -> HXLTM (bilinguam):
hxltmdexml fontem.xlf objectivum.tm.hxl.csv
Translation Memory eXchange format (TMX): -> HXLTM:
hxltmdexml fontem.tmx objectivum.tm.hxl.csv
TBX-Basic: TermBase eXchange (TBX) Basic: -> HXLTM:
hxltmdexml fontem.tbx objectivum.tm.hxl.csv
TBX-IATE (id est, https://iate.europa.eu/download-iate) -> HXLTM (por-Latn@pt)
zcat IATE_download.zip | hxltmdexml --agendum-linguam por-Latn@pt
cat IATE_export.tbx | hxltmdexml --agendum-linguam por-Latn@pt
TBX-IATE (id est, https://iate.europa.eu/download-iate) -> HXLTM (...)
hxltmdexml IATE_export.tbx IATE_export.hxltm.csv \\
--agendum-linguam bul-Latn@bg \\
--agendum-linguam ces-Latn@cs \\
--agendum-linguam dan-Latn@da \\
--agendum-linguam dut-Latn@nl \\
--agendum-linguam ell-Latn@el \\
--agendum-linguam eng-Latn@en \\
--agendum-linguam est-Latn@et \\
--agendum-linguam fin-Latn@fi \\
--agendum-linguam fra-Latn@fr \\
--agendum-linguam ger-Latn@de \\
--agendum-linguam ger-Latn@de \\
--agendum-linguam gle-Latn@ga \\
--agendum-linguam hun-Latn@hu \\
--agendum-linguam ita-Latn@it \\
--agendum-linguam lav-Latn@lv \\
--agendum-linguam lit-Latn@lt \\
--agendum-linguam mlt-Latn@mt \\
--agendum-linguam pol-Latn@pl \\
--agendum-linguam por-Latn@pt \\
--agendum-linguam ron-Latn@ro \\
--agendum-linguam slk-Latn@sk \\
--agendum-linguam slv-Latn@sl \\
--agendum-linguam spa-Latn@es \\
--agendum-linguam swe-Latn@sv
"""
HXLM core file dialects
HXLTM
Terminologia Multilinguae (priore HXL Trānslātiōnem Memoriam), Datum ideam
HXLTM:
__meta:
# archivum_extensionem: .tm.hxl.csv # .tm.hxl.xlsx, xlsx, ...
archivum:
extensionem:
- .tm.hxl.csv
- .tm.hxl.xlsx
- .hxltm.xml
# - .hxltm.tmx
# - .hxltm.tbx
# - HXL-proxy
# - (...)
descriptionem: |
_[eng-Latn]
`ontologia:normam.HXLTM` is an abstraction to several data containers
of HXLTM implementation able to store multilingual data without loss.
Some general notes:
- The most feature-complete are the HXLTM implementation using
tabular storage (plain HXLTM in CSV, or Google Sheets, or Excel,
or HXL-proxy, or...), which is able to preserve valid HXL HXLated
columns, but unknown to documented HXLTM implementation.
- The `ontologia:normam.XML`, while not tabular implementation,
contains information that allow data be exportable with
`hxltmcli --objectivum-XML` and importable with
`hxltmdexml` with more features than would be possible with other
data standards that could be close to what HXLTM is, TBX, TMX and
the tabular UTX.
- Valid HXLated HXL columns (but unknown HXLTM), even if the
templating engine know the undocumented HXL tags, are not
intended to be exported. The idea is the generic XML format still
designed to only export what could be imported back using the
same ontologia.
- If you plan to do VERY long-term data storage consider save
together with the data the ontologia that generated it.
- A cor.hxltm.yml with 3000 lines exported to PDF (which could be
printed if you data already is printed) takes around 48 pages
(A4 format).
- Is it possible to also change the tags from latin to your natural
language. While still have better ways to save more compact
export, if you plan to save a backup on some library on a
physical book, then at least customize it.
[eng-Latn]_
normam:
- <https://github.com/HXL-CPLP/forum/issues/58>
nomen:
eng-Latn: 'HXLTM: Terminologia Multilinguae (Datum ideam)'
situs_interretialis:
referens_officinale:
- <https://hdp.etica.ai/hxltm>
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# These examples may help you undestand the basics of how command line works.
# Most data can be downloaded from:
# - https://github.com/EticaAI/HXL-Data-Science-file-formats/tree/main/testum/hxltm
#
# All examples on this page will reuse either this folder or online spredsheets
# used by HXL-CPLP on the HXL-CPLP/Auxilium-Humanitarium-API project.
# [eng-Latn]_
# wget https://github.com/EticaAI/HXL-Data-Science-file-formats/archive/refs/heads/main.zip
# unzip main.zip
# cd HXL-Data-Science-file-formats-main/testum/hxltm
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli
### III ------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print save data from input on the
# output.
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv output-file.tm.hxl.csv
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli > output-file.tm.hxl.csv
### IV ------------------------------------------------------------------------
# _[eng-Latn]
# Since the HXLTM CSV/XLSX reference format act as a container of
# several languages (e.g a 'main' file), most commands tend to be
# related to export to other formats. They are documented on other
# sections.
#
# Is also possible to manipulate (like filter, or renerate other
# HXLated datasets from an HXLTM main file using HXL cli tools
# or the HXL-Proxy. These advanced cases will be not covered here.
# But see:
# - https://hxlstandard.org/
# - https://proxy.hxlstandard.org/
# - https://github.com/HXLStandard/libhxl-python/wiki
# [eng-Latn]_
HXLTM-TMETA
:
HXLTM Terminologia Multilinguae Meta
HXLTM-TMETA:
__meta:
archivum:
extensionem:
- .tmeta.json
- .tmeta.yml
descriptionem: |
_[eng-Latn]
To be documented.
[eng-Latn]_
normam:
- <https://hdp.etica.ai/hxltm/archivum/#HXLTM-TMETA>
nomen:
eng-Latn: 'HXLTM Terminologia Multilinguae Meta'
situs_interretialis:
referens_officinale:
- <https://hdp.etica.ai/hxltm>
- <https://github.com/EticaAI/HXL-Data-Science-file-formats/labels/HXLTM>
- <https://github.com/EticaAI/HXL-Data-Science-file-formats/issues/24>
HXLTM-ASA
:
HXLTM Abstractum Syntaxim Arborem
HXLTM-ASA:
__meta:
archivum:
extensionem:
- .asa.hxltm.json
- .asa.hxltm.yml
normam:
- <https://hdp.etica.ai/hxltm/archivum/#HXLTM-ASA>
descriptionem: |
_[eng-Latn]
The HXLTM-ASA is an not strictly documented Abstract Syntax Tree
of an data conversion operation.
This format, different from the HXLTM permanent storage, is not
meant to be used by end users. And, in fact, either JSON (or other
formats, like YAML) are more a tool for users debugging the initial
reference implementation hxltmcli OR developers using JSON
as more advanced input than the end user permanent storage.
Warning: The HXLTM-ASA is not meant to be an stricly documented format
even if HXLTM eventually get used by large public. If necessary,
some special format could be created, but this would require feedback
from community or some work already done by implementers.
[eng-Latn]_
Trivia:
- abstractum, <https://en.wiktionary.org/wiki/abstractus#Latin>
- syntaxim, <https://en.wiktionary.org/wiki/syntaxis#Latin>
- arborem, <https://en.wiktionary.org/wiki/arbor#Latin>
- conceptum de Abstractum Syntaxim Arborem
- <https://www.wikidata.org/wiki/Q127380>
nomen:
eng-Latn: 'HXLTM Abstractum Syntaxim Arborem'
situs_interretialis:
referens_officinale:
- <https://hdp.etica.ai/hxltm>
- <https://github.com/EticaAI/HXL-Data-Science-file-formats/labels/HXLTM>
- <https://github.com/EticaAI/HXL-Data-Science-file-formats/issues/22>
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# The HXLTM-ASA is an not strictly documented Abstract Syntax Tree
# of an data conversion operation.
#
# These are quick examples. They reuse other examples on this guide, but also
# save data on a separate file.
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next example will generate an XLIFF, but we also will save the HXLTM-ASA.
#
# The '--expertum-HXLTM-ASA hxltm-asa/hxltm-exemplum-linguam.asa.hxltm.json'
# will generate an JSON output of the operation.
#
# The '--expertum-HXLTM-ASA hxltm-asa/hxltm-exemplum-linguam.asa.hxltm.yml'
# will generate an YAML output of the operation.
# [eng-Latn]_
# TODO: replace HXLTM ASA example from XLIFF to TBX or TMX
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.xlf \
--expertum-HXLTM-ASA hxltm-asa/hxltm-exemplum-linguam.asa.hxltm.json \
--objectivum-XLIFF
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.xlf \
--expertum-HXLTM-ASA hxltm-asa/hxltm-exemplum-linguam.asa.hxltm.yml \
--objectivum-XLIFF
HXLTM Normam (HXLTM interoperability with conventions/standards)
The hxltmcli and (for importing XML, as long as you map the
tags and attributes, as this page already do for TMX, TBX, XLIFFs, …)
hxltmdexml are designed to work with gigabyte size datasets.
The ontology file can be customized with --archivum-configurationem
which means both edit or create new exporters/importers are possible.
|
CSV-3
:
CSV 3 bilingual Source + Objective + Comment
CSV-3:
__meta:
# archivum_extensionem: .csv
archivum:
extensionem: .csv
descriptionem: |
_[eng-Latn]
The hxltm "CSV-3" export format is a somewhat basic (but at worst case
accpted by several tools) with the following column order:
> "source-language","target-language",comment
> "verbum", "كلمة","Arabic translationem de latin verbum"
Some references of tools that allow conversions using this format:
- Okapi Framework
- <https://okapiframework.org/wiki/index.php/Table_Filter>
- MateCat
- <https://site.matecat.com/support/managing-language-resources/add-glossary/>
[eng-Latn]_
normam:
- <https://datatracker.ietf.org/doc/html/rfc4180>
nomen:
eng-Latn: 'CSV 3 bilingual Source + Objective + Comment'
situs_interretialis:
referens_officinale:
- <https://datatracker.ietf.org/doc/html/rfc4180>
vicipaedia:
- <https://en.wikipedia.org/wiki/Comma-separated_values>
# Trivia:
# - ASA
# - (HXLTM) Abstractum Syntaxim Arborem
asa:
# Trivia: modus operandī, https://en.wiktionary.org/wiki/modus_operandi#Latin
modus_operandi:
# - multiplum_linguam
- bilingue
formatum:
initiale: |-
{{ globum.fontem_linguam.bcp47 | default: 'la' | quotum_rem }},{{ globum.objectivum_linguam.bcp47 | default: 'ar' | quotum_rem }},commentarium
corporeum: |-
{{ rem.de_fontem_linguam.rem | quotum_rem }},{{ rem.de_objectivum_linguam.rem | quotum_rem }},""
finale: False
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Documentation at cor.hxltm.yml:normam.CSV-3
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-CSV-3
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-CSV-3
### II -------------------------------------------------------------------------
# _[eng-Latn]
# Instead of use the default source (Latin) and objective (Arab, the classic
# one) on both examples is defined the source (first column) and objective
# second column):
# - XLIFF source language:
# - Portuguese
# - XLIFF objective (target) language:
# - Spanish
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--objectivum-CSV-3
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-CSV-3 --fontem-linguam por-Latn@pt --objectivum-linguam spa-Latn@es
### III ------------------------------------------------------------------------
# _[eng-Latn]
# Instead of use the default source (Latin) and objective (Arab, the classic
# one) on both examples is defined the source (first column) and objective
# second column):
# - XLIFF source language:
# - Portuguese
# - XLIFF objective (target) language:
# - Spanish
#
# but now, instead of print to stdout, save on the file
# resultatum/hxltm-exemplum-linguam.por-Latn_spa-Latn.csv
# [eng-Latn]_
# hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
# resultatum/hxltm-exemplum-linguam.por-Latn_spa-Latn.csv \
# --objectivum-CSV-3
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.por-Latn_spa-Latn.csv \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--objectivum-CSV-3
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-CSV-3 --fontem-linguam por-Latn@pt --objectivum-linguam spa-Latn@es > resultatum/hxltm-exemplum-linguam.por-Latn_spa-Latn.csv
### III ------------------------------------------------------------------------
# _[eng-Latn]
# TODO: explain how to select the source and target language
# [eng-Latn]_
Result example
pt,es,commentarium
por-Latn,spa-Latn,""
Língua portuguesa,Idioma español,""
Alfabeto latino,Alfabeto latino,""
001,001,""
∅,∅,""
Olá mundo!,¡Hola mundo!,""
"Teste, 1, 2, 3","Prueba, 1, 2, 3",""
"1, 2, 3, 4, 5, 6, 7, 8, 9, 10","1, 2, 3, 4, 5, 6, 7, 8, 9, 10",""
GSheets
:
Google Sheets, HXLTM container (read-only; native support as data source)
#### XLSX, Google Sheets ____________________________________________________
# @see https://support.microsoft.com/en-us/office/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3
# @see https://support.google.com/drive/answer/37603
GSheets:
__meta:
archivum:
extensionem: # https://docs.google.com/spreadsheets/ (...)
descriptionem: |
_[eng-Latn]
Both URL GSheets and local/remote file of Microsoft Excel have built
read-only access in support for reference cli implementation
as container for data source without intermediate file transformation
to CSV container of HXLTM. This means humans don't need to edit CSV
files directly.
The support on `hxltmcli` to write directly to GSheets and
Microsoft Excel is unlikely to be implemented.
[eng-Latn]_
normam:
- <https://developers.google.com/sheets/api>
nomen:
# eng-Latn: 'Google Sheets (via CSV import)'
# eng-Latn: 'Google Sheet (native support to read, but not write, data directly from GSheets)'
eng-Latn: 'Google Sheets, HXLTM container (read-only; native support as data source)'
situs_interretialis:
referens_officinale:
- <https://www.google.com/sheets/about/>
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
HXL-Proxy
:
HXL-Proxy (read-only; native support as data source)
#### HXL-Proxy _______________________________________________________________
HXL-Proxy:
__meta:
archivum:
extensionem:
descriptionem: |
_[eng-Latn]
HXL Proxy, a tool for cleaning, transforming, merging, and
validating data tagged using the Humanitarian Exchange Language (HXL)
standard.
In the context of HXLTM, HXL-Proxy is recommended to use for:
- HXLate (e.g. add HXL hashtags, like the ones required by HXLTM) to
any supported data input by HXL-Proxy (which is a lot)
- Do more advanced filter (like removing columns) or merging
different datasets by HXLTM concept value with friendly user
interface
When use HXL cli tools (including hxltmcli) and when use HXL-Proxy?
While hxl cli tools (see
<https://github.com/HXLStandard/libhxl-python/wiki/Command-line-tools>
and <https://pypi.org/project/libhxl/>) have almost all features
of HXL-proxy (in special with use of JSON spec) in real world, under
urgency, still faster to set up a private Docker instance than use
teach everyone to use the cli tools.
The HXLTM reference tooling will intentionally NOT implement features
that would be possible do with HXL-Proxy. Some viable using HXL
Standard cli tools (that would be too complex to explain) may
be added either as hxltmcli / hxltmdexml CLI options or via
`ontologia:normam.HXLTM-TMETA`.
[eng-Latn]_
normam:
- <https://github.com/HXLStandard/hxl-proxy/wiki>
nomen:
eng-Latn: 'HXL-Proxy (read-only; native support as data source)'
situs_interretialis:
referens_officinale:
- <https://github.com/HXLStandard/hxl-proxy>
# For humanitarian-use only (or to lean HXL), the UN OCHA proxy:
- <https://proxy.hxlstandard.org/>
# For intranets, or for large (500.000 rows) or non-humanitarian use,
# please set up your own HXL-proxy using Docker.
- <https://hub.docker.com/r/unocha/hxl-proxy>
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
JSON-kv
:
JSON key: val; id/source → target (draft)
# TODO: create at least one different exporter, JSON-2, since JSON-kv
# would be harder to explain how to document on HXLTM sheets than
# create the exporter
JSON-kv:
__meta:
archivum:
extensionem: .json
descriptionem: |
_[eng-Latn]
This export/importer needs to be created. One level is trivial, but 2
or more nested levels would be simpler for end user just use
**HXLTM Ad Hoc Fōrmulam (HXLTM templated export)** to have full
control.
[eng-Latn]_
normam:
# Not sure where to find some place to 'explain' this format
- <https://angular.io/guide/i18n#change-the-source-language-file-location>
- <https://www.i18next.com/misc/json-format>
- <https://lokalise.com/blog/how-to-internationalize-react-application-using-i18next/>
nomen:
eng-Latn: 'JSON key: val; id/source -> target (draft)'
situs_interretialis:
referens_officinale: []
exemplum:
- <https://github.com/i18next/react-i18next/blob/master/example/react/public/locales/de/translation.json>
asa:
modus_operandi:
# - multiplum_linguam
- bilingue
TBX-Basim
:
TermBase eXchange (TBX) Basic 2.1
#### TBX-Basic: TermBase eXchange (TBX) Basic _______________________________
TBX-Basim:
__meta:
archivum:
extensionem: .tbx
descriptionem: |
_[eng-Latn]
See the links
[eng-Latn]_
- <http://www.terminorgs.net/Terminology-Starter-Guide.html>
- <https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf>
- <http://www.ttt.org/oscarStandards/tbx/tbx-basic.html>
exemplum:
- <http://www.ttt.org/oscarStandards/tbx/TBXBasic.zip>
normam:
- <http://www.terminorgs.net/downloads/TBX_Basic_Version_3.1.pdf>
nomen:
eng-Latn: 'TermBase eXchange (TBX) Basic 2.1'
situs_interretialis:
referens_officinale:
- <http://www.terminorgs.net/TBX-Basic.html>
- <http://www.ttt.org/oscarStandards/tbx/TBXBasic.zip>
- <http://www.ttt.org/oscarStandards/tbx/tbx-basic.html>
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
de_xml:
# This is a working draft
# @see https://terminator.readthedocs.io/en/latest/tbx_conformance.html
# ontologia libellam: I glossarium > II conceptum > III linguam > IV terminum
glossarium_radicem:
signum: martif
glossarium_titulum:
signum: title
# de_attributum: False
trivium:
# de <martif> ad <title>
- martifHeader
- fileDesc
# II conceptum
conceptum_codicem:
signum: termEntry
de_attributum: id
trivium:
# de <martif> ad <termEntry>
- text
- body
# III linguam
linguam_codicem:
signum: langSet # 'la' ad <langSet xml:lang="la">
de_attributum: lang
trivium: []
# IV terminum
terminum_habendum_accuratum: True # TBX terminum habendum accuratum? Verum
terminum_habendum_multum: True
terminum_habendum_fontem: False # TBX terminum habendum fontem? Falsum
terminum_habendum_objectivum: False # TBX terminum habendum objectivum? Falsum
# @see https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf
terminum_accuratum:
# Exemplum: <descrip type="reliabilityCode">1</descrip>
ad: XML-nodum-textum
de_signum: descrip
de_attributum:
type: reliabilityCode
# de_attributum: False
viam_trivium: []
# - termSec # de <langSec> ad <term>
terminum_valorem:
signum: term # 'lat-Latn' ad <langSet xml:lang="la"><tig><term>lat-Latn</term></tig></langSet>
# de_attributum: False
trivium:
- tig # de <langSet> ad <term>
formatum:
initiale: |2
<?xml version='1.0'?>
<!DOCTYPE martif SYSTEM "TBXBasiccoreStructV02.dtd">
<martif type="TBX-Basic"
xml:lang="{{ globum.fontem_linguam.iso6391a2 | default: globum.fontem_linguam.iso6391a2 | default: 'la' }}">
<martifHeader>
<fileDesc>
<titleStmt>
<title>TBX-Basic Sample File</title>
<note>
Lorem ipsum dolor semet
</note>
</titleStmt>
<sourceDesc>
<p>Lorem ipsum dolor semet</p>
</sourceDesc>
</fileDesc>
</martifHeader>
<text>
<body>
# _[eng-Latn]
# NOTE: some IDs, like
# <termEntry id="I18N_०१२३४५६७८९_〇一二三四五六七八九十百千万亿_-1+2/3*4_٩٨٧٦٥٤٣٢١٠_零壹贰叁肆伍陆柒捌玖拾佰仟萬億_I18N">
# will generate errors to validate TBX with TBXBasiccoreStructV02.dtd
# from http://www.ttt.org/oscarStandards/tbx/TBXBasic.zip
# The problematic part is '+2/3*': '*', '+', '/'
# One way to replace:
# {{ conceptum.codicem | default: rem.de_nomen_breve.conceptum_codicem | default: 'errorem' | replace: "*", "%2A" | replace: "+", "%2B" | replace: "/", "%2F" }}
# [eng-Latn]_
corporeum: |2
<termEntry id="{{ conceptum.codicem | default: rem.de_nomen_breve.conceptum_codicem | default: 'errorem' | replace: '*', '' | replace: '+', '' | replace: '/', '' }}">
{% for item in rem.de_linguam %}
{% if item[1].rem != '' -%}
<langSet xml:lang="{{ item[1].bcp47 }}">
<tig>
<term>
{{ item[1].rem }}
</term>
</tig>
</langSet>
{%- endif -%}
{%- endfor %}
</termEntry>
finale: |2
</body>
</text>
</martif>
#### Term Base eXchange (TBX) 2008 CC-BY License _____________________________
# Term Base eXchange (TBX) (identical to ISO 30042:2008)
TBX-2008:
archivum:
extensionem: .tbx
normam:
- <https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf>
situs_interretialis:
referens_officinale:
- <https://www.gala-global.org/knowledge-center/industry-development/standards/lisa-oscar-standards>
_temporarium:
- <https://www.termbases.eu/>
- <http://www.tbxinfo.net/>
# - <https://github.com/byutrg/TBX-Spec>
- <https://byutrg.github.io/TBX-Implementor/>
- <https://github.com/byutrg/baseterm>
- <https://github.com/LTAC-Global/TBX-Basic_ImplementationGuide>
# - https://www.tbxinfo.net/tbx-dialects/
# - Convert existing spreadsheet based glossaries into TBX format
# (see our tutorial and sample spreadsheets).
# - https://www.tbxinfo.net/wp-content/uploads/2016/06/Spreadsheet-to-TBX-Min-Tutorial.pdf
# - https://www.tbxinfo.net/wp-content/uploads/2016/05/sampleSpreadsheets.zip
# - https://multilingual.com/issues/july-aug-2019/tbx-version-3-published-at-iso/
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
#### TermBase eXchange (TBX) ISO 30042:2019 proprietary forma ________________
TBX-2019:
__meta:
archivum:
extensionem: .tbx
descriptionem: |
- <https://www.tbxinfo.net/about/>
- <https://www.iso.org/standard/62510.html>
## TBX-IATE
Trivia: <https://iate.europa.eu/fields-explained>
### `hxltmdexml` --agendum-linguam
- bul-Latn@bg
- Bulgarian: bg; <https://iso639-3.sil.org/code/bul>
- ces-Latn@cs
- Czech: cs; <https://iso639-3.sil.org/code/ces>
- dan-Latn@da
- Danish: da; <https://iso639-3.sil.org/code/dan>
- dut-Latn@nl
- Dutch: nl; <https://iso639-3.sil.org/code/dut>
- ell-Latn@el
- Greek: el; <https://iso639-3.sil.org/code/ell>
- eng-Latn@en
- English: en; <https://iso639-3.sil.org/code/eng>
- est-Latn@et
- Estonian: et; <https://iso639-3.sil.org/code/est>
- fin-Latn@fi
- Finnish: fi; <https://iso639-3.sil.org/code/fin>
- fra-Latn@fr
- French: fr; <https://iso639-3.sil.org/code/fra>
- ger-Latn@de
- German: de; <https://iso639-3.sil.org/code/ger>
- gle-Latn@ga
- Irish: ga; <https://iso639-3.sil.org/code/gle>
- hun-Latn@hu
- Hungarian: hu; <https://iso639-3.sil.org/code/hun>
- ita-Latn@it
- Italian: it; <https://iso639-3.sil.org/code/ita>
- lav-Latn@lv
- Latvian: lv; <https://iso639-3.sil.org/code/lav>
- lit-Latn@lt
- Lithuanian: lt; <https://iso639-3.sil.org/code/lit>
- mlt-Latn@mt
- Maltese: mt; <https://iso639-3.sil.org/code/mlt>
- pol-Latn@pl
- Polish: pl; <https://iso639-3.sil.org/code/pol>
- por-Latn@pt
- Portuguese: pt; <https://iso639-3.sil.org/code/por>
- ron-Latn@ro
- Romanian: ro; <https://iso639-3.sil.org/code/ron>
- scr-Latn@hr
- Croatian: hr; <https://iso639-3.sil.org/code/scr>
- slk-Latn@sk
- Slovak: sk; <https://iso639-3.sil.org/code/slk>
- slv-Latn@sl
- Slovene: sl; <https://iso639-3.sil.org/code/slv>
- spa-Latn@es
- Spanish: es; <https://iso639-3.sil.org/code/spa>
- swe-Latn@sv
- Swedish: sv; <https://iso639-3.sil.org/code/swe>
normam:
- proprietary format
- ¯\_(ツ)_/¯
nomen:
eng-Latn: 'TermBase eXchange (TBX) ISO 30042:2019 proprietary format'
situs_interretialis:
referens_officinale:
- proprietary format
- ¯\_(ツ)_/¯
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
de_xml:
# ontologia libellam: I glossarium > II conceptum > III linguam > IV terminum
glossarium_radicem:
signum: tbx # TBX-Basic: termEntry
# <tbx type="TBX-IATE" style="dca" xml:lang="en" xmlns="urn:iso:std:iso:30042:ed-2">
glossarium_titulum:
signum: title
# de_attributum: False
trivium:
# de <tbx> ad <title>
- tbxHeader
- fileDesc
# II conceptum
conceptum_codicem:
signum: conceptEntry # TBX-Basic: termEntry
de_attributum: id
trivium:
# de <martif> ad <termEntry>
- text
- body
# III linguam
linguam_codicem:
signum: langSec # TBX-Basic: langSet
de_attributum: lang
trivium:
- termSec
# IV terminum
terminum_habendum_accuratum: True
terminum_habendum_multum: True
terminum_habendum_fontem: False
terminum_habendum_objectivum: False
terminum_habendum_typum: True
terminum_accuratum:
# Exemplum: <descrip type="reliabilityCode">1</descrip>
ad: XML-nodum-textum
signum: descrip
de_attributum:
type: reliabilityCode
trivium:
- termSec # de <langSec> ad <term>
# terminum_accuratum:
# # Exemplum: <descrip type="reliabilityCode">1</descrip>
# ad: XML-nodum-textum
# de_signum: descrip
# de_attributum:
# type: reliabilityCode
# # de_attributum: False
# viam_trivium: []
# # - termSec # de <langSec> ad <term>
terminum_fontem: False # TBX terminum habendum fontem? Falsum
terminum_objectivum: False # TBX terminum habendum objectivum? Falsum
terminum_valorem:
signum: term # 'lat-Latn' ad <langSet xml:lang="la"><tig><term>lat-Latn</term></tig></langSet>
# de_attributum: False
trivium:
- tig # de <langSet> ad <term>
terminum_typum:
# Exemplum: <descrip type="reliabilityCode">1</descrip>
ad: XML-nodum-textum
signum: termNote
de_attributum:
type: termType
in_praefixum: 'TBX_'
in_suffixum: ''
trivium: []
formatum:
initiale: False
corporeum: False
finale: False
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Explanation about the format at cor.hxltm.yml:normam.TBX-Basim
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-TBX-Basim
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-TBX-Basim
### III ------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: they save the input data on a file on
# disk.
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.tbx \
--objectivum-TBX-Basim
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-TBX-Basim > resultatum/hxltm-exemplum-linguam.tbx
### III ------------------------------------------------------------------------
# _[eng-Latn]
# TODO: docupent eventual new options to the --objectivum-TMX here.
# [eng-Latn]_
Result example
<?xml version='1.0'?>
<!DOCTYPE martif SYSTEM "TBXBasiccoreStructV02.dtd">
<martif type="TBX-Basic"
xml:lang="la">
<martifHeader>
<fileDesc>
<titleStmt>
<title>TBX-Basic Sample File</title>
<note>
Lorem ipsum dolor semet
</note>
</titleStmt>
<sourceDesc>
<p>Lorem ipsum dolor semet</p>
</sourceDesc>
</fileDesc>
</martifHeader>
<text>
<body>
<termEntry id="L10N_ego_codicem">
<langSet xml:lang="la">
<tig>
<term>
lat-Latn
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
por-Latn
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
eng-Latn
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
spa-Latn
</term>
</tig>
</langSet>
<langSet xml:lang="ar">
<tig>
<term>
arb-Arab
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
hin-Deva
</term>
</tig>
</langSet>
<langSet xml:lang="sl">
<tig>
<term>
slv-Latn
</term>
</tig>
</langSet>
</termEntry>
<termEntry id="L10N_ego_linguam_nomen">
<langSet xml:lang="la">
<tig>
<term>
Lingua Latina
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
Língua portuguesa
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
English language
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
Idioma español
</term>
</tig>
</langSet>
<langSet xml:lang="ar">
<tig>
<term>
اللغة العربية
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
हिन्दी भाषा
</term>
</tig>
</langSet>
<langSet xml:lang="sl">
<tig>
<term>
Slovenščina
</term>
</tig>
</langSet>
</termEntry>
<termEntry id="L10N_ego_scriptum_nomen">
<langSet xml:lang="la">
<tig>
<term>
Abecedarium Latinum
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
Alfabeto latino
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
Latin script
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
Alfabeto latino
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
देवनागरी लिपि
</term>
</tig>
</langSet>
<langSet xml:lang="sl">
<tig>
<term>
Latinska abeceda
</term>
</tig>
</langSet>
</termEntry>
<termEntry id="L10N_ego_patriam_UN_M49_numerum">
<langSet xml:lang="la">
<tig>
<term>
001
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
001
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
001
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
001
</term>
</tig>
</langSet>
<langSet xml:lang="ar">
<tig>
<term>
001
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
001
</term>
</tig>
</langSet>
<langSet xml:lang="sl">
<tig>
<term>
001
</term>
</tig>
</langSet>
</termEntry>
<termEntry id="L10N_ego_patriam_UN_P_codicem">
<langSet xml:lang="la">
<tig>
<term>
∅
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
∅
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
∅
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
∅
</term>
</tig>
</langSet>
<langSet xml:lang="ar">
<tig>
<term>
∅
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
∅
</term>
</tig>
</langSet>
<langSet xml:lang="sl">
<tig>
<term>
∅
</term>
</tig>
</langSet>
</termEntry>
<termEntry id="I18N_testum_salve_mundi_testum_I18N">
<langSet xml:lang="la">
<tig>
<term>
Salvi mundi!
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
Olá mundo!
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
Hello, World!
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
¡Hola mundo!
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
नमस्ते दुनिया
</term>
</tig>
</langSet>
</termEntry>
<termEntry id="I18N_إختبار_טעסט_测试_테스트_испытание_I18N">
<langSet xml:lang="la">
<tig>
<term>
Testum, I, II, III
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
Teste, 1, 2, 3
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
Test, 1, 2, 3
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
Prueba, 1, 2, 3
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
परीक्षा, १, २, ३
</term>
</tig>
</langSet>
</termEntry>
<termEntry id="I18N_०१२३४५६७८९_〇一二三四五六七八九十百千万亿_-1234_٩٨٧٦٥٤٣٢١٠_零壹贰叁肆伍陆柒捌玖拾佰仟萬億_I18N">
<langSet xml:lang="la">
<tig>
<term>
I,V, X, L, C, D, M
</term>
</tig>
</langSet>
<langSet xml:lang="pt">
<tig>
<term>
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
</term>
</tig>
</langSet>
<langSet xml:lang="en">
<tig>
<term>
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
</term>
</tig>
</langSet>
<langSet xml:lang="es">
<tig>
<term>
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
</term>
</tig>
</langSet>
<langSet xml:lang="hi">
<tig>
<term>
०, १, २, ३, ४, ५, ६, ७, ८, ९
</term>
</tig>
</langSet>
<langSet xml:lang="sl">
<tig>
<term>
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
</term>
</tig>
</langSet>
</termEntry>
</body>
</text>
</martif>
TSV-3
:
TSV-3 bilingual Source + Objective + Comment
TSV-3:
__meta:
archivum:
extensionem: .tab
descriptionem: |
_[eng-Latn]
The hxltm "TSV-3" is que version of the "CSV-3" with tabs.
It will exportl tools) with the following column order:
> source-language target-language comment
> verbum كلمة Arabic translationem de latin verbum
This format is less common than CSV-3, but may be useful when tab
is never used inside the fields.
[eng-Latn]_
normam:
- <https://datatracker.ietf.org/doc/html/rfc4180>
- <https://www.iana.org/assignments/media-types/text/tab-separated-values>
# - <http://dataprotocols.org/linear-tsv/>
nomen:
eng-Latn: 'TSV-3 bilingual Source + Objective + Comment'
situs_interretialis:
vicipaedia:
- <https://en.wikipedia.org/wiki/Tab-separated_values>
asa:
modus_operandi:
# - multiplum_linguam
- bilingue
formatum:
initiale: |-
{{ globum.fontem_linguam.bcp47 | default: 'la' }} {{ globum.objectivum_linguam.bcp47 | default: 'ar' }} commentarium
corporeum: |-
{{ rem.de_fontem_linguam.rem | quotum_rem: ' ' }} {{ rem.de_objectivum_linguam.rem | quotum_rem: ' ' }}
finale: False
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Documentation at cor.hxltm.yml:normam.TSV-3
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-TSV-3
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-TSV-3
### II -------------------------------------------------------------------------
# _[eng-Latn]
# Instead of use the default source (Latin) and objective (Arab, the classic
# one) on both examples is defined the source (first column) and objective
# second column):
# - XLIFF source language:
# - Portuguese
# - XLIFF objective (target) language:
# - Spanish
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--objectivum-TSV-3
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-TSV-3 --fontem-linguam por-Latn@pt --objectivum-linguam spa-Latn@es
### III ------------------------------------------------------------------------
# _[eng-Latn]
# Instead of use the default source (Latin) and objective (Arab, the classic
# one) on both examples is defined the source (first column) and objective
# second column):
# - XLIFF source language:
# - Portuguese
# - XLIFF objective (target) language:
# - Spanish
#
# but now, instead of print to stdout, save on the file
# resultatum/hxltm-exemplum-linguam.por-Latn_spa-Latn.csv
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.por-Latn_spa-Latn.tsv \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--objectivum-TSV-3
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-TSV-3 --fontem-linguam por-Latn@pt --objectivum-linguam spa-Latn@es > resultatum/hxltm-exemplum-linguam.por-Latn_spa-Latn.tsv
Result example
pt es commentarium
por-Latn spa-Latn
Língua portuguesa Idioma español
Alfabeto latino Alfabeto latino
001 001
∅ ∅
Olá mundo! ¡Hola mundo!
Teste, 1, 2, 3 Prueba, 1, 2, 3
1, 2, 3, 4, 5, 6, 7, 8, 9, 10 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
TMX
:
Translation Memory eXchange format (TMX)
TMX:
__meta:
archivum:
extensionem: .tmx
normam:
- https://www.gala-global.org/tmx-14b
- https://www.gala-global.org/sites/default/files/migrated-pages/docs/tmx14%20%281%29.dtd
nomen:
eng-Latn: 'Translation Memory eXchange format (TMX)'
situs_interretialis:
referens_officinale:
- https://www.gala-global.org/knowledge-center/industry-development/standards/lisa-oscar-standards
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
de_xml:
# ontologia libellam: I glossarium > II conceptum > III linguam > IV terminum
glossarium_radicem:
signum: tmx
# <!DOCTYPE tmx SYSTEM "tmx14.dtd"><tmx version="1.4">...
glossarium_titulum: False
# II conceptum
conceptum_codicem:
signum: tu # <tu tuid="L10N_ego_codicem">
de_attributum: tuid
trivium:
# de <tmx> ad <tu>
- body
# III linguam
linguam_codicem:
signum: tuv
de_attributum: lang
trivium: []
# IV terminum
# terminum_habendum_accuratum: True
terminum_habendum_multum: True
terminum_habendum_fontem: False # TMX terminum habendum fontem? Falsum
terminum_habendum_objectivum: False # TMX terminum habendum objectivum? Falsum
# terminum_accuratum: False # TMX terminum habendum accuratum? Falsum
# terminum_fontem: False # TMX terminum habendum fontem? Falsum
# terminum_objectivum: False # TMX terminum habendum objectivum? Falsum
terminum_valorem:
signum: seg # 'lat-Latn' ad <tu tuid="L10N_ego_codicem"><tuv xml:lang="la"><seg>lat-Latn</seg>
# de_attributum: False
trivium: []
formatum:
initiale: |2
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE tmx SYSTEM "tmx14.dtd">
<tmx version="1.4">
<header
creationtool="hxltmcli.py"
creationtoolversion="{{ globum.instrumentum_versionem }}"
segtype="sentence"
o-tmf="UTF-8"
adminlang="{{ globum.instrumentum_versionem }}"
srclang="{{ globum.fontem_linguam.iso6391a2 | default: globum.fontem_linguam.iso6391a2 | default: 'la' }}"
datatype="PlainText"
/>
<body>
corporeum: |2
<tu tuid="{{ conceptum.codicem | default: rem.de_nomen_breve.conceptum_codicem | default: 'errorem' }}">
{%- if conceptum.conceptum_codicem_wikidata %}
<prop type="wikidata">{{ conceptum.conceptum_codicem_wikidata }}</prop>
{% endif -%}
{% for item in rem.de_linguam %}
{% if item[1].rem != '' -%}
<tuv xml:lang="{{ item[1].bcp47 }}">
<seg>{{ item[1].rem }}</seg>
</tuv>
{%- endif -%}
{%- endfor %}
</tu>
finale: |2
</body>
</tmx>
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Explanation about the format at cor.hxltm.yml:normam.TMX
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-TMX
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-TMX
### III ------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: they save the input data on a file on
# disk.
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.tmx \
--objectivum-TMX
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-TMX > resultatum/hxltm-exemplum-linguam.tmx
### III ------------------------------------------------------------------------
# _[eng-Latn]
# @TODO: docupent eventual new options to the --objectivum-TMX here, like the
# --agendum-linguam
# [eng-Latn]_
Result example
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE tmx SYSTEM "tmx14.dtd">
<tmx version="1.4">
<header
creationtool="hxltmcli.py"
creationtoolversion=""
segtype="sentence"
o-tmf="UTF-8"
adminlang=""
srclang="la"
datatype="PlainText"
/>
<body>
<tu tuid="L10N_ego_codicem">
<tuv xml:lang="la">
<seg>lat-Latn</seg>
</tuv>
<tuv xml:lang="pt">
<seg>por-Latn</seg>
</tuv>
<tuv xml:lang="en">
<seg>eng-Latn</seg>
</tuv>
<tuv xml:lang="es">
<seg>spa-Latn</seg>
</tuv>
<tuv xml:lang="ar">
<seg>arb-Arab</seg>
</tuv>
<tuv xml:lang="hi">
<seg>hin-Deva</seg>
</tuv>
<tuv xml:lang="sl">
<seg>slv-Latn</seg>
</tuv>
</tu>
<tu tuid="L10N_ego_linguam_nomen">
<tuv xml:lang="la">
<seg>Lingua Latina</seg>
</tuv>
<tuv xml:lang="pt">
<seg>Língua portuguesa</seg>
</tuv>
<tuv xml:lang="en">
<seg>English language</seg>
</tuv>
<tuv xml:lang="es">
<seg>Idioma español</seg>
</tuv>
<tuv xml:lang="ar">
<seg>اللغة العربية</seg>
</tuv>
<tuv xml:lang="hi">
<seg>हिन्दी भाषा</seg>
</tuv>
<tuv xml:lang="sl">
<seg>Slovenščina</seg>
</tuv>
</tu>
<tu tuid="L10N_ego_scriptum_nomen">
<tuv xml:lang="la">
<seg>Abecedarium Latinum</seg>
</tuv>
<tuv xml:lang="pt">
<seg>Alfabeto latino</seg>
</tuv>
<tuv xml:lang="en">
<seg>Latin script</seg>
</tuv>
<tuv xml:lang="es">
<seg>Alfabeto latino</seg>
</tuv>
<tuv xml:lang="hi">
<seg>देवनागरी लिपि</seg>
</tuv>
<tuv xml:lang="sl">
<seg>Latinska abeceda</seg>
</tuv>
</tu>
<tu tuid="L10N_ego_patriam_UN_M49_numerum">
<tuv xml:lang="la">
<seg>001</seg>
</tuv>
<tuv xml:lang="pt">
<seg>001</seg>
</tuv>
<tuv xml:lang="en">
<seg>001</seg>
</tuv>
<tuv xml:lang="es">
<seg>001</seg>
</tuv>
<tuv xml:lang="ar">
<seg>001</seg>
</tuv>
<tuv xml:lang="hi">
<seg>001</seg>
</tuv>
<tuv xml:lang="sl">
<seg>001</seg>
</tuv>
</tu>
<tu tuid="L10N_ego_patriam_UN_P_codicem">
<tuv xml:lang="la">
<seg>∅</seg>
</tuv>
<tuv xml:lang="pt">
<seg>∅</seg>
</tuv>
<tuv xml:lang="en">
<seg>∅</seg>
</tuv>
<tuv xml:lang="es">
<seg>∅</seg>
</tuv>
<tuv xml:lang="ar">
<seg>∅</seg>
</tuv>
<tuv xml:lang="hi">
<seg>∅</seg>
</tuv>
<tuv xml:lang="sl">
<seg>∅</seg>
</tuv>
</tu>
<tu tuid="I18N_testum_salve_mundi_testum_I18N">
<tuv xml:lang="la">
<seg>Salvi mundi!</seg>
</tuv>
<tuv xml:lang="pt">
<seg>Olá mundo!</seg>
</tuv>
<tuv xml:lang="en">
<seg>Hello, World!</seg>
</tuv>
<tuv xml:lang="es">
<seg>¡Hola mundo!</seg>
</tuv>
<tuv xml:lang="hi">
<seg>नमस्ते दुनिया</seg>
</tuv>
</tu>
<tu tuid="I18N_إختبار_טעסט_测试_테스트_испытание_I18N">
<tuv xml:lang="la">
<seg>Testum, I, II, III</seg>
</tuv>
<tuv xml:lang="pt">
<seg>Teste, 1, 2, 3</seg>
</tuv>
<tuv xml:lang="en">
<seg>Test, 1, 2, 3</seg>
</tuv>
<tuv xml:lang="es">
<seg>Prueba, 1, 2, 3</seg>
</tuv>
<tuv xml:lang="hi">
<seg>परीक्षा, १, २, ३</seg>
</tuv>
</tu>
<tu tuid="I18N_०१२३४५६७८९_〇一二三四五六七八九十百千万亿_-1+2/3*4_٩٨٧٦٥٤٣٢١٠_零壹贰叁肆伍陆柒捌玖拾佰仟萬億_I18N">
<tuv xml:lang="la">
<seg>I,V, X, L, C, D, M</seg>
</tuv>
<tuv xml:lang="pt">
<seg>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</seg>
</tuv>
<tuv xml:lang="en">
<seg>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</seg>
</tuv>
<tuv xml:lang="es">
<seg>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</seg>
</tuv>
<tuv xml:lang="hi">
<seg>०, १, २, ३, ४, ५, ६, ७, ८, ९</seg>
</tuv>
<tuv xml:lang="sl">
<seg>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</seg>
</tuv>
</tu>
</body>
</tmx>
UTX
:
Universal Terminology eXchange (UTX) (working draft)
UTX:
__meta:
archivum:
extensionem: .utx
situs_interretialis:
referens_officinale:
- <http://www.aamt.info/english/utx/>
vicipaedia:
- <https://en.wikipedia.org/wiki/Universal_Terminology_eXchange>
normam:
- <https://aamt.info/wp-content/uploads/2019/06/utx1.20-specification-e.pdf>
- <https://aamt.info/wp-content/uploads/2019/06/utx1.20-specification-e.docx>
nomen:
eng-Latn: 'Universal Terminology eXchange (UTX) (working draft)'
exemplum:
- <https://aamt.info/english/download/#UTX_Glossaries>
- <https://docs.google.com/spreadsheets/d/13gBCYAd3tbty10W2W6GX9rA-3oT2LnI2u0UG1dpIEkI/pubhtml?widget=true&headers=false>
- <https://docs.google.com/spreadsheets/d/1YONjd_pb7iXJvGlYBeFtZCTCUrDxRfbvKf-OLbCw3LE/pubhtml?widget=true&headers=false>
- <https://aamt.info/wp-content/uploads/2019/06/yakushite-soccer-ej-utx1.20.utx>
asa:
modus_operandi:
- multiplum_linguam
- bilingue
formatum:
# TODO: formatum.initiale should have access to at least the first 10
# data lines. This could allos make some inferences without require
# everyone make it in raw python code (id est, Liquid filters).
initiale: |-
#UTX 1.20; directionality: {{ glossarium.translationem_directionem | default: 'multi' }};
# @TODO: implement the quotum_lineam filter. At the moment we have
# blank line at the end
# [{{ item[1].statum }}]
corporeum: |-
{%- if tabulam.lineam_indicem contains 1 -%}
#{%- for item in rem.de_linguam -%}
term:{{ item[1].bcp47 | quotum_rem }},
{%- endfor %}
{% endif -%}
{%- for item in rem.de_linguam -%}
{% if item[1].rem != '' -%}
{{- item[1].rem | quotum_rem -}},
{%- endif -%}
{%- endfor -%}
finale: false
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Explanation about the format at cor.hxltm.yml:normam.UTX
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-UTX
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-UTX
### III ------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: they save the input data on a file on
# disk.
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.utx \
--objectivum-UTX
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-UTX > resultatum/hxltm-exemplum-linguam.utx
### III ------------------------------------------------------------------------
# _[eng-Latn]
# @TODO: docupent eventual new options to the --objectivum-TMX here, like the
# --agendum-linguam
# [eng-Latn]_
Result example
#UTX 1.20; directionality: multi;
#term:la,term:pt,term:en,term:es,term:ar,term:hi,term:sl,
lat-Latn,por-Latn,eng-Latn,spa-Latn,arb-Arab,hin-Deva,slv-Latn,
Lingua Latina,Língua portuguesa,English language,Idioma español,اللغة العربية,हिन्दी भाषा,Slovenščina,
Abecedarium Latinum,Alfabeto latino,Latin script,Alfabeto latino,देवनागरी लिपि,Latinska abeceda,
001,001,001,001,001,001,001,
∅,∅,∅,∅,∅,∅,∅,
Salvi mundi!,Olá mundo!,"Hello, World!",¡Hola mundo!,नमस्ते दुनिया,
"Testum, I, II, III","Teste, 1, 2, 3","Test, 1, 2, 3","Prueba, 1, 2, 3","परीक्षा, १, २, ३",
"I,V, X, L, C, D, M","1, 2, 3, 4, 5, 6, 7, 8, 9, 10","1, 2, 3, 4, 5, 6, 7, 8, 9, 10","1, 2, 3, 4, 5, 6, 7, 8, 9, 10","०, १, २, ३, ४, ५, ६, ७, ८, ९","1, 2, 3, 4, 5, 6, 7, 8, 9, 10",
XML
:
XML Glōssārium, HXLTM container (generic multilingual XML)'
XML:
__meta:
archivum:
extensionem: .hxltm.xml
descriptionem: |
_[eng-Latn]
The .hxltm.xml named 'XML Glōssārium' is an example of multilingual
glossary exported to XML format that can be imported back.
With help of the reference cli tool, hxltmdexml, the
ontologia:normam.XML.de_xml explains how to convert back from the
XML file to an HXLTM CSV working file to be able to work with reference
cli tool hxltmcli.
[eng-Latn]_
normam:
- <https://hdp.etica.ai/hxltm/archivum/#XML>
- <https://terminator.readthedocs.io/en/latest/_images/TBX_termEntry_structure.png>
nomen:
eng-Latn: 'XML Glōssārium (generic multilingual XML)'
situs_interretialis:
referens_officinale:
- <https://hdp.etica.ai/hxltm/archivum/#XML>
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
de_xml:
# ontologia libellam: I glossarium > II conceptum > III linguam > IV terminum
glossarium_radicem:
signum: glossarium
glossarium_titulum: False
# II conceptum
conceptum_codicem:
signum: conceptum
de_attributum: _
trivium:
# de <glossarium> ad <conceptum>
- datum
# III linguam
linguam_codicem:
signum: linguam
de_attributum: _
trivium: []
linguam_linguam:
ad: 'XML-nodum-attributum:_'
de_signum: linguam
# signum: linguam
# de_attributum:
# type: reliabilityCode
trivium: []
# IV terminum
terminum_habendum_accuratum: True
terminum_habendum_type: True
terminum_habendum_multum: True
terminum_habendum_fontem: False
terminum_habendum_objectivum: False
terminum_accuratum:
ad: XML-nodum-textum
de_signum: accuratum
# de_attributum: False
trivium: []
terminum_valorem:
signum: de-textum # 'lat-Latn' ad <tu tuid="L10N_ego_codicem"><tuv xml:lang="la"><seg>lat-Latn</seg>
# de_attributum: False
trivium:
- terminum
# @TODO: make the latin terms, like <glossarium>, <caput> and <datum>
# be configurable, users could use to bootstrap versions even in
# non-Latin scripts.
# Why allow this? The anwser would be: why not?
formatum:
# I glossarium > II conceptum > III linguam > IV terminum
__:
___: Latium
# glōssārium, https://en.wiktionary.org/wiki/glossarium#Latin
glossarium: glossarium
# datum, https://en.wiktionary.org/wiki/datum#Latin
datum: datum
# caput, https://en.wiktionary.org/wiki/caput#Latin
caput: caput
# conceptum, https://en.wiktionary.org/wiki/conceptus#Latin
conceptum: conceptum
# dēfīnītiōnem, https://en.wiktionary.org/wiki/definitio#Latin
definitionem: definitionem
# contextum, https://en.wiktionary.org/wiki/contextus#Latin
contextum: contextum
# titulum, https://en.wiktionary.org/wiki/titulus#Latin
titulum: titulum
# linguam, https://en.wiktionary.org/wiki/lingua#Latin
linguam: linguam
# librārium, https://en.wiktionary.org/wiki/librarium#Latin
librarium: librarium
# partem ōrātiōnis, https://en.wiktionary.org/wiki/pars_orationis#Latin
partem-orationis: partem-orationis
# terminum, https://en.wiktionary.org/wiki/terminus#Latin
terminum: terminum
# fontem, https://en.wiktionary.org/wiki/fons#Latin
fontem: fontem
# objectīvum, https://en.wiktionary.org/wiki/objectivus#Latin
objectivum: objectivum
rem: rem
# de-textum: de-textum
initiale: |2-
<glossarium _="Latium" __="hxltmcli.py" ___="{{ globum.instrumentum_versionem }}">
<caput>
<titulum>{{ globum.glossarium_titulum }}</titulum>
</caput>
<datum>
<!-- _[eng-Latn]@TODO Use librarium as XLIFF file / Excel worksheets. The "_" is (an non-Latin) script neutral default ID [eng-Latn]_ -->
<librarium _="_">
corporeum: |2
<conceptum _="{{ conceptum.codicem | default: rem.de_nomen_breve.conceptum_codicem }}">
{%- for item in rem.de_linguam %}
{% if item[1].rem != '' -%}
<linguam _="{{ item[1].linguam }}">
<definitionem></definitionem>
<terminum>
{% if item[1].accuratum -%}
<accuratum>{{ item[1].accuratum }}</accuratum>
{%- endif %}
<rem>{{ item[1].rem }}</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
{%- endif -%}
{%- endfor %}
</conceptum>
finale: |2
</librarium>
</datum>
</glossarium>
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Explanation about the format at cor.hxltm.yml:normam.XML
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-XML
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-XML
### III ------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: they save the input data on a file on
# disk.
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.hxltm.xml \
--objectivum-XML
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-XML > resultatum/hxltm-exemplum-linguam.hxltm.xml
Result example
<glossarium _="Latium" __="hxltmcli.py" ___="">
<caput>
<titulum></titulum>
</caput>
<datum>
<!-- _[eng-Latn]@TODO Use librarium as XLIFF file / Excel worksheets. The "_" is (an non-Latin) script neutral default ID [eng-Latn]_ -->
<librarium _="_">
<conceptum _="L10N_ego_codicem">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>lat-Latn</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>por-Latn</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>eng-Latn</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>spa-Latn</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="arb-Arab">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>arb-Arab</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>hin-Deva</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="slv-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>slv-Latn</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
<conceptum _="L10N_ego_linguam_nomen">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Lingua Latina</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Língua portuguesa</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>English language</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Idioma español</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="arb-Arab">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>اللغة العربية</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>हिन्दी भाषा</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="slv-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>Slovenščina</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
<conceptum _="L10N_ego_scriptum_nomen">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Abecedarium Latinum</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Alfabeto latino</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Latin script</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Alfabeto latino</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>देवनागरी लिपि</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="slv-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>Latinska abeceda</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
<conceptum _="L10N_ego_patriam_UN_M49_numerum">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>001</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>001</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>001</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>001</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="arb-Arab">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>001</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>001</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="slv-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>001</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
<conceptum _="L10N_ego_patriam_UN_P_codicem">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>∅</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>∅</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>∅</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>∅</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="arb-Arab">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>∅</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>∅</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="slv-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>∅</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
<conceptum _="I18N_testum_salve_mundi_testum_I18N">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Salvi mundi!</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Olá mundo!</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Hello, World!</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>¡Hola mundo!</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>नमस्ते दुनिया</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
<conceptum _="I18N_إختبار_טעסט_测试_테스트_испытание_I18N">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Testum, I, II, III</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Teste, 1, 2, 3</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Test, 1, 2, 3</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>Prueba, 1, 2, 3</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>परीक्षा, १, २, ३</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
<conceptum _="I18N_०१२३४५६७८९_〇一二三四五六七八九十百千万亿_-1+2/3*4_٩٨٧٦٥٤٣٢١٠_零壹贰叁肆伍陆柒捌玖拾佰仟萬億_I18N">
<linguam _="lat-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>I,V, X, L, C, D, M</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="por-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="eng-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="spa-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>10</accuratum>
<rem>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="hin-Deva">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>०, १, २, ३, ४, ५, ६, ७, ८, ९</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
<linguam _="slv-Latn">
<definitionem></definitionem>
<terminum>
<accuratum>1</accuratum>
<rem>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</rem>
</terminum>
<!-- <terminum-fontem></terminum-fontem> -->
<!-- <terminum-objectivum></terminum-objectivum> -->
</linguam>
</conceptum>
</librarium>
</datum>
</glossarium>
XLIFF
:
XML Localization Interchange File Format (XLIFF) v2.1
# @TODO: JLIFF (XLIFF on JSON) <https://github.com/oasis-tcs/xliff-omos-jliff>
XLIFF:
__meta:
archivum:
extensionem: .xlf
situs_interretialis:
referens_officinale:
- <https://www.oasis-open.org/committees/xliff/>
vicipaedia:
- <https://en.wikipedia.org/wiki/XLIFF>
exemplum:
- <https://github.com/oasis-tcs/xliff-xliff-22>
- <https://github.com/oasis-tcs/xliff-xliff-22/blob/master/xliff-21/test-suite/core/valid/allExtensions.xlf>
- <https://github.com/oasis-tcs/xliff-xliff-22/blob/master/xliff-21/test-suite/core/valid/everything-core.xlf>
normam:
- <https://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html>
# - <https://docs.oasis-open.org/xliff/xliff-core/v2.1/os/schemas/>
# @see <https://github.com/redhat-developer/vscode-xml/wiki/XMLValidation#XML-catalog-with-XSD>
# @see <https://github.com/redhat-developer/vscode-xml/issues/315>
- <https://docs.oasis-open.org/xliff/xliff-core/v2.1/os/schemas/catalog.xml>
nomen:
eng-Latn: 'XML Localization Interchange File Format (XLIFF) v2.1'
asa:
modus_operandi:
# - multiplum_linguam
- bilingue
de_xml:
# This is a working draft
# @see https://terminator.readthedocs.io/en/latest/tbx_conformance.html
# ontologia libellam: I glossarium > II conceptum > III linguam > IV terminum
glossarium_radicem:
signum: xliff
# Exemplum I: <xliff version="1.2">
# Exemplum II: <xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
glossarium_titulum: False
# II conceptum
conceptum_codicem:
signum: unit
de_attributum: id
trivium:
# de <xliff> ad <trans-unit>
- file
# III linguam
linguam_codicem: False # XLIFF-obsoletum est bilingue
linguam_fontem_codicem:
# Exemplum: 'pt' ad '<source xml:lang="pt">por-Latn</source>''
signum: source
de_attributum: lang
trivium: []
linguam_objectivum_codicem:
# Exemplum: 'es' ad '<target xml:lang="es">spa-Latn</target>''
signum: target
de_attributum: lang
trivium: []
# IV terminum
terminum_accuratum: False # XLIFF terminum habendum accuratum? Falsum
terminum_multum: False # XLIFF-obsoletum est bilingue
terminum_habendum_fontem: True
terminum_habendum_objectivum: True
terminum_fontem_valorem:
# Exemplum: 'por-Latn ad <source xml:lang="pt">por-Latn</source>
signum: source
# de_attributum: False
trivium: []
terminum_objectivum_valorem:
# Exemplum: 'spa-Latn' ad <target xml:lang="es">spa-Latn</target>
signum: target
# de_attributum: False
trivium: []
formatum:
# @see https://docs.oasis-open.org/xliff/xliff-core/v2.1/os/schemas/catalog.xml
# @see https://docs.oasis-open.org/xliff/xliff-core/v2.1/os/schemas/xliff_core_2.0.xsd
initiale: |2
<?xml version="1.0"?>
<xliff version="2.0"
xmlns="urn:oasis:names:tc:xliff:document:2.0"
xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0"
xmlns:val="urn:oasis:names:tc:xliff:validation:2.0"
srcLang="{{ globum.fontem_linguam.bcp47 | default: 'la' }}"
trgLang="{{ globum.objectivum_linguam.bcp47 | default: 'ar' }}">
<file id="f1">
corporeum: |2
{% if rem.de_fontem_linguam -%}
<unit id="{{ conceptum.codicem | default: rem.de_nomen_breve.conceptum_codicem | default: 'errorem' | replace: '*', '' | replace: '+', '' | replace: '/', '' }}">
{% if rem.de_auxilium_linguam or rem.de_nomen_breve.referens_situs_interretialis.size > 0 -%}
<notes>
{%- for item in rem.de_auxilium_linguam -%}
<note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[{{- item.linguam -}}]
{{- item.rem -}}
[{{- item.linguam -}}]_
</note>
{%- endfor %}
{% for item in rem.de_nomen_breve.referens_situs_interretialis -%}
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
{{ item }}
</note>
{% endfor -%}
</notes>
{% else -%}
<!--
non rem.de_auxilium_linguam aut rem.de_nomen_breve.referens_situs_interretialis
-->
{% endif -%}
<segment state="{{ rem.de_objectivum_linguam.codicem_XLIFF | default: 'initial' }}">
<source>{{ rem.de_fontem_linguam.rem }}</source>
{%- if rem.de_objectivum_linguam and rem.de_objectivum_linguam.rem != '' %}
<target>{{ rem.de_objectivum_linguam.rem }}</target>
{%- else %}
<!-- non rem.de_objectivum_linguam -->
{%- endif %}
</segment>
</unit>
{%- else -%}
<!-- non rem.de_fontem_linguam -->
{%- endif %}
# <!-- {{ rem }} -->
finale: |2
</file>
</xliff>
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Documentation at cor.hxltm.yml:normam.XLIFF
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-XLIFF
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-XLIFF
### III ------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: they save the input data on a file on
# disk.
# - XLIFF source language:
# - Portuguese
# - XLIFF objective (target) language:
# - Spanish
# - Auxiliar languages (accept multiple options):
# - Esperanto
# - English
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.por-Latn--spa-Latn.xlf \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--auxilium-linguam epo-Latn@eo,eng-Latn@en \
--objectivum-XLIFF
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-XLIFF --fontem-linguam por-Latn@pt --objectivum-linguam spa-Latn@es --auxilium-linguam epo-Latn@eo,eng-Latn@en > resultatum/hxltm-exemplum-linguam.por-Latn--spa-Latn.xlf
### VI ------------------------------------------------------------------------
# _[eng-Latn]
# Silence errors with --silentium.
#
# Sometimes may be necessary ignore errors (like missing source term to
# translate) and generate output format, even if invalid. The use of --silentium
# can help ignore some warnings.
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.por-Latn--spa-Latn.xlf \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--auxilium-linguam epo-Latn@eo,eng-Latn@en \
--objectivum-XLIFF \
--silentium
Result example
<?xml version="1.0"?>
<xliff version="2.0"
xmlns="urn:oasis:names:tc:xliff:document:2.0"
xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0"
xmlns:val="urn:oasis:names:tc:xliff:validation:2.0"
srcLang="pt"
trgLang="es">
<file id="f1">
<unit id="L10N_ego_codicem">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]eng-Latn[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
</notes>
<segment state="final">
<source>por-Latn</source>
<target>spa-Latn</target>
</segment>
</unit>
<unit id="L10N_ego_linguam_nomen">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]English language[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
</notes>
<segment state="final">
<source>Língua portuguesa</source>
<target>Idioma español</target>
</segment>
</unit>
<unit id="L10N_ego_scriptum_nomen">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]Latin script[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://www.unicode.org/iso15924/
</note>
</notes>
<segment state="final">
<source>Alfabeto latino</source>
<target>Alfabeto latino</target>
</segment>
</unit>
<unit id="L10N_ego_patriam_UN_M49_numerum">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]001[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://en.wikipedia.org/wiki/UN_M49
</note>
</notes>
<segment state="final">
<source>001</source>
<target>001</target>
</segment>
</unit>
<unit id="L10N_ego_patriam_UN_P_codicem">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]∅[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://en.wikipedia.org/wiki/Place_code
</note>
</notes>
<segment state="final">
<source>∅</source>
<target>∅</target>
</segment>
</unit>
<unit id="I18N_testum_salve_mundi_testum_I18N">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]Hello, World![eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
</notes>
<segment state="final">
<source>Olá mundo!</source>
<target>¡Hola mundo!</target>
</segment>
</unit>
<unit id="I18N_إختبار_טעסט_测试_테스트_испытание_I18N">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]Test, 1, 2, 3[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://www.iana.org/domains/reserved
</note>
</notes>
<segment state="final">
<source>Teste, 1, 2, 3</source>
<target>Prueba, 1, 2, 3</target>
</segment>
</unit>
<unit id="I18N_०१२३४५६७८९_〇一二三四五六七八九十百千万亿_-1234_٩٨٧٦٥٤٣٢١٠_零壹贰叁肆伍陆柒捌玖拾佰仟萬億_I18N">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]1, 2, 3, 4, 5, 6, 7, 8, 9, 10[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://en.wikipedia.org/wiki/List_of_numeral_systems
</note>
</notes>
<segment state="final">
<source>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</source>
<target>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</target>
</segment>
</unit>
</file>
</xliff>
XLIFF-obsoletum
:
XML Localization Interchange File Format (XLIFF) v1.2
XLIFF-obsoletum:
__meta:
archivum:
extensionem: .xlf
situs_interretialis:
referens_officinale:
- <https://www.oasis-open.org/committees/xliff/>
vicipaedia:
- <https://en.wikipedia.org/wiki/XLIFF>
normam:
- <https://docs.oasis-open.org/xliff/xliff-core/xliff-core.html>
- <https://docs.oasis-open.org/xliff/v1.2/os/xliff-core-1.2-strict.xsd>
- <http://docs.oasis-open.org/xliff/v1.2/cs02/xliff-core-1.2-transitional.xsd>
nomen:
eng-Latn: 'XML Localization Interchange File Format (XLIFF) v1.2'
asa:
modus_operandi:
# - multiplum_linguam
- bilingue
de_xml:
# This is a working draft
# @see https://terminator.readthedocs.io/en/latest/tbx_conformance.html
# ontologia libellam: I glossarium > II conceptum > III linguam > IV terminum
glossarium_radicem:
signum: xliff
# Exemplum I: <xliff version="1.2">
# Exemplum II: <xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
glossarium_titulum: False
# II conceptum
conceptum_codicem:
signum: trans-unit
de_attributum: id
trivium:
# de <xliff> ad <trans-unit>
- file
# III linguam
linguam_codicem: False # XLIFF-obsoletum est bilingue
linguam_fontem_codicem:
# Exemplum: 'pt' ad '<source xml:lang="pt">por-Latn</source>''
signum: source
de_attributum: lang
trivium: []
linguam_objectivum_codicem:
# Exemplum: 'es' ad '<target xml:lang="es">spa-Latn</target>''
signum: target
de_attributum: lang
trivium: []
# IV terminum
terminum_habendum_accuratum: False # XLIFF terminum habendum accuratum? Falsum
terminum_habendum_multum: False # XLIFF-obsoletum est bilingue
terminum_habendum_fontem: True
terminum_habendum_objectivum: True
terminum_fontem_valorem:
# Exemplum: 'por-Latn ad <source xml:lang="pt">por-Latn</source>
signum: source
# de_attributum: False
trivium: []
terminum_objectivum_valorem:
# Exemplum: 'spa-Latn' ad <target xml:lang="es">spa-Latn</target>
signum: target
# de_attributum: False
trivium: []
formatum:
initiale: |2
<?xml version="1.0"?>
<xliff version="1.2"
xmlns="urn:oasis:names:tc:xliff:document:1.2">
<file
source-language="{{ globum.fontem_linguam.bcp47 | default: 'la' }}"
target-language="{{ globum.objectivum_linguam.bcp47 | default: 'ar' }}"
datatype="plaintext"
original="exemplum.ext">
<body>
corporeum: |2
{% if rem.de_fontem_linguam and rem.de_fontem_linguam.rem != '' -%}
{%- comment -%}
_[eng-Latn]
Since we're targeting XLIFF 2.1 and this is XLIFF 1.2, we will hardcode
Some variables directly on this Liquid template instead of waste time
doing on the Python code.
If this template seems ugly, is uglier who, as 2021, still have shitty
support for XLIFF 2.X.
state: http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#state
state-qualifier: http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html#state-qualifier
[eng-Latn]_
{%- endcomment -%}
{% capture rem_fontem_linguam_attr -%}
xml:lang="{{ rem.de_fontem_linguam.bcp47 }}"
{%- endcapture -%}
{% if rem.de_objectivum_linguam.rem and if rem.de_objectivum_linguam.rem != '' %}
{% capture XLIFF_obsoletum_state -%}
{{ rem.de_objectivum_linguam.codicem_XLIFF | default: 'new' }}
{%- endcapture -%}
{% capture rem_objectivum_linguam_attr -%}
xml:lang="{{ rem.de_objectivum_linguam.bcp47 }}"
{%- endcapture -%}
{%- if XLIFF_obsoletum_state == 'initial' -%}
{%- assign XLIFF_obsoletum_state = 'new' -%}
{%- assign XLIFF_obsoletum_approved = 'no' -%}
{%- assign XLIFF_obsoletum_statequalifier_attr = '' -%}
{%- assign XLIFF_obsoletum_approved_attr = '' -%}
{%- endif -%}
{%- if XLIFF_obsoletum_state == 'final' -%}
{%- assign XLIFF_obsoletum_state = 'final' -%}
{%- assign XLIFF_obsoletum_approved = 'yes' -%}
{%- assign XLIFF_obsoletum_translatesource_attr = ' translate="no"' -%}
{%- assign XLIFF_obsoletum_statequalifier_attr = ' state-qualifier="id-match"' -%}
{%- assign XLIFF_obsoletum_approved_attr = ' approved="yes"' -%}
{%- endif -%}
{%- endif -%}
<trans-unit id="{{- conceptum.codicem | default: rem.de_nomen_breve.conceptum_codicem | default: 'errorem' | replace: '*', '' | replace: '+', '' | replace: '/', '' -}}"
{{- XLIFF_obsoletum_translatesource_attr -}}
{{- XLIFF_obsoletum_approved_attr -}}>
<source {{ rem_fontem_linguam_attr }}>
{{- rem.de_fontem_linguam.rem -}}
</source>
{%- if rem.de_objectivum_linguam.rem != '' %}
<target {{ rem_objectivum_linguam_attr }} state="{{ XLIFF_obsoletum_state }}"{{ XLIFF_obsoletum_statequalifier_attr }}>
{{- rem.de_objectivum_linguam.rem -}}
</target>
{%- else -%}
<!-- non rem.de_objectivum_linguam.rem -->
{%- endif %}
{% if rem.de_auxilium_linguam %}
{%- for item in rem.de_auxilium_linguam -%}
{%- if item.rem -%}
<note annotates="source" priority="2">
_[{{- item.linguam -}}]{{- item.rem -}}[{{- item.linguam -}}]_
</note>
{%- endif -%}
{%- endfor %}
{%- else -%}
<!--
non rem.de_auxilium_linguam
-->
{%- endif %}
{% if rem.de_nomen_breve.referens_situs_interretialis.size > 0 -%}
{% for item in rem.de_nomen_breve.referens_situs_interretialis -%}
<note annotates="source" priority="1"
from="referens_situs_interretialis">
{{ item }}
</note>
{%- endfor %}
{%- else -%}
<!--
non rem.de_nomen_breve.referens_situs_interretialis
-->
{%- endif %}
</trans-unit>
{%- else -%}
<!-- non rem.de_fontem_linguam -->
{%- endif %}
finale: |2
</body>
</file>
</xliff>
Command line examples
### I -------------------------------------------------------------------------
# _[eng-Latn]
# Documentation at cor.hxltm.yml:normam.XLIFF
# [eng-Latn]_
### II -------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: will print to stdout the result
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv --objectivum-XLIFF
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-XLIFF
### III ------------------------------------------------------------------------
# _[eng-Latn]
# The next 2 examples are equivalent: they save the input data on a file on
# disk.
# - XLIFF source language:
# - Portuguese
# - XLIFF objective (target) language:
# - Spanish
# - Auxiliar languages (accept multiple options):
# - Esperanto
# - English
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.por-Latn--spa-Latn.obsoletum.xlf \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--auxilium-linguam epo-Latn@eo,eng-Latn@en \
--objectivum-XLIFF-obsoletum
cat hxltm-exemplum-linguam.tm.hxl.csv | hxltmcli --objectivum-XLIFF-obsoletum --fontem-linguam por-Latn@pt --objectivum-linguam spa-Latn@es --auxilium-linguam epo-Latn@eo,eng-Latn@en > resultatum/hxltm-exemplum-linguam.por-Latn--spa-Latn.obsoletum.xlf
### VI ------------------------------------------------------------------------
# _[eng-Latn]
# Silence errors with --silentium.
#
# Sometimes may be necessary ignore errors (like missing source term to
# translate) and generate output format, even if invalid. The use of --silentium
# can help ignore some warnings.
# [eng-Latn]_
hxltmcli hxltm-exemplum-linguam.tm.hxl.csv \
resultatum/hxltm-exemplum-linguam.por-Latn--spa-Latn.obsoletum.xlf \
--fontem-linguam por-Latn@pt \
--objectivum-linguam spa-Latn@es \
--auxilium-linguam epo-Latn@eo,eng-Latn@en \
--objectivum-XLIFF-obsoletum \
--silentium
Result example
<?xml version="1.0"?>
<xliff version="2.0"
xmlns="urn:oasis:names:tc:xliff:document:2.0"
xmlns:fs="urn:oasis:names:tc:xliff:fs:2.0"
xmlns:val="urn:oasis:names:tc:xliff:validation:2.0"
srcLang="pt"
trgLang="es">
<file id="f1">
<unit id="L10N_ego_codicem">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]eng-Latn[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
</notes>
<segment state="final">
<source>por-Latn</source>
<target>spa-Latn</target>
</segment>
</unit>
<unit id="L10N_ego_linguam_nomen">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]English language[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
</notes>
<segment state="final">
<source>Língua portuguesa</source>
<target>Idioma español</target>
</segment>
</unit>
<unit id="L10N_ego_scriptum_nomen">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]Latin script[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://www.unicode.org/iso15924/
</note>
</notes>
<segment state="final">
<source>Alfabeto latino</source>
<target>Alfabeto latino</target>
</segment>
</unit>
<unit id="L10N_ego_patriam_UN_M49_numerum">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]001[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://en.wikipedia.org/wiki/UN_M49
</note>
</notes>
<segment state="final">
<source>001</source>
<target>001</target>
</segment>
</unit>
<unit id="L10N_ego_patriam_UN_P_codicem">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]∅[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://en.wikipedia.org/wiki/Place_code
</note>
</notes>
<segment state="final">
<source>∅</source>
<target>∅</target>
</segment>
</unit>
<unit id="I18N_testum_salve_mundi_testum_I18N">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]Hello, World![eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
</notes>
<segment state="final">
<source>Olá mundo!</source>
<target>¡Hola mundo!</target>
</segment>
</unit>
<unit id="I18N_إختبار_טעסט_测试_테스트_испытание_I18N">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]Test, 1, 2, 3[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://www.iana.org/domains/reserved
</note>
</notes>
<segment state="final">
<source>Teste, 1, 2, 3</source>
<target>Prueba, 1, 2, 3</target>
</segment>
</unit>
<unit id="I18N_०१२३४५६७८९_〇一二三四五六七八九十百千万亿_-1234_٩٨٧٦٥٤٣٢١٠_零壹贰叁肆伍陆柒捌玖拾佰仟萬億_I18N">
<notes><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[eng-Latn]1, 2, 3, 4, 5, 6, 7, 8, 9, 10[eng-Latn]_
</note><note appliesTo="source" priority="3"
category="de_auxilium_linguam">
_[][]_
</note>
<note appliesTo="source" priority="1"
category="referens_situs_interretialis">
https://en.wikipedia.org/wiki/List_of_numeral_systems
</note>
</notes>
<segment state="final">
<source>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</source>
<target>1, 2, 3, 4, 5, 6, 7, 8, 9, 10</target>
</segment>
</unit>
</file>
</xliff>
XLSX
:
Microsoft Excel, HXLTM container (read-only; native support as data source)
#### XLSX, Google Sheets ____________________________________________________
# @see https://support.microsoft.com/en-us/office/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3
# @see https://support.google.com/drive/answer/37603
XLSX:
__meta:
archivum:
extensionem: .xlsx
descriptionem: |
_[eng-Latn]
Both URL GSheets and local/remote file of Microsoft Excel have built
read-only access in support for reference cli implementation
as container for data source without intermediate file transformation
to CSV container of HXLTM. This means humans don't need to edit CSV
files directly.
The support on `hxltmcli` to write directly to GSheets and
Microsoft Excel is unlikely to be implemented.
[eng-Latn]_
nomen:
eng-Latn: 'Microsoft Excel, HXLTM container (read-only; native support as data source)'
# eng-Latn: 'Microsoft Excel (native support to read, but not write, data directly with .XSLX)'
asa:
modus_operandi:
- multiplum_linguam
# - bilingue
YAML
:
YAML (planned, but no draft)
YAML:
__meta:
archivum:
extensionem: .yml
normam:
# Not sure where to find some place to 'explain' this format
- <https://guides.rubyonrails.org/i18n.html>
nomen:
eng-Latn: 'YAML (planned, but no draft)'
situs_interretialis:
referens_officinale:
- <https://guides.rubyonrails.org/i18n.html>
exemplum:
- <https://github.com/i18next/react-i18next/blob/master/example/react/public/locales/de/translation.json>
asa:
modus_operandi:
# - multiplum_linguam
- bilingue
HXLTM Ad Hoc Fōrmulam (HXLTM templated export)
About create new HXLTM Ad Hoc: different from customizable HXLTM Normam (support for gigabyte size data manipulation) the use of this strategy is more optimized for end user who is unlikely o care about load data in chunks and try to explain how to import back to HXLTM working file. |
TODO: this is a draft. Document already implemented functionality |
HXLTM ontologies
Ontologia
Full file at https://github.com/EticaAI/HXL-Data-Science-file-formats/blob/main/ontologia/cor.hxltm.yml.
libellam:
glossarium:
conceptum:
_TBX: entry-level
linguam:
_TBX: language-level
terminum:
_TBX: term-level
# rem:
# _TBX: term-level
# Trivia: ontologia, https://la.wikipedia.org/wiki/Ontologia
ontologia:
# Trivia: commūne, https://en.wiktionary.org/wiki/conceptus#Latin
commune:
# Trivia: conceptum, https://en.wiktionary.org/wiki/conceptus#Latin
conceptum:
# Trivia:
# - accūrātum, https://en.wiktionary.org/wiki/accuratus
# - reliabilityCode, https://iate.europa.eu/fields-explained
# - reliabilityCode, https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf
accuratum:
__HXL: '#status +conceptum +accuratum'
__nomen_breve: 'accuratum' # __nomen_breve + __libellam : 'conceptum.accuratum'
__id: ontologia.commune.conceptum.accuratum
__libellam: conceptum
__valorem_optionem: "ontologia_aliud.accuratum"
__valorem_maximum: 10
__valorem_minimum: 0
__valorem_typum: numerum
_TBX: &referens_ontologia-commune-conceptum-accuratum-_TBX
# @see https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf
_nomen: reliabilityCode
_descriptionem: |
A code assigned to a data-category or record indicating accuracy
and/or completeness. The content of the <descrip> element when it
has a type attribute value of 'reliabilityCode' shall be a value
from 1 (least reliable) to 10 (most reliable).
_xml: <descrip type='reliabilityCode'>__valorem__</descrip>
# Trivia: cōdicem, https://en.wiktionary.org/wiki/codex#Latin
codicem:
__HXL: '#item +conceptum +codicem'
__nomen_breve: 'codicem' # conceptum.codicem
__id: ontologia.commune.conceptum.codicem
__libellam: conceptum
_XLIFF:
__HXL_bilingue: '#x_xliff +unit +id'
# Trivia: dēprecātum, https://en.wiktionary.org/wiki/deprecatus#Latin
deprecatum:
__HXL: '#meta +conceptum +codicem +deprecatum'
__nomen_breve: 'codicem_deprecatum' # conceptum.codicem_deprecatum
__id: ontologia.commune.conceptum.codicem.deprecatum
__libellam: conceptum
__valorem_typum: compactum_textum
- exemplum_codicem_a|exemplum_codicem_b|123456
# Trivia: alternātīvum, https://en.wiktionary.org/wiki/alternativus#Latin
alternativum:
__HXL: '#meta +conceptum +codicem +alternativum'
__nomen_breve: 'codicem_alternativum' # conceptum.codicem_deprecatum
__id: ontologia.commune.conceptum.codicem.alternativum
__libellam: conceptum
__valorem_typum: compactum_textum
_exemplum:
- Q1065|UNTERM5f40d95f1d17bf8c85256a01000080af|IATE787725
# ontologia.extensionem.conceptum.codicem.iate
# ontologia.extensionem.conceptum.codicem.unterm
# ontologia.extensionem.conceptum.codicem.wikidata
# (...)
dominium:
__HXL: '#item +conceptum +dominium' # Always a list
__nomen_breve: 'dominium' # conceptum.dominium
__id: ontologia.commune.conceptum.dominium
__libellam: conceptum
# See also
# - https://iate.europa.eu/developers
# - https://iate.europa.eu/em-api/domains/_tree?pretty=true
_TBX:
_id: DC-489
_descriptionem: |
Refers to a location in the corpus—such as a software application
user interface, product packaging, oran industrial process—where
the term frequently occurs'A sample sentence that contains
the term.
_nomen: Subject field
_level: Concept
_xml: <descrip type='subjectField'>
_TMX:
_descriptionem: |
Property - The <prop> element is used to define the various
properties of the parent element (or of the document when
<prop> is used in the <header> element).
These properties are not defined by the standard.
It is the responsibility of each tool provider to publish the
types and values of the properties it uses.
If the tool exports unpublished properties types, their
values should begin with the prefix "x-".
Example:
<prop type='user-defined'>name:domain value:Computer science</prop>
<prop type='x-domain'>Computer science</prop>
_UTX:
_nomen: domain property
_descriptionem: |
The domain property is a text string that indicates the domain of
the glossary. This property is used when you need to group multiple
glossaries into a domain. If you use glossary IDs as domain names,
the domain property is not necessary.
Example
domain: Aerospace
_XLIFF:
# (We will not implement deeper levels than 0 now)
__HXL_bilingue: '#x_xliff +group +group_0'
compactum_json:
__HXL: '#meta +conceptum +json'
__id: ontologia.commune.conceptum.compactum_json
# # Better not provide a generic compactum_textum for conceptum
# compactum_textum:
# __HXL: '#meta +conceptum +textum'
# __id: ontologia.commune.conceptum.compactum_textum
# meta:
# __HXL: '#meta +conceptum'
# __id: ontologia.commune.conceptum.meta
# # Note: this field is unlikely to show as it is on spreadsheets
# # edited by humans, but is a catch-all for metatada related
# # to concept that does not have specialized field.
# # Recommended data type: JSON
# Trivia: referēns, https://en.wiktionary.org/wiki/referens
referens:
# Situs interretialis
situs_interretialis:
__HXL: '#meta +item +url +list'
__nomen_breve: 'referens_situs_interretialis' # conceptum.referens_situs_interretialis
__id: ontologia.commune.conceptum.referens.situs_interretialis
__libellam: conceptum
# TODO: TBXBasic 6.17 Subject field, <descrip type="subjectField">
# Trivia: "typum", https://en.wiktionary.org/wiki/typus#Latin
typum:
__HXL: '#item +conceptum +typum'
# @TODO: This may need move to other part, since is term level
# not concept level.
__nomen_breve: 'typum' # rem.typum
__id: ontologia.commune.conceptum.typum
__libellam: rem
_TBX: &referens_ontologia-commune-rem-typum-__linguam__-_TBX
_id: DC-2677
_nomen: Term type
_xml: <termNote type='termType'>
_descriptionem: |
Permissible values and their ISOcat PIDs are as follows:
Value ISOcat PID
fullForm www.isocat.org/datcat/DC-321
acronym www.isocat.org/datcat/DC-334
abbreviation www.isocat.org/datcat/DC-331
shortForm www.isocat.org/datcat/DC-332
variant www.isocat.org/datcat/DC-330
phrase www.isocat.org/datcat/DC-339
_XLIFF:
# temporary hashtag. Needs better naming
__HXL_bilingue: '#x_xliff +unit +note +note_category__termtype'
# Trivia: rem, https://en.wiktionary.org/wiki/res#Latin
rem:
# Trivia: linguam, https://en.wiktionary.org/wiki/lingua#Latin
__linguam__:
# Exemplum: '#item +rem +i_en +i_eng +is_Latn'
__HXL: '#item +rem __linguam__'
__nomen_breve: 'rem__L__' # rem.rem__L__
__id: ontologia.commune.rem.__linguam__
__libellam: rem
_TBX: &referens_ontologia-commune-rem-__linguam__-_TBX
_id: DC-1823
_nomen: Term
_descriptionem: |
Refers to a location in the corpus—such as a software application
user interface, product packaging, oran industrial process—where
the term frequently occurs'A sample sentence that contains
the term.
_xml: '<term>'
_UTX: &referens_ontologia-commune-rem-__linguam__-_UTX
_nomen: Term
_descriptionem: |
A term is a headword of either the source or target language(s).
A term in a UTX glossary should be in the basic form of the word
such as a headword in a dictionary.
See also "7. Appendix A: UTX content guidelines."
Note: Term definitions are optional in a UTX glossary.
_XLIFF:
# __HXL_bilingue: ''
__HXL_fontem: '#x_xliff +source __linguam__'
__HXL_fontem_alternativum_I: '#x_xliff +unit +note +note_category__altsource1 __linguam__'
__HXL_fontem_alternativum_II: '#x_xliff +unit +note +note_category__altsource2 __linguam__'
__HXL_fontem_alternativum_III: '#x_xliff +unit +note +note_category__altsource3 __linguam__'
__HXL_fontem_alternativum_IV: '#x_xliff +unit +note +note_category__altsource4 __linguam__'
__HXL_fontem_alternativum_V: '#x_xliff +unit +note +note_category__altsource5 __linguam__'
__HXL_objectivum: '#x_xliff +target __linguam__'
# Trivia:
# - linguam, https://en.wiktionary.org/wiki/lingua#Latin
# - dē, https://en.wiktionary.org/wiki/de#Latin
# - imperium, https://en.wiktionary.org/wiki/imperium#Latin
__linguam_de_imperium__:
# Exemplum: '#item +rem +i_en +i_eng +is_Latn +TODO_THINK_ABOUT_HXLTAG'
__HXL: '#item +rem __linguam_de_imperium__'
__nomen_breve: 'rem__L_I__' # rem.rem__L_I__
__id: ontologia.commune.rem.__linguam_de_imperium__
__libellam: rem
_TBX: *referens_ontologia-commune-rem-__linguam__-_TBX
_UTX: *referens_ontologia-commune-rem-__linguam__-_UTX
_XLIFF:
# __HXL_bilingue: ''
__HXL_fontem: '#x_xliff +source __linguam_de_imperium__'
__HXL_fontem_alternativum_I: '#x_xliff +unit +note +note_category__altsource1 __linguam_de_imperium__'
__HXL_fontem_alternativum_II: '#x_xliff +unit +note +note_category__altsource2 __linguam_de_imperium__'
__HXL_fontem_alternativum_III: '#x_xliff +unit +note +note_category__altsource3 __linguam_de_imperium__'
__HXL_fontem_alternativum_IV: '#x_xliff +unit +note +note_category__altsource4 __linguam_de_imperium__'
__HXL_fontem_alternativum_V: '#x_xliff +unit +note +note_category__altsource5 __linguam_de_imperium__'
__HXL_objectivum: '#x_xliff +target __linguam_de_imperium__'
accuratum:
__linguam__:
__HXL: '#status +rem +accuratum __linguam__'
__nomen_breve: 'accuratum__L__' # rem.accuratum__L__
__id: ontologia.commune.rem.__linguam__.accuratum
__libellam: rem
__valorem_optionem: "ontologia_aliud.accuratum"
__valorem_maximum: 10
__valorem_minimum: 0
__valorem_typum: numerum
_TBX: *referens_ontologia-commune-conceptum-accuratum-_TBX
__linguam_de_imperium__:
__HXL: '#status +rem +accuratum __linguam_de_imperium__'
__nomen_breve: 'accuratum__L_I__' # rem.accuratum__L_I__
__id: ontologia.commune.rem.accuratum.__linguam_de_imperium__
__libellam: rem
__valorem_optionem: "ontologia_aliud.accuratum"
__valorem_maximum: 10
__valorem_minimum: 0
__valorem_typum: numerum
_TBX: *referens_ontologia-commune-conceptum-accuratum-_TBX
compactum_json:
__linguam__:
__HXL: '#meta +rem +json __linguam__'
__nomen_breve: 'json__L__' # rem.json__L__
__id: ontologia.commune.rem.compactum_json.__linguam__
__libellam: rem
__linguam_de_imperium__:
__HXL: '#meta +rem +json __linguam_de_imperium__'
__nomen_breve: 'json__L_I__' # rem.json__L_I__
__id: ontologia.commune.rem.compactum_json.__linguam_de_imperium__
__libellam: rem
# Note: we will NOT provide a generic #meta +rem, but will compact_json
# compactum_textum:
# __linguam__:
# __HXL: '#meta +rem +textum __linguam__'
# __id: ontologia.commune.rem.compactum_textum.__linguam__
# - status
# - HXL hashtag #status, https://hxlstandard.org/standard/1-1final/dictionary/#tag_status
# - statum, https://en.wiktionary.org/wiki/status#Latin
statum:
compactum_json:
__linguam__:
__HXL: '#status +rem +json __linguam__'
__id: ontologia.commune.rem.status.compactum_json.__linguam__
__libellam: rem
__nomen_breve: 'statum_rem_json__L__'
__linguam_de_imperium__:
__HXL: '#status +rem +json __linguam_de_imperium__'
__id: ontologia.commune.rem.compactum_json.__linguam_de_imperium__
__libellam: rem
compactum_textum:
__linguam__:
__HXL: '#status +rem +textum __linguam__'
__id: ontologia.commune.rem.__linguam__.status.compactum_textum
__libellam: rem
__nomen_breve: 'statum_rem_textum__L__'
_XLIFF:
# _[eng-Latn]
# XLIFF assume that the source translation already 'is perfect'
# Something that would not be true on HXLTM use cases that allow
# force any language as entry. So the alternative is only mark
# as metadata the acuracy of the source, and the target language
# use as status (since previous content may already be
# translated)
# [eng-Latn]_
#
# __HXL_bilingue: ''
# TODO: better naming for __HXL_fontem
__HXL_fontem: '#x_xliff +unit +note +note_category__sourcestatus __linguam__'
__HXL_objectivum: '#x_xliff +segment +state __linguam__'
# TODO: document in YAML how a program could 'transform' the text
# to what XLIFF expect. We already have aliases at the
# bottom of this file
__linguam_de_imperium__:
__HXL: '#status +rem +textum __linguam_de_imperium__'
__id: ontologia.commune.rem.compactum_textum.__linguam_de_imperium__
_XLIFF:
# __HXL_bilingue: ''
# TODO: better naming for __HXL_fontem
__HXL_fontem: '#x_xliff +unit +note +note_category__sourcestatus __linguam_de_imperium__'
__HXL_objectivum: '#x_xliff +segment +state __linguam_de_imperium__'
# MOVED TO ontologia.glossarium
# # Trivia: contextum, https://en.wiktionary.org/wiki/contextus#Latin
# contextum:
# __linguam__:
# __HXL: '#item +rem +contextum __linguam__'
# __id: ontologia.commune.rem.contextum.__linguam__
# _TBX: &referens_ontologia-commune-rem-contextum-__linguam__-_TBX
# _id: DC-149
# _descriptionem: A sample sentence that contains the term.
# _xml: <descrip type='context'>
# __linguam_de_imperium__:
# __HXL: '#item +rem +contextum __linguam_de_imperium__'
# __id: ontologia.commune.rem.contextum.__linguam_de_imperium__
# _TBX: *referens_ontologia-commune-rem-contextum-__linguam__-_TBX
# MOVED TO ontologia.glossarium
# # Trivia: dēfīnītiōnem, https://en.wiktionary.org/wiki/definitio#Latin
# definitionem:
# __linguam__:
# __HXL: '#item +rem +definitionem __linguam__'
# __id: ontologia.commune.rem.definitionem.__linguam__
# _TBX: &referens_ontologia-commune-rem-definitionem-__linguam__-_TBX
# _id: DC-168
# _descriptionem:
# _nomen: Definition
# _level: Concept, Language
# _xml: <descrip type='definition'>
# __linguam_de_imperium__:
# __HXL: '#item +rem +definitionem __linguam_de_imperium__'
# __id: ontologia.commune.rem.definitionem.__linguam_de_imperium__
# _TBX: *referens_ontologia-commune-rem-definitionem-__linguam__-_TBX
# Trivia:
# - genus_grammaticum,
# - https://la.wikipedia.org/wiki/Genus_grammaticum
# - https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders
genus_grammaticum:
__linguam__:
__HXL: '#item +rem +genus_grammaticum __linguam__'
__id: ontologia.commune.rem.genus_grammaticum.__linguam__
_TBX: &referens_ontologia-commune-rem-genus_grammaticum-__linguam__-_TBX
_id: DC-245
_descriptionem: |
Picklist, with permissible values as follows:
• masculine
• feminine
• neuter
• other
_nomen: Gender
_level: Term
_xml: <termNote type='grammaticalGender'>
__linguam_de_imperium__:
__HXL: '#item +rem +genus_grammaticum __linguam_de_imperium__'
__id: ontologia.commune.rem.genus_grammaticum.__linguam_de_imperium__
_TBX: *referens_ontologia-commune-rem-genus_grammaticum-__linguam__-_TBX
# Trivia: annotātiōnem, https://en.wiktionary.org/wiki/annotatio#Latin
annotationem:
__linguam__:
__HXL: '#meta +rem +annotationem __linguam__'
__id: ontologia.commune.rem.annotationem.__linguam__
__valorem_typum: textum # annotationem Always free text
_TBX: &referens_ontologia-commune-rem-annotationem-__linguam__-_TBX
_id: DC-382
_descriptionem: |
Any kind of note, such as a usage note, explanation,
or instruction
_nomen: Note
_level: Concept, Language, Term
_xml: <note>
__linguam_de_imperium__:
__HXL: '#meta +rem +annotationem __linguam_de_imperium__'
__id: ontologia.commune.rem.annotationem.__linguam_de_imperium__
__valorem_typum: textum # annotationem Always free text
_TBX: *referens_ontologia-commune-rem-annotationem-__linguam__-_TBX
# Trivia: partem ōrātiōnis, https://en.wiktionary.org/wiki/pars_orationis#Latin
partem_orationis:
__linguam__:
__HXL: '#item +rem +partem_orationis __linguam__'
__id: ontologia.commune.rem.partem_orationis.__linguam__
_TBX: &referens_ontologia-commune-rem-partem_orationis-__linguam__-_TBX
_id: DC-396
_nomen: Part of speech
_descriptionem: |
Content type picklist
Permissible values and their ISOcat PIDs are as follows:
Value ISOcat PID
noun www.isocat.org/datcat/DC-1333
verb www.isocat.org/datcat/DC-1424
adjective www.isocat.org/datcat/DC-1230
adverb www.isocat.org/datcat/DC-1232
properNoun www.isocat.org/datcat/DC-384
other www.isocat.org/datcat/DC-4336
In TBX-Default, the data type for part of speech is plainText.
TBX-Basic's use of picklist is in compliance with TBX-Default
because picklist is more constrained than plainText.
The other value can be used for terms of the phrase type.
_xml: <termNote type='partOfSpeech'>
__linguam_de_imperium__:
__HXL: '#item +rem +partem_orationis __linguam_de_imperium__'
__id: ontologia.commune.rem.partem_orationis.__linguam_de_imperium__
_TBX: *referens_ontologia-commune-rem-partem_orationis-__linguam__-_TBX
# # TODO: TBXBasic 6.17 Subject field, <descrip type="subjectField">
# # Trivia: "typum", https://en.wiktionary.org/wiki/typus#Latin
# typum:
# __linguam__:
# __HXL: '#item +rem +typum __linguam__'
# __id: ontologia.commune.rem.typum.__linguam__
# _TBX: &referens_ontologia-commune-rem-typum-__linguam__-_TBX
# _id: DC-2677
# _nomen: Term type
# _xml: <termNote type='termType'>
# _descriptionem: |
# Permissible values and their ISOcat PIDs are as follows:
# Value ISOcat PID
# fullForm www.isocat.org/datcat/DC-321
# acronym www.isocat.org/datcat/DC-334
# abbreviation www.isocat.org/datcat/DC-331
# shortForm www.isocat.org/datcat/DC-332
# variant www.isocat.org/datcat/DC-330
# phrase www.isocat.org/datcat/DC-339
# __linguam_de_imperium__:
# __HXL: '#item +rem +typum __linguam_de_imperium__'
# __id: ontologia.commune.rem.typum.__linguam_de_imperium__
# _TBX: *referens_ontologia-commune-rem-typum-__linguam__-_TBX
# Trivia: glōssārium, https://en.wiktionary.org/wiki/glossarium
glossarium:
# Trivia: rem, https://en.wiktionary.org/wiki/res#Latin
rem:
# Trivia: contextum, https://en.wiktionary.org/wiki/contextus#Latin
contextum:
__linguam__:
__HXL: '#item +rem +contextum __linguam__'
# __nomen_breve: 'G_rem_contextum__L__'
__nomen_breve: 'contextum__L__' # linguam.contextum__L__
__id: ontologia.glossarium.rem.contextum.__linguam__
__libellam: linguam
_TBX: &referens_ontologia-commune-rem-contextum-__linguam__-_TBX
_id: DC-149
_descriptionem: A sample sentence that contains the term.
_xml: <descrip type='context'>
_XLIFF:
# __HXL_bilingue: ''
__HXL_fontem: '#x_xliff +unit +note +note_category__context __linguam__'
__HXL_fontem_alternativum_I: '#x_xliff +unit +note +note_category__contextalt1 __linguam__'
__HXL_fontem_alternativum_II: '#x_xliff +unit +note +note_category__contextalt2 __linguam__'
__HXL_fontem_alternativum_III: '#x_xliff +unit +note +note_category__contextalt3 __linguam__'
__HXL_fontem_alternativum_IV: '#x_xliff +unit +note +note_category__contextalt4 __linguam__'
__HXL_fontem_alternativum_V: '#x_xliff +unit +note +note_category__contextalt5 __linguam__'
# __HXL_objectivum: ''
__linguam_de_imperium__:
__HXL: '#item +rem +contextum __linguam_de_imperium__'
# __nomen_breve: 'G_rem_contextum__L_I__'
__nomen_breve: 'contextum__L_I__' # linguam.contextum__L_I__
__id: ontologia.commune.rem.contextum.__linguam_de_imperium__
__libellam: linguam
_TBX: *referens_ontologia-commune-rem-contextum-__linguam__-_TBX
_XLIFF:
# __HXL_bilingue: ''
__HXL_fontem: '#x_xliff +unit +note +note_category__context __linguam_de_imperium__'
__HXL_fontem_alternativum_I: '#x_xliff +unit +note +note_category__contextalt1 __linguam_de_imperium__'
__HXL_fontem_alternativum_II: '#x_xliff +unit +note +note_category__contextalt2 __linguam_de_imperium__'
__HXL_fontem_alternativum_III: '#x_xliff +unit +note +note_category__contextalt3 __linguam_de_imperium__'
__HXL_fontem_alternativum_IV: '#x_xliff +unit +note +note_category__contextalt4 __linguam_de_imperium__'
__HXL_fontem_alternativum_V: '#x_xliff +unit +note +note_category__contextalt5 __linguam_de_imperium__'
# __HXL_objectivum: ''
# Trivia: dēfīnītiōnem, https://en.wiktionary.org/wiki/definitio#Latin
definitionem:
__linguam__:
__HXL: '#item +rem +definitionem __linguam__'
# __nomen_breve: 'G_rem_definitionem__L__'
__nomen_breve: 'definitionem__L__' # linguam.definitionem__L__
__id: ontologia.glossarium.rem.definitionem.__linguam__
__libellam: linguam
_TBX: &referens_ontologia-commune-rem-definitionem-__linguam__-_TBX
_id: DC-168
_descriptionem:
_nomen: Definition
_level: Concept, Language
_xml: <descrip type='definition'>
_XLIFF:
# __HXL_bilingue: ''
__HXL_fontem: '#x_xliff +unit +note +note_category__definition __linguam__'
__HXL_fontem_alternativum_I: '#x_xliff +unit +note +note_category__definitionalt1 __linguam__'
__HXL_fontem_alternativum_II: '#x_xliff +unit +note +note_category__definitionalt2 __linguam__'
__HXL_fontem_alternativum_III: '#x_xliff +unit +note +note_category__definitionalt3 __linguam__'
__HXL_fontem_alternativum_IV: '#x_xliff +unit +note +note_category__definitionalt4 __linguam__'
__HXL_fontem_alternativum_V: '#x_xliff +unit +note +note_category__definitionsalt5 __linguam__'
# __HXL_objectivum: ''
__linguam_de_imperium__:
__HXL: '#item +rem +definitionem __linguam_de_imperium__'
# __nomen_breve: 'G_rem_definitionem__L_I__'
__nomen_breve: 'definitionem__L_I__' # linguam.definitionem__L_I__
__id: ontologia.commune.rem.definitionem.__linguam_de_imperium__
__libellam: linguam
_TBX: *referens_ontologia-commune-rem-definitionem-__linguam__-_TBX
_XLIFF:
# __HXL_bilingue: ''
__HXL_fontem: '#x_xliff +unit +note +note_category__definition __linguam_de_imperium__'
__HXL_fontem_alternativum_I: '#x_xliff +unit +note +note_category__definitionalt1 __linguam_de_imperium__'
__HXL_fontem_alternativum_II: '#x_xliff +unit +note +note_category__definitionalt2 __linguam_de_imperium__'
__HXL_fontem_alternativum_III: '#x_xliff +unit +note +note_category__definitionalt3 __linguam_de_imperium__'
__HXL_fontem_alternativum_IV: '#x_xliff +unit +note +note_category__definitionalt4 __linguam_de_imperium__'
__HXL_fontem_alternativum_V: '#x_xliff +unit +note +note_category__definitionsalt5 __linguam_de_imperium__'
# __HXL_objectivum: ''
# Trivia: extēnsiōnem, https://en.wiktionary.org/wiki/extensio#Latin
extensionem:
conceptum:
codicem:
iate:
__HXL: '#meta +conceptum +codicem +iate'
__nomen_breve: 'codicem_iate' # conceptum.codicem_iate
__id: ontologia.extensionem.conceptum.codicem.iate
__libellam: conceptum
_url: https://iate.europa.eu/fields-explained
# _url_rem: ' ???? '
_compactum_json:
- #meta +json +conceptum
_id: ontologia.commune.conceptum.compactum_json
_JSONPath: codicem.iate
_exemplum:
{"codicem": {"iate": 787725}}
_compactum_textum:
- #meta +conceptum +codicem +alternativum
_id: ontologia.commune.conceptum.codicem.alternativum
_praefixum: "IATE"
_suffixum: ""
_exemplum:
- IATE787725
_XLIFF:
__HXL_bilingue: '#x_xliff +unit +note +note_category__iate'
unterm:
__HXL: '#meta +conceptum +codicem +unterm'
__nomen_breve: 'codicem_unterm' # conceptum.codicem_unterm
__id: ontologia.extensionem.conceptum.codicem.unterm
__libellam: conceptum
_url: https://iate.europa.eu/fields-explained
_url_rem: 'https://unterm.un.org/unterm/display/record/unhq/na?OriginalId={{valorem}}'
_compactum_json:
- _id: ontologia.commune.conceptum.compactum_json
_JSONPath: codicem.unterm
#meta +conceptum +json
_exemplum:
{"codicem": {"unterm": "5f40d95f1d17bf8c85256a01000080af"}}
_compactum_textum:
#meta +conceptum +codicem +alternativum
- _id: ontologia.commune.conceptum.codicem.alternativum
_praefixum: "UNTERM"
_suffixum: ""
_exemplum: UNTERM5f40d95f1d17bf8c85256a01000080af
_XLIFF:
__HXL_bilingue: '#x_xliff +unit +note +note_category__unterm'
wikidata:
__HXL: '#meta +conceptum +codicem +wikidata'
__nomen_breve: 'codicem_wikidata' # conceptum.codicem_wikidata
__id: ontologia.extensionem.conceptum.codicem.wikidata
__libellam: conceptum
_url: https://www.wikidata.org/
_url_rem: 'https://www.wikidata.org/wiki/{{valorem}}'
_compactum_json:
- #meta +json +conceptum
_id: ontologia.commune.conceptum.compactum_json
_JSONPath: codicem.wikidata
_exemplum:
{"codicem": {"wikidata": "Q1065"}}
_compactum_textum:
- #meta +conceptum +codicem +alternativum
_id: ontologia.commune.conceptum.codicem.alternativum
_praefixum: "Q"
_suffixum: ""
_exemplum: Q1065
_XLIFF:
__HXL_bilingue: '#x_xliff +unit +note +note_category__wikidata'
# @see https://docs.google.com/spreadsheets/d/1couRYFuVLnr6CfIMEiXKBamJtmcHinSAy1K1e69rNqw/edit#gid=141644151
normam:
org_hxlstandard:
__HXL: '#meta +conceptum +normam +normam_org_hxlstandard'
__nomen_breve: 'normam_org_hxlstandard'
__id: ontologia.extensionem.conceptum.normam.org_hxlstandard
__libellam: conceptum
_url: https://hxlstandard.org/
_compactum_json:
- #meta +json +conceptum
_id: ontologia.commune.conceptum.compactum_json
_JSONPath: normam.hxlstandard
_exemplum:
{"normam": {"org_hxlstandard": '#meta+id'}}
# _compactum_textum:
# - #meta +conceptum +normam +alternativum
# _id: ontologia.commune.conceptum.normam.alternativum
# _praefixum: ""
# _suffixum: ""
# _exemplum:
# -
# _XLIFF:
# __HXL_bilingue: '#x_xliff +unit +note +note_category__iate'
# Trivia: https://www.wikidata.org/wiki/Q333761
tm:
# @see https://multifarious.filkin.com/2018/08/23/xliff-2-x-the-translators-panacea/
# @see https://blog.zingword.com/xliff-2-0-and-how-big-companies-are-preventing-translators-from-improving-their-lives-209e384ed8ea
# @see https://github.com/tingley/interoperability-now/wiki
# Trivia: conceptum, https://en.wiktionary.org/wiki/conceptus#Latin
conceptum:
# Trivia: trānslātiōnem, https://en.wiktionary.org/wiki/translatio#Latin
translationem:
# Trivia: dīrēctiōnem, https://en.wiktionary.org/wiki/directio#Latin
directionem:
__HXL: '#meta +conceptum +translationem +directionem'
# __nomen_breve: 'TM_conceptum_translationem_directionem'
__nomen_breve: 'translationem_directionem' # conceptum.translationem_directionem
__id: ontologia.tm.conceptum.translationem.directionem
__libellam: conceptum
_compactum_json:
#meta +json +conceptum
_id: ontologia.commune.conceptum.compactum_json
_JSONPath: translationem.directionem
_exemplum:
{"translationem": {"directionem": []}}
_UTX: &referens_ontologia-commune-conceptum-meta-_UTX
_nomen: Translation direction
_descriptionem: |
The direction of translation from one language to another
(translation direction) can be unidirectional, bidirectional, or
multidirectional. This information can be specified with
"3.2.14 directionality property."
A unidirectional glossary is a glossary whose translation
direction is primarily one-way, i.e. from the source language to
the target language.
> Example: unidirectional bilingual Japanese-English UTX glossary
- Source language: Japanese, target language: English
- Primary translation direction: Japanese to English
- Note: Some terms in a unidirectional glossary may be exported
and used in the reverse direction in an ad-hoc manner.
This operation is called reverse-exporting. In this case, the
source language becomes the target language, and vice versa.
A reverse-exported unidirectional glossary may contain problems
because the consequence of the reversal may not be thoroughly
examined when compared with a full bidirectional glossary.
A bidirectional glossary is a glossary that is designed to be
used in two-way translation. Terms in one language can be
translated into another, and vice versa.
> Example: bidirectional bilingual Japanese-English UTX glossary
>
> - Language 1: Japanese, language 2: English
> - Translation direction: Japanese ⇔ English
> Example: bidirectional multilingual English-French-German
> glossary
>
> - Language 1: English, language 2: French, language 3: German
> - Translation direction: English ⇔ French, English ⇔ German
(but not French ⇔ German)
A multidirectional glossary is a type of multilingual glossary
that is designed to be used in any combination of languages in the
glossary.
> Example: multidirectional multilingual English-Japanese-Chinese
> glossary
>
> - Language 1: English, language 2: Japanese, language 3: Chinese
> - Translation direction: any combination of the above
# Trivia: rem, https://en.wiktionary.org/wiki/res#Latin
rem:
# Trivia: trānslātiōnem, https://en.wiktionary.org/wiki/translatio#Latin
translationem:
# Trivia: dīrēctiōnem, https://en.wiktionary.org/wiki/directio#Latin
directionem:
__linguam__:
# Exemplum: '#meta +rem +translationem +directionem +i_en +i_eng +is_Latn'
__HXL: '#meta +rem +translationem +directionem __linguam__'
# __nomen_breve: 'TM_rem_translationem_directionem__L__'
__nomen_breve: 'translationem_directionem__L__' # conceptum.translationem_directionem__L__
__id: ontologia.tm.rem.translationem.directionem.__linguam__
__libellam: conceptum
_compactum_json:
#meta +json +rem __linguam__
_id: ontologia.commune.rem.__linguam__.compactum_json
_JSONPath: translationem.directionem
_exemplum:
{"translationem": {"directionem": []}}
_UTX: *referens_ontologia-commune-conceptum-meta-_UTX
__linguam_de_imperium__:
__HXL: '#item +rem +translationem +directionem __linguam_de_imperium__'
# __nomen_breve: 'TM_rem_translationem_directionem__L_I__'
__nomen_breve: 'translationem_directionem__L_I__' # conceptum.translationem_directionem__L_I__
__id: ontologia.tm.rem.translationem.directionem.__linguam_de_imperium__
__libellam: conceptum
_compactum_json:
#meta +json +rem __linguam_de_imperium__
_id: ontologia.commune.rem.__linguam_de_imperium__.compactum_json
_JSONPath: translationem.directionem
_exemplum:
{"translationem": {"directionem": ""}}
_UTX: *referens_ontologia-commune-conceptum-meta-_UTX
Aliases
# Trivia:
# - ontologia, https://la.wikipedia.org/wiki/Ontologia
# - aliud, https://en.wiktionary.org/wiki/alius#Latin, https://en.wiktionary.org/wiki/alias#English
ontologia_aliud:
# Trivia:
# - accūrātum, https://en.wiktionary.org/wiki/accuratus
# - reliabilityCode, https://iate.europa.eu/fields-explained
# - reliabilityCode, https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf
#
# Term Base eXchange (TBX) 2008 CC-BY License ................................
# @see https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf
#
# reliabilityCode:
# A code assigned to a data-category or record indicating accuracy and/or
# completeness. The content of the <descrip> element when it has a type
# attribute value of 'reliabilityCode' shall be a value from 1
# (least reliable) to 10 (most reliable).
#
# Interactive Terminology for Europe (IATE) ..................................
# @see https://iate.europa.eu/fields-explained
#
# Reliability code
# IATE uses four codes to indicate the reliability of terms:
#
# Nº Code Description Explanation
#
# 1 ★ Reliability Automatically assigned to terms entered by
# not verified non-native speakers. Also, all lookup forms have
# a reliability of one.
#
# 6 ★★ Minimum Automatically assigned to terms entered or updated
# reliability by native speakers.
#
# 9 ★★★ Reliable Manually assigned by a terminologist following a
# reliability assessment. Reliable termsshould
# satisfy at least one of the following criteria:
# - having been obtained from a trusted source;
# - having been agreed on by a representative
# body of same-language terminologists;
# - being the common designation of the concept
# in its field.
#
# 10 ★★★★ Very reliable Manually assigned following a reliability
# assessment. Very reliable terms are:
# - well-established and widely accepted by
# experts as the correct designation, or
# - confirmed by a trusted and authoritative
# source, in particular a reliable written
# source.
accuratum:
"?":
# The '?' express what to do when the entire column does not exist, so
# is not a particular value that is missing
_IATE_valorem_codicem: "★★"
_IATE_valorem_descriptionem: |
Automatically assigned to terms entered or updated by native speakers.
_IATE_valorem_nomen: "Minimum reliability"
_IATE_valorem_numerum: 6
"_":
# The '_' express what to do when the column do exist, but one particular
# value does not have any value
_IATE_valorem_codicem: "★"
_IATE_valorem_descriptionem: |
Automatically assigned to terms entered by non-native speakers.
Also, all lookup forms have a reliability of one.
_IATE_valorem_nomen: "Reliability not verified"
_IATE_valorem_numerum: 1
"0":
# The '0' express when the column do exist, but one particular item
# evaluate to 0
_IATE_valorem_codicem: "★"
_IATE_valorem_descriptionem: |
Automatically assigned to terms entered by non-native speakers.
Also, all lookup forms have a reliability of one.
_IATE_valorem_nomen: "Reliability not verified"
_IATE_valorem_numerum: 1
"1":
_IATE_valorem_codicem: "★"
_IATE_valorem_descriptionem: |
Automatically assigned to terms entered by non-native speakers.
Also, all lookup forms have a reliability of one.
_IATE_valorem_nomen: "Reliability not verified"
_IATE_valorem_numerum: 1
"2": {}
"3": {}
"4": {}
"5": {}
"6":
_IATE_valorem_codicem: "★★"
_IATE_valorem_descriptionem: |
Automatically assigned to terms entered or updated by native speakers.
_IATE_valorem_nomen: "Minimum reliability"
_IATE_valorem_numerum: 6
"7": {}
"8": {}
"9":
_IATE_valorem_codicem: "★★★"
_IATE_valorem_descriptionem: |
Manually assigned by a terminologist following a reliability
assessment. Reliable terms should satisfy at least one of the
following criteria:
- having been obtained from a trusted source;
- having been agreed on by a representative body of same-language
terminologists;
- being the common designation of the concept in its field.
N.B. This code was automatically assigned to many entries, regardless
of their previous validation status, following the merger of existing
databases to create IATE. Therefore some entries marked as ‘reliable’
are not necessarily so.
_IATE_valorem_nomen: "Reliable"
_IATE_valorem_numerum: 9
"10":
_IATE_valorem_codicem: "★★★★"
_IATE_valorem_descriptionem: |
Manually assigned following a reliability assessment.
Very reliable terms are:
- well-established and widely accepted by experts as the correct
designation, or
- confirmed by a trusted and authoritative source, in particular a
reliable written source.
_IATE_valorem_nomen: "Very reliable"
_IATE_valorem_numerum: 10
genus_grammaticum:
# TODO: Several languages have more than 3 genders, but 6 are added to
# latin aliases. This need improvement.
# @see https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders
### Lingua Latina ------------------------------------------------------------
# > https://la.wikipedia.org/wiki/Genus_grammaticum
# Genus grammaticum
# Genus grammaticum est aliqua proprietas sive nominis sive interdum etiam
# verbi. Genera circiter in quarta parte linguarum mundi distinguuntur.
# Divisiones principales sunt haec:
#
# - masculinum et femininum
# - (e.g. Francogallice, Hispanice, Hindi-Urdu, Arabice, Hebraice)
# - masculinum, femininum, neutrum
# - (e.g. Latine, Graece, Theodisce, Islandice, Sanscritice)
# - animatum, inanimatum
# - (e.g. Ojibwayense et probabiliter lingua Protoindoeuropaea)
# - commune, neutrum (e.g. Danice, Suecice).
lat_commune:
_aliud: 'TBX_other'
# _codicem: lat_commune
_codicem_TBX: TBX_other
_descriptionem: |
- https://la.wikipedia.org/wiki/Genus_grammaticum
- https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders#Common_and_neuter
codicem_lat: commune
lat_animatum:
_aliud: 'TBX_other'
# _codicem: lat_animatum
_codicem_TBX: TBX_other
_descriptionem: |
- https://la.wikipedia.org/wiki/Genus_grammaticum
- https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders#Animate_and_inanimate
codicem_lat: animatum
lat_femininum:
_descriptionem: |
- https://la.wikipedia.org/wiki/Genus_grammaticum
- https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders
codicem_lat: masculinum
lat_inanimatum:
_aliud: 'TBX_other'
# _codicem: lat_inanimatum
_codicem_TBX: TBX_other
_descriptionem: |
- https://la.wikipedia.org/wiki/Genus_grammaticum
- https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders#Animate_and_inanimate
codicem_lat: inanimatum
lat_neutrum:
_aliud: 'TBX_other'
# _codicem: lat_neutrum
_codicem_TBX: TBX_other
_descriptionem: |
- https://la.wikipedia.org/wiki/Genus_grammaticum
- https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders#Common_and_neuter
codicem_lat: neutrum
lat_masculinum:
_aliud: 'TBX_masculine'
# _codicem: lat_masculinum
_codicem_TBX: TBX_masculine
_descriptionem: |
- https://la.wikipedia.org/wiki/Genus_grammaticum
- https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders
codicem_lat: masculinum
# TODO: Lingua latina, 'More than three grammatical genders'
# @see https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders#More_than_three_grammatical_genders
# Term Base eXchange (TBX) 2008 CC-BY License ................................
# @see http://www.terminorgs.net/downloads/TBX_Basic_Version_3.1.pdf
# @see https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf
#
# 6.8 Gender
# Identifier www.isocat.org/datcat/DC-245
# XML representation <termNote type="grammaticalGender">
# Level Term
# Content type Picklist, with permissible values as follows:
# • masculine
# • feminine
# • neuter
# • other
#
# <termNoteSpec name="grammaticalGender" datcatId="ISO12620A-020202">
# <contents datatype="picklist" forTermComp="yes">
# masculine feminine neuter otherGender
# </contents>
# </termNoteSpec>
TBX_masculine:
codicem_TBX: masculine
TBX_feminine:
codicem_TBX: feminine
TBX_neuter:
codicem_TBX: neuter
TBX_other:
codicem_TBX: other
# Trivia: partem ōrātiōnis, https://en.wiktionary.org/wiki/pars_orationis#Latin
partem_orationis:
### Lingua Latina ---------------------------------------------------------
# @see https://en.wikipedia.org/wiki/Latin_grammar
# @see https://la.wikipedia.org/wiki/Grammatica_Latina
# @see http://www.butte.edu/departments/cas/tipsheets/grammar/parts_of_speech.html
# - Charlton T. Lewis and Charles Short, A new Latin Dictionary,
# New York/Oxford 1891 (1879)
# - https://archive.org/details/LewisAndShortANewLatinDictionary
lat_adverbium:
_aliud: 'TBX_adverb|UTX_adverb'
_codicem: lat_adverbium
_codicem_TBX: TBX_adverb
_codicem_UTX: UTX_adverb
_codicem_wikidata: Q380057 # https://www.wikidata.org/wiki/Q380057
_normam: https://la.wikipedia.org/wiki/Adverbium
codicem_lat: adverbium
lat_nomen_adiectivum:
_aliud: 'TBX_adjective|UTX_adjective'
_codicem: lat_nomen_adiectivum
_codicem_TBX: TBX_adjective
_codicem_UTX: UTX_adjective
_codicem_wikidata: Q34698 # https://www.wikidata.org/wiki/Q34698
_normam: https://la.wikipedia.org/wiki/Adiectivum
codicem_lat: nomen_adiectivum
# substantīvum
lat_nomen_substantivum:
_aliud: 'TBX_noun|UTX_noun'
_codicem: lat_nomen_substantivum
_codicem_TBX: TBX_noun
_codicem_UTX: UTX_properNoun
_codicem_wikidata: Q1084 # https://www.wikidata.org/wiki/Q1084
_normam: https://la.wikipedia.org/wiki/Nomen_substantivum
codicem_lat: nomen_substantivum
lat_nomen_proprium:
_aliud: 'TBX_properNoun|UTX_properNoun'
_codicem: lat_nomen_proprium
_codicem_TBX: TBX_properNoun
_codicem_UTX: UTX_properNoun
_codicem_wikidata: Q147276 # https://www.wikidata.org/wiki/Q147276
_normam: https://la.wikipedia.org/wiki/Nomen_proprium
codicem_lat: nomen_proprium
# TODO: UTX_prenominal, https://pt.wikipedia.org/wiki/Coloca%C3%A7%C3%A3o_pronominal
lat_verbum:
_aliud: 'TBX_verb|UTX_verb'
_codicem: lat_verbum
_codicem_TBX: TBX_verb
_codicem_UTX: UTX_verb
_codicem_wikidata: Q24905 # https://www.wikidata.org/wiki/Q24905
_normam: https://la.wikipedia.org/wiki/Verbum_(temporale)
codicem_lat: verbum
# TODO: UTX_vt, Transitive verb, https://en.wikipedia.org/wiki/Transitive_verb
# TODO: UTX_vi, Transitive verb, https://en.wikipedia.org/wiki/Intransitive_verb
# sentence, maybe https://en.wiktionary.org/wiki/sententia
# phrasem1, https://en.wiktionary.org/wiki/phrasis#Latin
# TODO: TBX_other
### ------------------------------------------------------------------------
# http://www.terminorgs.net/downloads/TBX_Basic_Version_3.1.pdf
TBX_noun:
_codicem: DC-1333
codicem_TBX: noun
TBX_verb:
_codicem: DC-1424
codicem_TBX: verb
TBX_adjective:
_codicem: DC-1230
codicem_TBX: adjective
TBX_adverb:
_codicem: DC-1232
codicem_TBX: adverb
TBX_properNoun:
_codicem: DC-384
codicem_TBX: properNoun
TBX_other:
_codicem: DC-4336
codicem_TBX: other
### ------------------------------------------------------------------------
# https://aamt.info/wp-content/uploads/2019/06/utx1.20-specification-e.pdf
UTX_noun:
_nomen: Noun
codicem_UTX: noun
UTX_properNoun:
_nomen: Proper noun
codicem_UTX: properNoun
UTX_verb:
codicem_UTX: verb
UTX_vt:
_nomen: Transitive verb
codicem_UTX: vt
UTX_vi:
_nomen: Intransitive verb
codicem_UTX: vi
UTX_adjective:
codicem_UTX: adjective
UTX_prenominal:
codicem_UTX: prenominal
UTX_adverb:
codicem_UTX: adverb
UTX_sentence:
codicem_UTX: sentence
# Trivia:
# - rem, https://en.wiktionary.org/wiki/res#Latin
# - trānslātiōnem, https://en.wiktionary.org/wiki/translatio#Latin
# - status
# - HXL hashtag #status, https://hxlstandard.org/standard/1-1final/dictionary/#tag_status
# - statum, https://en.wiktionary.org/wiki/status#Latin
# translationem_status:
rem_statum:
### Lingua Latina ---------------------------------------------------------
# TODO: lingua latina
# Trivia:
# - rem, https://en.wiktionary.org/wiki/res#Latin
# - fīnāle, https://en.wiktionary.org/wiki/finalis#Latin
lat_rem_finale:
_aliud: 'TBX_preferred|UTX_approved|XLIFF_final'
codicem_lat: rem_finale
# Trivia: initiāle, https://en.wiktionary.org/wiki/initialis#Latin
lat_rem_initiale:
# TBX do not have valor for 'initial' but we can use accuratus for this
_aliud: 'UTX_provisional|XLIFF_initial'
codicem_lat: rem_initiale
# Trivia:
# - temporārium, https://en.wiktionary.org/wiki/temporarius#Latin
# - non, https://en.wiktionary.org/wiki/non#Latin
# - nātīvum, https://en.wiktionary.org/wiki/nativus#Latin
lat_rem_temporarium_de_non_nativum:
# TBX do not have valor for 'initial' but we can use accuratus for this
_aliud: 'UTX_provisional|XLIFF_initial'
codicem_lat: rem_temporarium_de_non_nativum
# Trivia:
# - temporārium, https://en.wiktionary.org/wiki/temporarius#Latin
lat_rem_temporarium:
# TBX do not have valor for 'initial' but we can use accuratus for this
_aliud: 'UTX_provisional|XLIFF_initial'
codicem_lat: rem_temporarium
# Trivia: vacuum, https://en.wiktionary.org/wiki/vacuus#Latin
lat_rem_vacuum:
_aliud: ''
codicem_lat: rem_vacuum
### TermBase eXchange (TBX) "TBX-Basic" 3.1 --------------------------------
# 3.1 7 Additional information about data categories
# The term type data category is optional. When a term has no term type
# value, it is assumed to be an ordinary entry term that is not an
# abbreviation or a variant of another term or an abbreviation of
# another full form term.
TBX_preferred:
_codicem: DC-72
_codicem_isocat: preferredTerm‐admn‐sts
_descriptionem: |
The term that, among a set of synonymous terms, is most recommended
for use.
codicem_TBX: preferred
TBX_admitted:
_codicem: DC-73
_codicem_isocat: admittedTerm‐admn‐sts
_descriptionem: The term is acceptable for use
codicem_TBX: admitted
TBX_notRecommended:
_codicem: DC-74
_codicem_isocat: deprecatedTerm‐admn‐sts
_descriptionem: The term should not be used.
codicem_TBX: notRecommended
TBX_obsolete:
_codicem: DC-75
_codicem_isocat: supersededTerm‐admn‐sts
_descriptionem: |
The term is no longer used, usually because a more modern term has
replaced it.
codicem_TBX: obsolete
### Universal Terminology eXchange UTX 1.20 --------------------------------
# The term status field indicates the status of a term.
# There are 7 statuses: blank, provisional, approved, non-standard,
# forbidden, rejected, or obsolete. Only a glossary administrator and a
# delegate can change the value of a term status.
# Note: If a glossary does not have a term status field, all entries
# are considered to be approved.
# (...)
# 5. Advanced concepts
# 5.1 Single term status and per-language term status
# There are two methods of applying term status: single term status and
# per-language term status. A glossary can use either of these to
# indicate the term status.
# 5.1.1 Single term status
#
# (...)
#
# 5.1.2 Per-language term status
# Per-language term status specifies the term status of a term for a
# particular language rather than a pair of two languages. For example,
# if it is used in a bilingual unidirectional glossary, it requires two
# term status columns, one for the source language, and one for the
# target language.
#
# If a language tag is not specified, the term status is treated as
# single term status (UTX 1.11 style).
# Note: Per-language term status is introduced in UTX 1.20 to handle
# bilingual bidirectional glossaries and multilingual glossaries.
#
# 5.1.3 Term status behaviors for an MT dictionary
#
# (...)
#
UTX_provisional:
_descriptionem: |
The term status "provisional" indicates that a target term is
proposed by a contributor but not yet authorized by the glossary
administrator. As provisional status is temporary, the glossary
administrator should promptly decide the term status such as
"approved."
Note: The glossary administrator may also choose to exclude (delete)
the term from the glossary, or move it to another glossary
codicem_UTX: provisional
UTX_approved:
_descriptionem: |
The term status "approved" indicates that an entry has been approved
for the particular glossary (domain) by the glossary administrator.
An approved status indicates that the term must be used with the
highest priority, whenever applicable. If a term has synonyms or
alternative spellings, such as "plugin" and "plugin," only one of
these should have approved status.
An approved term in one language is paired with another approved term
in another language. If the parts of speech of these multiple entries
are different, then they are different terms. For example, "plot"
can be a noun and a verb, and each can have approved status.
codicem_UTX: approved
# 4.5.3 Blank term status
# If the term status is left blank, it is considered as approved
# (a change from UTX 1.11). The term status of a term paired with a
# non-standard, forbidden, rejected, and obsolete term (explained later)
# can also be blank, which implicitly indicates approved status.
UTX_non-standard:
_descriptionem: |
The term status "non-standard" indicates one or more terms that are
less-preferred within a group of synonyms or alternative spellings.
Note: The glossary administrator decides whether the term is
less-preferred or not for a particular glossary. Therefore, this
status could vary in different glossaries, or with a different
glossary administrator.
codicem_UTX: non-standard
UTX_forbidden:
_descriptionem: |
The term status "forbidden" indicates that a term must not be used.
A term is marked as forbidden not only for being inappropriate as a
translation, but also if it is inappropriate within the context of the
end-result document.
A forbidden term, unlike a non-standard term, should not be provided
as a translation candidate.
Note: A term is “forbidden” because it is inappropriate from
linguistic, social, terminological, branding, or other viewpoints.
Up to UTX 1.11, only a target term could be indicated as forbidden.
UTX 1.20 allows any term (including a source term) to be indicated a
forbidden.
Forbidden terms can be exported from a UTX glossary for terminological
checking. Based on this information, a function of a translation tool
or a dedicated terminological checker can ascertain whether
translation files contain any undesirable terms.
codicem_UTX: forbidden
UTX_rejected:
_descriptionem: |
The term status "rejected" indicates that a term is not appropriate
for inclusion in a glossary. Rejected terms can be kept in the glossary
for record keeping, moved into a separate list, or deleted at a
later time.
codicem_UTX: rejected
UTX_obsolete:
_descriptionem: |
The term status "obsolete" indicates that a term was previously used,
but should no longer be used. Obsolete terms can be kept in the
glossary for record keeping, moved into a separate list, or deleted
at a later time.
codicem_UTX: obsolete
### XML Localization Interchange File Format XLIFF 2.1 ---------------------
# @see https://docs.oasis-open.org/xliff/xliff-core/v2.1/xliff-core-v2.1.html#state
# @see http://docs.oasis-open.org/xliff/xliff-core/v2.1/os/xliff-core-v2.1-os.html#substate
#
# State - indicates the state of the translation of a segment.
#
# Value description: The value MUST be set to one of the following values:
#
# initial - indicates the segment is in its initial state.
# translated - indicates the segment has been translated.
# reviewed - indicates the segment has been reviewed.
# final - indicates the segment is finalized and ready to be used.
#
# The 4 defined states constitute a simple linear state machine that
# advances in the above given order. No particular workflow or process is
# prescribed, except that the three states more advanced than the default
# initial assume the existence of a Translation within the segment.
# One can further specify the state of the Translation using the subState
# attribute.
#
# Default value: initial
XLIFF_initial:
_descriptionem: indicates the segment is in its initial state
codicem_XLIFF: initial
XLIFF_translated:
_descriptionem: indicates the segment has been translated
codicem_XLIFF: translated
XLIFF_reviewed:
_descriptionem: indicates the segment has been reviewed
codicem_XLIFF: reviewed
XLIFF_final:
_descriptionem: indicates the segment is finalized and ready to be used
codicem_XLIFF: final
# Trivia:
# - terminum, https://en.wiktionary.org/wiki/terminus#Latin
# - typum, https://en.wiktionary.org/wiki/typus#Latin
terminum_typum: &ontologia_aliud_terminum_typum
### TermBase eXchange (TBX) "TBX-Basic" 3.1 + IATE ------------------------
# @see http://www.terminorgs.net/downloads/TBX_Basic_Version_3.1.pdf
# @see https://iate.europa.eu/fields-explained
# @see https://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf
# @TODO: maybe add the ones from tbx_oscar_0
TBX_fullForm:
_codicem: DC-321
codicem_TBX: fullForm
TBX_acronym:
_codicem: DC-334
codicem_TBX: acronym
TBX_abbreviation:
_codicem: DC-331
codicem_TBX: abbreviation
TBX_formula: # Used on IATE
codicem_TBX: formula
TBX_shortForm:
_codicem: DC-332
codicem_TBX: shortForm
TBX_variant:
_codicem: DC-330
codicem_TBX: variant
TBX_phrase:
_codicem: DC-339
codicem_TBX: phrase
### Universal Terminology eXchange UTX 1.20 --------------------------------
# 4.4.2 sentence and special characters
# sentence is a special pos field item that indicates that the "term"
# is a sentence.
# Note: sentence should only be used when necessary. sentence would be
# used for a user interface message in the form of a sentence, for example.
# Entries of pairs of translated sentences should be stored in a translation
# memory format (such as TMX) rather than a glossary. When a UTX glossary is
# exported for an MT system that does not treat sentence as a type of
# part of speech, sentence entries can be treated as nouns.
# @deprecated rem_typum. Use terminum_typum
rem_typum: *ontologia_aliud_terminum_typum
ontologia_aliud_familiam:
lat:
TBX:
XLIFF:
UTX:
Data types
# Trivia:
# - ontologia, https://la.wikipedia.org/wiki/Ontologia
# - datum, https://en.wiktionary.org/wiki/datum#Latin
# - typum, https://en.wiktionary.org/wiki/typus#Latin
# - fōrmātum, https://en.wiktionary.org/wiki/formatus#Latin
# - normam, https://en.wiktionary.org/wiki/norma#Latin
# - digitum, https://en.wiktionary.org/wiki/digitus#Latin
# - digit, https://en.wiktionary.org/wiki/digit#English
# - 'Unicode Digit' (unicode have several classes of numbers, digit is
# how it calls the 0 1 2 3 4 5 6 7 8 9)
# - textum, https://en.wiktionary.org/wiki/textus#Latin
ontologia_datum_typum:
formatum:
compactum_json:
_normam: https://www.json.org/
_typum: textum
compactum_textum:
_descriptionem: |
_[eng-Latn] Note: compactum_textum is different from textum. While textum
is a free form text, compactum_textum is meant to be used as human
editable compact form of values that could be stored on the equivalent
compactum_json.
Most of the time this means create controlled constants documented on
cor.hxltm.yml->ontologia_aliud to to represent other values. So this
means that compactum_textum MUST be both editable by humans and
parseable by computers.
Also, when more than one constant is need on compactum_textum the
separator is the character "|".
Whitespaces betwen start and end of a term should be ignored.
Unknow values should be ignored. Errors should only stop processing
if user ask for stricter or debugging mode.
[eng-Latn]_
_typum: textum
numerum:
_exemplum:
- "123456789"
- "0123456789ABCDEF"
- "0123456789ABCDEF"
- "Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ" # https://en.wikipedia.org/wiki/Numerals_in_Unicode#Roman_numerals
- "零 壹 貳 參 肆 伍 陸 柒 捌 玖" # https://en.wikipedia.org/wiki/Chinese_numerals
- "〇 一 二 三 四 五 六 七 八 九" # https://en.wikipedia.org/wiki/Chinese_numerals
# Several more examples at https://en.wikipedia.org/wiki/Numeral_system
_typum: textum
numerum_digitum:
_descriptionem: |
- https://en.wikipedia.org/wiki/Numerals_in_Unicode
_exemplum:
- "123456789"
_typum: numerum_digitum
textum:
_descriptionem: |
_[eng-Latn] Free text. This means allow even line breaks.
The only hard requeriment is a a format that can be represented on CSV
format (so means escaping characters).
DO NOT escape non US-ASCII characters. This is annoying. Fix your
system to accept Unicode and if necessary, only threat differently
control characters that are writing system neutral.
[eng-Latn]_
typum:
numerum_digitum: {}
textum: {}
hxltmcli --help
# hxltmcli can be installed with hdp-toolchain
# @see https://pypi.org/project/hdp-toolchain/
pip install hdp-toolchain[hxltm]
hxltmcli --help
usage: hxltmcli [-h] [--sheet [number]] [--selector [path]]
[--http-header header] [--remove-headers] [--strip-tags]
[--ignore-certs]
[--log debug|info|warning|error|critical|none]
[--fontem-linguam [fontem_linguam]]
[--objectivum-linguam [objectivum_linguam]]
[--agendum-linguam agendum_linguam]
[--non-agendum-linguam non_agendum_linguam]
[--auxilium-linguam auxilium_linguam]
[--fontem-normam [fontem_normam]]
[--tmeta-archivum [tmeta_archivum]]
[--objectivum-normam [objectivum_normam]]
[--objectivum-formulam OBJECTIVUM_FORMULAM]
[--objectivum-HXLTM] [--objectivum-TMX]
[--objectivum-TBX-Basim] [--objectivum-UTX] [--objectivum-XML]
[--objectivum-XLIFF] [--objectivum-XLIFF-obsoletum]
[--objectivum-CSV-3] [--objectivum-TSV-3]
[--objectivum-JSON-kv]
[--objectivum-formatum-speciale [objectivum_formatum_speciale]]
[--limitem-quantitatem [limitem_quantitatem]]
[--limitem-initiale-lineam [limitem_initiale_lineam]]
[--non-securum-limitem]
[--selectum-columnam-numerum columnam_numerum]
[--non-selectum-columnam-numerum non_columnam_numerum]
[--crudum-objectivum-caput [fon_hxlattrs]]
[--crudum-fontem-linguam-hxlattrs [fon_hxlattrs]]
[--crudum-fontem-linguam-bcp47 [fon_bcp47]]
[--crudum-objectivum-linguam-hxlattrs [obj_hxlattrs]]
[--crudum-objectivum-linguam-bcp47 [obj_bcp47]]
[--archivum-configurationem]
[--archivum-configurationem-appendicem] [--silentium]
[--expertum-HXLTM-ASA [hxltm_asa]]
[--expertum-HXLTM-ASA-verbosum] [--experimentum-est]
[--venandum-insectum-est]
[infile] [outfile]
_[eng-Latn] hxltmcli v0.8.7 is an implementation of HXLTM tagging conventions
on HXL to manage and export tabular data to popular translation memories
and glossaries file formats with non-close standards.
[eng-Latn]_"
positional arguments:
infile HXL file to read (if omitted, use standard input).
outfile HXL file to write (if omitted, use standard output).
optional arguments:
-h, --help show this help message and exit
--sheet [number] Select sheet from a workbook (1 is first sheet)
--selector [path] JSONPath expression for starting point in JSON input
--http-header header Custom HTTP header to send with request
--remove-headers Strip text headers from the CSV output
--strip-tags Strip HXL tags from the CSV output
--ignore-certs Don't verify SSL connections (useful for self-signed)
--log debug|info|warning|error|critical|none
Set minimum logging level
--fontem-linguam [fontem_linguam], -FL [fontem_linguam]
(For bilingual operations) Source natural language
(use if not auto-detected). Must be like {ISO
639-3}-{ISO 15924}. Example: lat-Latn. Accept a single
value.
--objectivum-linguam [objectivum_linguam], -OL [objectivum_linguam]
(For bilingual and monolingual operations) Target
natural language (use if not auto-detected). Must be
like {ISO 639-3}-{ISO 15924}. Example: arb-Arab.
Requires: mono or bilingual operation. Accept a single
value.
--agendum-linguam agendum_linguam, -AL agendum_linguam
(Planned, but not fully implemented yet) Restrict
working languages to a list. Useful for HXLTM to HXLTM
or multilingual formats like TBX and TMX. Requires:
multilingual operation. Accepts multiple values.
--non-agendum-linguam non_agendum_linguam, -non-AL non_agendum_linguam
(Planned, but not implemented yet) Inverse of
--agendum-linguam. Document one or more languages that
should be ignored if they exist. Requires:
multilingual operation. Accept multiple values.
--auxilium-linguam auxilium_linguam, -AUXL auxilium_linguam
(Planned, but not implemented yet) Define auxiliary
language. Requires: bilingual operation (and file
format allow metadata). Default: Esperanto and
Interlingua Accepts multiple values.
--fontem-normam [fontem_normam]
(For data exchange) Source of data convention
Recommended convention: use "{UN M49}_{P-Code}" when
endorsed by regional government, and reverse domain
name notation with "_" for other cases. Examples:
076_BR (Brazil, adm0, Federal level); 076_BR33
(Brazil, adm1, Minas Gerais State, uses PCode);
076_BR3106200 (Brazil, adm2, Belo Horizonte city, uses
PCode).
--tmeta-archivum [tmeta_archivum]
(Draft, not fully implemented) Optional YAML metadata
for advanced processing operations.
--objectivum-normam [objectivum_normam]
(For data exchange) Target of data convention
Recommended convention: use "{UN M49}_{P-Code}" when
endorsed by regional government, and reverse domain
name notation with "_" for other cases. Example:
org_hxlstandard
--objectivum-formulam OBJECTIVUM_FORMULAM
Template file to use as reference to generate an
output. Less powerful than custom file but can be used
for simple cases.
--objectivum-HXLTM, --HXLTM
Save output as HXLTM (default). Multilingual output
format.
--objectivum-TMX, --TMX
Export to Translation Memory eXchange (TMX) v1.4b.
Multilingual output format
--objectivum-TBX-Basim, --TBX-Basim
(Working draft) Export to Term Base eXchange (TBX)
Basic Multilingual output format
--objectivum-UTX, --UTX
(Planned, but not implemented yet) Export to Universal
Terminology eXchange (UTX). Multilingual output format
--objectivum-XML Export to XML format. Multilingual output format
--objectivum-XLIFF, --XLIFF, --XLIFF2
Export to XLIFF (XML Localization Interchange File
Format) v2.1. (mono or bi-lingual support only as per
XLIFF specification)
--objectivum-XLIFF-obsoletum, --XLIFF-obsoletum, --XLIFF1
(Not implemented) Export to XLIFF (XML Localization
Interchange File Format) v1.2, an obsolete format for
lazy developers who don't implemented XLIFF 2
(released in 2014) yet.
--objectivum-CSV-3, --CSV-3
(Not implemented yet) Export to Bilingual CSV with
BCP47 headers (source to target) plus comments on last
column Bilingual operation.
--objectivum-TSV-3, --TSV-3
(Not implemented yet) Export to Bilingual TAB with
BCP47 headers (source to target) plus comments on last
column Bilingual operation.
--objectivum-JSON-kv, --JSON-kv
(Not implemented yet) Export to Bilingual JSON. Keys
are ID (if available) or source natural language.
Values are target language. No comments are exported.
Monolingual/Bilingual
--objectivum-formatum-speciale [objectivum_formatum_speciale]
(Not fully implemented yet) In addition to use a
output format (like --objectivum-TMX) inform an
special additional key that customize the base format
(like normam.TMX) already existing on
ego.hxltm.yml/venditorem.hxltm.yml/cor.hxltm.yml.
Example: "hxltmcli fontem.hxl.csv objectivum.tmx
--objectivum-TMX --objectivum-formatum-speciale TMX-
de-marcus"
--limitem-quantitatem [limitem_quantitatem]
(Advanced, large data sets) Customize the limit of the
maximum number of raw rows can be in a single step.
Try increments of 1 million.Use value -1 to disable
limits (even if means exhaust all computer memory
require full restart). Defaults to 1048576 (but to
avoid non-expert humans or automated work flows
generate output with missing data without no one
reading the warning messages if the --limitem-
quantitatem was reached AND no customization was done
on --limitem-initiale-lineam an exception will abort
--limitem-initiale-lineam [limitem_initiale_lineam]
(Advanced, large data sets) When working in batches
and the initial row to process is not the first one
(starts from 0) use this option if is inviable
increase to simply --limitem-quantitatem
--non-securum-limitem, --ad-astra-per-aspera
(For situational/temporary usage, as in "one weekend"
NOT six months) Disable any secure hardware limits and
make the program try harder tolerate (even if means
ignore entire individual rows or columns) but still
work with what was left from the dataset. This option
assume is acceptable not try protect from exhaust all
memory or disk space when working with large data sets
and (even for smaller, but not well know from the
python or YAML ontologia) the current human user
evaluated that the data loss is either false positive
or tolerable until permanent fix.
--selectum-columnam-numerum columnam_numerum
(Advanced) Select only columns from source HXLTM
dataset by a list of index numbers (starts by zero).
As example: to select the first 3 columns use "0,1,2"
and NOT "1,2,3"
--non-selectum-columnam-numerum non_columnam_numerum
(Advanced) Exclude columns from source HXLTM dataset
by a list of index numbers (starts by zero). As
example: to ignore the first ("Excel A"), and fifth
("Excel: E") column:use "0,4" and not "1,5"
--crudum-objectivum-caput [fon_hxlattrs]
(Advanced override for tabular output, like CSV).
Explicit define first line of output (separed by ,)
Example: "la,ar,Annotationem"
--crudum-fontem-linguam-hxlattrs [fon_hxlattrs], --fon-hxlattrs [fon_hxlattrs]
(Advanced override for --fontem-linguam). Explicit HXL
Attributes for source language. Example:
"+i_la+i_lat+is_latn"
--crudum-fontem-linguam-bcp47 [fon_bcp47], --fon-bcp47 [fon_bcp47]
(Advanced override for --fontem-linguam). Explicit
IETF BCP 47 language tag for source language. Example:
"la"
--crudum-objectivum-linguam-hxlattrs [obj_hxlattrs], --obj-hxlattrs [obj_hxlattrs]
(Advanced override for --objectivum-linguam). Explicit
HXL Attributes for target language. Example:
"+i_ar+i_arb+is_arab"
--crudum-objectivum-linguam-bcp47 [obj_bcp47], --obj-bcp47 [obj_bcp47]
(Advanced override for --objectivum-linguam). Explicit
IETF BCP 47 language tag for target language. Example:
"ar"
--archivum-configurationem
Path to custom configuration file (The cor.hxltm.yml)
--archivum-configurationem-appendicem
(Not implemented yet)Path to custom configuration file
(The cor.hxltm.yml)
--silentium Silence warnings? Try to not generate any warning. May
generate invalid output
--expertum-HXLTM-ASA [hxltm_asa]
(Expert mode) Save an Abstract Syntax Tree in JSON
format to a file path. With --expertum-HXLTM-ASA-
verbosum output entire dataset data. File extensions
with .yml/.yaml = YAML output. Files extensions with
.json/.json5 = JSONs output. Default: JSON output.
Good for debugging.
--expertum-HXLTM-ASA-verbosum
(Expert mode) Enable --expertum-HXLTM-ASA verbose mode
--experimentum-est (Internal testing only) Enable undocumented feature
--venandum-insectum-est, --debug
Enable debug? Extra information for program debugging
Exemplōrum gratiā:
HXLTM (csv) -> Translation Memory eXchange format (TMX):
hxltmcli fontem.tm.hxl.csv objectivum.tmx --objectivum-TMX
HXLTM (xlsx; sheet 7) -> Translation Memory eXchange format (TMX):
hxltmcli fontem.xlsx objectivum.tmx --sheet 7 --objectivum-TMX
HXLTM (xlsx; sheet 7, Situs interretialis) -> HXLTM (csv):
hxltmcli https://example.org/fontem.xlsx --sheet 7 fontem.tm.hxl.csv
HXLTM (Google Docs) -> HXLTM (csv):
hxltmcli https://docs.google.com/spreadsheets/(...) fontem.tm.hxl.csv
HXLTM (Google Docs) -> Translation Memory eXchange format (TMX):
hxltmcli https://docs.google.com/spreadsheets/(...) objectivum.tmx --objectivum-TMX
hxltmdexml --help
# hxltmdexml can be installed with hdp-toolchain
# @see https://pypi.org/project/hdp-toolchain/
pip install hdp-toolchain[hxltm]
hxltmdexml --help
usage: hxltmdexml [-h] [--agendum-linguam [agendum_linguam]]
[--fontem-linguam [fontem_linguam]]
[--objectivum-linguam [objectivum_linguam]]
[--archivum-configurationem] [--venandum-insectum-est]
[infile] [outfile]
_[eng-Latn]
hxltmdexml v0.7.0 is an (not feature-by-feature) conversor from some
XML formats to HXLTM tabular working file.
[eng-Latn]_"
positional arguments:
infile HXL file to read (if omitted, use standard input).
outfile HXL file to write (if omitted, use standard output).
optional arguments:
-h, --help show this help message and exit
--agendum-linguam [agendum_linguam], -AL [agendum_linguam]
List of working languages. Required for multilinguam
formats (like TBX and TBX) both to avoid scan the
source file and to be sure about HXL attributes of the
output format. Example I (Latin and Arabic): lat-
Latn@la,arb-Arab@ar . Example II (TBX IATE,
es,en,fr,la,pt,mul): spa-Latn@es,eng-Latn@en,fra-
Latn@fr,lat-Latn@la,por-Latn@pt,mul-Zyyy
--fontem-linguam [fontem_linguam], -FL [fontem_linguam]
--objectivum-linguam [objectivum_linguam], -OL [objectivum_linguam]
--archivum-configurationem
Path to custom configuration file (The cor.hxltm.yml)
--venandum-insectum-est, --debug
Enable debug? Extra information for program debugging
Exemplōrum gratiā:
XML Localization Interchange File Format (XLIFF) v2.1+: -> HXLTM (bilinguam):
hxltmdexml fontem.xlf objectivum.tm.hxl.csv
XML Localization Interchange File Format (XLIFF) v1.2: -> HXLTM (bilinguam):
hxltmdexml fontem.xlf objectivum.tm.hxl.csv
Translation Memory eXchange format (TMX): -> HXLTM:
hxltmdexml fontem.tmx objectivum.tm.hxl.csv
TBX-Basic: TermBase eXchange (TBX) Basic: -> HXLTM:
hxltmdexml fontem.tbx objectivum.tm.hxl.csv
TBX-IATE (id est, https://iate.europa.eu/download-iate) -> HXLTM (por-Latn@pt)
zcat IATE_download.zip | hxltmdexml --agendum-linguam por-Latn@pt
cat IATE_export.tbx | hxltmdexml --agendum-linguam por-Latn@pt
TBX-IATE (id est, https://iate.europa.eu/download-iate) -> HXLTM (...)
hxltmdexml IATE_export.tbx IATE_export.hxltm.csv \
--agendum-linguam bul-Latn@bg \
--agendum-linguam ces-Latn@cs \
--agendum-linguam dan-Latn@da \
--agendum-linguam dut-Latn@nl \
--agendum-linguam ell-Latn@el \
--agendum-linguam eng-Latn@en \
--agendum-linguam est-Latn@et \
--agendum-linguam fin-Latn@fi \
--agendum-linguam fra-Latn@fr \
--agendum-linguam ger-Latn@de \
--agendum-linguam ger-Latn@de \
--agendum-linguam gle-Latn@ga \
--agendum-linguam hun-Latn@hu \
--agendum-linguam ita-Latn@it \
--agendum-linguam lav-Latn@lv \
--agendum-linguam lit-Latn@lt \
--agendum-linguam mlt-Latn@mt \
--agendum-linguam pol-Latn@pl \
--agendum-linguam por-Latn@pt \
--agendum-linguam ron-Latn@ro \
--agendum-linguam slk-Latn@sk \
--agendum-linguam slv-Latn@sl \
--agendum-linguam spa-Latn@es \
--agendum-linguam swe-Latn@sv
License
The EticaAI has dedicated the work to the public domain by waiving all of their rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.