HXL-Data-Science-file-formats

The heart of the HXLm, HDP and HDPLisp ontologies

Protip: if you cannot make local cache from the GitHub repository or install the Python Pypy package hdp-toolchain the https://hdp.etica.ai/ontologia/ is an public end point.

When feasible, even if it make harder to do initial implementation or be a bit less efficient than use dedicated “advanced” strategies with state of the art tools, the internal parts of hxlm.core that deal with ontology will be stored in this folder.

This strategy is likely to make it easier for non-developers to update internals, like individuals interested in adding new languages or proposing corrections.



Knowledge Graph and JSON Schemas

Knowledge graph on Wikipedia

Note: contents of ontologia/json/ are generated from ontologia/ *.yml files with exception of ontologia/hdp.json-schema.json that is not yet automated

Localization Knowledge Graph

core.lkg.yml

json/core.lkg.json

# Generate ontologia/json/core.vkg.json
yq < ontologia/core.vkg.yml > ontologia/json/core.vkg.json

Vocabulary Knowledge Graph

core.vkg.yml

json/core.vkg.json

# Generate ontologia/json/core.lkg.json
yq < ontologia/core.lkg.yml > ontologia/json/core.lkg.json

JSON Schema

Latin

Other natural languages

TODO: explain more about it (Emerson Rocha, 2021-03 09:46 UTC)

HXLTM

Exchange Codes and terms

Prebuild tables

Common Operational Datasets

Gender/Sex codes

HXL

Human Anatomy

Language codes

TODO: add also the macrolanguages mapping https://iso639-3.sil.org/sites/iso639-3/files/downloads/iso-639-3-macrolanguages.tab

Location codes

Location codes at adm0
Location at adm1, adm2, adm3, adm4, adm5

TODO: we should both explain how to obtain these without use HDPLisp (Emerson Rocha, 2021-04-13 22:28 UTC)

Numbers (draft)

Writting system codes

ISO

The files on ontologia/iso contain symlinks to generated resources that are based on then already HXLated.

ISO 639-3

ISO 3166

TODO: work around how to get at least some subdivisions (Emerson Rocha, 2021-04-13 23:55 UTC)

ISO 3166 country/territory codes

ISO 15924

URN resolver

The URN:DATA specification (early draft)

Default values for the urnresolver

When the command line util urnresolver does not have a user customized specified file, this is the loaded file.

Platform dependent ontologies

Python Data classes

Protip: even if you are not a python programmer, but is debugging some HXLm implementation (or want to undestand more how the objects are related) this folder can help you. This also means that feedback from advanced users that know other programming languages but do not know python could still be done focusing on this folder

While not as portable, the contents of this uses an specialized Python type of class called dataclasses. The non-buzzword meaning of this is the code on this folder is (or should be) more an representation on how data objects are manipulated, instead of being classes that actually change behavior. In theory they should more simple to port to other programming languages.

Other programming languages

Note: at the moment (2021-04-01) there is no interest to implement non-portable underlining classes (at least form HXLm.core.HPD) on other languages like the HXL Standard (see https://github.com/HXLStandard and https://hxlstandard.org/developer-documentation/).

Recommendation: if over the years do exist interest in porting some features, one good approach would be use the part of the ontologies that are platform independent. Also different implementations would have different approaches and minimal viable products could work faster without need to implement a more object oriented approach.

To Do’s

While the idea behind the hxlm.core project is output production-ready toolchains (and, to make easier for localization, the number of core keywords and how they are used is keep as as minimalist as possible) do already exist other works, like the Perl Lingua::Romana::Perligata, that can at least help with usages for internal terms in Latin.