Coverage for src/metador_core/container/__init__.py: 100%
4 statements
« prev ^ index » next coverage.py v7.3.2, created at 2023-11-02 09:33 +0000
« prev ^ index » next coverage.py v7.3.2, created at 2023-11-02 09:33 +0000
1"""Metador interface to manage metadata in HDF5 containers.
3Works with plain h5py.File and IH5Record subclasses and can be extended to work
4with any type of archive providing the required functions in a h5py-like interface.
6When assembling a container, the compliance with the Metador container specification is
7ensured by using it through the MetadorContainer interface.
9Technical Metador container specification (not required for users):
11Metador uses only HDF5 Groups and Datasets. We call both kinds of objects Nodes.
12Notice that HardLinks, SymLinks, ExternalLinks or region references cannot be used.
14Users are free to lay out data in the container as they please, with one exception:
15a user-defined Node MUST NOT have a name starting with "metador_".
16"metador_" is a reserved prefix for Group and Dataset names used to manage
17technical bookkeeping structures that are needed for providing all container features.
19For each HDF5 Group or Dataset there MAY exist a corresponding
20Group for Metador-compatible metadata that is prefixed with "metador_meta_".
22For "/foo/bar" the metadata is to be found...
23 ...in a group "/foo/metador_meta_bar", if "/foo/bar" is a dataset,
24 ...in a group "/foo/bar/metador_meta_" if it is a group.
25We write meta("/foo/bar") to denote that group.
27Given schemas with entrypoint names X, Y and Z such that X is the parent schema of Y,
28and Y is the parent schema of Z and a node "/foo/bar" annotated by a JSON object of
29type Z, that JSON object MUST be stored as a newline-terminated, utf-8 encoded byte
30sequence at the path meta("/foo/bar")/X/Y/Z/=UUID, where the UUID is unique in the
31container.
33For metadata attached to an object we expect the following to hold:
35Node instance uniqueness:
36Each schema MAY be instantiated explicitly for each node at most ONCE.
37Collections thus must be represented on schema-level whenever needed.
39Parent Validity:
40Any object of a subschema MUST also be a valid instance of all its parent schemas.
41The schema developers are responsible to ensure this by correct implementation
42of subschemas.
44Parent Consistency:
45Any objects of a subtype of schema X that stored at the same node SHOULD result
46in the same object when parsed as X (they agree on the "common" information).
47Thus, any child object can be used to retrieve the same parent view on the data.
48The container creator is responsible for ensuring this property. In case it is not
49fulfilled, retrieving data for a more abstract type will yield it from ANY present subtype
50instance (but always the same one, as long as the container does not change)!
52If at least one metadata object it stored, a container MUST have a "/metador_toc" Group,
53containing a lookup index of all metadata objects following a registered metadata schema.
54This index structure MUST be in sync with the node metadata annotations.
55Keeping this structure in sync is responsibility of the container interface.
57This means (using the previous example) that for "/foo/bar" annotated by Z there also
58exists a dataset "/metador_toc/X/Y/Z/=UUID" containing the full path to the metadata node,
59i.e. "meta(/foo/bar)/X/Y/Z/=UUID". Conversely, there must not be any empty entry-point
60named Groups, and all listed paths in the TOC must point to an existing node.
62A valid container MUST contain a dataset /metador_version string of the form "X.Y"
64A correctly implemented library supporting an older minor version MUST be able open a
65container with increased minor version without problems (by ignoring unknown data),
66so for a minor update of this specification only new entities may be defined.
68Known technical limitations:
70Due to the fact that versioning of plugins such as schemas is coupled to the versioning
71of the respective Python packages, it is not (directly) possible to use two different
72versions of the same schema in the same environment (with the exception of mappings, as
73they may bring their own equivalent schema classes).
75Minor version updates of packages providing schemas must ensure that the classes providing
76schemas are backward-compatible (i.e. can parse instances of older minor versions).
78Major version updates must also provide mappings migrating between the old and new schema
79versions. In case that the schema did not change, the mapping is simply the identity.
80"""
82from .interface import MetadorContainerTOC, MetadorMeta
83from .provider import ContainerProxy
84from .wrappers import MetadorContainer, MetadorDataset, MetadorGroup, MetadorNode
86__all__ = [
87 "MetadorContainer",
88 "MetadorNode",
89 "MetadorGroup",
90 "MetadorDataset",
91 "MetadorMeta",
92 "MetadorContainerTOC",
93 "ContainerProxy",
94]