Skip to content

container

Metador interface to manage metadata in HDF5 containers.

Works with plain h5py.File and IH5Record subclasses and can be extended to work with any type of archive providing the required functions in a h5py-like interface.

When assembling a container, the compliance with the Metador container specification is ensured by using it through the MetadorContainer interface.

Technical Metador container specification (not required for users):

Metador uses only HDF5 Groups and Datasets. We call both kinds of objects Nodes. Notice that HardLinks, SymLinks, ExternalLinks or region references cannot be used.

Users are free to lay out data in the container as they please, with one exception: a user-defined Node MUST NOT have a name starting with "metador_". "metador_" is a reserved prefix for Group and Dataset names used to manage technical bookkeeping structures that are needed for providing all container features.

For each HDF5 Group or Dataset there MAY exist a corresponding Group for Metador-compatible metadata that is prefixed with "metador_meta_".

For "/foo/bar" the metadata is to be found... ...in a group "/foo/metador_meta_bar", if "/foo/bar" is a dataset, ...in a group "/foo/bar/metador_meta_" if it is a group. We write meta("/foo/bar") to denote that group.

Given schemas with entrypoint names X, Y and Z such that X is the parent schema of Y, and Y is the parent schema of Z and a node "/foo/bar" annotated by a JSON object of type Z, that JSON object MUST be stored as a newline-terminated, utf-8 encoded byte sequence at the path meta("/foo/bar")/X/Y/Z/=UUID, where the UUID is unique in the container.

For metadata attached to an object we expect the following to hold:

Node instance uniqueness: Each schema MAY be instantiated explicitly for each node at most ONCE. Collections thus must be represented on schema-level whenever needed.

Parent Validity: Any object of a subschema MUST also be a valid instance of all its parent schemas. The schema developers are responsible to ensure this by correct implementation of subschemas.

Parent Consistency: Any objects of a subtype of schema X that stored at the same node SHOULD result in the same object when parsed as X (they agree on the "common" information). Thus, any child object can be used to retrieve the same parent view on the data. The container creator is responsible for ensuring this property. In case it is not fulfilled, retrieving data for a more abstract type will yield it from ANY present subtype instance (but always the same one, as long as the container does not change)!

If at least one metadata object it stored, a container MUST have a "/metador_toc" Group, containing a lookup index of all metadata objects following a registered metadata schema. This index structure MUST be in sync with the node metadata annotations. Keeping this structure in sync is responsibility of the container interface.

This means (using the previous example) that for "/foo/bar" annotated by Z there also exists a dataset "/metador_toc/X/Y/Z/=UUID" containing the full path to the metadata node, i.e. "meta(/foo/bar)/X/Y/Z/=UUID". Conversely, there must not be any empty entry-point named Groups, and all listed paths in the TOC must point to an existing node.

A valid container MUST contain a dataset /metador_version string of the form "X.Y"

A correctly implemented library supporting an older minor version MUST be able open a container with increased minor version without problems (by ignoring unknown data), so for a minor update of this specification only new entities may be defined.

Known technical limitations:

Due to the fact that versioning of plugins such as schemas is coupled to the versioning of the respective Python packages, it is not (directly) possible to use two different versions of the same schema in the same environment (with the exception of mappings, as they may bring their own equivalent schema classes).

Minor version updates of packages providing schemas must ensure that the classes providing schemas are backward-compatible (i.e. can parse instances of older minor versions).

Major version updates must also provide mappings migrating between the old and new schema versions. In case that the schema did not change, the mapping is simply the identity.

MetadorContainerTOC

Interface to the Metador metadata index (table of contents) of a container.

Source code in src/metador_core/container/interface.py
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
class MetadorContainerTOC:
    """Interface to the Metador metadata index (table of contents) of a container."""

    def __init__(self, container: MetadorContainer):
        self._container = container
        self._raw = self._container.__wrapped__

        ver = self.spec_version if M.METADOR_VERSION_PATH in self._raw else None
        if ver:
            if ver >= [2]:
                msg = f"Unsupported Metador container version: {ver}"
                raise ValueError(msg)
        else:
            if self._container.acl[NodeAcl.read_only]:
                msg = "Container is read-only and does not look like a Metador container! "
                msg += "Please open in writable mode to initialize Metador structures!"
                raise ValueError(msg)

            # writable + no version = fresh (for metador), initialize it
            self._raw[M.METADOR_VERSION_PATH] = M.METADOR_SPEC_VERSION
            self._raw[M.METADOR_UUID_PATH] = str(uuid1())

        # if we're here, we have a prepared container TOC structure

        # proceed to initialize TOC
        self._driver_type: MetadorDriverEnum = get_driver_type(self._raw)

        self._packages = TOCPackages(self._raw)
        self._schemas = TOCSchemas(self._raw, self._packages)
        self._links = TOCLinks(self._raw, self._schemas)

    # ----

    @property
    def driver_type(self) -> MetadorDriverEnum:
        """Return the type of the container driver."""
        return self._driver_type

    @property
    def driver(self) -> Type[MetadorDriver]:
        """Return the container driver class used by the container."""
        return METADOR_DRIVERS[self.driver_type]

    @property
    def source(self) -> Any:
        """Return data underlying thes container (file, set of files, etc. used with the driver)."""
        return get_source(self._raw, self.driver_type)

    # ----

    @property
    def container_uuid(self) -> UUID:
        """Return UUID of the container."""
        uuid = self._raw[M.METADOR_UUID_PATH]
        uuid_ds = cast(H5DatasetLike, uuid)
        return UUID(uuid_ds[()].decode("utf-8"))

    @property
    def spec_version(self) -> List[int]:
        """Return Metador container specification version of the container."""
        ver = cast(H5DatasetLike, self._raw[M.METADOR_VERSION_PATH])
        return list(map(int, ver[()].decode("utf-8").split(".")))

    @property
    def schemas(self):
        """Information about all schemas used for metadata objects in this container."""
        return self._schemas

    def query(
        self,
        schema: Union[str, Type[S]],
        version: Optional[SemVerTuple] = None,
        *,
        node: Optional[MetadorNode] = None,
    ) -> Iterator[MetadorNode]:
        """Return nodes that contain a metadata object compatible with the given schema."""
        schema_name, schema_ver = plugin_args(schema, version)
        if not schema_name:  # could be e.g. empty string
            msg = "A schema name, plugin reference or class must be provided!"
            raise ValueError(msg)

        start_node: MetadorNode = node or self._container["/"]

        # check start node metadata explicitly
        if (schema_name, schema_ver) in start_node.meta:
            yield start_node

        if not isinstance(start_node, H5GroupLike):
            return  # the node is not group-like, cannot be traversed down

        # collect nodes below start node recursively
        # NOTE: yielding from the collect_nodes does not work :'(
        # so we have to actually materialize the list >.<
        # but we expose only the generator interface anyway (better design)
        # (maybe consider replacing visititems with a custom traversal here)
        ret: List[MetadorNode] = []

        def collect_nodes(_, node: MetadorNode):
            if (schema_name, schema_ver) in node.meta:
                ret.append(node)

        start_node.visititems(collect_nodes)
        yield from iter(ret)

driver_type property

driver_type: MetadorDriverEnum

Return the type of the container driver.

driver property

driver: Type[MetadorDriver]

Return the container driver class used by the container.

source property

source: Any

Return data underlying thes container (file, set of files, etc. used with the driver).

container_uuid property

container_uuid: UUID

Return UUID of the container.

spec_version property

spec_version: List[int]

Return Metador container specification version of the container.

schemas property

schemas

Information about all schemas used for metadata objects in this container.

query

query(
    schema: Union[str, Type[S]],
    version: Optional[SemVerTuple] = None,
    *,
    node: Optional[MetadorNode] = None
) -> Iterator[MetadorNode]

Return nodes that contain a metadata object compatible with the given schema.

Source code in src/metador_core/container/interface.py
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
def query(
    self,
    schema: Union[str, Type[S]],
    version: Optional[SemVerTuple] = None,
    *,
    node: Optional[MetadorNode] = None,
) -> Iterator[MetadorNode]:
    """Return nodes that contain a metadata object compatible with the given schema."""
    schema_name, schema_ver = plugin_args(schema, version)
    if not schema_name:  # could be e.g. empty string
        msg = "A schema name, plugin reference or class must be provided!"
        raise ValueError(msg)

    start_node: MetadorNode = node or self._container["/"]

    # check start node metadata explicitly
    if (schema_name, schema_ver) in start_node.meta:
        yield start_node

    if not isinstance(start_node, H5GroupLike):
        return  # the node is not group-like, cannot be traversed down

    # collect nodes below start node recursively
    # NOTE: yielding from the collect_nodes does not work :'(
    # so we have to actually materialize the list >.<
    # but we expose only the generator interface anyway (better design)
    # (maybe consider replacing visititems with a custom traversal here)
    ret: List[MetadorNode] = []

    def collect_nodes(_, node: MetadorNode):
        if (schema_name, schema_ver) in node.meta:
            ret.append(node)

    start_node.visititems(collect_nodes)
    yield from iter(ret)

MetadorMeta

Interface to Metador metadata objects stored at a single HDF5 node.

Source code in src/metador_core/container/interface.py
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
class MetadorMeta:
    """Interface to Metador metadata objects stored at a single HDF5 node."""

    # helpers for __getitem__ and __setitem__

    @staticmethod
    def _require_schema(
        schema_name: str, schema_ver: Optional[SemVerTuple]
    ) -> Type[MetadataSchema]:
        """Return compatible installed schema class, if possible.

        Raises KeyError if no suitable schema was found.

        Raises TypeError if an auxiliary schema is requested.
        """
        schema_class = schemas._get_unsafe(
            schema_name, schema_ver
        )  # can raise KeyError
        if schema_class.Plugin.auxiliary:  # reject auxiliary schemas in container
            msg = f"Cannot attach instances of auxiliary schema '{schema_name}' to a node!"
            raise TypeError(msg)
        return schema_class

    @staticmethod
    def _parse_obj(
        schema: Type[S], obj: Union[str, bytes, Dict[str, Any], MetadataSchema]
    ) -> S:
        """Return original object if it is an instance of passed schema, or else parse it.

        Raises ValidationError if parsing fails.
        """
        if isinstance(obj, schema):
            return obj  # skip validation, already correct model!
        # try to convert/parse it:
        if isinstance(obj, (str, bytes)):
            return schema.parse_raw(obj)
        if isinstance(obj, MetadataSchema):
            return schema.parse_obj(obj.dict())
        else:  # dict
            return schema.parse_obj(obj)

    # raw getters and setters don't care about the environment,
    # they work only based on what objects are available and compatible
    # and do not perform validation etc.

    def _get_raw(
        self, schema_name: str, version: Optional[SemVerTuple] = None
    ) -> Optional[StoredMetadata]:
        """Return stored metadata for given schema at this node (or None).

        If a version is passed, the stored version must also be compatible.
        """
        # retrieve stored instance (if suitable)
        ret: Optional[StoredMetadata] = self._objs.get(schema_name)
        if not version:
            return ret  # no specified version -> anything goes
        # otherwise: only return if it is compatible
        req_ref: Optional[PluginRef] = None
        req_ref = schemas.PluginRef(name=schema_name, version=version)
        return ret if ret and req_ref.supports(ret.schema) else None

    def _set_raw(self, schema_ref: PluginRef, obj: MetadataSchema) -> None:
        """Store metadata object as instance of passed schema at this node."""
        # reserve UUID, construct dataset path and store metadata object
        obj_uuid = self._mc.metador._links.fresh_uuid()
        obj_path = f"{self._base_dir}/{_ep_name_for(schema_ref)}={str(obj_uuid)}"
        # store object
        self._mc.__wrapped__[obj_path] = bytes(obj)
        obj_node = self._mc.__wrapped__[obj_path]
        assert isinstance(obj_node, H5DatasetLike)
        stored_obj = StoredMetadata(uuid=obj_uuid, schema=schema_ref, node=obj_node)
        self._objs[schema_ref] = stored_obj
        # update TOC
        self._mc.metador._links.register(stored_obj)
        return

    def _del_raw(self, schema_name: str, *, _unlink: bool = True) -> None:
        """Delete stored metadata for given schema at this node."""
        # NOTE: _unlink is only for the destroy method
        stored_obj = self._objs[schema_name]
        # unregister in TOC (will also trigger clean up there)
        if _unlink:
            self._mc.metador._links.unregister(stored_obj.uuid)
        # remove metadata object
        del self._objs[stored_obj.schema.name]
        del self._mc.__wrapped__[stored_obj.node.name]
        # no metadata objects left -> remove metadata dir
        if not self._objs:
            del self._mc.__wrapped__[self._base_dir]
        return

    # helpers for container-level opertions (move, copy, delete etc)

    def _destroy(self, *, _unlink: bool = True):
        """Unregister and delete all metadata objects attached to this node."""
        # NOTE: _unlink is only set to false for node copy without metadata
        for schema_name in list(self.keys()):
            self._del_raw(schema_name, _unlink=_unlink)

    # ----

    def __init__(self, node: MetadorNode):
        self._mc: MetadorContainer = node._self_container
        """Underlying container (for convenience)."""

        self._node: MetadorNode = node
        """Underlying actual user node."""

        is_dataset = isinstance(node, H5DatasetLike)
        self._base_dir: str = M.to_meta_base_path(node.name, is_dataset)
        """Path of this metador metadata group node.

        Actual node exists iff any metadata is stored for the node.
        """

        self._objs: Dict[str, StoredMetadata] = {}
        """Information about available metadata objects."""

        # load available object metadata encoded in the node names
        meta_grp = cast(H5GroupLike, self._mc.__wrapped__.get(self._base_dir, {}))
        for obj_node in meta_grp.values():
            assert isinstance(obj_node, H5DatasetLike)
            obj = StoredMetadata.from_node(obj_node)
            self._objs[obj.schema.name] = obj

    # ----

    def keys(self) -> KeysView[str]:
        """Return names of explicitly attached metadata objects.

        Transitive parent schemas are not included.
        """
        return self._objs.keys()

    def values(self) -> ValuesView[StoredMetadata]:
        self._node._guard_acl(NodeAcl.skel_only)
        return self._objs.values()

    def items(self) -> ItemsView[str, StoredMetadata]:
        self._node._guard_acl(NodeAcl.skel_only)
        return self._objs.items()

    # ----

    def __len__(self) -> int:
        """Return number of explicitly attached metadata objects.

        Transitive parent schemas are not counted.
        """
        return len(self.keys())

    def __iter__(self) -> Iterator[str]:
        """Iterate listing schema names of all actually attached metadata objects.

        Transitive parent schemas are not included.
        """
        return iter(self.keys())

    # ----

    def query(
        self,
        schema: Union[
            str, Tuple[str, Optional[SemVerTuple]], PluginRef, Type[MetadataSchema]
        ] = "",
        version: Optional[SemVerTuple] = None,
    ) -> Iterator[PluginRef]:
        """Return schema names for which objects at this node are compatible with passed schema.

        Will also consider compatible child schema instances.

        Returned iterator will yield passed schema first, if an object is available.
        Apart from this, the order is not specified.
        """
        schema_name, schema_ver = plugin_args(schema, version)
        # no schema selected -> list everything
        if not schema_name:
            for obj in self.values():
                yield obj.schema
            return

        # try exact schema (in any compatible version, if version specified)
        if obj := self._get_raw(schema_name, schema_ver):
            yield obj.schema

        # next, try compatible child schemas of compatible versions of requested schema
        compat = set().union(
            *(
                self._mc.metador.schemas.children(ref)
                for ref in self._mc.metador.schemas.versions(schema_name, schema_ver)
            )
        )
        avail = {self._get_raw(s).schema for s in self.keys()}
        for s_ref in avail.intersection(compat):
            yield s_ref

    def __contains__(
        self,
        schema: Union[
            str, Tuple[str, Optional[SemVerTuple]], PluginRef, Type[MetadataSchema]
        ],
    ) -> bool:
        """Check whether a compatible metadata object for given schema exists.

        Will also consider compatible child schema instances.
        """
        if schema == "" or isinstance(schema, tuple) and schema[0] == "":
            return False  # empty query lists everything, here the logic is inverted!
        return next(self.query(schema), None) is not None

    @overload
    def __getitem__(self, schema: str) -> MetadataSchema:
        ...

    @overload
    def __getitem__(self, schema: Type[S]) -> S:
        ...

    def __getitem__(self, schema: Union[str, Type[S]]) -> Union[S, MetadataSchema]:
        """Like get, but will raise KeyError on failure."""
        if ret := self.get(schema):
            return ret
        raise KeyError(schema)

    @overload
    def get(
        self, schema: str, version: Optional[SemVerTuple] = None
    ) -> Optional[MetadataSchema]:
        ...

    @overload
    def get(
        self, schema: Type[S], version: Optional[SemVerTuple] = None
    ) -> Optional[S]:
        ...

    def get(
        self, schema: Union[str, Type[S]], version: Optional[SemVerTuple] = None
    ) -> Optional[Union[MetadataSchema, S]]:
        """Get a parsed metadata object matching the given schema (if it exists).

        Will also consider compatible child schema instances.
        """
        self._node._guard_acl(NodeAcl.skel_only)

        # normalize arguments
        schema_name, schema_ver = plugin_args(schema, version)

        # get a compatible schema instance that is available at this node
        compat_schema = next(self.query(schema_name, schema_ver), None)
        if not compat_schema:
            return None  # not found

        # get class of schema and parse object
        schema_class = self._require_schema(schema_name, schema_ver)
        if obj := self._get_raw(compat_schema.name, compat_schema.version):
            return cast(S, self._parse_obj(schema_class, obj.node[()]))
        return None

    def __setitem__(
        self, schema: Union[str, Type[S]], value: Union[Dict[str, Any], MetadataSchema]
    ) -> None:
        """Store metadata object as instance of given schema.

        Raises KeyError if passed schema is not installed in environment.

        Raises TypeError if passed schema is marked auxiliary.

        Raises ValueError if an object for the schema already exists.

        Raises ValidationError if passed object is not valid for the schema.
        """
        self._node._guard_acl(NodeAcl.read_only)
        schema_name, schema_ver = plugin_args(schema)

        # if self.get(schema_name, schema_ver):  # <- also subclass schemas
        # NOTE: for practical reasons let's be more lenient here and allow redundancy
        # hence only check if exact schema (modulo version) is already there
        if self._get_raw(schema_name):  # <- only same schema
            msg = f"Metadata object for schema {schema_name} already exists!"
            raise ValueError(msg)

        schema_class = self._require_schema(schema_name, schema_ver)
        checked_obj = self._parse_obj(schema_class, value)
        self._set_raw(schema_class.Plugin.ref(), checked_obj)

    def __delitem__(self, schema: Union[str, Type[MetadataSchema]]) -> None:
        """Delete metadata object explicitly stored for the passed schema.

        If a schema class is passed, its version is ignored,
        as each node may contain at most one explicit instance per schema.

        Raises KeyError if no metadata object for that schema exists.
        """
        self._node._guard_acl(NodeAcl.read_only)
        schema_name, _ = plugin_args(schema)

        if self._get_raw(schema_name) is None:
            raise KeyError(schema_name)  # no (explicit) metadata object

        self._del_raw(schema_name)

keys

keys() -> KeysView[str]

Return names of explicitly attached metadata objects.

Transitive parent schemas are not included.

Source code in src/metador_core/container/interface.py
246
247
248
249
250
251
def keys(self) -> KeysView[str]:
    """Return names of explicitly attached metadata objects.

    Transitive parent schemas are not included.
    """
    return self._objs.keys()

__len__

__len__() -> int

Return number of explicitly attached metadata objects.

Transitive parent schemas are not counted.

Source code in src/metador_core/container/interface.py
263
264
265
266
267
268
def __len__(self) -> int:
    """Return number of explicitly attached metadata objects.

    Transitive parent schemas are not counted.
    """
    return len(self.keys())

__iter__

__iter__() -> Iterator[str]

Iterate listing schema names of all actually attached metadata objects.

Transitive parent schemas are not included.

Source code in src/metador_core/container/interface.py
270
271
272
273
274
275
def __iter__(self) -> Iterator[str]:
    """Iterate listing schema names of all actually attached metadata objects.

    Transitive parent schemas are not included.
    """
    return iter(self.keys())

query

query(
    schema: Union[
        str,
        Tuple[str, Optional[SemVerTuple]],
        PluginRef,
        Type[MetadataSchema],
    ] = "",
    version: Optional[SemVerTuple] = None,
) -> Iterator[PluginRef]

Return schema names for which objects at this node are compatible with passed schema.

Will also consider compatible child schema instances.

Returned iterator will yield passed schema first, if an object is available. Apart from this, the order is not specified.

Source code in src/metador_core/container/interface.py
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
def query(
    self,
    schema: Union[
        str, Tuple[str, Optional[SemVerTuple]], PluginRef, Type[MetadataSchema]
    ] = "",
    version: Optional[SemVerTuple] = None,
) -> Iterator[PluginRef]:
    """Return schema names for which objects at this node are compatible with passed schema.

    Will also consider compatible child schema instances.

    Returned iterator will yield passed schema first, if an object is available.
    Apart from this, the order is not specified.
    """
    schema_name, schema_ver = plugin_args(schema, version)
    # no schema selected -> list everything
    if not schema_name:
        for obj in self.values():
            yield obj.schema
        return

    # try exact schema (in any compatible version, if version specified)
    if obj := self._get_raw(schema_name, schema_ver):
        yield obj.schema

    # next, try compatible child schemas of compatible versions of requested schema
    compat = set().union(
        *(
            self._mc.metador.schemas.children(ref)
            for ref in self._mc.metador.schemas.versions(schema_name, schema_ver)
        )
    )
    avail = {self._get_raw(s).schema for s in self.keys()}
    for s_ref in avail.intersection(compat):
        yield s_ref

__contains__

__contains__(
    schema: Union[
        str,
        Tuple[str, Optional[SemVerTuple]],
        PluginRef,
        Type[MetadataSchema],
    ]
) -> bool

Check whether a compatible metadata object for given schema exists.

Will also consider compatible child schema instances.

Source code in src/metador_core/container/interface.py
315
316
317
318
319
320
321
322
323
324
325
326
327
def __contains__(
    self,
    schema: Union[
        str, Tuple[str, Optional[SemVerTuple]], PluginRef, Type[MetadataSchema]
    ],
) -> bool:
    """Check whether a compatible metadata object for given schema exists.

    Will also consider compatible child schema instances.
    """
    if schema == "" or isinstance(schema, tuple) and schema[0] == "":
        return False  # empty query lists everything, here the logic is inverted!
    return next(self.query(schema), None) is not None

__getitem__

__getitem__(
    schema: Union[str, Type[S]]
) -> Union[S, MetadataSchema]

Like get, but will raise KeyError on failure.

Source code in src/metador_core/container/interface.py
337
338
339
340
341
def __getitem__(self, schema: Union[str, Type[S]]) -> Union[S, MetadataSchema]:
    """Like get, but will raise KeyError on failure."""
    if ret := self.get(schema):
        return ret
    raise KeyError(schema)

get

get(
    schema: Union[str, Type[S]],
    version: Optional[SemVerTuple] = None,
) -> Optional[Union[MetadataSchema, S]]

Get a parsed metadata object matching the given schema (if it exists).

Will also consider compatible child schema instances.

Source code in src/metador_core/container/interface.py
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
def get(
    self, schema: Union[str, Type[S]], version: Optional[SemVerTuple] = None
) -> Optional[Union[MetadataSchema, S]]:
    """Get a parsed metadata object matching the given schema (if it exists).

    Will also consider compatible child schema instances.
    """
    self._node._guard_acl(NodeAcl.skel_only)

    # normalize arguments
    schema_name, schema_ver = plugin_args(schema, version)

    # get a compatible schema instance that is available at this node
    compat_schema = next(self.query(schema_name, schema_ver), None)
    if not compat_schema:
        return None  # not found

    # get class of schema and parse object
    schema_class = self._require_schema(schema_name, schema_ver)
    if obj := self._get_raw(compat_schema.name, compat_schema.version):
        return cast(S, self._parse_obj(schema_class, obj.node[()]))
    return None

__setitem__

__setitem__(
    schema: Union[str, Type[S]],
    value: Union[Dict[str, Any], MetadataSchema],
) -> None

Store metadata object as instance of given schema.

Raises KeyError if passed schema is not installed in environment.

Raises TypeError if passed schema is marked auxiliary.

Raises ValueError if an object for the schema already exists.

Raises ValidationError if passed object is not valid for the schema.

Source code in src/metador_core/container/interface.py
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
def __setitem__(
    self, schema: Union[str, Type[S]], value: Union[Dict[str, Any], MetadataSchema]
) -> None:
    """Store metadata object as instance of given schema.

    Raises KeyError if passed schema is not installed in environment.

    Raises TypeError if passed schema is marked auxiliary.

    Raises ValueError if an object for the schema already exists.

    Raises ValidationError if passed object is not valid for the schema.
    """
    self._node._guard_acl(NodeAcl.read_only)
    schema_name, schema_ver = plugin_args(schema)

    # if self.get(schema_name, schema_ver):  # <- also subclass schemas
    # NOTE: for practical reasons let's be more lenient here and allow redundancy
    # hence only check if exact schema (modulo version) is already there
    if self._get_raw(schema_name):  # <- only same schema
        msg = f"Metadata object for schema {schema_name} already exists!"
        raise ValueError(msg)

    schema_class = self._require_schema(schema_name, schema_ver)
    checked_obj = self._parse_obj(schema_class, value)
    self._set_raw(schema_class.Plugin.ref(), checked_obj)

__delitem__

__delitem__(
    schema: Union[str, Type[MetadataSchema]]
) -> None

Delete metadata object explicitly stored for the passed schema.

If a schema class is passed, its version is ignored, as each node may contain at most one explicit instance per schema.

Raises KeyError if no metadata object for that schema exists.

Source code in src/metador_core/container/interface.py
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
def __delitem__(self, schema: Union[str, Type[MetadataSchema]]) -> None:
    """Delete metadata object explicitly stored for the passed schema.

    If a schema class is passed, its version is ignored,
    as each node may contain at most one explicit instance per schema.

    Raises KeyError if no metadata object for that schema exists.
    """
    self._node._guard_acl(NodeAcl.read_only)
    schema_name, _ = plugin_args(schema)

    if self._get_raw(schema_name) is None:
        raise KeyError(schema_name)  # no (explicit) metadata object

    self._del_raw(schema_name)

ContainerProxy

Bases: Protocol[T]

Abstract interface for Metador container providers.

This interface acts like a proxy to access containers by some identifier.

The identifier type parameter T is in the simplest case the Metador container UUID. In more complex cases, it could be a different unique identifier with a non-trivial relationship to Metador container UUIDs (many-to-many). Therefore, T is implementation-specific.

There are many ways to store and organize containers, this interface serves as the implementation target for generic service components such as container-centric Flask blueprints, so they can be easier reused in different backends and services.

Note that only containment and retrieval are possible - on purpose. Knowing and iterating over all containers in a system is not always possible.

Source code in src/metador_core/container/provider.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class ContainerProxy(Protocol[T]):
    """Abstract interface for Metador container providers.

    This interface acts like a proxy to access containers by some identifier.

    The identifier type parameter T is in the simplest case the Metador
    container UUID. In more complex cases, it could be a different unique
    identifier with a non-trivial relationship to Metador container UUIDs
    (many-to-many). Therefore, T is implementation-specific.

    There are many ways to store and organize containers, this interface serves
    as the implementation target for generic service components such as
    container-centric Flask blueprints, so they can be easier reused in
    different backends and services.

    Note that only containment and retrieval are possible - on purpose.
    Knowing and iterating over all containers in a system is not always possible.
    """

    def __contains__(self, key: T) -> bool:
        """Return whether a resource key is known to the proxy."""
        # return self.get(key) is not None

    def get(self, key: T) -> Optional[MetadorContainer]:
        """Get a container instance, if resource key is known to the proxy.

        Implement this method in subclasses to support the minimal interface.
        """

    def __getitem__(self, key: T) -> T:
        """Get a container instance, if resource key is known to the proxy.

        Default implementation is in terms of `get`.
        """
        if ret := self.get(key):
            return ret
        raise KeyError(key)

__contains__

__contains__(key: T) -> bool

Return whether a resource key is known to the proxy.

Source code in src/metador_core/container/provider.py
28
29
def __contains__(self, key: T) -> bool:
    """Return whether a resource key is known to the proxy."""

get

get(key: T) -> Optional[MetadorContainer]

Get a container instance, if resource key is known to the proxy.

Implement this method in subclasses to support the minimal interface.

Source code in src/metador_core/container/provider.py
32
33
34
35
36
def get(self, key: T) -> Optional[MetadorContainer]:
    """Get a container instance, if resource key is known to the proxy.

    Implement this method in subclasses to support the minimal interface.
    """

__getitem__

__getitem__(key: T) -> T

Get a container instance, if resource key is known to the proxy.

Default implementation is in terms of get.

Source code in src/metador_core/container/provider.py
38
39
40
41
42
43
44
45
def __getitem__(self, key: T) -> T:
    """Get a container instance, if resource key is known to the proxy.

    Default implementation is in terms of `get`.
    """
    if ret := self.get(key):
        return ret
    raise KeyError(key)

MetadorContainer

Bases: MetadorGroup

Wrapper class adding Metador container interface to h5py.File-like objects.

The wrapper ensures that any actions done to IH5Records through this interface also work with plain h5py.Files.

There are no guarantees about behaviour with h5py methods not supported by IH5Records.

Given old: MetadorContainer, MetadorContainer(old.data_source, driver=old.data_driver) should be able to construct another object to access the same data (assuming it is not locked).

Source code in src/metador_core/container/wrappers.py
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
class MetadorContainer(MetadorGroup):
    """Wrapper class adding Metador container interface to h5py.File-like objects.

    The wrapper ensures that any actions done to IH5Records through this interface
    also work with plain h5py.Files.

    There are no guarantees about behaviour with h5py methods not supported by IH5Records.

    Given `old: MetadorContainer`, `MetadorContainer(old.data_source, driver=old.data_driver)`
    should be able to construct another object to access the same data (assuming it is not locked).
    """

    __wrapped__: H5FileLike

    _self_SUPPORTED: Set[str] = {"mode", "flush", "close"}

    # ---- new container-level interface ----

    _self_toc: MetadorContainerTOC

    @property
    def metador(self) -> MetadorContainerTOC:
        """Access interface to Metador metadata object index."""
        return self._self_toc

    def __init__(
        self,
        name_or_obj: Union[MetadorDriver, Any],
        mode: Optional[OpenMode] = "r",
        *,
        # NOTE: driver takes class instead of enum to also allow subclasses
        driver: Optional[Type[MetadorDriver]] = None,
    ):
        """Initialize a MetadorContainer instance from file(s) or a supported object.

        The `mode` argument is ignored when simply wrapping an object.

        If a data source such as a path is passed, will instantiate the object first,
        using the default H5File driver or the passed `driver` keyword argument.
        """
        # wrap the h5file-like object (will set self.__wrapped__)
        super().__init__(self, to_h5filelike(name_or_obj, mode, driver=driver))
        # initialize metador-specific stuff
        self._self_toc = MetadorContainerTOC(self)

    # not clear if we want these in the public interface. keep this private for now:

    # def _find_orphan_meta(self) -> List[str]:
    #     """Return list of paths to metadata that has no corresponding user node anymore."""
    #     ret: List[str] = []

    #     def collect_orphans(name: str):
    #         if M.is_meta_base_path(name):
    #             if M.to_data_node_path(name) not in self:
    #                 ret.append(name)

    #     self.__wrapped__.visit(collect_orphans)
    #     return ret

    # def _repair(self, remove_orphans: bool = False):
    #     """Repair container structure on best-effort basis.

    #     This will ensure that the TOC points to existing metadata objects
    #     and that all metadata objects are listed in the TOC.

    #     If remove_orphans is set, will erase metadata not belonging to an existing node.

    #     Notice that missing schema plugin dependency metadata cannot be restored.
    #     """
    #     if remove_orphans:
    #         for path in self._find_orphan_meta():
    #             del self.__wrapped__[path]
    #     self.toc._links.find_broken(repair=True)
    #     missing = self.toc._links._find_missing("/")
    #     self.toc._links.repair_missing(missing)

    # ---- pass through HDF5 group methods to a wrapped root group instance ----

    def __getattr__(self, key: str):
        if key in self._self_SUPPORTED:
            return getattr(self.__wrapped__, key)
        return super().__getattr__(key)  # ask group for method

    # context manager: return the wrapper back, not the raw thing:

    def __enter__(self):
        self.__wrapped__.__enter__()
        return self

    def __exit__(self, *args):
        return self.__wrapped__.__exit__(*args)

    # we want these also to be forwarded to the wrapped group, not the raw object:

    def __dir__(self):
        return list(set(super().__dir__()).union(type(self).__dict__.keys()))

    # make wrapper transparent:

    def __repr__(self) -> str:
        return repr(self.__wrapped__)  # shows that its a File, not just a Group

metador property

metador: MetadorContainerTOC

Access interface to Metador metadata object index.

__init__

__init__(
    name_or_obj: Union[MetadorDriver, Any],
    mode: Optional[OpenMode] = "r",
    *,
    driver: Optional[Type[MetadorDriver]] = None
)

Initialize a MetadorContainer instance from file(s) or a supported object.

The mode argument is ignored when simply wrapping an object.

If a data source such as a path is passed, will instantiate the object first, using the default H5File driver or the passed driver keyword argument.

Source code in src/metador_core/container/wrappers.py
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
def __init__(
    self,
    name_or_obj: Union[MetadorDriver, Any],
    mode: Optional[OpenMode] = "r",
    *,
    # NOTE: driver takes class instead of enum to also allow subclasses
    driver: Optional[Type[MetadorDriver]] = None,
):
    """Initialize a MetadorContainer instance from file(s) or a supported object.

    The `mode` argument is ignored when simply wrapping an object.

    If a data source such as a path is passed, will instantiate the object first,
    using the default H5File driver or the passed `driver` keyword argument.
    """
    # wrap the h5file-like object (will set self.__wrapped__)
    super().__init__(self, to_h5filelike(name_or_obj, mode, driver=driver))
    # initialize metador-specific stuff
    self._self_toc = MetadorContainerTOC(self)

MetadorDataset

Bases: MetadorNode

Metador wrapper for a HDF5 Dataset.

Source code in src/metador_core/container/wrappers.py
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
class MetadorDataset(MetadorNode):
    """Metador wrapper for a HDF5 Dataset."""

    __wrapped__: H5DatasetLike

    # manually assembled from public methods which h5py.Dataset provides
    _self_RO_FORBIDDEN = {"resize", "make_scale", "write_direct", "flush"}

    def __getattr__(self, key):
        if self.acl[NodeAcl.read_only] and key in self._self_RO_FORBIDDEN:
            self._guard_acl(NodeAcl.read_only, key)
        if self.acl[NodeAcl.skel_only] and key == "get":
            self._guard_acl(NodeAcl.skel_only, key)

        return getattr(self.__wrapped__, key)

    # prevent getter of node if marked as skel_only
    def __getitem__(self, *args, **kwargs):
        self._guard_acl(NodeAcl.skel_only, "__getitem__")
        return self.__wrapped__.__getitem__(*args, **kwargs)

    # prevent mutating method calls of node is marked as read_only

    def __setitem__(self, *args, **kwargs):
        self._guard_acl(NodeAcl.read_only, "__setitem__")
        return self.__wrapped__.__setitem__(*args, **kwargs)

MetadorGroup

Bases: MetadorNode

Wrapper for a HDF5 Group.

Source code in src/metador_core/container/wrappers.py
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
class MetadorGroup(MetadorNode):
    """Wrapper for a HDF5 Group."""

    __wrapped__: H5GroupLike

    def _destroy_meta(self, _unlink: bool = True):
        """Destroy all attached metadata at and below this node (recursively)."""
        super()._destroy_meta(_unlink=_unlink)  # this node
        for child in self.values():  # recurse
            child._destroy_meta(_unlink=_unlink)

    # these access entities in read-only way:

    get = _wrap_method("get", is_read_only_method=True)
    __getitem__ = _wrap_method("__getitem__", is_read_only_method=True)

    # these just create new entities with no metadata attached:

    create_group = _wrap_method("create_group")
    require_group = _wrap_method("require_group")
    create_dataset = _wrap_method("create_dataset")
    require_dataset = _wrap_method("require_dataset")

    def __setitem__(self, name, value):
        if any(map(lambda x: isinstance(value, x), _H5_REF_TYPES)):
            raise ValueError(f"Unsupported reference type: {type(value).__name__}")

        return _wrap_method("__setitem__")(self, name, value)

    # following all must be filtered to hide metador-specific structures:

    # must wrap nodes passed into the callback function and filter visited names
    def visititems(self, func):
        def wrapped_func(name, node):
            if M.is_internal_path(node.name):
                return  # skip path/node
            return func(name, self._wrap_if_node(node))

        return self.__wrapped__.visititems(wrapped_func)  # RAW

    # paths passed to visit also must be filtered, so must override this one too
    def visit(self, func):
        def wrapped_func(name, _):
            return func(name)

        return self.visititems(wrapped_func)

    # following also depend on the filtered sequence, directly
    # filter the items, derive other related functions based on that

    def items(self):
        for k, v in self.__wrapped__.items():
            if v is None:
                # NOTE: e.g. when nodes are deleted/moved during iteration,
                # v can suddenly be None -> we need to catch this case!
                continue
            if not M.is_internal_path(v.name):
                yield (k, self._wrap_if_node(v))

    def values(self):
        return map(lambda x: x[1], self.items())

    def keys(self):
        return map(lambda x: x[0], self.items())

    def __iter__(self):
        return iter(self.keys())

    def __len__(self):
        return len(list(self.keys()))

    def __contains__(self, name: str):
        self._guard_path(name)
        if name[0] == "/" and self.name != "/":
            return name in self["/"]
        segs = name.lstrip("/").split("/")
        has_first_seg = segs[0] in self.keys()
        if len(segs) == 1:
            return has_first_seg
        else:
            if nxt := self.get(segs[0]):
                return "/".join(segs[1:]) in nxt
            return False

    # these we can take care of but are a bit more tricky to think through

    def __delitem__(self, name: str):
        self._guard_acl(NodeAcl.read_only, "__delitem__")
        self._guard_path(name)

        node = self[name]
        # clean up metadata (recursively, if a group)
        node._destroy_meta()
        # kill the actual data
        return _wrap_method("__delitem__")(self, name)

    def move(self, source: str, dest: str):
        self._guard_acl(NodeAcl.read_only, "move")
        self._guard_path(source)
        self._guard_path(dest)

        src_metadir = self[source].meta._base_dir
        # if actual data move fails, an exception will prevent the rest
        self.__wrapped__.move(source, dest)  # RAW

        # if we're here, no problems -> proceed with moving metadata
        dst_node = self[dest]
        if isinstance(dst_node, MetadorDataset):
            dst_metadir = dst_node.meta._base_dir
            # dataset has its metadata stored in parallel -> need to take care of it
            meta_base = dst_metadir
            if src_metadir in self.__wrapped__:  # RAW
                self.__wrapped__.move(src_metadir, dst_metadir)  # RAW
        else:
            # directory where to fix up metadata object TOC links
            # when a group was moved, all metadata is contained in dest -> search it
            meta_base = dst_node.name

        # re-link metadata object TOC links
        if meta_base_node := self.__wrapped__.get(meta_base):
            assert isinstance(meta_base_node, H5GroupLike)
            missing = self._self_container.metador._links.find_missing(meta_base_node)
            self._self_container.metador._links.repair_missing(missing, update=True)

    def copy(
        self,
        source: Union[str, MetadorGroup, MetadorDataset],
        dest: Union[str, MetadorGroup],
        **kwargs,
    ):
        self._guard_acl(NodeAcl.read_only, "copy")

        # get source node and its name without the path and its type
        src_node: MetadorNode
        if isinstance(source, str):
            self._guard_path(source)
            src_node = self[source]
        elif isinstance(source, MetadorNode):
            src_node = source
        else:
            raise ValueError("Copy source must be path, Group or Dataset!")
        src_is_dataset: bool = isinstance(src_node, MetadorDataset)
        src_name: str = src_node.name.split("/")[-1]
        # user can override name at target
        dst_name: str = kwargs.pop("name", src_name)

        # fix up target path
        dst_path: str
        if isinstance(dest, str):
            self._guard_path(dest)
            dst_path = dest
        elif isinstance(dest, MetadorGroup):
            dst_path = dest.name + f"/{dst_name}"
        else:
            raise ValueError("Copy dest must be path or Group!")

        # get other allowed options
        without_attrs: bool = kwargs.pop("without_attrs", False)
        without_meta: bool = kwargs.pop("without_meta", False)
        if kwargs:
            raise ValueError(f"Unknown keyword arguments: {kwargs}")

        # perform copy
        copy_kwargs = {
            "name": None,
            "shallow": False,
            "expand_soft": True,
            "expand_external": True,
            "expand_refs": True,
            "without_attrs": without_attrs,
        }
        self.__wrapped__.copy(source, dst_path, **copy_kwargs)  # RAW
        dst_node = self[dst_path]  # exists now

        if src_is_dataset and not without_meta:
            # because metadata lives in parallel group, need to copy separately:
            src_meta: str = src_node.meta._base_dir
            dst_meta: str = dst_node.meta._base_dir  # node will not exist yet
            self.__wrapped__.copy(src_meta, dst_meta, **copy_kwargs)  # RAW

            # register in TOC:
            dst_meta_node = self.__wrapped__[dst_meta]
            assert isinstance(dst_meta_node, H5GroupLike)
            missing = self._self_container.metador._links.find_missing(dst_meta_node)
            self._self_container.metador._links.repair_missing(missing)

        if not src_is_dataset:
            if without_meta:
                # need to destroy copied metadata copied with the source group
                # but keep TOC links (they point to original copy!)
                dst_node._destroy_meta(_unlink=False)
            else:
                # register copied metadata objects under new uuids
                missing = self._self_container.metador._links.find_missing(dst_node)
                self._self_container.metador._links.repair_missing(missing)

    def __getattr__(self, key):
        if hasattr(self.__wrapped__, key):
            raise UnsupportedOperationError(key)  # deliberately unsupported
        else:
            msg = f"'{type(self).__name__}' object has no attribute '{key}'"
            raise AttributeError(msg)

MetadorNode

Bases: ObjectProxy

Wrapper for h5py and IH5 Groups and Datasets providing Metador-specific features.

In addition to the Metadata management, also provides helpers to reduce possible mistakes in implementing interfaces by allowing to mark nodes as

  • read_only (regardless of the writability of the underlying opened container) and
  • local_only (preventing access to (meta)data above this node)

Note that these are "soft" restrictions to prevent errors and can be bypassed.

Source code in src/metador_core/container/wrappers.py
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
class MetadorNode(wrapt.ObjectProxy):
    """Wrapper for h5py and IH5 Groups and Datasets providing Metador-specific features.

    In addition to the Metadata management, also provides helpers to reduce possible
    mistakes in implementing interfaces by allowing to mark nodes as

    * read_only (regardless of the writability of the underlying opened container) and
    * local_only (preventing access to (meta)data above this node)

    Note that these are "soft" restrictions to prevent errors and can be bypassed.
    """

    __wrapped__: H5NodeLike

    @staticmethod
    def _parse_access_flags(kwargs) -> NodeAclFlags:
        # NOTE: mutating kwargs, removes keys that are inspected!
        return {flag: kwargs.pop(flag.name, False) for flag in iter(NodeAcl)}

    def __init__(self, mc: MetadorContainer, node: H5NodeLike, **kwargs):
        flags = self._parse_access_flags(kwargs)
        lp = kwargs.pop("local_parent", None)
        if kwargs:
            raise ValueError(f"Unknown keyword arguments: {kwargs}")

        super().__init__(node)
        self._self_container: MetadorContainer = mc

        self._self_flags: NodeAclFlags = flags
        self._self_local_parent: Optional[MetadorGroup] = lp

    def _child_node_kwargs(self):
        """Return kwargs to be passed to a child node.

        Ensures that {read,skel,local}_only status is passed down correctly.
        """
        return {
            "local_parent": self if self.acl[NodeAcl.local_only] else None,
            **{k.name: v for k, v in self.acl.items() if v},
        }

    def restrict(self, **kwargs) -> MetadorNode:
        """Restrict this object to be local_only or read_only.

        Pass local_only=True and/or read_only=True to enable the restriction.

        local_only means that the node may not access the parent or file objects.
        read_only means that mutable actions cannot be done (even if container is mutable).
        """
        added_flags = self._parse_access_flags(kwargs)
        if added_flags[NodeAcl.local_only]:
            # was set as local explicitly for this node ->
            self._self_local_parent = None  # remove its ability to go up

        # can only set, but not unset!
        self._self_flags.update({k: True for k, v in added_flags.items() if v})
        if kwargs:
            raise ValueError(f"Unknown keyword arguments: {kwargs}")

        return self

    @property
    def acl(self) -> Dict[NodeAcl, bool]:
        """Return ACL flags of current node."""
        return dict(self._self_flags)

    def _guard_path(self, path: str):
        if M.is_internal_path(path):
            msg = f"Trying to use a Metador-internal path: '{path}'"
            raise ValueError(msg)
        if self.acl[NodeAcl.local_only] and path[0] == "/":
            msg = f"Node is marked as local_only, cannot use absolute path '{path}'!"
            raise ValueError(msg)

    def _guard_acl(self, flag: NodeAcl, method: str = "this method"):
        if self.acl[flag]:
            msg = f"Cannot use {method}, the node is marked as {flag.name}!"
            raise UnsupportedOperationError(msg)

    # helpers

    def _wrap_if_node(self, val):
        """Wrap value into a metador node wrapper, if it is a suitable group or dataset."""
        if isinstance(val, H5GroupLike):
            return MetadorGroup(self._self_container, val, **self._child_node_kwargs())
        elif isinstance(val, H5DatasetLike):
            return MetadorDataset(
                self._self_container, val, **self._child_node_kwargs()
            )
        else:
            return val

    def _destroy_meta(self, _unlink: bool = True):
        """Destroy all attached metadata at and below this node."""
        self.meta._destroy(_unlink=_unlink)

    # need that to add our new methods

    def __dir__(self):
        names = set.union(
            *map(
                lambda x: set(x.__dict__.keys()),
                takewhile(lambda x: issubclass(x, MetadorNode), type(self).mro()),
            )
        )
        return list(set(super().__dir__()).union(names))

    # make wrapper transparent

    def __repr__(self):
        return repr(self.__wrapped__)

    # added features

    @property
    def meta(self) -> MetadorMeta:
        """Access the interface to metadata attached to this node."""
        return MetadorMeta(self)

    @property
    def metador(self) -> MetadorContainerTOC:
        """Access the info about the container this node belongs to."""
        return WithDefaultQueryStartNode(self._self_container.metador, self)

    # wrap existing methods as needed

    @property
    def name(self) -> str:
        return self.__wrapped__.name  # just for type checker not to complain

    @property
    def attrs(self):
        if self.acl[NodeAcl.read_only] or self.acl[NodeAcl.skel_only]:
            return WrappedAttributeManager(self.__wrapped__.attrs, self.acl)
        return self.__wrapped__.attrs

    @property
    def parent(self) -> MetadorGroup:
        if self.acl[NodeAcl.local_only]:
            # allow child nodes of local-only nodes to go up to the marked parent
            # (or it is None, if this is the local root)
            if lp := self._self_local_parent:
                return lp
            else:
                # raise exception (illegal non-local access)
                self._guard_acl(NodeAcl.local_only, "parent")

        return MetadorGroup(
            self._self_container,
            self.__wrapped__.parent,
            **self._child_node_kwargs(),
        )

    @property
    def file(self) -> MetadorContainer:
        if self.acl[NodeAcl.local_only]:
            # raise exception (illegal non-local access)
            self._guard_acl(NodeAcl.local_only, "parent")
        return self._self_container

acl property

acl: Dict[NodeAcl, bool]

Return ACL flags of current node.

meta property

meta: MetadorMeta

Access the interface to metadata attached to this node.

metador property

metador: MetadorContainerTOC

Access the info about the container this node belongs to.

restrict

restrict(**kwargs) -> MetadorNode

Restrict this object to be local_only or read_only.

Pass local_only=True and/or read_only=True to enable the restriction.

local_only means that the node may not access the parent or file objects. read_only means that mutable actions cannot be done (even if container is mutable).

Source code in src/metador_core/container/wrappers.py
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
def restrict(self, **kwargs) -> MetadorNode:
    """Restrict this object to be local_only or read_only.

    Pass local_only=True and/or read_only=True to enable the restriction.

    local_only means that the node may not access the parent or file objects.
    read_only means that mutable actions cannot be done (even if container is mutable).
    """
    added_flags = self._parse_access_flags(kwargs)
    if added_flags[NodeAcl.local_only]:
        # was set as local explicitly for this node ->
        self._self_local_parent = None  # remove its ability to go up

    # can only set, but not unset!
    self._self_flags.update({k: True for k, v in added_flags.items() if v})
    if kwargs:
        raise ValueError(f"Unknown keyword arguments: {kwargs}")

    return self