Skip to content

(Draft) New Data Model#480

Draft
SKernchen wants to merge 285 commits into
developfrom
refactor/data-model
Draft

(Draft) New Data Model#480
SKernchen wants to merge 285 commits into
developfrom
refactor/data-model

Conversation

@SKernchen

Copy link
Copy Markdown
Contributor

This pull request is not ready yet. It is only for showing the process

led02 and others added 30 commits August 1, 2025 21:16
The conversion tests (`_to_python` and `_to_expanded_json`) are pretty long now...
Pull in relevant changes from `develop`
Disable tests that fail due to data model
…or/381-test-ld_container

# Conflicts:
#	poetry.lock
Update poetry.lock for added test dependencies.

@zyzzyxdonta zyzzyxdonta left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some first very high-level comments. Will continue the review another day.

Comment thread conftest.py

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a top-level conftest exist? I think this belongs into test/ or test/hermes_test/

Comment thread pyproject.toml
Comment thread pyproject.toml Outdated

@zyzzyxdonta zyzzyxdonta Jun 3, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Without having read any of the docs,) I find the jargon strange in this module. The module is called context_manager. It contains HermesCache which is a context manager, and HermesContext which is not a context manager but returns context managers. Maybe this could be cleared up by renaming the module. Maybe nothing with context in it as this term is already used for the context of the JSON-LD object.

from .error import HermesContextError


class HermesCache:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this class is only ever instantiated in HermesContext. So it might be a good idea to add a docstring to the module that explains how to get such a cache object (i.e. via HermesContext).

Comment thread src/hermes/model/error.py Outdated
Comment thread src/hermes/model/error.py Outdated
Comment thread src/hermes/model/merge/container.py Outdated
"""

@classmethod
def vocabulary(cls, base_url: str = "http://spam.eggs/") -> dict:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only example.org and example.com are safe test domains

dict[str, Union["JSON_LD_VALUE", BASIC_TYPE, TIME_TYPE, "ld_dict", "ld_list"]],
]
""" Type description of valid JSON_LD objects that are partially represented by ld_containers """
PYTHONIZED_LD_CONTAINER: TypeAlias = Union[

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does pythonized mean? 😅 The Python representation of the thing?

Comment on lines +107 to +109
for mapping in active_ctx["mappings"].values():
if "@container" in mapping and long_iri:
value = {x: "none" for x in mapping["@container"]}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this is for.

The and long_iri I guess saves work if no IRI is given. But this could be expressed more clearly by doing if long_iri is None: return long_iri at the beginning of the function, just like _compact_iri() does it.

value overwritten in each new iteration in each iteration because it @container can only exist once? In that case, this would be clearer by breaking out of the loop.

But what does "none" do?

I think this whole module goes deep into pyld internals and a couple of comments would be helpful to understand what is going on.

Comment thread docs/source/dev/data_model.md Outdated
Comment thread docs/source/dev/data_model.md Outdated
Comment thread docs/source/dev/data_model.md Outdated
Comment thread docs/source/dev/data_model.md Outdated
Comment thread poetry.lock

@zyzzyxdonta zyzzyxdonta Jun 4, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the given poetry.lock, the docs don't compile. A lot of our packages are old and incompatible with the ones that aren't pinned. I got it to work like this:

diff --git i/pyproject.toml w/pyproject.toml
index 17ea508..932427b 100644
--- i/pyproject.toml
+++ w/pyproject.toml
@@ -92,13 +92,13 @@ pytest-httpserver = "^1.1.5"
 optional = true
 
 [tool.poetry.group.docs.dependencies]
-Sphinx = "^6.2.1"
+Sphinx = "^8.0.0"
 # Sphinx - Additional modules
-myst-parser = "^2.0.0"
+myst-parser = "^4.0.0"
 sphinx-book-theme = "^1.0.1"
 sphinx-favicon = "^0.2"
 sphinxcontrib-contentui = "^0.2.5"
-sphinxcontrib-images = "^0.9.4"
+sphinxcontrib-images = "^1.0.1"
 sphinx-icon = "^0.1.2"
 sphinx-autobuild = "^2021.3.14"
 sphinx-autoapi = "^3.0.0"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove 'sphinxemoji' too or constrain setuptools to <82 (the dependency pkg_resources (deprecated) was removed in setuptools version 82 but was necessary for 'sphinxemoji').

Also building the docs loally always (at least for me) results in errors from myst (failed linkings).
This has been an issue for longer though, but I haven't been able to fix it so far.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, the setuptools change was exactly the same problem I had. I didn't have any problems with sphinxemoji though. Strange ... 🤔

Yes, the errors from myst have been there for a while. I think it's mostly unused or missing references/labels. I haven't looked into it.

:caption: Injecting additional schemas
from hermes.model import SoftwareMetadata

# Contents served at https://bar.net/schema.jsonld:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only use reserved example domains! https://en.wikipedia.org/wiki/Example.com

it will always be returned in a **list**-like object!
```

The reason for providing data in list-like objects is that JSON-LD treats all property values as arrays.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the related section in the JSON-LD spec would be nice

Comment thread docs/source/dev/data_model.md Outdated
Python data:

```{code-block} python
:caption: Naive containment assertion that raises

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't though. The mock output is from the path where the assertion holds.

Comment thread docs/source/dev/data_model.md Outdated
Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md
Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md Outdated
Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md Outdated
Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md Outdated
Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md Outdated
}
```

HERMES would use the {py:class}`~hermes.model.merge.action.Reject` strategy for merging values of the key `full_property_iri` in objects of type `full_type_iri`. (A key in strategies being `None` instead of a string indicates to HERMES that its value is to be used as a default [i.e. if no more specific entry exists].)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels hacky. Have you considered collections.defaultdict?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole file feels more like a reference than a tutorial. A tutorial should be "learning by doing". An example could be having the reader re-build the git-plugin from scratch.

See also: https://diataxis.fr/tutorials/

Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md Outdated
Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md Outdated
Comment thread docs/source/tutorials/writing-a-plugin-for-hermes.md Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants