-
Notifications
You must be signed in to change notification settings - Fork 577
Description
The class heirarchy for rdflib.namespace
and rdflib.term
is introducing errors in implementation.
- object
- rdflib.namespace.ClosedNamespace
- rdflib.namespace._RDFNamespace
- rdflib.term.Node
- rdflib.term.Identifier
- rdflib.term.BNode
- rdflib.term.Literal
- rdflib.term.URIRef
- rdflib.term.Genid
- rdflib.term.RDFLibGenid
- str
- rdflib.namespace.Namespace
- rdflib.namespace.URIPattern
For example, the lack of relationship between Namespace
and ClosedNamespace
causes multiple definitions of term
, __getitem__
, __getattr__
, and (with PR#1237) __contains__
.
Additionally, some of the convenience of Namespace
inheriting from str
is now gone, so the ClosedNamespace
loses some portability. For example, to show ref = URIRef(...)
membership in a ns = Namespace(...)
, the code would be ref.startswith(ns)
. However, with a ClosedNamespace
, this is an error and the code is ref.startswith(cns.uri)
(ignoring for a moment validity of the membership relationship - just checking prefixes).
First Step
Refactoring the Namespace
and ClosedNamespace
heirarchy should be easy without destroying existing functionality.
Proposed class heirarchy:
- str
- rdflib.namespace.Namespace
- rdflib.namespace.ClosedNamespace
- rdflib.namespace._RDFNamespace
This should add some new functions to the ClosedNamespace
from the str
type.
This may already have an open PR in #1213. However, the implementation there doesn't address duplicate implementation of __getitem__
, __getattr__
, and term
.
Next Steps
Investigate issues with the URIRef
representation. Identifier
currently has multiple inheritence from Node
and text_type
. Double check representation.
There is a lot of added complexity coming from Python 2 support and six
. Removing six
and the now-EOL Python 2 compatibility could simplify long-term maintenance. (#1014 polled on this for 6.0.0, but there doesn't seem to be an issue open to remove it and coordinate?) We'd also be inheriting from stdlib str
and not text_type
!
Scope Creep
I'm unsure of what the proper granularity for a ticket in rdflib
is, but if we want to add some outer bounds for scope: adding type annotations to the code base may help tremendously with finding errors like this.
Also, I'd like to contribute, not just make work for other people :) I've got dev cycles I can put on rdflib
and am happy to discuss the best direction for my energy with the maintainers. Already meeting to discuss some work on PySHACL soon. If there's a public roadmap beyond the 6.0.0 tag or Wiki, I'd love to learn more!