Skip to content

Conversation

@rubensworks
Copy link
Member

@rubensworks rubensworks commented Sep 25, 2025

While #194 introduced sameValue and sameTerm support for triple terms, but <, <=, >, and >= were missing, so this PR adds them.
It also adds ORDER BY support for triple terms.

The same approach as the RDF-star spec was followed here, with some small tweaks.

Thanks to the keen eye of @jitsedesmet for noticing this was missing!


Preview | Diff

@rubensworks rubensworks requested review from Tpt, afs, hartig and kasei September 25, 2025 10:18
@rubensworks rubensworks added the spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature label Sep 25, 2025
Copy link
Contributor

@hartig hartig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the suggestions in my review, I propose to remove the sentence about triple terms that is after the operator mapping table in Sec.17.3.

@kasei
Copy link
Contributor

kasei commented Sep 25, 2025

Aside from bringing ORDER BY support, I don't think this PR makes sense. We don't define < or > for IRIs or blank nodes, so the term-wise comparisons in compareTripleTerm won't work for at least subjects and predicates (and at least the non-literal objects). I think this means that compareTripleTerm can never return anything but 0 or raise an error.

A smaller issue (if we ignore the above) is that the signature of compareTripleTerm in 17.4.2.3 uses term1 and term2, but the definition uses A and B.

@rubensworks
Copy link
Member Author

We don't define < or > for IRIs or blank nodes, so the term-wise comparisons in compareTripleTerm won't work for at least subjects and predicates (and at least the non-literal objects). I think this means that compareTripleTerm can never return anything but 0 or raise an error.

Correct, for subjects and predicates, it can only return 0 (since equality is defined for IRIs and bnodes) or error.
This means comparisons such as compareTripleTerm(<<(:s :p 123)>>, <<(:s :p 124)>>) become possible.
We already have spec tests with this behaviour (see the op-* tests), so this PR ensures this behaviour is properly defined in the spec.
If we would not go forward with this PR, these spec tests will have to be retracted.

This change also offers a baseline for systems that offer operator extensions for defining < and > for IRIs and blank nodes.

@kasei
Copy link
Contributor

kasei commented Sep 25, 2025

Correct, for subjects and predicates, it can only return 0 (since equality is defined for IRIs and bnodes) or error.
This means comparisons such as compareTripleTerm(<<(:s :p 123)>>, <<(:s :p 124)>>) become possible.

How would this comparison work? The definition says:

If SUBJECT(A) = SUBJECT(B) evaluates to true, go to the next step. If SUBJECT(A) < SUBJECT(B) evaluates to true, return -1.
If SUBJECT(A) > SUBJECT(B) evaluates to true, return 1. If any of the evalutions cause an error, raise an error.

The SUBJECT(A) > SUBJECT(B) will raise an error on any system that has not extended the operator mapping table, so compareTripleTerm raises an error at this point, correct?

If we would not go forward with this PR, these spec tests will have to be retracted.

Those tests aren't approved yet, just proposed, so while we can clean them up no "retraction" would be necessary.

@rubensworks
Copy link
Member Author

The SUBJECT(A) > SUBJECT(B) will raise an error on any system that has not extended the operator mapping table, so compareTripleTerm raises an error at this point, correct?

If A and B are equal (e.g. when evaluating compareTripleTerm(<<(:s :p 123)>>, <<(:s :p 124)>>)), this case will apply:

If SUBJECT(A) = SUBJECT(B) evaluates to true, go to the next step.

So SUBJECT(A) < SUBJECT(B) and SUBJECT(A) > SUBJECT(B) and will never be evaluated.
The evaluation will continue with step 2, being to compare the predicates, and so on.

I do see how "the next step" could be interpreted differently, so wording should be improved there.

@kasei
Copy link
Contributor

kasei commented Sep 25, 2025

Apologies. I misread your example, and didn't notice that subject and object were the same. However, this feels like a very small special case that could just as easily be handled by OBJECT(?tt1) < OBJECT(?tt2). Writing language like "If SUBJECT(A) > SUBJECT(B) evaluates to true…" seems really strange to me knowing that it will only make sense in systems that have implemented extensions to the spec.

My opinion is that we should not include < and > operator mappings for triple terms, but include special handling in ORDER BY similar to the current language for IRIs.

If we did include the operator support for triple terms, I think we would need to also include similar definitions for IRIs and blank nodes for consistency (which to be clear I think would be a bad idea).

@rubensworks rubensworks requested a review from hartig September 26, 2025 05:36
@rubensworks
Copy link
Member Author

rubensworks commented Sep 26, 2025

My opinion is that we should not include < and > operator mappings for triple terms, but include special handling in ORDER BY similar to the current language for IRIs.

I understand your concern. The lack of default comparison support for IRIs and bnodes can introduce confusion indeed (it already has in the tests).
I don't have a strong opinion on adding this, I mainly followed the RDF-star spec here.
But IMO the added value outweighs the downsides.
Curious to hear other people's thoughts on this.

If we did include the operator support for triple terms, I think we would need to also include similar definitions for IRIs and blank nodes for consistency (which to be clear I think would be a bad idea).

I would also not add support for bnodes indeed. However, I don't see immediate issues with adding IRI support, so I wouldn't be against that.

@Tpt
Copy link
Contributor

Tpt commented Sep 26, 2025

+1 to @kasei I have a hard time to find a usecase for this extension of <. Having it but not IRIs seems quite confusing to me.

@niklasl
Copy link

niklasl commented Sep 26, 2025

Chiming in from the side: All I can say about > (and <) is that it is sometimes very useful also for "directed difference", to avoid duplicate result rows of ?a, ?b, ?b, ?a from an ?a != ?b filter, where ?a > ?b only yields one row. Granted, STR(?A) > STR(?b) is a sufficient workaround (though with possible performance penalty).

@jitsedesmet
Copy link

I also do not have a strong opinion, but I also think that adding support for iri comparison makes sense.
Adding support for blank nodes does not make sense to me - since it would likely depend on the labels which are often auto generated by the engines themself all with their own label generating scheme. I therefore think adding support would only fracture the landscape (imo).

If support for IRIs would be added, I think adding support for triple terms also makes sense. But like the majority here I also think adding, triple term comparison without defining IRI comparison is really confusing (to reason on what an engine would return).

@TallTed
Copy link
Member

TallTed commented Sep 26, 2025

@hartig

remove the sentence about triple terms that is after the operator mapping table in Sec.17.3.

There are no section numbers here. Can you please change the above quoted snippet to identify that location in some way which is visible in the HTML/ReSpec source?

@hartig
Copy link
Contributor

hartig commented Sep 26, 2025

@hartig

remove the sentence about triple terms that is after the operator mapping table in Sec.17.3.

There are no section numbers here. Can you please change the above quoted snippet to identify that location in some way which is visible in the HTML/ReSpec source?

@TallTed , @rubensworks has already implemented my suggestion, see commit 189d62e

@rubensworks
Copy link
Member Author

As discussed during the last SPARQL task force meeting, this PR has been updated to include comparison for IRIs and blank nodes.
These are only defined within the scope of compareTripleTerm (an alternative would be to include dedicated entries for them in the operator mapping table, but that felt a bit too strong to me).
How they are implemented is up to the system.

<code>xsd:strings</code>, <code>xsd:booleans</code> and <code>xsd:dateTimes</code>. Pairs of
<code>xsd:strings</code>, <code>xsd:booleans</code>, <code>xsd:dateTimes</code>, and
<a data-cite="RDF12-CONCEPTS#dfn-triple-term">Triple Terms</a>. Pairs of
IRIs are ordered by comparing them as literals with datatype <code>xsd:string</code>.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to say something about case-sensitive ordering being the norm in some environments, and case-insensitive ordering in others, as well as how to override both norms, assuming that is possible anywhere if not everywhere. I know we've said in some place(s) that lexical ordering is to be code point order, which is effectively case-sensitive, but I don't think this will suffice for all use cases. So far as I knowI believe code point order will also prove problematic when trying to compare, for instance, ä, as a single character and as a composite character (i.e., a combination of a and ¨), and in making all the as (uppercase, lowercase, with diacriticals, etc.) appear as a group and be in a predictable order within that group.

Copy link
Contributor

@afs afs Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Function and operators" has fn:collation-key and https://www.w3.org/TR/xpath-functions-31/#choosing-a-collation to handle this.

(I don't see anything new to SPARQL 1.2 here so I think it's not for this PR.)

rubensworks and others added 2 commits October 24, 2025 09:14
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
@jitsedesmet
Copy link

jitsedesmet commented Oct 26, 2025

@rubensworks
Thank you for the update on the conclusion made by the working-group.

These are only defined within the scope of compareTripleTerm (an alternative would be to include dedicated entries for them in the operator mapping table, but that felt a bit too strong to me).

I think this makes sense from a spec perspective, but I wonder what this means for query engines that have implemented comparators for bnode and iri.
So imagine my engine has a comparator for these term types, being compared like A.
Now the spec will say that within triple terms we should compare using comparator B.
That means that my engine would have two, possibly conflicting orderings. Meaning the ordering between

SELECT * {
    ?s ?p ?o .
    FILTER( isTriple(?o) ).
} ORDER BY (?o)

Would could return a different order to:

SELECT * {
    ?s ?p ?o .
    FILTER (isTriple(?o)) .
    BIND ( subject(?o) as ?subj ) .
} ORDER BY (?subj, ?o)

My engine is now not consistent with itself. Not saying this is necceserilly wrong, but it is something to be aware of.

Ignore what I said, I just read the spec changes and my assumption based on your message were wrong.

@afs
Copy link
Contributor

afs commented Oct 26, 2025

These are only defined within the scope of compareTripleTerm (an alternative would be to include dedicated entries for them in the operator mapping table, but that felt a bit too strong to me).

I agree this is not for the operator mapping table - otherwise it would allow the use in any expression, not limited to ORDER BY.

It is probably worth introducing a piece of terminology for this. Having a name means we can have a (short) section about it, define the function, and pass the function into the order processs.

(It is sort-of there, not by name, in "15.1 ORDER BY.)

"pairwise comparsion" vs "ordering comparison" -- feels a bit unnatural and clunky to repeat it over-and-over again.

"comparison" vs "collation" -- collation seems right although the dominant usage of "collation" is the natural language case.

"order collation"?

@rubensworks
Copy link
Member Author

@afs Just to make sure I understand your comment correctly:

I agree this is not for the operator mapping table - otherwise it would allow the use in any expression, not limited to ORDER BY.
It is probably worth introducing a piece of terminology for this. Having a name means we can have a (short) section about it, define the function, and pass the function into the order processs.
(It is sort-of there, not by name, in "15.1 ORDER BY.)

15.1 ORDER BY explicitly refers to the "<" operator, which is extended towards triple terms within this PR.
What additional changes do you foresee exactly in ORDER BY?

"pairwise comparsion" vs "ordering comparison" -- feels a bit unnatural and clunky to repeat it over-and-over again.
"comparison" vs "collation" -- collation seems right although the dominant usage of "collation" is the natural language case.
"order collation"?

Are you referring to the ORDER BY section here, or the new section for compareTripleTerm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants