Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
Jena 3.13.1
-
None
Description
It looks like Jena assumes that OpDistinct preserves order, but order is not preserved when spilling occurs. This is only a problem when the ARQ.spillToDiskThreshold setting has been configured.
Consider the following query:
PREFIX : <http://example.org/>
SELECT DISTINCT *
WHERE
{ ?x :p ?v }
ORDER BY ASC(?v)
Here's the query plan for this query:
(distinct
(order ((asc ?v))
(bgp (triple ?x <http://example.org/p> ?v))))
Jena executes the ORDER BY ASC(?v) before the DISTINCT, relying on the SPARQL requirement:
The order of Distinct(Ψ) must preserve any ordering given by OrderBy.
But, when spilling, QueryIterDistinct (which executes OpDistinct) creates a DistinctDataBag with a BindingComparator without any sort conditions. As a result, the DISTINCT operation sorts using "compareBindingsSyntactic()" and doesn't preserve the ORDER BY ASC(?v) requirement.
Note that some query plans will reorder the ORDER BY and DISTINCT, making things work correctly. For example, adding a LIMIT 5 clause to the query above results in a "(top (5 (asc ?v))" operation that doesn't suffer from the bug.
You can reproduce this by injecting the following into QueryTest.java then running the ARQTestRefEngine tests:
void runTestSelect(Query query, QueryExecution qe)
{
qe.getContext().set(ARQ.spillToDiskThreshold, 4); // add this line
...
For example, "ARQTestRefEngine -> Algebra optimizations -> QueryTest.opt-top-05" will fail with:
Query: PREFIX : <http://example/> SELECT DISTINCT * WHERE { ?x :p ?v } ORDER BY ASC(?v) LIMIT 5 Got: 5 -------------------------------- ------------- | x | v | ============= | :x1 | 1 | | :x2 | 2 | | :x10 | 10 | | :x11 | 11 | | :x12 | 12 | ------------- Expected: 5 ----------------------------- ------------ | x | v | ============ | :x1 | 1 | | :x2 | 2 | | :x3 | 3 | | :x3a | 3 | | :x4 | 4 | ------------ junit.framework.AssertionFailedError: Results do not match at junit.framework.Assert.fail(Assert.java:57) at junit.framework.Assert.assertTrue(Assert.java:22) at junit.framework.TestCase.assertTrue(TestCase.java:192) at org.apache.jena.sparql.junit.QueryTest.runTestSelect(QueryTest.java:284) at org.apache.jena.sparql.junit.QueryTest.runTestForReal(QueryTest.java:201) at org.apache.jena.sparql.junit.EarlTestCase.runTest(EarlTestCase.java:88) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129)
Attachments
Issue Links
- links to