[TINKERPOP-1397] StarVertex self edge has buggy interaction with graph filters - ASF JIRA

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.3
Fix Version/s: 3.1.4, 3.2.2
Component/s: process
Labels:
None

Description

When StarVertex adds a self-loop, it adds an instance of concrete type StarOutEdge to both its outEdges map and its inEdges map.

https://github.com/apache/tinkerpop/blob/8f7218d53a31cf41f4a0269d64ac1c27dfc0907a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/util/star/StarGraph.java#L321-L330

(line 329 adds a StarOutEdge to the inEdges map)

Having a StarOutEdge in the inEdges map mostly doesn't matter. However, this does matter in the applyGraphFilter method. This method uses instanceof StarOutEdge to guess whether an edge's original direction was in or out:

https://github.com/apache/tinkerpop/blob/8f7218d53a31cf41f4a0269d64ac1c27dfc0907a/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/util/star/StarGraph.java#L470

If you trace applyGraphFilter through, you see that a StarVertex with a self-loop can pop out of applyGraphFilter with two out edges corresponding to the self loop and no in edges. This leads to weird traversal results. One way to trigger this is to write a GraphFilterAware InputRDD that produces StarVertex instances and then run a g.V()....inE("somelabel"), so that TP pushes down the "somelabel" constraint, which is how I got here.

I think addEdge should continue putting a StarOutEdge into outEdges, like it already does. However, it should put a StarInEdge into inEdges. This would increase the object overhead associated with StarVertices that have self loops (two edge objs instead of one). If we wanted to be obsessive about optimizing this case we could probably pervert the inEdge/outEdge datastructure contents to do it, but IMHO that's not worth it. Actually, I'm no longer convinced that's safe, since I think it would alter some of StarVertex's other semantics. For one, I think retrieving an edge and setting a property on it would probably break.

I'll make a PR soon.

I don't know all of the versions this affects, but I do know it affects 3.2.1.

Attachments

Issue Links

links to

GitHub Pull Request #377

GitHub Pull Request #378

Activity

Ascending order - Click to sort in descending order

Dan LaRocque added a comment - 04/Aug/16 06:00

I opened https://github.com/apache/tinkerpop/pull/372.

This solves my immediate problem. It also preserves the following semantics:

StarVertex self edges are still represented by a single object
StarVertex self edges still only allocate a single ID (per previous point)

Maybe broken semantics that I missed will come up in review. Right now, I can only see the following risk:

PR introduces new edge class StarSelfEdge. Existing classes StarOutEdge and StarInEdge were already public. Any third-party code that assumes all star edges were of type StarOutEdge or StarInEdge (and nothing else) will break when it encounters StarSelfEdge. I don't know of any such code first-hand, but it's clearly theoretically possible that it exists out there somewhere, since StarOut/InEdge are public.

I ran mvn -pl tinkergraph-gremlin clean verify, which I think indirectly includes StarGraphTest.

Dan LaRocque added a comment - 04/Aug/16 06:00 I opened https://github.com/apache/tinkerpop/pull/372 . This solves my immediate problem. It also preserves the following semantics: StarVertex self edges are still represented by a single object StarVertex self edges still only allocate a single ID (per previous point) Maybe broken semantics that I missed will come up in review. Right now, I can only see the following risk: PR introduces new edge class StarSelfEdge . Existing classes StarOutEdge and StarInEdge were already public. Any third-party code that assumes all star edges were of type StarOutEdge or StarInEdge (and nothing else) will break when it encounters StarSelfEdge . I don't know of any such code first-hand, but it's clearly theoretically possible that it exists out there somewhere, since StarOut/InEdge are public. I ran mvn -pl tinkergraph-gremlin clean verify , which I think indirectly includes StarGraphTest.

ASF GitHub Bot added a comment - 11/Aug/16 00:01

GitHub user dalaro opened a pull request:

https://github.com/apache/tinkerpop/pull/377

~~TINKERPOP-1397~~ StarGraph filtering with self edges

See:

old PR #372
JIRA issue https://issues.apache.org/jira/browse/TINKERPOP-1397

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dalaro/incubator-tinkerpop ~~TINKERPOP-1397~~

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/377.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #377

commit 0022b7f6be25eb7d3c778b137beb6e8a7d2784ca
Author: Dan LaRocque <dalaro@hopcount.org>
Date: 2016-08-10T22:52:13Z

~~TINKERPOP-1397~~ Fix StarGraph.addEdge

For self-loops, StarGraph.addEdge used to put a single StarOutEdge
into both its inEdges and outEdges maps, potentially causing problems
in applyGraphFilter.

This change makes StarGraph.addEdge put the appropriate type of edge
(Star(Out/In)Edge) in the associated map. The IDs for each edge
instance are kept in agreement.

This change is @okram's, who suggested it in PR #372. I merely
reviewed it and added a couple of comments.

commit 7842c4e7e2e3c2b33fc1bca7a8a80e165080bc59
Author: Dan LaRocque <dalaro@hopcount.org>
Date: 2016-08-10T22:59:54Z

~~TINKERPOP-1397~~ Add StarGraph bothE filtering test

commit b4aaa69b9c10a61c7ab2f7476c90e4728ef032e8
Author: Dan LaRocque <dalaro@hopcount.org>
Date: 2016-08-10T23:24:10Z

Merge branch '~~TINKERPOP-1397~~-tp31' into ~~TINKERPOP-1397~~

ASF GitHub Bot added a comment - 11/Aug/16 00:01 GitHub user dalaro opened a pull request: https://github.com/apache/tinkerpop/pull/377 TINKERPOP-1397 StarGraph filtering with self edges See: old PR #372 JIRA issue https://issues.apache.org/jira/browse/TINKERPOP-1397 You can merge this pull request into a Git repository by running: $ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1397 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/377.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #377 commit 0022b7f6be25eb7d3c778b137beb6e8a7d2784ca Author: Dan LaRocque <dalaro@hopcount.org> Date: 2016-08-10T22:52:13Z TINKERPOP-1397 Fix StarGraph.addEdge For self-loops, StarGraph.addEdge used to put a single StarOutEdge into both its inEdges and outEdges maps, potentially causing problems in applyGraphFilter. This change makes StarGraph.addEdge put the appropriate type of edge (Star(Out/In)Edge) in the associated map. The IDs for each edge instance are kept in agreement. This change is @okram's, who suggested it in PR #372. I merely reviewed it and added a couple of comments. commit 7842c4e7e2e3c2b33fc1bca7a8a80e165080bc59 Author: Dan LaRocque <dalaro@hopcount.org> Date: 2016-08-10T22:59:54Z TINKERPOP-1397 Add StarGraph bothE filtering test commit b4aaa69b9c10a61c7ab2f7476c90e4728ef032e8 Author: Dan LaRocque <dalaro@hopcount.org> Date: 2016-08-10T23:24:10Z Merge branch ' TINKERPOP-1397 -tp31' into TINKERPOP-1397

ASF GitHub Bot added a comment - 11/Aug/16 00:03

GitHub user dalaro opened a pull request:

https://github.com/apache/tinkerpop/pull/378

~~TINKERPOP-1397~~ Fix StarGraph.addEdge

See:

old PR #372
JIRA issue https://issues.apache.org/jira/browse/TINKERPOP-1397

As far as I can tell, GraphFilter did not exist in 3.1, and it may not even be possible to induce the buggy traversal behavior describe in #372 (which was all based on master / 3.2). The underlying datastructure corruption – putting a StarOutEdge into the `inEdges` map – does exist on 3.1, and @okram's `addEdge` fix applies cleanly on 3.1, but without `applyGraphFilter`, I'm not sure it even matters for correctness on 3.1.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dalaro/incubator-tinkerpop ~~TINKERPOP-1397~~-tp31

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/378.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #378

commit 0022b7f6be25eb7d3c778b137beb6e8a7d2784ca
Author: Dan LaRocque <dalaro@hopcount.org>
Date: 2016-08-10T22:52:13Z

~~TINKERPOP-1397~~ Fix StarGraph.addEdge

For self-loops, StarGraph.addEdge used to put a single StarOutEdge
into both its inEdges and outEdges maps, potentially causing problems
in applyGraphFilter.

This change makes StarGraph.addEdge put the appropriate type of edge
(Star(Out/In)Edge) in the associated map. The IDs for each edge
instance are kept in agreement.

This change is @okram's, who suggested it in PR #372. I merely
reviewed it and added a couple of comments.

ASF GitHub Bot added a comment - 11/Aug/16 00:03 GitHub user dalaro opened a pull request: https://github.com/apache/tinkerpop/pull/378 TINKERPOP-1397 Fix StarGraph.addEdge See: old PR #372 JIRA issue https://issues.apache.org/jira/browse/TINKERPOP-1397 As far as I can tell, GraphFilter did not exist in 3.1, and it may not even be possible to induce the buggy traversal behavior describe in #372 (which was all based on master / 3.2). The underlying datastructure corruption – putting a StarOutEdge into the `inEdges` map – does exist on 3.1, and @okram's `addEdge` fix applies cleanly on 3.1, but without `applyGraphFilter`, I'm not sure it even matters for correctness on 3.1. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1397 -tp31 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/378.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #378 commit 0022b7f6be25eb7d3c778b137beb6e8a7d2784ca Author: Dan LaRocque <dalaro@hopcount.org> Date: 2016-08-10T22:52:13Z TINKERPOP-1397 Fix StarGraph.addEdge For self-loops, StarGraph.addEdge used to put a single StarOutEdge into both its inEdges and outEdges maps, potentially causing problems in applyGraphFilter. This change makes StarGraph.addEdge put the appropriate type of edge (Star(Out/In)Edge) in the associated map. The IDs for each edge instance are kept in agreement. This change is @okram's, who suggested it in PR #372. I merely reviewed it and added a couple of comments.

Dan LaRocque added a comment - 11/Aug/16 02:10

A lot of discussion happened recently in the old github PR for this issue: https://github.com/apache/tinkerpop/pull/372.

However, there are two shortcomings in that PR:

I misnamed the source branch
I targeted master, but Stephen rightly pointed out that this could affect tp31

So, I made two new PRs:

https://github.com/apache/tinkerpop/pull/377 (master)
https://github.com/apache/tinkerpop/pull/378 (tp31)

Dan LaRocque added a comment - 11/Aug/16 02:10 A lot of discussion happened recently in the old github PR for this issue: https://github.com/apache/tinkerpop/pull/372 . However, there are two shortcomings in that PR: I misnamed the source branch I targeted master, but Stephen rightly pointed out that this could affect tp31 So, I made two new PRs: https://github.com/apache/tinkerpop/pull/377 (master) https://github.com/apache/tinkerpop/pull/378 (tp31)

ASF GitHub Bot added a comment - 11/Aug/16 14:55

Github user okram commented on the issue:

https://github.com/apache/tinkerpop/pull/377

This solution makes sense to me – no new `SelfEdge` class and we now have a test case. Thanks.

VOTE +1.

ASF GitHub Bot added a comment - 11/Aug/16 14:55 Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/377 This solution makes sense to me – no new `SelfEdge` class and we now have a test case. Thanks. VOTE +1.

ASF GitHub Bot added a comment - 11/Aug/16 14:56

Github user okram commented on the issue:

https://github.com/apache/tinkerpop/pull/378

VOTE +1.

ASF GitHub Bot added a comment - 11/Aug/16 14:56 Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/378 VOTE +1.

ASF GitHub Bot added a comment - 11/Aug/16 15:55

Github user dkuppitz commented on the issue:

https://github.com/apache/tinkerpop/pull/378

VOTE: +1

ASF GitHub Bot added a comment - 11/Aug/16 15:55 Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/378 VOTE: +1

ASF GitHub Bot added a comment - 11/Aug/16 15:56

Github user dkuppitz commented on the issue: