Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1585

OLAP dedup over non elements

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.2.3
    • 3.2.4
    • hadoop, process
    • None

    Description

      OLAP dedup() is highly inefficient when it's fed with non elements.

      In a customer project a query similar tho the following returned a result in slightly more than 6 seconds:

      persistedRDD.
        V().hasLabel("label1","label2").
        inE("edgeLabel1","edgeLabel2").outV().
        id().count()
      

      The same query with dedup() added:

      persistedRDD.
        V().hasLabel("label1","label2").
        inE("edgeLabel1","edgeLabel2").outV().
        id().dedup().count()
      

      ...took more than 120 seconds.

      Attachments

        Issue Links

          Activity

            People

              okram Marko A. Rodriguez
              dkuppitz Daniel Kuppitz
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: