Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-12253

[Rust] [Ballista] Implement scalable joins

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • None
    • 5.0.0
    • Rust - Ballista
    • None

    Description

      The main issue limiting scalability in Ballista today is that joins are implemented as hash joins where each partition of the probe side causes the entire left side to be loaded into memory.

      To make this scalable we need to hash partition left and right inputs so that we can join the left and right partitions in parallel.

      There is already work underway in DataFusion to implement this that we can leverage.

      Attachments

        Activity

          People

            andygrove Andy Grove
            andygrove Andy Grove
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: