Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8901

[C++] Reduce number of take kernels

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • C++
    • None

    Description

      After ARROW-8792 we can observe that we are generating 312 take kernels

      In [1]: import pyarrow.compute as pc                                                                      
      
      In [2]: reg = pc.function_registry()                                                                      
      
      In [3]: reg.get_function('take')                                                                          
      Out[3]: 
      arrow.compute.Function
      kind: vector
      num_kernels: 312
      

      You can see them all here: https://gist.github.com/wesm/c3085bf40fa2ee5e555204f8c65b4ad5

      It's probably going to be sufficient to only support int16, int32, and int64 index types for almost all types and insert implicit casts (once we implement implicit-cast-insertion into the execution code) for other index types. If we determine that there is some performance hot path where we need to specialize for other index types, then we can always do that.

      Additionally, we should be able to collapse the date/time kernels since we're just moving memory.

      Attachments

        Issue Links

          Activity

            People

              wesm Wes McKinney
              wesm Wes McKinney
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: