[ARROW-8901] [C++] Reduce number of take kernels - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.0.0
Component/s: C++
Labels:
None

External issue URL:
https://github.com/apache/arrow/issues/25035

Description

After ~~ARROW-8792~~ we can observe that we are generating 312 take kernels

In [1]: import pyarrow.compute as pc                                                                      

In [2]: reg = pc.function_registry()                                                                      

In [3]: reg.get_function('take')                                                                          
Out[3]: 
arrow.compute.Function
kind: vector
num_kernels: 312

You can see them all here: https://gist.github.com/wesm/c3085bf40fa2ee5e555204f8c65b4ad5

It's probably going to be sufficient to only support int16, int32, and int64 index types for almost all types and insert implicit casts (once we implement implicit-cast-insertion into the execution code) for other index types. If we determine that there is some performance hot path where we need to specialize for other index types, then we can always do that.

Additionally, we should be able to collapse the date/time kernels since we're just moving memory.

Attachments

Issue Links

is a child of

ARROW-8894 [C++] C++ array kernels framework and execution buildout (umbrella issue)

Open

is related to

ARROW-8919 [C++] Add "DispatchBest" APIs to compute::Function that selects a kernel that may require implicit casts to invoke

Resolved

ARROW-8970 [C++] Reduce shared library / binary code size (umbrella issue)

Resolved

Activity

People

Assignee:: Wes McKinney

Reporter:: Wes McKinney

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 22/May/20 23:54

Updated:: 11/Jan/23 08:03

Resolved:: 12/Jun/20 18:49