Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
In the wake of ARROW-8792, this issue is to serve as an umbrella issue for follow up work and associated "buildout" which includes things like:
- Implementation of many new function types and adding new kernel cases to existing functions
- Adding implicit casting functionality to function execution
- Creation of "bound" physical array expressions and execution thereof
- Pipeline execution (executing multiple kernels while eliminating temporary allocation)
- Parallel execution of scalar and aggregate kernels (including parallel execution of pipelined kernels)
There's quite a few existing JIRAs in the project that I'll attach to this issue and I'll open plenty more issues as things occur to me to help organize the work.
Attachments
Issue Links
- is a parent of
-
ARROW-555 [C++] String algorithm library for StringArray/BinaryArray
- Open
-
ARROW-1489 [C++] Add casting option to set unsafe casts to null rather than some garbage value
- Open
-
ARROW-1574 [C++] Implement kernel function that converts a dense array to dictionary given known dictionary
- Open
-
ARROW-3120 [C++] Parallelize execution of ScalarAggregateFunction
- Open
-
ARROW-3978 [C++] Implement hashing, dictionary-encoding for StructArray
- Open
-
ARROW-4097 [C++] Add function to "conform" a dictionary array to a target new dictionary
- Open
-
ARROW-11090 [C++] Support temporal arithmetic ({time,date}{32,64}, timestamp, interval)
- Open
-
ARROW-1569 [C++] Kernel functions for determining monotonicity (ascending or descending) for well-ordered types
- In Progress
-
ARROW-1567 [C++] Implement "fill null" kernels that replace null values with some scalar replacement value
- Resolved
-
ARROW-1568 [C++] Implement "drop null" kernels that return array without nulls
- Resolved
-
ARROW-1699 [C++] Forward, backward fill kernel functions
- Resolved
-
ARROW-1846 [C++] Implement "any" reduction kernel for boolean data
- Resolved
-
ARROW-1888 [C++] Implement casts from one struct type to another (with same field names and number of fields)
- Resolved
-
ARROW-6978 [R] Add bindings for sum and mean compute kernels
- Resolved
-
ARROW-7010 [C++] Support lossy casts from decimal128 to float32 and float64/double
- Resolved
-
ARROW-7011 [C++] Implement casts from float/double to decimal128
- Resolved
-
ARROW-8922 [C++] Implement example string scalar kernel function to assist with string kernels buildout per ARROW-555
- Resolved
-
ARROW-8934 [C++] Add timestamp subtract kernel aliased to int64 subtract implementation
- Resolved
-
ARROW-8937 [C++] Add "parse_strptime" function for string to timestamp conversions using the kernels framework
- Resolved
-
ARROW-8938 [R] Provide binding for arrow::compute::CallFunction
- Resolved
-
ARROW-13174 [C++][Compute] Add strftime kernel
- Resolved
-
ARROW-7009 [C++] Refactor filter/take kernels to use Datum instead of overloads
- Resolved
-
ARROW-3122 [C++] Incremental Variance, Standard Deviation aggregators
- Closed
-
ARROW-3802 [C++]Â Cast to/from halffloat not implemented
- Open
-
ARROW-5005 [C++] Implement support for using selection vectors in scalar aggregate function kernels
- Open
-
ARROW-5530 [C++] Add options to ValueCount/Unique/DictEncode kernel to toggle null behavior
- Open
-
ARROW-5890 [C++][Python] Support ExtensionType arrays in more kernels
- Open
-
ARROW-7245 [C++] Allow automatic String -> LargeString promotions when concatenating tables
- Open
-
ARROW-8897 [C++] Determine strategy for propagating failures in initializing built-in function registry in arrow/compute
- Open
-
ARROW-8898 [C++] Determine desirable maximum length for ExecBatch in pipelined and parallel execution of kernels
- Open
-
ARROW-8921 [C++] Add "TypeResolver" class interface to replace current OutputType::Resolver pattern
- Open
-
ARROW-8936 [C++] Parallelize execution of arrow::compute::ScalarFunction
- Open
-
ARROW-9003 [C++] Add VectorFunction wrapping arrow::Concatenate
- Open
-
ARROW-9006 [C++] Deprecate or remove Scalar::Parse and Scalar::CastTo
- Open
-
ARROW-12748 [C++] Arithmetic kernels for numeric arrays
- Open
-
ARROW-13121 [C++][Compute] Extract preallocation logic from KernelExecutor
- Open
-
ARROW-13122 [C++][Compute] Dispatch* should examine options as well as input types
- Open
-
ARROW-13339 [C++] Implement hash_aggregate kernels (umbrella issue)
- Open
-
ARROW-971 [C++/Python] Implement Array.isvalid/notnull/isnull as scalar functions
- Resolved
-
ARROW-2665 [Python/C++]Â Add index() method to find first occurence of Python scalar
- Resolved
-
ARROW-5854 [Python] Expose compare kernels on Array class
- Resolved
-
ARROW-6456 [C++] Possible to reduce object code generated in compute/kernels/take.cc?
- Resolved
-
ARROW-6982 [R] Add bindings for compare and boolean kernels
- Resolved
-
ARROW-7017 [C++] Refactor AddKernel to support other operations and types
- Resolved
-
ARROW-7179 [C++][Compute] Consolidate fill_null and coalesce
- Resolved
-
ARROW-8025 [C++] Implement cast to Binary and FixedSizeBinary
- Resolved
-
ARROW-8500 [C++] Use selection vectors in Filter implementation for record batches, tables
- Resolved
-
ARROW-8876 [C++] Implement casts from date types to Timestamp
- Resolved
-
ARROW-8895 [C++] Add C++ unit tests for filter and take functions on temporal type inputs, including timestamps
- Resolved
-
ARROW-8896 [C++] Reimplement dictionary unpacking in Cast kernels using Take
- Resolved
-
ARROW-8901 [C++] Reduce number of take kernels
- Resolved
-
ARROW-8903 [C++] Implement optimized "unsafe take" for use with selection vectors for kernel execution
- Resolved
-
ARROW-8917 [C++][Compute] Formalize "metafunction" concept
- Resolved
-
ARROW-8918 [C++] Add cast "metafunction" to FunctionRegistry that addresses dispatching to appropriate type-specific CastFunction
- Resolved
-
ARROW-8919 [C++] Add "DispatchBest" APIs to compute::Function that selects a kernel that may require implicit casts to invoke
- Resolved
-
ARROW-8923 [C++] Improve usability of arrow::compute::CallFunction by moving ExecContext* argument to end and adding default
- Resolved
-
ARROW-8926 [C++] Improve docstrings in new public APIs in arrow/compute and fix miscellaneous typos
- Resolved
-
ARROW-8928 [C++] Measure microperformance associated with ExecBatchIterator
- Resolved
-
ARROW-8929 [C++] Change compute::Arity:VarArgs min_args default to 0
- Resolved
-
ARROW-8933 [C++] Reduce generated code in vector_hash.cc
- Resolved
-
ARROW-8969 [C++] Reduce generated code in compute/kernels/scalar_compare.cc
- Resolved
-
ARROW-8976 [C++] compute::CallFunction can't Filter/Take with ChunkedArray
- Resolved
-
ARROW-9029 [C++] Implement BitBlockCounter interface for blockwise popcounts of validity bitmaps
- Resolved
-
ARROW-9045 [C++] Improve and expand Take/Filter benchmarks
- Resolved
-
ARROW-9055 [C++] Add sum/mean kernels for Boolean type
- Resolved
-
ARROW-9056 [C++] Support scalar aggregation over scalars
- Resolved
-
ARROW-11928 [C++][Compute] Add ExecNode hierarchy
- Resolved
-
ARROW-11929 [C++][Compute] Promote Expression to the compute namespace
- Resolved
-
ARROW-11930 [C++][Dataset][Compute] Refactor Dataset scans to use an ExecNode graph
- Resolved
-
ARROW-12499 [C++][Compute][R] Add ScalarAggregateOptions to Any and All kernels
- Resolved
-
ARROW-12980 [C++] Kernels to extract datetime components should be timezone aware
- Resolved
-
ARROW-13005 [C++] Support filter/take for union data type.
- Resolved
-
ARROW-13025 [C++][Compute] Enhance FunctionOptions with equality, debug representability, and serializability
- Resolved
-
ARROW-13033 [C++] Kernel to localize naive timestamps to a timezone (preserving clock-time)
- Resolved
-
ARROW-13054 [C++] Add option to specify the first day of the week for the "day_of_week" temporal kernel
- Resolved
-
ARROW-13064 [C++] Add a general "if, ifelse, ..., else" kernel ("CASE WHEN")
- Resolved
-
ARROW-13220 [C++] Add a 'choose' kernel/scalar compute function
- Resolved
-
ARROW-13222 [C++] Support variable-width types in case_when function
- Resolved
-
ARROW-13548 [C++] Implement datediff kernel
- Resolved
-
ARROW-13549 [C++] Implement timestamp to date/time cast that extracts value
- Resolved
-
ARROW-6071 [C++] Implement casting Binary <-> LargeBinary
- Resolved
-
ARROW-6072 [C++] Implement casting List <-> LargeList
- Resolved
-
ARROW-6122 [C++] SortToIndices kernel must support FixedSizeBinary
- Closed
-
ARROW-6123 [C++] ArgSort kernel should not materialize the output internal
- Closed
-
ARROW-6923 [C++] Option for Filter kernel how to handle nulls in the selection vector
- Closed
-
ARROW-7083 [C++] Determine the feasibility and build a prototype to replace compute/kernels with gandiva kernels
- Closed
-
ARROW-8916 [Python] Add relevant glue for implementing each kind of FunctionOptions
- Closed
-
ARROW-12053 [C++] Implement aggregate compute functions for decimal datatypes
- Closed
-
ARROW-6974 [C++] Refactor temporal casts to work with Scalar inputs
- Closed
- is related to
-
ARROW-9042 [C++] Add Subtract and Multiply arithmetic kernels with wrap-around behavior
- Resolved
-
ARROW-8939 [C++] Arrow-native C++ Data Frame-style programming interface for analytics (umbrella issue)
- Open
-
ARROW-3520 [C++] Implement List Flatten kernel
- Resolved
-
ARROW-4333 [C++] Sketch out design for kernels and "query" execution in compute layer
- Resolved
-
ARROW-5760 [C++] Optimize Take implementation
- Resolved
-
ARROW-8961 [C++] Add utf8proc library to toolchain
- Resolved
-
ARROW-8891 [C++] Split non-cast compute kernels into a separate shared library
- Open
-
ARROW-7871 [Python] Expose more compute kernels
- Resolved
-
ARROW-8966 [C++] Move arrow::ArrayData to a separate header file
- Resolved
-
ARROW-8989 [C++] Document available functions in compute::FunctionRegistry
- Resolved
-
ARROW-9075 [C++] Optimize Filter implementation
- Resolved
-
ARROW-8772 [C++] Expand SumKernel benchmark to more types
- Resolved
-
ARROW-6990 [C++] Support casting between decimal types with compatible precision/scales
- Closed
-
ARROW-8214 [C++] Flatbuffers based serialization protocol for Expressions
- Closed
-
ARROW-8905 [C++] Collapse Take APIs from 8 to 1 or 2
- Closed
-
ARROW-6775 [C++] [Python] Proposal for several Array utility functions
- Resolved
- relates to
-
ARROW-8792 [C++] Improved declarative compute function / kernel development framework, normalize calling conventions
- Resolved