|
|
|
SPARK-28132
|
SPARK-22216
Update document type conversion for Pandas UDFs (pyarrow 0.13.0, pandas 0.24.2, Python 3.7)
|
Hyukjin Kwon
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-25798
|
SPARK-22216
Internally document type conversion between Pandas data and SQL types in Pandas UDFs
|
Hyukjin Kwon
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-25640
|
SPARK-22216
Clarify/Improve EvalType for grouped aggregate and window aggregate
|
Unassigned
|
Li Jin
|
|
Resolved |
Incomplete
|
|
|
|
|
|
|
|
SPARK-25601
|
SPARK-22216
Register Grouped aggregate UDF Vectorized UDFs for SQL Statement
|
Hyukjin Kwon
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-25328
|
SPARK-22216
Add an example for having two columns as the grouping key in group aggregate pandas UDF
|
Hyukjin Kwon
|
Xiao Li
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-25274
|
SPARK-22216
Improve toPandas with Arrow by sending out-of-order record batches
|
Bryan Cutler
|
Bryan Cutler
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-25272
|
SPARK-22216
Show some kind of test output to indicate pyarrow tests were run
|
Bryan Cutler
|
Bryan Cutler
|
|
Resolved |
Won't Fix
|
|
|
|
|
|
|
|
SPARK-24976
|
SPARK-22216
Allow None for Decimal type conversion (specific to PyArrow 0.9.0)
|
Hyukjin Kwon
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-24796
|
SPARK-22216
Support GROUPED_AGG_PANDAS_UDF in Pivot
|
Unassigned
|
Xiao Li
|
|
Resolved |
Incomplete
|
|
|
|
|
|
|
|
SPARK-24624
|
SPARK-22216
Can not mix vectorized and non-vectorized UDFs
|
Li Jin
|
Xiao Li
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-24561
|
SPARK-22216
User-defined window functions with pandas udf (bounded window)
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-24334
|
SPARK-22216
Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory allocator
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-24324
|
SPARK-22216
Pandas Grouped Map UserDefinedFunction mixes column labels
|
Bryan Cutler
|
Cristian Consonni
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23800
|
SPARK-22216
Support partial function and callable object with pandas UDF
|
Unassigned
|
Li Jin
|
|
Resolved |
Incomplete
|
|
|
|
|
|
|
|
SPARK-23633
|
SPARK-22216
Update Pandas UDFs section in sql-programming-guide
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23446
|
SPARK-22216
Explicitly check supported types in toPandas
|
Hyukjin Kwon
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23401
|
SPARK-22216
Improve test cases for all supported types and unsupported types
|
Aleksandr Koriagin
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23380
|
SPARK-22216
Adds a conf for Arrow fallback in toPandas/createDataFrame with Pandas DataFrame
|
Hyukjin Kwon
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23352
|
SPARK-22216
Explicitly specify supported types in Pandas UDFs
|
Hyukjin Kwon
|
Hyukjin Kwon
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23334
|
SPARK-22216
Fix pandas_udf with return type StringType() to handle str type properly in Python 2.
|
Takuya Ueshin
|
Takuya Ueshin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23314
|
SPARK-22216
Pandas grouped udf on dataset with timestamp column error
|
Li Jin
|
Felix Cheung
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23302
|
SPARK-22216
Refactor group aggregate pandas UDF to its own catalyst rules
|
Unassigned
|
Li Jin
|
|
Resolved |
Incomplete
|
|
|
|
|
|
|
|
SPARK-23261
|
SPARK-22216
Rename Pandas UDFs
|
Xiao Li
|
Xiao Li
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23047
|
SPARK-22216
Change MapVector to NullableMapVector in ArrowColumnVector
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23030
|
SPARK-22216
Decrease memory consumption with toPandas() collection using Arrow
|
Bryan Cutler
|
Bryan Cutler
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-23011
|
SPARK-22216
Support alternative function form with group aggregate pandas UDF
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22980
|
SPARK-22216
Using pandas_udf when inputs are not Pandas's Series or DataFrame
|
Hyukjin Kwon
|
Xiao Li
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22978
|
SPARK-22216
Register Scalar Vectorized UDFs for SQL Statement
|
Xiao Li
|
Xiao Li
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22930
|
SPARK-22216
Improve the description of Vectorized UDFs for non-deterministic cases
|
Li Jin
|
Xiao Li
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22409
|
SPARK-22216
Add function type argument to pandas_udf
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22324
|
SPARK-22216
Upgrade Arrow to version 0.8.0 and upgrade Netty to 4.1.17
|
Bryan Cutler
|
Bryan Cutler
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22323
|
SPARK-22216
Design doc for different types of pandas_udf
|
Unassigned
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22274
|
SPARK-22216
User-defined aggregation functions with pandas udf
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-22239
|
SPARK-22216
User-defined window functions with pandas udf (unbounded window)
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-21404
|
SPARK-22216
Simple Vectorized Python UDFs using Arrow
|
Unassigned
|
Bryan Cutler
|
|
Closed |
Fixed
|
|
|
|
|
|
|
|
SPARK-21190
|
SPARK-22216
SPIP: Vectorized UDFs in Python
|
Bryan Cutler
|
Reynold Xin
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-20791
|
SPARK-22216
Use Apache Arrow to Improve Spark createDataFrame from Pandas.DataFrame
|
Bryan Cutler
|
Bryan Cutler
|
|
Resolved |
Fixed
|
|
|
|
|
|
|
|
SPARK-20396
|
SPARK-22216
groupBy().apply() with pandas udf in pyspark
|
Li Jin
|
Li Jin
|
|
Resolved |
Fixed
|
|
|
|
|