Description
Capable of registering vectorized UDFs and then use it in SQL statement.
For example,
>>> import random >>> from pyspark.sql.types import IntegerType >>> from pyspark.sql.functions import pandas_udf >>> random_pandas_udf = pandas_udf( ... lambda x: random.randint(0, 100) + x, IntegerType()) ... .asNondeterministic() # doctest: +SKIP >>> _ = spark.catalog.registerFunction( ... "random_pandas_udf", random_pandas_udf, IntegerType()) # doctest: +SKIP >>> spark.sql("SELECT random_pandas_udf(2)").collect() # doctest: +SKIP [Row(random_pandas_udf(2)=84)]