Description
In SPARK-26560, we ensured that Hive UDF using JAR is executed regardless of current thread context classloader.
cloud_fan pointed out another potential issue in post-review of SPARK-26560 - quoting the comment:
Found a potential problem: here we call HiveSimpleUDF.dateType (which is a lazy val), to force to load the class with the corrected class loader.
However, if the expression gets transformed later, which copies HiveSimpleUDF, then calling HiveSimpleUDF.dataType will re-trigger the class loading, and at that time there is no guarantee that the corrected classloader is used.
I think we should materialize the loaded class in HiveSimpleUDF.
This JIRA issue is to track the effort of verifying the potential issue and fixing the issue.
Attachments
Issue Links
- relates to
-
SPARK-26560 Repeating select on udf function throws analysis exception - function not registered
- Resolved
- links to