When the VersionSuite test runs from sql/hive it downloads different versions from hive.
Unfortunately the IsolatedClientClassloader (which is used by the VersionSuite) uses hardcoded fix repositories:
val classpath = quietly { SparkSubmitUtils.resolveMavenCoordinates( hiveArtifacts.mkString(","), SparkSubmitUtils.buildIvySettings( Some(""), ivyPath), exclusions = version.exclusions) }
The problem is with the hard-coded repositories:
1. it's hard to run unit tests in an environment where only one internal maven repository is available (and central/datanucleus is not)
2. it's impossible to run unit tests against custom built hive/hadoop artifacts (which are not available from the central repository)
VersionSuite has already a specific SPARK_VERSIONS_SUITE_IVY_PATH environment variable to define a custom local repository as ivy cache.
I suggest to add an additional environment variable (SPARK_VERSIONS_SUITE_IVY_REPOSITORIES to the HiveClientBuilder.scala), to make it possible adding new remote repositories for testing the different hive versions.
Issue Links
- duplicates
SPARK-19458 loading hive jars from the local repo which has already downloaded
- Resolved