Details
Description
Currently, the spark-examples submodule builds 2 main JAR files. One is a JAR file without any shaded/relocated dependencies (target/spark-examples_<scala-version><spark-version>.jar), and another is a JAR file containing all of the shaded dependencies (under target/scala-<scala-version>/spark-examples-<spark-version>-hadoop<hadoop-version>.jar). The shaded spark-examples JAR comes out to be around ~120MB and contains many duplicates already found in the spark-assembly.
Since spark-assembly shades AND relocates Guava (to org.spark-project.guava), and most of its dependencies are already provided by spark-assembly, the non-shaded spark-examples JAR is still unable to find Guava at runtime as no relocations occur in the non shaded spark-examples.jar.
An example of one failure (spark as built by the Bigtop distribution):
$ spark-submit --deploy-mode client --class org.apache.spark.examples.sql.hive.HiveFromSpark /usr/lib/spark/lib/spark-examples.jar ... ... Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/io/Files at org.apache.spark.examples.sql.hive.HiveFromSpark$.<init>(HiveFromSpark.scala:35) at org.apache.spark.examples.sql.hive.HiveFromSpark$.<clinit>(HiveFromSpark.scala) at org.apache.spark.examples.sql.hive.HiveFromSpark.main(HiveFromSpark.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: com.google.common.io.Files at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 12 more Command exiting with ret '1'