Details
Description
I am running my Spark code to read data from Phoenix which has Spark 2.3.0 installed. Running in IntelliJ, it works not fine that is throwing me this error:
java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
spark code:
package com.ahct.hbase import org.apache.spark.sql._ object Test1 { def main(args: Array[String]): Unit = { val zkUrl = "192.168.240.101:2181" val spark = SparkSession.builder() .appName("SparkPhoenixTest1") .master("local[2]") .getOrCreate() val df = spark.read.format("org.apache.phoenix.spark") .option("zkurl", zkUrl) .option("table","\"bigdata\".\"tbs1\"") .load() df.show() } }
My pom.xml which is correctly mentioning Spark version as 2.3.0:
<properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <scala.version>2.11.8</scala.version> <spark.version>2.3.0</spark.version> <hadoop.version>2.6.0-cdh5.14.2</hadoop.version> <hive.version>1.1.0-cdh5.14.2</hive.version> </properties> <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>${hive.version}</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>${hive.version}</version> </dependency> <dependency> <groupId>org.postgresql</groupId> <artifactId>postgresql</artifactId> <version>42.2.5</version> </dependency> <dependency> <groupId>org.apache.phoenix</groupId> <artifactId>phoenix-spark</artifactId> <version>4.14.0-cdh5.14.2</version> </dependency> <dependency> <groupId>org.apache.twill</groupId> <artifactId>twill-api</artifactId> <version>0.8.0</version> </dependency> <dependency> <groupId>joda-time</groupId> <artifactId>joda-time</artifactId> <version>2.9.9</version> </dependency> <!-- Test --> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.8.1</version> <scope>test</scope> </dependency> </dependencies>
Here is the stacktrace from IntelliJ which shows this error:
"C:\Program Files\Java\jdk1.8.0_111\bin\java" "-javaagent:D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\lib\idea_rt.jar=61050:D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\ZX~1\AppData\Local\Temp\classpath.jar com.ahct.hbase.Test3 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/03/31 19:59:49 INFO SparkContext: Running Spark version 2.3.0 19/03/31 19:59:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 19/03/31 19:59:50 INFO SparkContext: Submitted application: SparkPhoenixTest3 19/03/31 19:59:50 INFO SecurityManager: Changing view acls to: ZX 19/03/31 19:59:51 INFO SecurityManager: Changing modify acls to: ZX 19/03/31 19:59:51 INFO SecurityManager: Changing view acls groups to: 19/03/31 19:59:51 INFO SecurityManager: Changing modify acls groups to: 19/03/31 19:59:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ZX); groups with view permissions: Set(); users with modify permissions: Set(ZX); groups with modify permissions: Set() 19/03/31 19:59:53 INFO Utils: Successfully started service 'sparkDriver' on port 61072. 19/03/31 19:59:53 INFO SparkEnv: Registering MapOutputTracker 19/03/31 19:59:53 INFO SparkEnv: Registering BlockManagerMaster 19/03/31 19:59:53 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/03/31 19:59:53 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/03/31 19:59:53 INFO DiskBlockManager: Created local directory at C:\Users\ZX\AppData\Local\Temp\blockmgr-7386bf6d-b0f4-40b0-b015-ed0191990e1c 19/03/31 19:59:53 INFO MemoryStore: MemoryStore started with capacity 899.7 MB 19/03/31 19:59:53 INFO SparkEnv: Registering OutputCommitCoordinator 19/03/31 19:59:54 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 19/03/31 19:59:54 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042. 19/03/31 19:59:54 INFO Utils: Successfully started service 'SparkUI' on port 4042. 19/03/31 19:59:54 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://DESKTOP-7M1BH3H:4042 19/03/31 19:59:54 INFO Executor: Starting executor ID driver on host localhost 19/03/31 19:59:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61085. 19/03/31 19:59:54 INFO NettyBlockTransferService: Server created on DESKTOP-7M1BH3H:61085 19/03/31 19:59:54 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 19/03/31 19:59:54 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None) 19/03/31 19:59:54 INFO BlockManagerMasterEndpoint: Registering block manager DESKTOP-7M1BH3H:61085 with 899.7 MB RAM, BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None) 19/03/31 19:59:54 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None) 19/03/31 19:59:54 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None) 19/03/31 19:59:55 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/D:/ahty/AHCT/code/scala-test/spark-warehouse/'). 19/03/31 19:59:55 INFO SharedState: Warehouse path is 'file:/D:/ahty/AHCT/code/scala-test/spark-warehouse/'. 19/03/31 19:59:56 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 19/03/31 19:59:58 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 419.0 KB, free 899.3 MB) 19/03/31 19:59:59 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 29.4 KB, free 899.3 MB) 19/03/31 19:59:59 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on DESKTOP-7M1BH3H:61085 (size: 29.4 KB, free: 899.7 MB) 19/03/31 19:59:59 INFO SparkContext: Created broadcast 0 from newAPIHadoopRDD at PhoenixRDD.scala:49 19/03/31 19:59:59 INFO deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum 19/03/31 19:59:59 INFO deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum 19/03/31 19:59:59 INFO QueryLoggerDisruptor: Starting QueryLoggerDisruptor for with ringbufferSize=8192, waitStrategy=BlockingWaitStrategy, exceptionHandler=org.apache.phoenix.log.QueryLoggerDefaultExceptionHandler@7b5cc918... 19/03/31 19:59:59 INFO ConnectionQueryServicesImpl: An instance of ConnectionQueryServices was created. 19/03/31 20:00:00 INFO RecoverableZooKeeper: Process identifier=hconnection-0x1cbc5693 connecting to ZooKeeper ensemble=192.168.240.101:2181 19/03/31 20:00:00 INFO ZooKeeper: Client environment:zookeeper.version=3.4.5-cdh5.14.2--1, built on 03/27/2018 20:39 GMT 19/03/31 20:00:00 INFO ZooKeeper: Client environment:host.name=DESKTOP-7M1BH3H 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.version=1.8.0_111 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.vendor=Oracle Corporation 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.home=C:\Program Files\Java\jdk1.8.0_111\jre 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.class.path=C:\Users\ZX~1\AppData\Local\Temp\classpath.jar;D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\lib\idea_rt.jar 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.library.path=C:\Program Files\Java\jdk1.8.0_111\bin;C:\WINDOWS\Sun\Java\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\Python27\;C:\Python27\Scripts;C:\Program Files (x86)\Intel\iCLS Client\;C:\ProgramData\Oracle\Java\javapath;D:\app\ZX\product\11.2.0\client_1\bin;C:\Program Files (x86)\RSA SecurID Token Common;C:\Program Files\Intel\iCLS Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;c:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;D:\Program Files\Git\cmd;C:\Program Files\Mercurial\;D:\Go\bin;C:\TDM-GCC-64\bin;D:\Program Files (x86)\scala\bin;D:\python;D:\python\Scripts;C:\WINDOWS\System32\OpenSSH\;c:\program files\Mozilla Firefox;D:\Program Files\wkhtmltox\bin;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;D:\Program Files\nodejs\;C:\ProgramData\chocolatey\bin;D:\code\ahswww\play5;D:\code\mysql-5.7.24\bin;C:\VisualSVN Server\bin;C:\Users\ZX\AppData\Local\Microsoft\WindowsApps;C:\Program Files (x86)\SSH Communications Security\SSH Secure Shell;C:\Users\ZX\AppData\Local\GitHubDesktop\bin;C:\Users\ZX\AppData\Local\Microsoft\WindowsApps;;D:\Program Files\Microsoft VS Code\bin;C:\Users\ZX\AppData\Roaming\npm;C:\Program Files\JetBrains\PyCharm 2018.3.1\bin;;. 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.io.tmpdir=C:\Users\ZX~1\AppData\Local\Temp\ 19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.compiler=<NA> 19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.name=Windows 10 19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.arch=amd64 19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.version=10.0 19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.name=ZX 19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.home=C:\Users\ZX 19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.dir=D:\ahty\AHCT\code\scala-test 19/03/31 20:00:00 INFO ZooKeeper: Initiating client connection, connectString=192.168.240.101:2181 sessionTimeout=90000 watcher=hconnection-0x1cbc56930x0, quorum=192.168.240.101:2181, baseZNode=/hbase 19/03/31 20:00:00 INFO ClientCnxn: Opening socket connection to server hadoop001.local/192.168.240.101:2181. Will not attempt to authenticate using SASL (unknown error) 19/03/31 20:00:00 INFO ClientCnxn: Socket connection established, initiating session, client: /192.168.240.101:61089, server: hadoop001.local/192.168.240.101:2181 19/03/31 20:00:00 INFO ClientCnxn: Session establishment complete on server hadoop001.local/192.168.240.101:2181, sessionid = 0x169cc35e45e0013, negotiated timeout = 40000 19/03/31 20:00:01 INFO ConnectionQueryServicesImpl: HConnection established. Stacktrace for informational purposes: hconnection-0x1cbc5693 java.lang.Thread.getStackTrace(Thread.java:1556) org.apache.phoenix.util.LogUtil.getCallerStackTrace(LogUtil.java:55) org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:427) org.apache.phoenix.query.ConnectionQueryServicesImpl.access$400(ConnectionQueryServicesImpl.java:267) org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2515) org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2491) org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2491) org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255) org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) java.sql.DriverManager.getConnection(DriverManager.java:664) java.sql.DriverManager.getConnection(DriverManager.java:208) org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:113) org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:58) org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil.getSelectColumnMetadataList(PhoenixConfigurationUtil.java:354) org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:118) org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60) org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431) org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) com.ahct.hbase.Test3$.main(Test3.scala:25) com.ahct.hbase.Test3.main(Test3.scala) 19/03/31 20:00:02 INFO deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.getDeclaredMethod(Class.java:2128) at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475) at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43) at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:342) at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:335) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159) at org.apache.spark.SparkContext.clean(SparkContext.scala:2292) at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:371) at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:370) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.RDD.map(RDD.scala:370) at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:131) at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) at com.ahct.hbase.Test3$.main(Test3.scala:25) at com.ahct.hbase.Test3.main(Test3.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 36 more 19/03/31 20:00:08 INFO SparkContext: Invoking stop() from shutdown hook 19/03/31 20:00:08 INFO SparkUI: Stopped Spark web UI at http://DESKTOP-7M1BH3H:4042 19/03/31 20:00:08 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 19/03/31 20:00:08 INFO MemoryStore: MemoryStore cleared 19/03/31 20:00:08 INFO BlockManager: BlockManager stopped 19/03/31 20:00:08 INFO BlockManagerMaster: BlockManagerMaster stopped 19/03/31 20:00:08 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 19/03/31 20:00:08 INFO SparkContext: Successfully stopped SparkContext 19/03/31 20:00:08 INFO ShutdownHookManager: Shutdown hook called Process finished with exit code 1