Details
-
Bug
-
Status: Open
-
Blocker
-
Resolution: Unresolved
-
0.10.1
-
None
Description
Hi,
I downloaded the Zeppelin docker file from the below site and then i took the spark standalone docker file and merged them into one (attached). I was able to create the image and run the cluster.
1) Zeppelin 0.10.1 : https://hub.docker.com/r/apache/zeppelin/dockerfile
2) Spark standalone : https://github.com/apache/zeppelin/blob/master/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
spark version :
ENV SPARK_PROFILE="2.4"
ENV SPARK_VERSION="2.4.8"
ENV HADOOP_PROFILE="2.7"
ENV SPARK_HOME="/usr/local/spark"
Command used to run the container :
docker run -it \
-p 8080:8080 \
-p 7077:7077 \
-p 8888:8888 \
-p 8081:8081 \
-h sparkmaster \
--name imagename\
imagename bash;
3) I am able to connect to Zeppelin Ui : http://hostname:8080/and run basic r/ python notebook but when I run a simple spark code it keeps running and does not finish & its gets stuck. I have restart zep-daemeon everytime to test it again. The application id doesn't even show in the spark master ui: http://hostname:8081/.
4) I am able to connect to spark-shell, pyspark and do spark-submit within the container and can see the application in the spark master url but none works from Zeppelin. I have been struggling to make it run from past 1 week
Note : I had set the spark_home : /usr/local/spark/ and master : spark://sparkmaster:7077 in the interpreter setting for spark but it didn't help. There is also no interpreter log being generated within the zeppelin logs folder.
I have attached the logs (spark master , slave and zepp logs) but no errors in it. I have pasted the process id details as well. There are many process running which don't get killed when i restart the daemon.