Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
1.3.1
-
None
-
None
-
OS X
Description
- Download Spark 1.3.1 pre-built for Hadoop 2.6 from the Spark downloads page.
- Add localhost to your slaves file and start-all.sh
- Fire up PySpark and try reading from S3 with something like this:
sc.textFile('s3n://bucket/file_*').count()
- You will get an error like this:
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : java.io.IOException: No FileSystem for scheme: s3n
file:///... works. Spark 1.3.1 prebuilt for Hadoop 2.4 works. Spark 1.3.0 works.
It's just the combination of Spark 1.3.1 prebuilt for Hadoop 2.6 accessing S3 that doesn't work.
Attachments
Issue Links
- relates to
-
HADOOP-11863 Document process of deploying alternative file systems like S3 and Azure to the classpath.
- Open
-
SPARK-7481 Add spark-hadoop-cloud module to pull in object store support
- Resolved