Details
Description
Reading
s3n://bucket/{a/,b/,c/}
if one of the globs matches nothing, I get:
Exception in thread "main" java.lang.NullPointerException
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:992)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:177)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
at spark.HadoopRDD.<init>(HadoopRDD.scala:51)
at spark.SparkContext.hadoopFile(SparkContext.scala:186)
at spark.SparkContext.textFile(SparkContext.scala:155)
at com.celtra.analyzer.LogAnalyzer.analyzeSufficientS3Logs(LogAnalyzer.scala:52)
at com.celtra.analyzer.App$.main(App.scala:164)
at com.celtra.analyzer.App.main(App.scala)
I'm not sure whether this is specific to S3 or all filesystems.
This was occuring in 0.20.205 and I confirmed it's still present in 1.0.3.
Attachments
Issue Links
- relates to
-
HADOOP-15748 S3 listing inconsistency can raise NPE in globber
- Resolved