Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.8.0
-
None
-
None
Description
LocatedFileStatusFetcher does parallelized path listing, but it does make recursive calls on every subdir.
If we could switch it to use FileSystem.listFiles(recursive), object stores that have high-performance implementations of that operation would see significant speedup.
HADOOP-13208 implements that for S3A; Azure, swift &c can do the same.
Attachments
Issue Links
- is depended upon by
-
HADOOP-13525 Optimize uses of FS operations in the ASF analysis frameworks and libraries
- Resolved
- relates to
-
HIVE-14165 Remove Hive file listing during split computation
- Closed