Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.0-alpha
-
None
-
None
-
Reviewed
Description
Currently, HDFS exposes on which datanodes a block resides, which allows clients to make scheduling decisions for locality and load balancing. Extending this to also expose on which disk on a datanode a block resides would enable even better scheduling, on a per-disk rather than coarse per-datanode basis.
This API would likely look similar to Filesystem#getFileBlockLocations, but also involve a series of RPCs to the responsible datanodes to determine disk ids.
Attachments
Attachments
Issue Links
- is depended upon by
-
HBASE-6572 Tiered HFile storage
- Closed
- is related to
-
HDFS-3969 Small bug fixes and improvements for disk locations API
- Closed
-
MAPREDUCE-4577 HDFS-3672 broke TestCombineFileInputFormat.testMissingBlocks() test
- Closed
-
HDFS-2832 Enable support for heterogeneous storages in HDFS - DN as a collection of storages
- Closed
-
HDFS-8895 Remove deprecated BlockStorageLocation APIs
- Resolved
-
HBASE-6572 Tiered HFile storage
- Closed
- is superceded by
-
HDFS-8887 Expose storage type and storage ID in BlockLocation
- Resolved