Details
Description
When the number of blocks on the DataNode grows large we start running into a few issues:
- Block reports take a long time to process on the NameNode. In testing we have seen that a block report with 6 Million blocks takes close to one second to process on the NameNode. The NameSystem write lock is held during this time.
- We start hitting the default protobuf message limit of 64MB somewhere around 10 Million blocks. While we can increase the message size limit it already takes over 7 seconds to serialize/unserialize a block report of this size.
HDFS-2832 has introduced the concept of a DataNode as a collection of storages i.e. the NameNode is aware of all the volumes (storage directories) attached to a given DataNode. This makes it easy to split block reports from the DN by sending one report per storage directory to mitigate the above problems.
Attachments
Attachments
Issue Links
- breaks
-
HDFS-6340 DN can't finalize upgrade
- Closed
- depends upon
-
HDFS-4988 Datanode must support all the volumes as individual storages
- Resolved
- is duplicated by
-
HDFS-6756 Default ipc.maximum.data.length should be increased to 128MB from 64MB
- Resolved
- is related to
-
HDFS-7579 Improve log reporting during block report rpc failure
- Closed