Details
-
Improvement
-
Status: Resolved
-
Trivial
-
Resolution: Fixed
-
None
Description
Currently, BlockManager.metaSave method (which is called by "-metasave" dfs CLI command) reports both "under replicated" and "missing" blocks under same metric Metasave: Blocks waiting for reconstruction: as shown on below code snippet:
synchronized (neededReconstruction) { out.println("Metasave: Blocks waiting for reconstruction: " + neededReconstruction.size()); for (Block block : neededReconstruction) { dumpBlockMeta(block, out); } }
neededReconstruction is an instance of LowRedundancyBlocks, which actually wraps 5 priority queues currently. 4 of these queues store different under replicated scenarios, but the 5th one is dedicated for corrupt/missing blocks.
Thus, metasave report may suggest some corrupt blocks are just under replicated. This can be misleading for admins and operators trying to track block missing/corruption issues, and/or other issues related to BlockManager metrics.
I would like to propose a patch with trivial changes that would report corrupt blocks separately.