Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10667

Report more accurate info about data corruption location

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.8.0, 3.0.0-alpha1
    • datanode, hdfs
    • None
    • Reviewed

    Description

      Per

      https://issues.apache.org/jira/browse/HDFS-10587?focusedCommentId=15376897&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15376897

      129.77 report:

      2016-07-13 11:49:01,512 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving blk_1116167880_42906656 src: /10.6.134.229:43844 dest: /10.6.129.77:5080
      2016-07-13 11:49:01,543 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Checksum error in block blk_1116167880_42906656 from /10.6.134.229:43844
      org.apache.hadoop.fs.ChecksumException: Checksum error: DFSClient_NONMAPREDUCE_2019484565_1 at 81920 exp: 1352119728 got: -1012279895
              at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method)
              at org.apache.hadoop.util.NativeCrc32.verifyChunkedSumsByteArray(NativeCrc32.java:69)
              at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:347)
              at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:294)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.verifyChunks(BlockReceiver.java:421)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:558)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:789)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:917)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:174)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:80)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
              at java.lang.Thread.run(Thread.java:745)
      2016-07-13 11:49:01,543 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for blk_1116167880_42906656
      java.io.IOException: Terminating due to a checksum error.java.io.IOException: Unexpected checksum mismatch while writing blk_1116167880_42906656 from /10.6.134.229:43844
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:571)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:789)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:917)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:174)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:80)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
              at java.lang.Thread.run(Thread.java:745)
      

      and

      https://issues.apache.org/jira/browse/HDFS-10587?focusedCommentId=15378879&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15378879

      While verifying only packet, the position mentioned in the checksum exception, is relative to packet buffer offset, not the block offset. So 81920 is the offset in the exception.

      Create this jira to report more accurate corruption location information: the offset in the file, offset in block, and offset in packet.

      See

      https://issues.apache.org/jira/browse/HDFS-10587?focusedCommentId=15387083&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15387083

      Attachments

        1. HDFS-10667.005.patch
          1 kB
          Yuanbo Liu
        2. HDFS-10667.004.patch
          1 kB
          Yuanbo Liu
        3. HDFS-10667.003.patch
          2 kB
          Yuanbo Liu
        4. HDFS-10667.002.patch
          2 kB
          Yuanbo Liu
        5. HDFS-10667.001.patch
          2 kB
          Yuanbo Liu

        Issue Links

          Activity

            People

              yuanbo Yuanbo Liu
              yzhangal Yongjun Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: