Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12935

Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.9.0, 3.0.0-beta1, 3.0.0
    • 3.1.0, 2.10.0, 2.9.1, 3.0.1
    • tools
    • None
    • Reviewed

    Description

      In HA mode, if one namenode is down, most of functions can still work. When considering the following two occasions:
      (1)nn1 up and nn2 down
      (2)nn1 down and nn2 up
      These two occasions should be equivalent. However, some of the DFSAdmin commands will have ambiguous results. The commands can be send successfully to the up namenode and are always functionally useful only when nn1 is up regardless of exception (IOException when connecting to the down namenode nn2). If only nn2 is up, the commands have no use at all and only exception to connect nn1 can be found.
      See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to set balancer bandwidth value for datanodes as an example. It works and all the datanodes can get the setting values only when nn1 is up. If only nn2 is up, the command throws exception directly and no datanode get the bandwidth setting. Approximately ten DFSAdmin commands use the similar logical process and may be ambiguous.
      [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
      active
      [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
      Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820
      setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to jiangjianfei02:9820 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
      [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
      active
      [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
      setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to jiangjianfei01:9820 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
      [root@jiangjianfei01 ~]#

      Attachments

        1. HDFS_12935.001.patch
          5 kB
          Jianfei Jiang
        2. HDFS-12935.002.patch
          35 kB
          Jianfei Jiang
        3. HDFS-12935.003.patch
          35 kB
          Jianfei Jiang
        4. HDFS-12935.004.patch
          32 kB
          Jianfei Jiang
        5. HDFS-12935.005.patch
          36 kB
          Jianfei Jiang
        6. HDFS-12935.006.patch
          36 kB
          Jianfei Jiang
        7. HDFS-12935.006-branch.2.patch
          36 kB
          Jianfei Jiang
        8. HDFS-12935.007-branch.2.patch
          38 kB
          Jianfei Jiang
        9. HDFS-12935.007.patch
          38 kB
          Jianfei Jiang
        10. HDFS-12935.008.patch
          38 kB
          Jianfei Jiang
        11. HDFS-12935.009.patch
          38 kB
          Jianfei Jiang
        12. HDFS-12935.009-branch.2.patch
          39 kB
          Jianfei Jiang
        13. HDFS-12935.009-branch-2.patch
          39 kB
          Brahma Reddy Battula

        Issue Links

          Activity

            People

              jiangjianfei Jianfei Jiang
              jiangjianfei Jianfei Jiang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: