Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13678

StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.5.0
    • None
    • rolling upgrades
    • None

    Description

      In version 2.6.0, we supported more storage types in HDFS that implemented in HDFS-6584. But this seems a incompatible change when we rolling upgrade our cluster from 2.5.0 to 2.6.0 and throw following error.

      2018-06-14 11:43:39,246 ERROR [DataNode: [[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/, [DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022] org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022
      java.lang.ArrayStoreException
       at java.util.ArrayList.toArray(ArrayList.java:412)
       at java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034)
       at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030)
       at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836)
       at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146)
       at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566)
       at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664)
       at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
       at java.lang.Thread.run(Thread.java:748)
      

      The scenery is that old DN parses StorageType error that got from new NN. This error is taking place in sending heratbeat to NN and blocks won't be reported to NN successfully. This will lead subsequent errors.

      Corresponding logic in 2.5.0:

        public static BlockCommand convert(BlockCommandProto blkCmd) {
          ...
      
          StorageType[][] targetStorageTypes = new StorageType[targetList.size()][];
          List<StorageTypesProto> targetStorageTypesList = blkCmd.getTargetStorageTypesList();
          if (targetStorageTypesList.isEmpty()) { // missing storage types
            for(int i = 0; i < targetStorageTypes.length; i++) {
              targetStorageTypes[i] = new StorageType[targets[i].length];
              Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT);
            }
          } else {
            for(int i = 0; i < targetStorageTypes.length; i++) {
              List<StorageTypeProto> p = targetStorageTypesList.get(i).getStorageTypesList();
              targetStorageTypes[i] = p.toArray(new StorageType[p.size()]);  <==== error here
            }
          }
      

      But corresponding to the current logic , it's will be better to return default type instead of a exception in case StorageType changed(new fields added or new types) in new versions during rolling upgrade.

          public static StorageType convertStorageType(StorageTypeProto type) {
          switch(type) {
          case DISK:
            return StorageType.DISK;
          case SSD:
            return StorageType.SSD;
          case ARCHIVE:
            return StorageType.ARCHIVE;
          case RAM_DISK:
            return StorageType.RAM_DISK;
          case PROVIDED:
            return StorageType.PROVIDED;
          default:
            throw new IllegalStateException(
                "BUG: StorageTypeProto not found, type=" + type);
          }
        }
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              linyiqun Yiqun Lin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: