Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-931

Optimize RunLengthIntegerWriterV2 code for better readability

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 1.7.0, 1.8.0
    • 1.7.0, 1.8.0
    • Java
    • None

    Description

      RunLengthIntegerWriterV2.java
      512-546 line

        if (diffBitsLH > 1) {
            for (int i = 0; i < numLiterals; i++) {
              baseRedLiterals[i] = literals[i] - min;
            }
            brBits95p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 0.95);
            brBits100p = utils.percentileBits(baseRedLiterals, 0, numLiterals, 1.0);
            if ((brBits100p - brBits95p) != 0 && Math.abs(min) < BASE_VALUE_LIMIT) {
              encoding = EncodingType.PATCHED_BASE;
              preparePatchedBlob();
              return;
            } else {
              encoding = EncodingType.DIRECT;
              return;
            }
          } else {
            // if difference in bits between 95th percentile and 100th percentile is
            // 0, then patch length will become 0. Hence we will fallback to direct
            encoding = EncodingType.DIRECT;
            return;
          }
      

      All three conditional branch logics have been completed and the return statement is redundant.

      691-704 line

            if (fixedRunLength < MIN_REPEAT) {
                variableRunLength = fixedRunLength;
                fixedRunLength = 0;
                determineEncoding();
                writeValues();
              } else if (fixedRunLength >= MIN_REPEAT
                  && fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
                encoding = EncodingType.SHORT_REPEAT;
                writeValues();
              } else {
                encoding = EncodingType.DELTA;
                isFixedDelta = true;
                writeValues();
              }
      

      fixedRunLength >= MIN_REPEAT is redundant, the previous condition already ensures this. Extract the writeValues() method to the end. It seems better for conditional judgements to deal only with encoding and state.

      772-781 line

                if (fixedRunLength >= MIN_REPEAT) {
                  if (fixedRunLength <= MAX_SHORT_REPEAT_LENGTH) {
                    encoding = EncodingType.SHORT_REPEAT;
                    writeValues();
                  } else {
                    encoding = EncodingType.DELTA;
                    isFixedDelta = true;
                    writeValues();
                  }
                }
      

      Ditto

      Attachments

        Issue Links

          Activity

            People

              Guiyankuang Yiqun Zhang
              Guiyankuang Yiqun Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: