Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10163

CommonMergeJoinOperator calls WritableComparator.get() in the inner loop

    XMLWordPrintableJSON

Details

    Description

      The CommonMergeJoinOperator wastes CPU looking up the correct comparator for each WritableComparable in each row.

      @SuppressWarnings("rawtypes")
        private int compareKeys(List<Object> k1, List<Object> k2) {
          int ret = 0;
      ....   
            ret = WritableComparator.get(key_1.getClass()).compare(key_1, key_2);
            if (ret != 0) {
              return ret;
            }
          }
      

      The slow part of that get() is deep within ReflectionUtils.setConf, where it tries to use reflection to set the Comparator config for each row being compared.

      Attachments

        1. HIVE-10163.3.patch
          5 kB
          Gunther Hagleitner
        2. HIVE-10163.2.patch
          6 kB
          Gopal Vijayaraghavan
        3. HIVE-10163.1.patch
          3 kB
          Gunther Hagleitner
        4. mergejoin-parallel-lock.png
          20 kB
          Gopal Vijayaraghavan
        5. mergejoin-parallel-bt.png
          86 kB
          Gopal Vijayaraghavan
        6. mergejoin-comparekeys.png
          80 kB
          Gopal Vijayaraghavan

        Activity

          People

            hagleitn Gunther Hagleitner
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: