Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11664

Remove HDFS Binaries/Jars Dependency From YARN

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 3.4.0
    • 3.5.0
    • yarn
    • Hide
      To support YARN deployments in clusters without HDFS
      some changes have been made in packaging.
      New hadoop-common class org.apache.hadoop.fs.HdfsCommonConstants. HDFS class org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair moved from hdfs-client to hadoop-common.
      YARN handlers for DSQuotaExceededException replaced by use of superclass ClusterStorageCapacityExceededException.
      Show
      To support YARN deployments in clusters without HDFS some changes have been made in packaging. New hadoop-common class org.apache.hadoop.fs.HdfsCommonConstants. HDFS class org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair moved from hdfs-client to hadoop-common. YARN handlers for DSQuotaExceededException replaced by use of superclass ClusterStorageCapacityExceededException.

    Description

      In principle Hadoop Yarn is independent of HDFS. It can work with any filesystem. Currently there exists some code dependency for Yarn with HDFS. This dependency requires Yarn to bring in some of the HDFS binaries/jars to its class path. The idea behind this jira is to remove this dependency so that Yarn can run without HDFS binaries/jars

      Scope
      1. Non test classes are considered
      2. Some test classes which comes as transitive dependency are considered

      Out of scope
      1. All test classes in Yarn module is not considered

       

      --------------------------------------------------------------------------------------------------------------------------------------------------------------------

      A quick search in Yarn module revealed following HDFS dependencies

      1. Constants

      import org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier;
      import org.apache.hadoop.hdfs.DFSConfigKeys;

       

       
      2. Exception

      import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;

       

      3. Utility

      import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;

       

      Both Yarn and HDFS depends on hadoop-common module,

      • Constants variables and Utility classes can be moved to hadoop-common
      • Instead of DSQuotaExceededException, Use the parent exception ClusterStoragrCapacityExceeded

      Attachments

        Issue Links

          Activity

            People

              srahman Syed Shameerur Rahman
              srahman Syed Shameerur Rahman
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: