Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10181

HiveContext is not used with keytab principal but with user principal/unix username

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.0
    • 1.5.3, 1.6.0
    • SQL
    • kerberos

    Description

      `bin/spark-submit --num-executors 1 --executor-cores 5 --executor-memory 5G --driver-java-options -XX:MaxPermSize=4G --driver-class-path lib/datanucleus-api-jdo-3.2.6.jar:lib/datanucleus-core-3.2.10.jar:lib/datanucleus-rdbms-3.2.9.jar:conf/hive-site.xml --files conf/hive-site.xml --master yarn --principal sparkjob --keytab /etc/security/keytabs/sparkjob.keytab --conf spark.yarn.executor.memoryOverhead=18000 --conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=4G" --conf spark.eventLog.enabled=false ~/test.py`

      With:

      #!/usr/bin/python
      from pyspark import SparkContext
      from pyspark.sql import HiveContext

      sc = SparkContext()
      sqlContext = HiveContext(sc)

      query = """ SELECT * FROM fm.sk_cluster """
      rdd = sqlContext.sql(query)

      rdd.registerTempTable("test")
      sqlContext.sql("CREATE TABLE wcs.test LOCATION '/tmp/test_gl' AS SELECT * FROM test")

      Ends up with:

      Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denie
      d: user=ua80tl, access=READ_EXECUTE, inode="/tmp/test_gl/.hive-staging_hive_2015-08-24_10-43-09_157_78057390024057878
      34-1/ext-10000":sparkjob:hdfs:drwxr-x--

      (Our umask denies read access to other by default)

      Attachments

        Issue Links

          Activity

            People

              crystal_gaoyu Yu Gao
              bolke Bolke de Bruin
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: