[SUBMARINE-562] Secure raw read and writes to hdfs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Security
Labels:
None

Description

I was testing the security plugin inside my company and I noticed that either running a "select * from table" or reading directly the table path on hdfs produces the same plan but in the raw path read it shows the path URI only and this is not considered into the PrivilegesBuilder class, I designed an internal patch for this module at my company to address this issue by adding this to the buildQuery function

case l: LogicalRelation =>
if (l.catalogTable.nonEmpty) {
  mergeProjection(l.catalogTable.get)
} else if (l.relation.isInstanceOf[HadoopFsRelation]) {
  for (path <- l.relation.asInstanceOf[HadoopFsRelation].location.rootPaths)
    privilegeObjects += new SparkPrivilegeObject(
      SparkPrivilegeObjectType.DFS_URI, path.toString, path.toString)
}

and this to the buildCommand function

case i: InsertIntoHadoopFsRelationCommand =>
i.catalogTable foreach { t =>
  addTableOrViewLevelObjs(
    t.identifier,
    outputObjs,
    i.partitionColumns.map(_.name),
    t.schema.fieldNames)
}
if (i.catalogTable.isEmpty) {
  outputObjs += new SparkPrivilegeObject(
    SparkPrivilegeObjectType.DFS_URI, i.outputPath.toString, i.outputPath.toString)
}

but I get this project proposes Hive authorization and not HDFS authorization, but even so people in the Spark environment tend to write temporary files without metastore tables also and this should pass through authorization.

I am creating this issue in order to ask the maintainers if this is relevant and if this is in the same scope of the Security module in order for me to provide a patch for this.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Pedro Rossi

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 09/Jul/20 14:30

Updated:: 18/Jan/22 05:06