Details
-
Bug
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
2.4.6, 3.0.0, 3.1.0
-
None
-
None
Description
Before this fix https://issues.apache.org/jira/browse/HIVE-14380 in Hive 2.2.0, when moving files from staging dir to the final table dir, Hive will do encryption check for the srcPaths and destPaths
// Some comments here if (!isSrcLocal) { // For NOT local src file, rename the file if (hdfsEncryptionShim != null && (hdfsEncryptionShim.isPathEncrypted(srcf) || hdfsEncryptionShim.isPathEncrypted(destf)) && !hdfsEncryptionShim.arePathsOnSameEncryptionZone(srcf, destf)) { LOG.info("Copying source " + srcf + " to " + destf + " because HDFS encryption zones are different."); success = FileUtils.copy(srcf.getFileSystem(conf), srcf, destf.getFileSystem(conf), destf, true, // delete source replace, // overwrite destination conf); } else {
The hdfsEncryptionShim instance holds a global FileSystem instance belong to the default fileSystem. It causes failures when checking a path that belongs to a remote file system.
For example, I
key int NULL # Detailed Table Information Database bdms_hzyaoqin_test_2 Table abc Owner bdms_hzyaoqin Created Time Mon May 11 15:14:15 CST 2020 Last Access Thu Jan 01 08:00:00 CST 1970 Created By Spark 2.4.3 Type MANAGED Provider hive Table Properties [transient_lastDdlTime=1589181255] Location hdfs://cluster2/user/warehouse/bdms_hzyaoqin_test.db/abc Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat org.apache.hadoop.mapred.TextInputFormat OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Storage Properties [serialization.format=1] Partition Provider Catalog Time taken: 0.224 seconds, Fetched 18 row(s)
The table abc belongs to the remote hdfs 'hdfs://cluster2', and when we run command below via a spark sql job with default fs is ' 'hdfs://cluster1'
insert into bdms_hzyaoqin_test_2.abc values(1);
Error in query: java.lang.IllegalArgumentException: Wrong FS: hdfs://cluster2/user/warehouse/bdms_hzyaoqin_test.db/abc/.hive-staging_hive_2020-05-11_17-10-27_123_6306294638950056285-1/-ext-10000/part-00000-badf2a31-ab36-4b60-82a1-0848774e4af5-c000, expected: hdfs://cluster1
Attachments
Issue Links
- is related to
-
HIVE-14380 Queries on tables with remote HDFS paths fail in "encryption" checks.
- Closed
- relates to
-
SPARK-31684 Overwrite partition failed with 'WRONG FS' when the target partition is not belong to the filesystem as same as the table
- Resolved
- links to