Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Not A Bug
-
Impala 2.9.0, Impala 2.10.0
-
None
-
ghx-label-6
Description
For CTAS queries, Impala has to do some gymnastics to figure out the path of the new table before the table has actually been created by the Hive Metastore. It looks like like that logic has a bug which prevents it from working with HDFS HA.
This issue manifests itself as follows.
When running a CTAS through the Impala shell you will see:
Failed to open HDFS file for writing: hdfs://<active_nn_host_port>/user/hive/warehouse/impala/_impala_insert_staging/a44bbb42e3e6030a_3708fc8f00000000/.a44bbb42e3e6030a-3708fc8f00000000_955325278_dir/a44bbb42e3e6030a-3708fc8f00000000_2079801042_data.0. Error(255): Unknown error 255
In the HDFS NN logs you may see something strange like this:
hdfsOpenFile(hdfs://<active_nn_host_port>/user/hive/warehouse/impala/_impala_insert_staging/a44bbb42e3e6030a_3708fc8f00000000/.a44bbb42e3e6030a-3708fc8f00000000_955325278_dir/a44bbb42e3e6030a-3708fc8f00000000_2079801042_data.0.): FileSystem#create((Lorg/apache/hadoop/fs/Path;ZISJ)Lorg/apache/hadoop/fs/FSDataOutputStream;) error:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby.
The problem is most likely in CreateTableAsSelectStmt.analyze()
... try (MetaStoreClient client = analyzer.getCatalog().getMetaStoreClient()) { // Set a valid location of this table using the same rules as the metastore. If the // user specified a location for the table this will be a no-op. msTbl.getSd().setLocation(analyzer.getCatalog().getTablePath(msTbl).toString()); <---Wrong path with HDFS HA enabled! ...
Workaround
Split the query into create table and insert into select.