Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
ghx-label-8
Description
TestAvroSchemaResolution.test_avro_schema_resolution recently fails when building against a Hive version with HIVE-24157.
query_test.test_avro_schema_resolution.TestAvroSchemaResolution.test_avro_schema_resolution[protocol: beeswax | exec_option: \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: avro/snap/block] (from pytest) query_test/test_avro_schema_resolution.py:36: in test_avro_schema_resolution self.run_test_case('QueryTest/avro-schema-resolution', vector, unique_database) common/impala_test_suite.py:690: in run_test_case self.__verify_results_and_errors(vector, test_section, result, use_db) common/impala_test_suite.py:523: in __verify_results_and_errors replace_filenames_with_placeholder) common/test_result_verifier.py:456: in verify_raw_results VERIFIER_MAP[verifier](expected, actual) common/test_result_verifier.py:278: in verify_query_result_is_equal assert expected_results == actual_results E assert Comparing QueryTestResults (expected vs actual): E 10 != 0
The failed query is
select count(*) from functional_avro_snap.avro_coldef
The cause is that data loading for avro_coldef failed. The DML is
INSERT OVERWRITE TABLE avro_coldef PARTITION(year=2014, month=1) SELECT bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col FROM (select * from functional.alltypes order by id limit 5) a;
The failure (found in HS2) is:
2021-01-24T01:52:16,340 ERROR [9433ee64-d706-4fa4-a146-18d71bf17013 HiveServer2-Handler-Pool: Thread-4946] parse.CalcitePlanner: CBO failed, skipping CBO. org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting DATE/TIMESTAMP types to NUMERIC is prohibited (hive.strict.timestamp.conversion) at org.apache.hadoop.hive.ql.udf.TimestampCastRestrictorResolver.getEvalMethod(TimestampCastRestrictorResolver.java:62) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:168) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:260) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:292) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDescWithUdfData(TypeCheckProcFactory.java:987) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.ParseUtils.createConversionCast(ParseUtils.java:163) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:8551) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:7908) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11100) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10972) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11901) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11771) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:593) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12678) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:423) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:288) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:221) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:194) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:607) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:553) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:547) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:260) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.cli.operation.Operation.run(Operation.java:274) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:565) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:551) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:567) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_144] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_144] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
This check is introduced in HIVE-24157. Describe on the table shows the timestamp_col is bigint:
0: jdbc:hive2://localhost:11050> desc avro_coldef; INFO : Compiling command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2): desc avro_coldef INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2); Time taken: 0.016 seconds INFO : Executing command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2): desc avro_coldef INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2); Time taken: 0.008 seconds INFO : OK +--------------------------+------------+----------+ | col_name | data_type | comment | +--------------------------+------------+----------+ | bool_col | boolean | | | tinyint_col | int | | | smallint_col | int | | | int_col | int | | | bigint_col | bigint | | | float_col | float | | | double_col | double | | | date_string_col | string | | | string_col | string | | | timestamp_col | bigint | | | year | int | | | month | int | | | | NULL | NULL | | # Partition Information | NULL | NULL | | # col_name | data_type | comment | | year | int | | | month | int | | +--------------------------+------------+----------+
This hits the restriction.