Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
4.0.0
-
None
Description
HIVE-21329 added automatic sizing of tez unordered partitioned KV buffer based on group by statistics. However, some corner cases for group by statistics sets Long.MAX for data size. This ends up setting Integer.MAX for unordered KV buffer size. This buffer size is expected to be in MB. Converting Integer.MAX value from MB to bytes will overflow and following exception is thrown.
2019-03-23T01:35:17,760 INFO [Dispatcher thread {Central}] HistoryEventHandler.criticalEvents: [HISTORY][DAG:dag_1553330105749_0001_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Map 1, taskAttemptId=attempt_1553330105749_0001_1_00_000000_0, creationTime=1553330117468, allocationTime=1553330117524, startTime=1553330117562, finishTime=1553330117755, timeTaken=193, status=FAILED, taskFailureType=NON_FATAL, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Error while running task ( failure ) : attempt_1553330105749_0001_1_00_000000_0:java.lang.IllegalArgumentException at com.google.common.base.Preconditions.checkArgument(Preconditions.java:108) at org.apache.tez.runtime.common.resources.MemoryDistributor.registerRequest(MemoryDistributor.java:177) at org.apache.tez.runtime.common.resources.MemoryDistributor.requestMemory(MemoryDistributor.java:110) at org.apache.tez.runtime.api.impl.TezTaskContextImpl.requestInitialMemory(TezTaskContextImpl.java:214) at org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput.initialize(UnorderedPartitionedKVOutput.java:76) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:537) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:520) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:505) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Stats for GBY operator is getting Long.MAX_VALUE as seen below
2019-03-23T01:35:16,466 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [0] STATS-TS[0] (logs): numRows: 1795 dataSize: 4443078 basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: severity colType: string countDistincts: 359 numNulls: 89 avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true} 2019-03-23T01:35:16,466 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: Estimating row count for GenericUDFOPEqual(Column[severity], Const string ERROR) Original num rows: 1795 New num rows: 5 2019-03-23T01:35:16,467 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [1] STATS-FIL[8]: numRows: 5 dataSize: 12376 basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: severity colType: string countDistincts: 359 numNulls: 89 avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true} 2019-03-23T01:35:16,467 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] exec.FilterOperator: Setting stats (Num rows: 5 Data size: 12376 Basic stats: PARTIAL Column stats: NONE) on: FIL[8] 2019-03-23T01:35:16,468 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] exec.SelectOperator: Setting stats (Num rows: 5 Data size: 12376 Basic stats: PARTIAL Column stats: NONE) on: SEL[2] 2019-03-23T01:35:16,468 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [1] STATS-SEL[2]: numRows: 5 dataSize: 12376 basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: severity colType: string countDistincts: 359 numNulls: 89 avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true} 2019-03-23T01:35:16,471 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: STATS-GBY[3]: inputSize: 4443078 maxSplitSize: 256000000 parallelism: 1 containsGroupingSet: false sizeOfGroupingSet: 1 2019-03-23T01:35:16,471 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [Case 1] STATS-GBY[3]: cardinality: 5 2019-03-23T01:35:16,472 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] exec.GroupByOperator: Setting stats (Num rows: 1 Data size: 9223372036854775807 Basic stats: PARTIAL Column stats: NONE) on: GBY[3] 2019-03-23T01:35:16,472 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [0] STATS-GBY[3]: numRows: 1 dataSize: 9223372036854775807 basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: severity colType: string countDistincts: 1 numNulls: 18 avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true, _col0= colName: _col0 colType: bigint countDistincts: 1 numNulls: 0 avgColLen: 8.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: false} 2019-03-23T01:35:16,473 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] exec.ReduceSinkOperator: Setting stats (Num rows: 1 Data size: 9223372036854775807 Basic stats: PARTIAL Column stats: NONE) on: RS[4] 2019-03-23T01:35:16,474 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [0] STATS-RS[4]: numRows: 1 dataSize: 9223372036854775807 basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: severity colType: string countDistincts: 1 numNulls: 18 avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true, _col0= colName: _col0 colType: bigint countDistincts: 1 numNulls: 0 avgColLen: 8.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: false} 2019-03-23T01:35:16,474 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: STATS-GBY[5]: inputSize: 1 maxSplitSize: 256000000 parallelism: 1 containsGroupingSet: false sizeOfGroupingSet: 1 2019-03-23T01:35:16,474 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [Case 7] STATS-GBY[5]: cardinality: 0 2019-03-23T01:35:16,474 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] stats.StatsUtils: STATS-GBY[5]: Equals 0 in number of rows. 0 rows will be set to 1 2019-03-23T01:35:16,474 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] exec.GroupByOperator: Setting stats (Num rows: 1 Data size: 9223372036854775807 Basic stats: PARTIAL Column stats: NONE) on: GBY[5] 2019-03-23T01:35:16,474 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [0] STATS-GBY[5]: numRows: 1 dataSize: 9223372036854775807 basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: severity colType: string countDistincts: 1 numNulls: 18 avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true, _col0= colName: _col0 colType: bigint countDistincts: 1 numNulls: 0 avgColLen: 8.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: false} 2019-03-23T01:35:16,474 DEBUG [c779e956-b3b9-451a-8248-6ae7c669854f main] annotation.StatsRulesProcFactory: [0] STATS-FS[7]: numRows: 1 dataSize: 9223372036854775807 basicStatsState: PARTIAL colStatsState: NONE colStats: {severity= colName: severity colType: string countDistincts: 1 numNulls: 36 avgColLen: 100.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: true, _col0= colName: _col0 colType: bigint countDistincts: 1 numNulls: 0 avgColLen: 8.0 numTrues: 0 numFalses: 0 isPrimaryKey: false isEstimated: false}