Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.6.0
Description
I was running tests in a loop for a long time and the impalad crashed with this DCHECK:
F0319 08:43:11.185613 14963 scanner-context.h:249] Check failed: idx < streams_.size() (0 vs. 0) *** Check failure stack trace: *** @ 0x268534d google::LogMessage::Fail() @ 0x2687c76 google::LogMessage::SendToLog() @ 0x2684e6d google::LogMessage::Flush() @ 0x268871e google::LogMessageFatal::~LogMessageFatal() @ 0x15e3cd3 impala::ScannerContext::GetStream() @ 0x15e15fa impala::HdfsScanNode::ProcessSplit() @ 0x15e04af impala::HdfsScanNode::ScannerThread() @ 0x15fd41f boost::_mfi::mf0<>::operator()() @ 0x15fc5a2 boost::_bi::list1<>::operator()<>() @ 0x15fb051 boost::_bi::bind_t<>::operator()() @ 0x15f8e1e boost::detail::function::void_function_obj_invoker0<>::invoke() @ 0x128e6be boost::function0<>::operator()() @ 0x1554a03 impala::Thread::SuperviseThread() @ 0x155c134 boost::_bi::list4<>::operator()<>() @ 0x155c077 boost::_bi::bind_t<>::operator()() @ 0x155c03a boost::detail::thread_data<>::run() @ 0x196980a thread_proxy @ 0x7f540b43a6aa start_thread @ 0x7f5408989e9d (unknown)
It appears that it hit a memory limit in the scanner:
I0319 08:43:11.183944 14967 hdfs-scan-node.cc:1204] Scan node (id=3) ran into a parse error for scan range hdfs://localhost:20500/test-warehouse/tpch.lineitem_text_gzip/000003_0.gz(0:39234185). Processed 4946173 bytes. Memory Limit Exceeded HDFS_SCAN_NODE (id=3) could not allocate 24.00 KB without exceeding limit. Query(74c1f2a155c7140:78a6b5a156fe08c) Limit: Consumption=228.66 MB Fragment 74c1f2a155c7140:78a6b5a156fe08d: Consumption=8.00 KB EXCHANGE_NODE (id=17): Consumption=0 DataStreamRecvr: Consumption=0 Block Manager: Limit=6.69 GB Consumption=132.50 MB Fragment 74c1f2a155c7140:78a6b5a156fe095: Consumption=2.25 MB AGGREGATION_NODE (id=13): Consumption=2.25 MB EXCHANGE_NODE (id=12): Consumption=0 DataStreamRecvr: Consumption=0 Fragment 74c1f2a155c7140:78a6b5a156fe092: Consumption=152.30 MB AGGREGATION_NODE (id=8): Consumption=1.25 MB HASH_JOIN_NODE (id=7): Consumption=24.00 KB HASH_JOIN_NODE (id=6): Consumption=13.02 MB HASH_JOIN_NODE (id=5): Consumption=138.01 MB HDFS_SCAN_NODE (id=2): Consumption=0 EXCHANGE_NODE (id=10): Consumption=0 EXCHANGE_NODE (id=11): Consumption=0 EXCHANGE_NODE (id=14): Consumption=0 Fragment 74c1f2a155c7140:78a6b5a156fe098: Consumption=43.06 MB AGGREGATION_NODE (id=4): Consumption=3.03 MB HDFS_SCAN_NODE (id=3): Consumption=40.01 MB DataStreamSender: Consumption=12.00 KB Fragment 74c1f2a155c7140:78a6b5a156fe09c: Consumption=29.04 MB HDFS_SCAN_NODE (id=1): Consumption=29.04 MB Fragment 74c1f2a155c7140:78a6b5a156fe08f: Consumption=0 SORT_NODE (id=9): Consumption=0 AGGREGATION_NODE (id=16): Consumption=0 EXCHANGE_NODE (id=15): Consumption=0
I think I touched that error path last, but I'll assign to Michael in the meantime since he's actively working on this code. Feel free to assign back to me.