Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.9.0
-
ghx-label-4
Description
The Parquet scanner always transfers decompression buffers to the scratch batch:
Status BaseScalarColumnReader::ReadDataPage() { // We're about to move to the next data page. The previous data page is // now complete, pass along the memory allocated for it. parent_->scratch_batch_->mem_pool()->AcquireData(decompressed_data_pool_.get(), false);
These in turn are passed along with the row batch. This is safe but unnecessary in many cases where the batch does not hold pointers into the decompression buffer: if the column has only fixed-length data, or if the data page is dictionary-encoded.
This can make problems like IMPALA-4923 worse than they would be otherwise because extra data is transferred across threads.
Attachments
Issue Links
- is related to
-
IMPALA-6054 Parquet dictionary pages should be freed on dictionary construction
- Resolved
- relates to
-
IMPALA-4923 Operators running on top of selective Parquet scans spend a lot of time calling impala::MemPool::FreeAll on empty batches
- Resolved