Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.6.0
Description
Kudu should annotate each column in the batch if it is nullable, as today per row per column from a kudu batch the scanner checks if the slot is null, it would be much more efficient to store a per column bit in the KuduScanBatch indicating nullability of a column.
Status KuduScanner::KuduRowToImpalaTuple(const KuduScanBatch::RowPtr& row, RowBatch* row_batch, Tuple* tuple) { for (int i = 0; i < scan_node_->tuple_desc_->slots().size(); ++i) { const SlotDescriptor* info = scan_node_->tuple_desc_->slots()[i]; void* slot = tuple->GetSlot(info->tuple_offset()); if (row.IsNull(i)) { SetSlotToNull(tuple, *info); continue; } int max_len = -1; switch (info->type().type) { case TYPE_VARCHAR: max_len = info->type().len; DCHECK_GT(max_len, 0);
For a basic scan null check consumes 4% of the CPU cycles.
Attachments
Issue Links
- is related to
-
IMPALA-3346 Kudu scanner : Improve perf of DecodeRowsIntoRowBatch
- Resolved