Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.0
-
None
-
None
Description
While looking at the code for generating the memory layout in the frontend I found some possible inconstancies:
1) If the slot size is larger than 1 byte it will try to align the next slot to a multiple of the slot, however, this can be bad for all types that are not strictly power of twos. Imagine a char(3) column that is followed by a 4 byte value. In the worst case the 4b value will spread over two words and make the data access expensive and error prone with regards to thread safety
2) If I understand correctly by default the char slot is inlined except it exceeds a certain size. However, I'm not sure if this case his handled correctly, because for all char the ScalarType class will return the complete length independent of the decision to inline it. So in case of not inlining the size should by equivalent to the size required to store a normal string.