Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
ghx-label-7
Description
We assume that the size fits to an int:
https://github.com/apache/impala/blob/308fda110758b0fc58e5b1f477d635aac29aea75/be/src/exec/hdfs-orc-scanner.cc#L253
If the size overflows, then we can incorrectly hit the following error check (this check is meant to avoid crashing due to corrupt metadata). I see no other ways this could cause problems, if the catch still succeeds (because the overflow led to a valid looking length), then the data will be read correctly.
This looks like a trivial fix, but I am concerned about lack of testing of >2GB files