Description
git.commit.id.abbrev=3c89b30
I have a folder with 10k complex files. The generated cache file is around 486 MB. The below numbers indicate that we regressed in terms of performance when we generated the metadata cache
0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from `complex_sparse_50000files`; +----------+ | EXPR$0 | +----------+ | 1000000 | +----------+ 1 row selected (30.835 seconds) 0: jdbc:drill:zk=10.10.100.190:5181> refresh table metadata `complex_sparse_50000files`; +-------+---------------------------------------------------------------------+ | ok | summary | +-------+---------------------------------------------------------------------+ | true | Successfully updated metadata for table complex_sparse_50000files. | +-------+---------------------------------------------------------------------+ 1 row selected (10.69 seconds) 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from `complex_sparse_50000files`; +----------+ | EXPR$0 | +----------+ | 1000000 | +----------+ 1 row selected (47.614 seconds)
Attachments
Issue Links
- relates to
-
DRILL-7064 Leverage the summary's totalRowCount and totalNullCount for COUNT() queries (also prevent eager expansion of files)
- Resolved