Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
Description
In these days, it's unnecessary to say the advantages of columnar store and vectorized processing on analytic workloads. These approaches are well known as the state-of-the-art techniques in database community and are also acceptable in practical areas.
Since we started Tajo project in 2010 year, we have planed the new engine using both JIT query compilation and vectorized engine. My colleagues and I have surveyed columnar store, vectorized processing, cache conscious techniques, and query compilation.
In this issue, we will design and implement the new engine. The key implementation plan is as follows:
- Implemented in C++
- Vectorization primitives will be generated by LLVM.
- Two or more primitives by using JIT can be blurred according to the situation.
This is an umbrella issue, and we will create lots of subtasks for this issue.
The design references are as follows:
- DSM vs. NSM: CPU Performance Tradeoffs in Block-Oriented Query Processing.
- Efficiently Compiling Efficient Query Plans for Modern Hardware
- Just-in-time Compilation in Vectorized Query Execution
- MonetDB/X100: Hyper-Pipelining Query Execution
- Column-Stores vs. Row-Stores: How Different Are They Really?
- Balancing vectorized query execution with bandwidth-optimized storage