Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
In multi-dimensional languages such as MDX, DAX, Tableau, you can define calculations in your models that can be re-evaluated in other dimensional contexts. (The models are often called cubes, and the calculations are often called measures.)
In SQL, the model is a view (or a sub-query in the FROM clause) but the columns are just values. Suppose you have a private Employees table, a Departments view that rolls Employees up to department level and has an averageSalary column. Now suppose you wish to roll up averageSalary to the region level. The values that went into averageSalary are not available to you, either directly or indirectly, so the best you can do is to average-the-averages.
In this proposed (and experimental) feature, you can define a special kind of column - a measure - in the SELECT list of a view (or sub-query in a FROM clause), and it remains a calculation. When a query uses a measure column, the calculation is re-evaluated in the context of that query.
To some extent, this breaches the "black box" property of SQL views. Hitherto, a SQL view can be replaced with a table that has the same contents, and all queries that use that view will return the same results. That property no longer holds. But the view remains a useful "hiding" abstraction, and the rows that compose that view cannot be viewed directly.
Like dimensional models, measures in SQL would allow high-level abstractions such as key-performance indicators (KPIs) to be shared and composed. Unlike dimensional models, the models remain relational, namely, it is still possible to enumerate and count the rows in a model.
Consider the following view and query that uses it:
CREATE VIEW EmpSummary AS SELECT deptno, job, AVG(sal) AS avg_sal, AVG(sal) AS MEASURE avg_sal_measure, COUNT(*) + 1 AS MEASURE count_plus_one_measure FROM Emp GROUP BY deptno, job; SELECT deptno, AVG(avg_sal) AS a1, AGGREGATE(avg_sal_measure) AS a2, AGGREGATE(count_plus_one_measure) AS c1 FROM EmpSummary GROUP BY deptno;
Note that there is a special aggregate function, AGGREGATE, that rolls up measures. Columns a1 and a2 will contain different values; the first averages the averages, and the second computes the average from the raw data. Column c1 will return the number of employees in each department plus one, not rolling up the "plus one" for each distinct job in the department.
This is just a brief sketch illustrating the purpose of measures. This feature is experimental, the syntax will no doubt change, and much of the semantics (for example, what expressions are valid as measures, whether measures remain measures they appear in the SELECT clause of an enclosing query, and what is the "context" in which a measure is evaluated) need to be ironed out.
Attachments
Issue Links
- depends upon
-
CALCITE-5105 Add MEASURE type and AGGREGATE aggregate function
- Closed
- is related to
-
CALCITE-4483 WITHIN DISTINCT clause for aggregate functions (experimental)
- Closed
-
CALCITE-6425 Attributes
- Open
- relates to
-
CALCITE-5692 Add AT operator, for context-sensitive expressions
- Open
-
CALCITE-5802 In RelBuilder add method aggregateRex, to allow aggregating complex expressions such as "1 + SUM(x + 2)"
- Resolved
-
CALCITE-5869 LEAST_RESTRICTIVE does not use inner type of MEASURE for comparisons
- Closed
-
CALCITE-6013 RelBuilder should simplify plan by pruning unused measures
- Closed
-
CALCITE-6443 Create view based on LookML model
- Open
-
CALCITE-6459 Measures inside structured types (ROW, ARRAY, MAP)
- Open
-
CALCITE-6525 Query with one-to-many join of measure to regular table
- In Progress
-
CALCITE-6519 Non-aggregate query that uses measure in ORDER BY
- Resolved
- links to