OLAP specifications
Online analytical processing (OLAP) specifications provide the ability to return ranking, row numbering, and aggregation information as a scalar value in the result of a query. An OLAP specification can be included in an expression, in a select-list, or in the ORDER BY clause of a select-statement. The query result to which the OLAP specifications is applied is the result table of the innermost subselect that includes the OLAP specification.
Syntax for OLAP specifications
ordered-OLAP-specification
- 1 FL 504 This passthrough-only function cannot run on Db2 for z/OS without acceleration. See Accelerating queries with passthrough-only expressions.
lag-function
lead-function
window-partition-clause
window-order-clause
numbering-specification
aggregation-specification
aggregate-function
- 1 You cannot specify DISTINCT or ALL for an aggregate function that is included in an aggregation-specification.
OLAP-aggregate-function
- 1 FL 504 These passthrough-only functions cannot run on Db2 for z/OS without acceleration. See Accelerating queries with passthrough-only expressions.
first-value-function
last-value-function
nth-value-function
ratio-to-report-function
window-aggregation-group-clause
group-start
group-between
group-bound-1
group-bound-2
group-end
RANK, DENSE_RANK, and ROW_NUMBER are sometimes called window functions.
An OLAP specification is not valid in a WHERE, VALUES, GROUP BY, HAVING, or SET clause. An OLAP specification cannot be used as an argument of an aggregate function.
When invoking an OLAP specification, a window is specified that defines the rows over which the function is applied and in which order.
The result of a RANK, DENSE_RANK, or ROW_NUMBER specification is BIGINT. The result cannot be null.
- CUME_DIST
- FL 504 Returns a cumulative distribution of a row within an OLAP window, expressed as a value in the range of 0.0–1.0. The result is computed by dividing the number of rows preceding or equivalent to the current row in the OLAP window by the number of rows in the OLAP window.
The data type of the result is DECFLOAT(34). The result cannot be NULL.
Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions. - PERCENT_RANK
- FL 504 Returns a relative percentile rank of a row within an OLAP window, expressed as a value in the range 0.0–1.0. When the number of rows in the OLAP window is greater than 1, the result is computed by dividing the RANK of the current row in the OLAP window minus 1 by the number of rows in the OLAP window minus 1. Otherwise, the result is 0.0.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- RANK or DENSE_RANK
- Specifies that the ordinal rank of a row within the specified window is computed. Rows that are
not distinct with respect to the ordering within the specified window are assigned the same rank.
The results of the ranking can be defined with or without gaps in the numbers that result from
duplicate values.
- RANK
- Specifies that the rank of a row is defined as 1 plus the number of rows that strictly precede the row. Thus, if two or more rows are not distinct with respect to the ordering, there will be one or more gaps in the sequential rank numbering.
- DENSE_RANK
- Specifies that the rank of a row is defined as 1 plus the number of preceding rows that are distinct with respect to the ordering. Therefore, there will be no gaps in the sequential rank numbering.
- NTILE
- FL 504 Returns the quantile rank of a row.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- num-tiles
- An expression that specifies the number of quantiles. The expression must return a value that is a built-in numeric data type, CHAR, or VARCHAR data type. In a Unicode database, the expression can also be a GRAPHIC or VARGRAPHIC data type. CHAR, VARCHAR, GRAPHIC, and VARGRAPHIC are supported by using implicit casting. If the expression is not a SMALLINT, INTEGER, or BIGINT, it is cast to BIGINT before the function is evaluated. The value must be greater than 0. The expression must be a constant, a variable, or a cast of a constant or variable.
The data type of the result of NTILE is the same data type as the data type of num-tiles after any implicit casting. If the argument can be null, the result can be null. If the argument is null, the result is the null value.
The NTILE function computes the quantile rank of a row by dividing the ordered rows within the OLAP window into num-tiles quantiles and returns a value between 1 and MIN(n, num-tiles), where n is the number of rows within the OLAP window. If n is evenly divisible by num-tiles, the rows in the OLAP window are grouped into num-tiles quantiles, each containing (n / num-tiles) rows. Otherwise, each of the quantiles 1 through MOD(n, num-tiles) is assigned (n / num-tiles + 1) rows, and each of the quantiles (MOD(n, num-tiles) + 1) through num-tiles is assigned (n / num-tiles) rows. The result is the quantile rank which is associated with the current row.
Equivalent sort keys are not considered when rows are divided into quantiles. Rows with equivalent sort keys can be assigned to different quantiles based on the non-deterministic order of these sort keys. Therefore, NTILE is a non-deterministic function.
- LAG
- FL 504 Returns the expression value for the row at offset rows before the current row. The offset must be a positive integer constant. An offset value of 0 indicates the current row. If a window-partition-clause is specified, offset indicates offset rows before the current row and within the current partition. If offset is not specified, the value 1 is used. If default-value (which can be an expression) is specified, it will be returned if the offset goes beyond the scope of the current partition. Otherwise, the null value is returned. If 'IGNORE NULLS' is specified, all rows for which the expression value is the null value are not considered in the calculation. If 'IGNORE NULLS' is specified and all rows are null, default-value (or the null value if default-value was not specified) is returned.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- LEAD
- FL 504 Returns the expression value for the row at offset rows after the current row. The offset must be a positive integer constant. An offset value of 0 indicates the current row. If a window-partition-clause is specified, offset indicates offset rows after the current row and within the current partition. If offset is not specified, the value 1 is used. If default-value (which can be an expression) is specified, it will be returned if the offset goes beyond the scope of the current partition. Otherwise, the null value is returned. If 'IGNORE NULLS' is specified, all rows for which the expression value is the null value are not considered in the calculation. If 'IGNORE NULLS' is specified and all rows are null, default-value (or the null value if default-value was not specified) is returned.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- ROW_NUMBER
- Specifies that a sequential row number is computed for the row that is defined by the ordering, starting with 1 for the first row. If the ORDER BY clause is not specified in the window, the row numbers are assigned to the rows in an arbitrary order, as the rows are returned (not according to any ORDER BY clause in the select-statement).
- PARTITION BY partitioning-expression,...
- Defines the partition within which the OLAP operation is applied. A partitioning-expression is an expression that is used in defining the partitioning of the result table. Each column name that is referenced in a partitioning-expression must unambiguously reference a column of the result table of the subselect that contains the OLAP specification. A partitioning-expression cannot include a scalar-fullselect an XMLQUERY or XMLEXISTS expression or any function that is not deterministic or has an external action.
- ORDER BY sort-key-expression,...
- Defines the ordering of rows within a partition that is used to determine the value of the OLAP
specification. It does not define the ordering of the result table.
- sort-key-expression
- Specifies an expression to use in defining the ordering of the rows within a window partition. Each column name that is referenced in a sort-key-expression must unambiguously reference a column of the result table of the subselect, including the OLAP specification. A sort-key-expression cannot include a scalar fullselect, an XMLQUERY or XMLEXISTS expression, or any function that is not deterministic or that has an external action.
- ASC
- Specifies that the values of sort-key-expression are used in ascending order.
- DESC
- Specifies that the values of sort-key-expression are used in descending order.
- NULLS FIRST
- Specifies that the window ordering considers null values before all non-null values in the sort order.
- NULLS LAST
- Specifies that the window ordering considers null values after all non-null values in the sort order.
- FIRST_VALUE
- FL 504 Returns the expression value for the first row in an OLAP window. If 'IGNORE NULLS' is specified, all rows for which the expression value is the null value are not considered in the calculation. If 'IGNORE NULLS' is specified and all values in the OLAP window are null, FIRST_VALUE returns the null value.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- LAST_VALUE
- FL 504 Returns the expression value for the last row in an OLAP window. If 'IGNORE NULLS' is specified, all rows for which the expression value is the null value are not considered in the calculation. If 'IGNORE NULLS' is specified and all values in the OLAP window are null, LAST_VALUE returns the null value.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- NTH_VALUE
- FL 504 Returns the expression value for the nth-row row in an OLAP window.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- expression
- An expression that specifies the current row in an OLAP window. The expression must return a value that is a built-in data type.
- nth-row
- An expression that specifies which row of the OLAP window to return. The expression must return a value that is a built-in numeric data type, a CHAR, or a VARCHAR data type. In a Unicode database, the expression can also be a GRAPHIC or VARGRAPHIC data type. CHAR, VARCHAR, GRAPHIC, and VARGRAPHIC are supported using implicit casting. If the expression is not a SMALLINT, INTEGER, or BIGINT, it is cast to BIGINT before the function is evaluated. The value must be greater than 0. The expression must be a constant, a variable, or a cast of a constant or variable.
Db2 for z/OS does not support the FROM FIRST, FROM LAST, RESPECT NULLS, and IGNORE NULLS keywords for this function. Omitting these keywords in Db2 for Linux®, UNIX, and Windows results in the default behavior FROM FIRST RESPECT NULLS.
- RATIO_TO_REPORT
- FL 504 Returns the ratio of an argument to the sum of the arguments in an OLAP partition. The division is always performed using DECFLOAT(34). The result data type is DECFLOAT(34). If the argument can be null, the result can be null; if the argument is null, the result is the null value.Passthrough-only expression: This function is passthrough-only and cannot run on Db2 for z/OS without acceleration. For information about invoking this function, see Accelerating queries with passthrough-only expressions.
- window-aggregation-group-clause
- The aggregation group of a given row is a set of rows that is defined in relation to the given
row (in the ordering of the rows in the partition of the given row).
window-aggregation-group-clause specifies the aggregation group. If this
clause is not specified and a window-order-clause is also not specified,
the aggregation group consists of all rows of the window partition. The aggregation group of all
rows of the window partition can be explicitly specified using the RANGE or ROWS clauses.
If window-order-clause is specified, but window-aggregation-group-clause is not specified, the window aggregation group consists of all rows that precede a given row of the partition of the given row or all rows that are peers of the given row in the window ordering of the window partition that is defined by the window-order-clause
- ROW
- Specifies that the aggregation group is defined by counting rows.
- RANGE
- Specifies that the aggregation group is defined by an offset from a sort key.
- group-start
- Specifies the starting point for the aggregation group. The aggregation group end is the CURRENT ROW. Specifying group-start is equivalent to specifying group-between as BETWEEN group-start AND CURRENT ROW.
- group-between
- Specifies that the aggregation group start and end based on either ROWS or RANGE.
- group-end
- Specifies the ending point for the aggregation group. The aggregation group start is the CURRENT ROW. Specifying group-end is equivalent to specifying group-between as BETWEEN CURRENT ROW AND group-end.
- UNBOUNDED PRECEDING
- Specifies that the entire partition that precedes the current row is included in the aggregation group. This can be specified with either the ROWS or RANGE clauses. Including the entire partition that precedes the current row can also be specified with multiple sort-key-expressions in the window-order-clause.
- UNBOUNDED FOLLOWING
- Specifies that the entire partition that follows the current row is included in the aggregation group. This can be specified with either the ROWS or RANGE clauses. Including the entire partition that follows the current row can also be specified with multiple sort-key-expressions in the window-order-clause.
- CURRENT ROW
- Specifies that the aggregation group starts or ends based on the current row. IF ROWS is specified, the current row is the aggregation group boundary. If RANGE is specified, the aggregation group boundary includes the set of rows with the values specified for the sort-key-expression as the current row. This clause cannot be specified in group-bound-2 if group-bound-1 specifies unsigned-constant FOLLOWING.
- unsigned-constant PRECEDING
- Specifies either the range or the number of rows that precede the current row. If ROWS is specified, unsigned-constant must be zero or a positive integer that indicates a number of rows. If RANGE is specified, the data type of unsigned-constant must be comparable to the data type of the sort-key-expression of the window-order-clause. Only one sort-key-expression is allowed, and the data type of sort-key-expression must allow subtraction. This clause cannot be specified in group-bound-2 if group-bound-1 is CURRENT ROW or unsigned-constant FOLLOWING.
- unsigned-constant FOLLOWING
- Specifies either the range or the number of rows that follow the current row. If ROWS is specified, unsigned-constant must be zero or a positive integer that indicates a number of rows. If RANGE is specified, the data type of unsigned-constant must be comparable to the data type of the sort-key-expression of the window-order-clause. Only one sort-key-expression is allowed, and the data type of sort-key-expression must allow addition.
Notes for OLAP Specification
Using a column mask with an OLAP specification: If a column mask is used to mask the column values in the final result table and an OLAP specification is referenced in the select list that is used to derive the final result table, the column mask cannot be applied to the column that is specified in the partitioning-expression or the sort-key-expression in the OLAP specification.
Syntax alternatives and synonyms: For compatibility, the keywords DENSERANK and ROWNUMBER can be used as synonyms for DENSE_RANK and ROW_NUMBER respectively.
Examples for OLAP specification
SELECT EMPNO, LASTNAME, FIRSTNME, SALARY+BONUS AS TOTAL_SALARY,
RANK() OVER(ORDER BY SALARY+BONUS DESC) AS RANK_SALARY
FROM EMP WHERE SALARY+BONUS > 30000
ORDER BY LASTNAME;
If
the result is to be ordered by rank, ORDER BY LASTNAME
would be replaced with
ORDER BY RANK_SALARY
.SELECT WORKDEPT, AVG(SALARY+BONUS) AS AVG_TOTAL_SALARY,
RANK() OVER(ORDER BY AVG(SALARY+BONUS) DESC) AS RANK_AVG_SAL
FROM EMP
GROUP BY WORKDEPT
ORDER BY RANK_AVG_SAL;
SELECT WORKDEPT, EMPNO, LASTNAME, FIRSTNME, EDLEVEL,
DENSE_RANK() OVER
(PARTITION BY WORKDEPT ORDER BY EDLEVEL DESC) AS RANK_EDLEVEL
FROM EMP
ORDER BY WORKDEPT, LASTNAME;
SELECT ROW_NUMBER() OVER(ORDER BY WORKDEPT, LASTNAME) AS NUMBER,
LASTNAME, SALARY
FROM EMP
ORDER BY WORKDEPT, LASTNAME;
SELECT EMPNO, LASTNAME, FIRSTNME, TOTAL_SALARY, RANK_SALARY
FROM (SELECT EMPNO, LASTNAME, FIRSTNME, SALARY+BONUS AS TOTAL_SALARY,
RANK() OVER (ORDER BY SALARY+BONUS DESC) AS RANK_SALARY
FROM EMP) AS RANKED_EMPLOYEE
WHERE RANK_SALARY < 6
ORDER BY RANK_SALARY;
A
nested table expression is used to first compute the result, including the ranking, before the rank
can be used in the WHERE clause. A common table expression could also have
been used.CREATE VIEW V1 AS
SELECT SYMBOL, TRADINGDATE,
AVG(CLOSINGPRICE) OVER (PARTITION BY SYMBOL
ORDER BY TRADINGDATE
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW)
FROM DAILYSTOCKDATA
WHERE SYMBOL IN ('ABC', 'XYZ')
AND TRADINGDATE BETWEEN DATE('2005-01-01') - 2 MONTHS AND '2005-12-31';
SELECT SYMBOL, TRADINGDATE, MOVINGAVG30DAY
FROM V1
WHERE TRADINGDATE BETWEEN '2005-01-01' AND '2005-12-31'
ORDER BY SYMBOL, TRADINGDATE;
SELECT EMPNO, SALARY, NTILE(4) OVER
(ORDER BY SALARY) AS QUARTILE
FROM EMPLOYEE
ORDER BY SALARY
The
result set
is:EMPNO SALARY QUARTILE
------ ----------- -----------
200340 31840.00 1
000290 35340.00 1
200330 35370.00 1
000310 35900.00 1
200310 35900.00 1
000280 36250.00 1
000270 37380.00 1
000300 37750.00 1
200240 37760.00 1
200120 39250.00 1
000320 39950.00 1
000230 42180.00 2
000340 43840.00 2
000170 44680.00 2
000330 45370.00 2
200280 46250.00 2
200010 46500.00 2
000260 47250.00 2
000240 48760.00 2
000250 49180.00 2
000120 49250.00 2
000220 49840.00 2
000190 50450.00 3
000180 51340.00 3
000150 55280.00 3
000200 57740.00 3
000160 62250.00 3
200170 64680.00 3
000110 66500.00 3
000210 68270.00 3
000140 68420.00 3
200140 68420.00 3
200220 69840.00 4
000060 72250.00 4
000130 73800.00 4
000050 80175.00 4
000100 86150.00 4
000090 89750.00 4
000020 94250.00 4
000070 96170.00 4
000030 98250.00 4
000010 152750.00 4
42 record(s) selected.
SELECT NTILE(3) OVER (ORDER BY MAX_SALARY) AS Bucket,
MAX_SALARY FROM GOSALESDW.EMP_POSITION_DIM;
A
portion of the output from the query is in the following table:
BUCKET | MAX_SALARY |
---|---|
1 | 0.00 |
... | ... |
1 | 35000.00 |
2 | 5000.00 |
... | ... |
2 | 12000.00 |
3 | 13000.00 |
... | ... |
3 | 301500.00 |