Hierarchical Clause
The Hierarchical clause sets the conditions for recursive queries on a table object in which a hierarchy of parent-child dependencies exists among the rows. SELECT statements that include this clause are called hierarchical queries.
The table object on which the hierarchical query operates must be specified in the FROM clause of the SELECT statement. The table object is typically a self-referencing table in which one or more columns acts as a foreign key constraint for another column (or for a subset of the columns) in the same table.
A hierarchical query operates on rows in which one or more column values correspond to nodes within a logical structure of parent-child relationships. If parent rows have multiple children, sibling relationships exist among child rows of the same parent. These relationships might reflect, for example, the reporting structure among employees and managers within the divisions and management levels of an organization.
The syntax that this clause supports is an extension to the ANSI/ISO standard for SQL. This syntax fragment is part of the SELECT statement.
Hierarchical Clause (2) |--+---------------------------+--| CONNECT BY Clause |--------| | (1) | '-| START WITH Clause |-----'
- A table or updateable view
- A temporary table
- A table in another database of the same Informix instance to which the session is connected
- A derived table that is the result of a query
- A table that is protected by a label-based access control (LBAC) security policy
- A table with column level encryption or row level encryption
- A synonym for any of the other table objects.
- A join of two or more tables
- A view that is not updatable
- A table in a database of a remote Informix instance
- An external table that the CREATE EXTERNAL TABLE statement defined
- A sequence object.
Informix supports sequence objects in the projection list of hierarchical queries, in the WHERE clause, and in other contexts where an expression is valid in SELECT statements, but not in the hierarchical query clause.
The hierarchical clause is valid in correlated subqueries and in uncorrelated subqueries.
- Join-order directives
- Join-method directives
Hierarchical queries do not support the Parallel Database Query (PDQ) feature of Informix.
- The hierarchy can be a simple hierarchy, such as the reporting structure of an organization in which every node that is not the root reports to a single node at higher level within the hierarchy. (In the LBAC security feature of Informix, a security label component of type TREE has the logical structure of a simple hierarchy.)
- The Hierarchical clause can query data hierarchies of more complex topologies, in which nodes have many-to-many relationships, and in which a child node can be the ancestor of its parent. For information about using the Hierarchical clause to query a table that has cycles within the data hierarchy, see CONNECT BY Clause.
SQL Syntax Specific to Hierarchical Queries
- CONNECT_BY_ROOT operator
This operator can return an expression for the root ancestor of its operand.
- PRIOR operator
This operator can reference a returned value from the previous recursive step (where "step" refers to an iteration of the recursive query).
- LEVEL pseudocolumn
This pseudocolumn returns an integer, indicating which step of the recursive query returned a row within the hierarchy.
- CONNECT_BY_ISCYCLE pseudocolumn
This pseudocolumn can indicate whether a row has a child row that is also its ancestor.
- CONNECT_BY_ISLEAF pseudocolumn
This pseudocolumn can indicate whether a row has any children among the rows that the query returns.
- SYS_CONNECT_BY_PATH function
This function can construct and return a string that represents the path from a specified row to the root of the hierarchy
- SIBLINGS keyword in the ORDER BY clause
The ORDER SIBLINGS BY clause can sort returned rows for siblings of the same parent at every level.
A pseudocolumn is a built-in identifier that the SQL parser can recognize in specific contexts, and that shares the same namespace as columns and variables. These pseudocolumns and the SYS_CONNECT_BY_PATH function are typically specified in the Projection clause of the SELECT statement, but the LEVEL pseudocolumn and the PRIOR operator can be specified in the Hierarchical clause.
For details of the syntax and semantics of these tokens that support only hierarchical queries, see Conditions in the CONNECT BY Clause and ORDER SIBLINGS BY Clause.
Overview of Hierarchical Queries
- FROM clause (for only a single table object in the current database)
- Hierarchical clause
- WHERE clause (without join predicates)
- GROUP BY clause
- HAVING clause
- Projection clause
- ORDER BY clause
The ORDER BY SIBLING option of the ORDER BY clause can order the set of rows that are children of the same parent.
A subquery that includes the Hierarchical clause returns the intermediate result set in a partial order, where the rows produced in iteration (n+1) for a specific hierarchy immediately follow the row in iteration (n) that produced them. However, specifying an ORDER BY clause, a GROUP BY or HAVING clause, or the DISTINCT or UNIQUE keyword in the Projection clause destroys that partial order.
The Hierarchical clause follows the WHERE clause in the lexical sequence of SELECT statement clauses, but the WHERE clause predicates are processed on the result of the Hierarchical clause. The WHERE clause cannot specify join predicates if the SELECT statement includes the Hierarchical clause, but the table object that is specified in the FROM clause can be the result set of a query that joins one or more tables.
- The optional START WITH clause can specify a condition. Any rows that satisfy this condition are returned as the first intermediate result set of the hierarchical query.
- The next step applies the condition that is specified in the CONNECT BY clause to the table. Any rows that satisfy that condition are returned as the second intermediate result set.
- The next step applies the CONNECT BY condition to the table. Any rows that are returned comprise the third intermediate result set.
- The CONNECT BY clause runs queries recursively to produce successive intermediate result sets, until an iteration yields an empty result set.
- The hierarchical SELECT statement then combines all of the intermediate result sets of the preceding recursive steps, producing the final result set of the Hierarchical clause.
- The predicates of the WHERE clause are then applied to this set of rows that the Hierarchical clause retrieved, and the remaining clauses of the SELECT statement are then applied in the order listed.
After the START WITH and CONNECT BY clauses return all of the intermediate result sets, you can use the ORDER SIBLINGS BY clause to sort the sibling rows that have the same parent for every level within the hierarchy. For more information, see ORDER SIBLINGS BY Clause.
You can use output from the SET EXPLAIN statement to view the execution path of a hierarchical query.
The Hierarchical clause provides an efficient alternative to using the Node database extension to retrieve information from hierarchical data sets
Example of a Hierarchical Data Set
In several topics that follow, SQL code examples that illustrate hierarchical queries are based on hierarchic data in the following employee table, whose rows contains information about employees within an organizational hierarchy. The mgrid column shows the employee identifier (empid) of the manager to whom the employee reports:
CREATE TABLE employee(
empid INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(10),
salary DECIMAL(9, 2),
mgrid INTEGER
);
Data values for the 17 rows in the employee table are these.
INSERT INTO employee VALUES ( 1, 'Jones', 30000, 10);
INSERT INTO employee VALUES ( 2, 'Hall', 35000, 10);
INSERT INTO employee VALUES ( 3, 'Kim', 40000, 10);
INSERT INTO employee VALUES ( 4, 'Lindsay', 38000, 10);
INSERT INTO employee VALUES ( 5, 'McKeough', 42000, 11);
INSERT INTO employee VALUES ( 6, 'Barnes', 41000, 11);
INSERT INTO employee VALUES ( 7, 'O''Neil', 36000, 12);
INSERT INTO employee VALUES ( 8, 'Smith', 34000, 12);
INSERT INTO employee VALUES ( 9, 'Shoeman', 33000, 12);
INSERT INTO employee VALUES (10, 'Monroe', 50000, 15);
INSERT INTO employee VALUES (11, 'Zander', 52000, 16);
INSERT INTO employee VALUES (12, 'Henry', 51000, 16);
INSERT INTO employee VALUES (13, 'Aaron', 54000, 15);
INSERT INTO employee VALUES (14, 'Scott', 53000, 16);
INSERT INTO employee VALUES (15, 'Mills', 70000, 17);
INSERT INTO employee VALUES (16, 'Goyal', 80000, 17);
INSERT INTO employee VALUES (17, 'Urbassek', 95000, NULL);
Each pair of empid and mgrid values express referential relationships that the recursive iterations of a query with an appropriate CONNECT BY condition can correctly assemble into a hierarchy.
Here the NULL value in the mgrid column in the last row shows that employee Urbassek, whose empid value is 17 is the root node of this reporting hierarchy.
The following diagram illustrates the four levels of the reporting hierarchy (with nodes that show the empid values) for the employee table data: