6- Hierachical Querys(级联查询)

 
relational database is based upon sets, with each table representing a set. However, there are some types of information that are not directly amenable to the set data structure. Think, for example, of an organization chart, a bill of material in a manufacturing and assembly plant, or a family tree. These types of information are hierarchical in nature, and most conveniently represented in a tree structure.
 
To represent hierarchical data, we need to make use of a relationship such as when one column of a table references another column of the same table. When such a relationship is implemented using a database constraint, it is known as self-referential integrity constraint( 当一个表中的某个字段引用了同一个表中的其他字段,就是我们常说的“自引用”完整性约束) . The corresponding CREATE TABLE statement will look as follows
 
CREATE TABLE EMPLOYEE
(
    EMP_ID          NUMBER (4) CONSTRAINT EMP_PK PRIMARY KEY,
    LNAME           VARCHAR2 (15)NOT NULL,
    DEPT_ID         NUMBER (2)NOT NULL,
    MANAGER_EMP_ID NUMBER (4) CONSTRAINT EMP_FK REFERENCES EMPLOYEE(EMP_ID),
)
The column MANAGER_EMP_ID stores the EMP_ID of the employee's manager.There is a foreign key constraint on the MANAGER_EMP_ID column. This enforces the rule that any value we put in the MANAGER_EMP_ID column must be the EMP_ID of a valid employee. Such a constraint is not mandatory when representing hierarchical information. However, it is a good practice to define database constraints to enforce such business rules.
(在字段 MANAGER_EMP_ID上有一个外键索引,该外键索引指向了同一个表中的另一个字段EMP_ID,这就强迫我们向该字段添加的任何值必须是employee表中一个有效的EMP_ID值,像这样的约束并不是强制必要的,但在执行某些商业逻辑上定义这样的约束是一个好习惯。)
 
1.简单的级联操作:
 
A .Finding the Root Node:
Finding the root of a hierarchy tree is easy; we look for the one node with no parent.
 
B .Finding a Node's Immediate Parent:
We may wish to link nodes to their immediate parents. For example, we might want to print a report showing each employee's manager. The name of each employee's manager can be derived by joining the EMPLOYEE table to itself. This type of join is a self join. The following query returns the desired result:
 
SELECT E.LNAME "Employee", M.LNAME "Manager"
 FROM EMPLOYEE E, EMPLOYEE M
WHERE E.MANAGER_EMP_ID = M.EMP_ID;
 
The reason that only 13 rows are returned from the self join is simple. This query lists employees and their managers. But since the uppermost employee KING doesn't have any manager, that row is not produced in the output. If we want all the employees to be produced in the result, we need an outer join, as in the following example:
 
SELECT E.LNAME "Employee", M.LNAME "Manager"
 FROM EMPLOYEE E, EMPLOYEE M
WHERE E.MANAGER_EMP_ID = M.EMP_ID (+);
 
C.Finding the leaf nodes
The opposite problem from finding the root node, which has no parent, is to find leaf nodes, which have no children. Employees who do not manage anyone are the leaf nodes in the hierarchy tree shown in Figure 8-1. At first glance, the following query seems like it should list all employees from the EMPLOYEE table who are not managers of any other employee:
 
SELECT * FROM EMPLOYEE
WHERE EMP_ID NOT IN (SELECT MANAGER_EMP_ID FROM EMPLOYEE);
 
However, when we execute this statement, we will see "No rows selected." Why? It is because the MANAGER_EMP_ID column contains a NULL value in one row (for the uppermost employee), and NULLs can't be compared to any data value.(然而,我们很奇怪地发现:没有一条记录返回!为什么?因为最顶层的节点其 MANAGER_EMP_ID列包含了空值,而空值是无法和其他类型的数据进行比较的。)
Therefore, to get the employees who don't manage anyone, we need to rewrite the query as follows:
 
SELECT EMP_ID, LNAME, DEPT_ID, MANAGER_EMP_ID, SALARY, HIRE_DATE
  FROM EMPLOYEE E
WHERE EMP_ID NOT IN (SELECT MANAGER_EMP_ID FROM EMPLOYEE
                                WHERE MANAGER_EMP_ID IS NOT NULL);
 
In this example, the subquery returns the EMP_IDs of all the managers. The outer query then returns all the employees, except the ones returned by the subquery. This query can also be written as a correlated subquery using EXISTS instead of IN:
 
SELECT EMP_ID, LNAME, DEPT_ID, MANAGER_EMP_ID, SALARY, HIRE_DATE
   FROM EMPLOYEE E
 WHERE NOT EXISTS (SELECT EMP_ID FROM EMPLOYEE E1
                            WHERE E.EMP_ID = E1.MANAGER_EMP_ID);
 
2. Oracle SQL扩展:
For example, let's say we want to list each employee with his manager. Using regular Oracle SQL, we can perform self outer joins on the EMPLOYEE table, as shown here:
 
SELECT E_TOP.LNAME, E_2.LNAME, E_3.LNAME, E_4.LNAME
  FROM EMPLOYEE E_TOP, EMPLOYEE E_2, EMPLOYEE E_3, EMPLOYEE E_4
WHERE E_TOP.MANAGER_EMP_ID IS NULL
   AND E_TOP.EMP_ID = E_2.MANAGER_EMP_ID (+)
   AND E_2.EMP_ID = E_3.MANAGER_EMP_ID (+)
   AND E_3.EMP_ID = E_4.MANAGER_EMP_ID (+);
  
The query returns eight rows, corresponding to the eight branches of the tree. To get those results, the query performs a self join on four instances of the EMPLOYEE table. Four EMPLOYEE table instances are needed in this statement because there are four levels to the hierarchy. Each level is represented by one copy of the EMPLOYEE table. The outer join is required because one employee (KING) has a NULL value in the MANAGER_EMP_ID column
 
This type query has several drawbacks. First of all, we need to know the number of levels in an organization chart when we write the query, and it's not realistic to assume that we will know that information. It's even less realistic to think that the number of levels will remain stable over time. Moreover, we need to join four instances of the EMPLOYEE table together for a four level hierarchy. Imagine an organization with 20 levels—we'd need to join 20 tables. This would cause a huge performance problem.
 
A .START WITH...CONNECT BY and PRIOR:
We can extract information in hierarchical form from a table containing hierarchical data by using the SELECT statement's START WITH...CONNECT BY clause. The syntax for this clause is:
 
[[START WITH condition1] CONNECT BY condition2]
The syntax elements are:
 
START WITH condition1
Specifies the root row(s) of the hierarchy. All rows that satisfy condition1 are considered root rows. If we don't specify the START WITH clause, all rows are considered root rows, which is usually not desirable. We can include a subquery in condition1.
(Start with 用于指定级联结构中的根节点,我们可以在Start with子句中使用子查询)
 
CONNECT BY condition2
Specifies the relationship between parent rows and child rows in the hierarchy. The relationship is expressed as a comparison expression, where columns from the current row are compared to corresponding parent columns. condition2 must contain the PRIOR operator, which is used to identify columns from the parent row. condition2 cannot contain a subquery.
(Connect by 用于指定级联结构中父记录和子记录之间的对应关系,condition2必须包含Prior操作符,该操作符用于表示父记录中的列,condition2不能含有子查询)
 
PRIOR is a built-in Oracle SQL operator that is used with hierarchical queries only. In a hierarchical query, the CONNECT BY clause specifies the relationship between parent and child rows. When we use the PRIOR operator in an expression in the CONNECT BY condition, the expression following the PRIOR keyword is evaluated for the parent row of the current row in the query. In the following example, PRIOR is used to connect each row to its parent by connecting MANAGER_EMP_ID in the child to EMP_ID in the parent:
(Prior 操作符是Oracle中专门用于级联查询的操作符,当我们在一个表达式中使用Prior操作符时,紧跟着Prior关键字的表达式将在查询中被当成父记录看待,等号右边的表达式将被看成是子记录)
 
例:
SELECT LNAME, EMP_ID, MANAGER_EMP_ID
 FROM EMPLOYEE
 START WITH MANAGER_EMP_ID IS NULL
CONNECT BY PRIOR EMP_ID = MANAGER_EMP_ID;
 
LNAME                   EMP_ID           MANAGER_EMP_ID
-------------------- ----------         ----------------
KING                       7839
JONES                      7566             7839
SCOTT                      7788             7566
ADAMS                    7876             7788
FORD                       7902             7566
 
The PRIOR column does not need to be listed first. The previous query could be restated as:
 
SELECT LNAME, EMP_ID, MANAGER_EMP_ID
 FROM EMPLOYEE
 START WITH MANAGER_EMP_ID IS NULL
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;
 
Since the CONNECT BY condition specifies the parent-child relationship, it cannot contain a loop. If a row is both parent (direct ancestor) and child (direct descendent) of another row, then we have a loop. For example, if the EMPLOYEE table had the following two rows, they would represent a loop:
 
EMP_ID LNAME         DEPT_ID MANAGER_EMP_ID    SALARY HIRE_DATE
------ ---------- --------- -------------- --------- ---------
 9001 SMITH              20           9002      1800 15-NOV-61
 9002 ALLEN              30           9001     11600 16-NOV-61
***************************************************************************************
 
When a parent-child relationship involves two or more columns, we need to use the PRIOR operator before each parent column.( 当一个父子记录之间的关系受两个或多个字段影响时,我们就必须在每个父字段的前面都加上Prior关键字)
 
SELECT * FROM ASSEMBLY
  START WITH PARENT_ASSEMBLY_TYPE IS NULL AND PARENT_ASSEMBLY_ID IS NULL
CONNECT BY PARENT_ASSEMBLY_TYPE = PRIOR ASSEMBLY_TYPE
AND PARENT_ASSEMBLY_ID = PRIOR ASSEMBLY_ID;
 
B .The LEVEL Pseudocolumn:
In a hierarchy tree, the term level refers to one layer of nodes.Oracle provides a pseudocolumn, LEVEL, to represent these levels in a hierarchy tree. Whenever we use the START WITH...CONNECT BY clauses in a hierarchical query, we can use the pseudocolumn LEVEL to return the level number for each row returned by the query.(在一棵级联结构的树中, level用来反映某一层的节点,Oracle提供了一个伪列(pseudocolumn)-Level,用来表示级联树中的层次,不管我们在级联查询中如何使用Start with…Conect by子句,我们可以使用伪列“level”来返回查询结果中任何一条记录的层数)
 
The following example illustrates the use of the LEVEL pseudocolumn
SELECT LEVEL, LNAME, EMP_ID, MANAGER_EMP_ID
      FROM EMPLOYEE
      START WITH MANAGER_EMP_ID IS NULL
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;
 
LEVEL LNAME         EMP_ID         MANAGER_EMP_ID
----- ----- -------------------- ---------------
1  KING         7839
   2  JONES       7566             7839
   3  SCOTT       7788             7566
   4  ADAMS       7876             778
 
3.复杂级联运算:
A .Finding the Number of Levels:
Previously we showed how the LEVEL pseudocolumn generates a level number for each record when we use the START WITH...CONNECT BY clause. We can use the following query to determine the number of levels in the hierarchy by counting the number of distinct level numbers returned by the LEVEL pseudocolumn:
 
SELECT COUNT(DISTINCT LEVEL)
   FROM EMPLOYEE
 START WITH MANAGER_EMP_ID IS NULL
CONNECT BY PRIOR EMP_ID = MANAGER_EMP_ID;
 
COUNT(DISTINCTLEVEL)
--------------------
                   4
 
B .Listing Records in Hierarchical Order
One of the very common programming challenges SQL programmers face is to list records in a hierarchy in their proper hierarchical order. For example, we might wish to list employees with their subordinates underneath them, as is in the following query
 
SELECT LEVEL, LPAD(' ',2*(LEVEL - 1)) || LNAME "EMPLOYEE",EMP_ID, MANAGER_EMP_ID
 FROM EMPLOYEE
 START WITH MANAGER_EMP_ID IS NULL
CONNECT BY PRIOR EMP_ID = MANAGER_EMP_ID;
 
LEVEL Employee         EMP_ID MANAGER_EMP_ID
--------- ------------ --------- --------------
        1 KING              7839               
        2   JONES           7566           7839
        3     SCOTT         7788           7566
        4       ADAMS       7876           7788
 
Notice that by using the expression LPAD(' ',2*(LEVEL - 1)), we are able to align employee names in a manner that corresponds to their level. As the level number increases, the number of spaces returned by the expression increases, and the employee name is further indented.
(注意:通过使用 LPAD(' ',2*(LEVEL - 1))表达式,我们可以用一种特定的方式来对齐各个级别的用户名,随着级别数目的增加,该表达式返回的空格数目也随着增加,各个用户名之间更加交错排列)
 
Instead of reporting out the whole organization chart, we may want to list only the subtree under a given employee, JONES for example. To do this, we can modify the START WITH condition so that it specifies JONES as the root of the query. For example
 
SELECT LEVEL, LPAD('     ',2*(LEVEL - 1)) || LNAME "EMPLOYEE", EMP_ID, MANAGER_EMP_ID, SALARY
 FROM EMPLOYEE
  START WITH LNAME = 'JONES'
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;
 
LEVEL Employee         EMP_ID MANAGER_EMP_ID     SALARY
--------- ------------ --------- -------------- ---------
        1 JONES             7566           7839       2000
        2   SCOTT           7788           7566       3000
        3     ADAMS         7876           7788       1100
        2   FORD            7902           7566       3000
        3     SMITH         7369           7902        800
Notice that since we asked the query to consider JONES as the root of the hierarchy, it assigned level 1 to JONES, level 2 to employees directly reporting to him, and so forth. Be careful while using conditions such as LNAME = 'JONES' in hierarchical queries. In this case, if we have two JONES in our organization, the result returned by the hierarchy may be wrong. It is better to use primary or unique key columns, such as EMP_ID, as the condition in such situations
 
(注意:因为我们在查询中指定 JONES作为级联查询的起始点,它指定JONES的级别为1,其直接下属的级别为2,实际上这和我们原来查询得出的结果不同。当在级联查询中使用诸如LNAME = ‘JONES’的条件时要特别注意,在这种情况下,如果我们刚好有两个同名的JONES,由此返回的结果集将有可能错误。相比之下,使用主键索引或惟一索引,比如EMP_ID作为条件更加合适)
 
In this example, we listed the portion of the organization chart headed by a specific employee. There could be situations when we may need to print the organization chart headed by any employee that meets a specific condition. For example, we may want to list all employees under the employee who has been working in the company for the longest time. In this case, the starting point of the query (the root) is dependent on a condition. Therefore, we have to use a subquery to generate this information and pass it to the main query, as in the following example
 
SELECT LEVEL, LPAD('     ',2*(LEVEL - 1)) || LNAME "EMPLOYEE",EMP_ID, MANAGER_EMP_ID, SALARY
 FROM EMPLOYEE
START WITH HIRE_DATE = (SELECT MIN(HIRE_DATE) FROM EMPLOYEE)
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;
(该级联查询的起始条件中使用了子查询,先查出入职日期最早的时间,作为起始条件,然后查出其下的所有子节点的记录 )
 
While using a subquery in the START WITH clause, be aware of how many rows will be returned by the subquery. If more than one row is returned when we are expecting just one row (indicated by the = sign), the query will generate an error. We can get around this by replacing = with the IN operator, but be warned that the hierarchical query may then end up dealing with multiple roots.
 
C .Checking for Ascendancy(检查节点之间的父子关系):
Another common operation on hierarchical data is to check for ascendancy. In an organization chart, we may ask whether one employee has authority over another. For example: "Does JONES have any authority over BLAKE?" To find out, we need to search for BLAKE in the subtree headed by JONES. If we find BLAKE in the subtree, then we know that BLAKE either directly or indirectly reports to JONES. If we don't find BLAKE in the subtree, then we know that JONES doesn't have any authority over BLAKE. The following query searches for BLAKE in the subtree headed by JONES:
SELECT * FROM EMPLOYEE
WHERE LNAME = 'BLAKE'
 START WITH LNAME = 'JONES'
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;
 
The START WITH...CONNECT BY clause in this example generates the subtree headed by JONES, and the WHERE clause filters this subtree to find BLAKE. As we can see, no rows were returned. This means that BLAKE was not found in JONES' subtree, so we know that JONES has no authority over BLAKE
 
D .Deleting a Subtree(删除子树):
Let's assume that the organization we are dealing with splits, and JONES and all his subordinates form a new company. Therefore, we don't need to maintain JONES and his subordinates in our EMPLOYEE table. Furthermore, we need to delete the entire subtree headed by JONES, as shown in Figure 8-1, from our table. We can do this by using a subquery as in the following example:
 
DELETE FROM EMPLOYEE
 WHERE EMP_ID IN (SELECT EMP_ID FROM EMPLOYEE
                        START WITH LNAME = 'JONES'
                        CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID);
 
In this example, the subquery generates the subtree headed by JONES, and returns the EMP_IDs of the employees in that subtree, including JONES'. The outer query then deletes the records with these EMP_ID values from the EMPLOYEE table.
 
E .Listing Multiple Root Nodes(列出所有的根节点):
An interesting variation on the problem of listing the root node of a hierarchy is to find and list the root nodes from several hierarchies that are all stored in the same table. For example, we might consider department manager's to represent root nodes, and we might further wish to list all department managers found in the EMPLOYEE table
 
There are no constraints on the employees belonging to any department. However, we can assume that if A reports to B and B reports to C, and A and C belong to the same department, then B also belongs to the same department.
 
If an employee's manager belongs to another department, then that employee is the uppermost employee, or manager, of his department.Therefore, to find the uppermost employee in each department, we need to search the tree for those employees whose managers belong to a different department then their own.
( 假如一个雇员的上级隶属于另一个部门,那么该雇员肯定是其所在部门中的最顶端的雇员,或者说是该部门的经理。所以为了找出employee表中所有部门的最顶端的雇员,我们必须查找整棵树,找出那些雇员的上级隶属于另一个部门的节点)
   
    SELECT EMP_ID, LNAME, DEPT_ID, MANAGER_EMP_ID, SALARY, HIRE_DATE
      FROM EMPLOYEE
  START WITH MANAGER_EMP_ID IS NULL
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID
    AND DEPT_ID != PRIOR DEPT_ID;
 
EMP_ID LNAME     DEPT_ID MANAGER_EMP_ID SALARY HIRE_DATE
------ -------- -------- -------------- ------ ---------
 7839 KING            10                  5000 17-NOV-81
 7566 JONES           20           7839   2975 02-APR-81
 7698 BLAKE           30           7839   2850 01-MAY-81
 
F .Listing the Top Few Levels of a Hierarchy(列出级联记录中的若干层记录):
Another common task in dealing with hierarchical data is listing the top few levels of a hierarchy tree. For example, we may want to list top management employees in an organization. Let's assume that the top two levels in our organization chart constitute top management. We can then use the LEVEL pseudocolumn to identify those employees, as in the following example
 
    SELECT EMP_ID, LNAME, DEPT_ID, MANAGER_EMP_ID, SALARY, HIRE_DATE
   FROM EMPLOYEE
WHERE LEVEL <= 2
 START WITH MANAGER_EMP_ID IS NULL
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;
 
G .Aggregating a Hierarchy(级联记录的统计):
Another challenging requirement on hierarchical data is to aggregate a hierarchy. For example, we may want to sum the salaries of all employees reporting to a specific employee. Or, we may want to consider each employee as a root, and for each employee report out the sum of the salaries of all subordinate employees
 
I.统计某个特定雇员及其所有下属的工资总和:
        SELECT SUM(SALARY) FROM EMPLOYEE
 START WITH LNAME = 'JONES'
CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID;
 
II.统计每个雇员及其所有下属的工资总和:
    相对于第一个问题,我们必须把所有的节点都看成是根节点,对每个节点都计算该员工及其所有下属的工资总和。明显地,我们必须不断重复执行第一个问题所使用地查询。
   SELECT LNAME, SALARY, (SELECT SUM(SALARY) FROM EMPLOYEE T1
                            START WITH LNAME = T2.LNAME
                            CONNECT BY MANAGER_EMP_ID = PRIOR EMP_ID) SUM_SALARY
FROM EMPLOYEE T2;
  
4.级联查询中的限制:
A. A hierarchical query can't use a join.
B. A hierarchical query cannot select data from a view that involves a join
C. We can use an ORDER BY clause within a hierarchical query; however, the ORDER BY clause takes precedence over the hierarchical ordering performed by the START WITH...CONNECT BY clause. Therefore, unless all we care about is the level number, it doesn't make sense to use ORDER BY in a hierarchical query
 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值