oracle树状查询的原理,(转摘)Oracle 查询树状结构数据

Querying Hierarchies: Top-of-the-Line Support

By Jonathan Gennick

OTN Member since 2001

It's not widely known, but Oracle provides strong support for querying hierarchical, or tree-structured, data

It wasn't until college, I don't think, that I was formally introduced to the concepts of tree-structured, or hierarchical, data and the recursive processes used to manipulate such data. But college only provided theformalintroduction to those concepts.

Hierarchies abound. I remember, as a young child, playing with nested alphabet blocks, a form of hierarchical data and recursion common among the toddler set. And who hasn't been fascinated by Russian Matryoshka dolls, intrigued by the way each one opens to reveal yet another?

The first databases were hierarchical, not relational, a reflection of how common hierarchies are in the modern world. I live in a country divided into states, many of which are subdivided into counties, which are further divided into townships and cities. Many of us work for corporations with hierarchical governance: a president or CEO manages several vice presidents, who in turn each lead several senior managers; they in turn are set over managers, who oversee supervisors—and somewhere down the line, hopefully, are some people who do productive work for the organization. Even placing a simple book order from Amazon.com involves hierarchical data, because each order consists of one or more books. Indeed, such master-detail relationships likely represent the most common form of tree-structured data we work with today.

Hierarchical databases proved cumbersome, and relational databases, with their tables and columns, and their implementations of the common SQL interface, eventually won the day. However, hierarchical data is still with us, and will always be with us. Unfortunately, the relational model doesn't deal well with hierarchical data. Most implementations of SQL provide no support for querying tables recursively. But Oracle bucks this trend, recognizes the reality of hierarchical data, and provides powerful support for querying tables containing such data.

Fixed Vs. Variable Levels

For the purposes of this article, I'm going to divide hierarchies into two categories: those with a fixed number of levels and those without. The example of states, counties, townships, and cities I gave earlier is a good example of a hierarchy with a fixed number of levels. Figure 1a shows how that example can be implemented in one table recursively, but such an implementation would be a poor choice. Figure 1b shows the same example implemented in the master-detail style, using three tables. Each row in the state table corresponds to one or more rows in the county table. Likewise, each county row corresponds to one or more rows in the city and township tables. Master-detail data is easily queried and manipulated using standard, relational join operations, and is not the focus of this article.

otn_gennick_1.gif

Figure 2shows a table structure for a hierarchy with a variable number of levels. Thebill_of_materialstable contains information on the various components, or assemblies, that combine to make up a product such as an automobile or an airplane. An automobile, for example, may consist of an engine, a body, and an interior. In turn, the engine may consist of parts such as spark plugs and other assemblies such as a starter system. The starter system consists of yet more parts and assemblies, and so forth. Figure 3 shows an example, formatted as a tree, of the data found in thebill_of_materialstable.

otn_gennick_2.gif

otn_gennick_3.gif

Hierarchies really become interesting when the number of levels is not fixed, because the only possible implementation then is a recursive table relationship such as that shown in Figure 2. (Did I say "interesting?" If you're not using Oracle, "painful" might be a better word choice.) I've not seen a relational database other than Oracle that provides support for querying hierarchical data. If your database doesn't provide such support, you are forced to write recursive code yourself. For example, to get the first level of components that make up an airplane, you could issue the following query:

SELECT assembly_id, assembly_name

FROM bill_of_materials

WHERE parent_assembly = 200;

ASSEMBLY_ID ASSEMBLY_NAME

----------- ---------------

201 Jet Engine

202 Left Wing

203 Right Wing

204 Body

Four rows come back. Each of these four assemblies might (or might not) in turn be composed of other assemblies; thus, you must issue four more queries in the same form:

SELECT assembly_id, assembly_name

FROM bill_of_materials

WHERE parent_assembly = 201;

SELECT assembly_id, assembly_name

FROM bill_of_materials

WHERE parent_assembly = 202;

etc.

Painful! Think of all the network round trips required to execute these queries and bring back each level of results. Consider the performance impact of all those round trips. And imagine having to write and debug the code to do all this work.

Oracle Provides a Better Way

Oracle has long provided specific support for querying hierarchical data such as that illustrated in Figure 3. This support comes in the form of theSTART WITHandCONNECT BYclauses of theSELECTstatement, which allow you to easily query hierarchical data in which each child row contains a link back to its parent.

To issue a hierarchical query, you must know two things about your data: the conditions identifying root rows (those at the top of a hierarchy) and the column or columns in a child row that point to its parent.

In thebill_of_materialstable used for the examples in this article, theparent_assemblycolumn contains an assembly ID value identifying the parent of each row. For example, assembly 201 is a jet engine. Theparent_assemblyvalue for that row is 200, indicating that the jet engine is part of an airplane. Theparent_assemblyfor airplane isNULL, indicating that the airplane is the final product.

Follow along with the queries in this article. Download connect_by.zip, which contains a script to create and populate the sample tables and another script that executes all queries shown here.Begin writing a hierarchical query by using theSTART WITHclause to identify the root nodes of interest. The clause in the following query tells Oracle to select all rows frombill_of_materialshaving aNULL parent_assembly. Each of those rows will then be treated as the root node of a tree growing downward, and all the child nodes for each tree and subtree and so forth will also be retrieved.

SELECT assembly_id, assembly_name, parent_assembly

FROM bill_of_materials

START WITH parent_assembly IS NULL

Next, use theCONNECT BYclause to define the relationship between child and parent rows. In thebill_of_materialstable, theparent_assemblycolumn holds the parent row'sassembly_idvalue. This leads to theCONNECT BYclause shown here:

SELECT assembly_id, assembly_name, parent_assembly

FROM bill_of_materials

START WITH parent_assembly IS NULL

CONNECT BY parent_assembly = PRIOR assembly_id;

ThePRIORkeyword, which is really an operator, and not a keyword such asSELECTorCONNECT BY, is critical to the operation of the query.PRIORis designed specifically for use in hierarchical queries. Its purpose is to return a column value from a given row's parent row. In this case, theCONNECT BYclause specifies that for a given row, the parent row is the one for which the prospective parent'sassembly_idvalue matches the child'sparent_assemblyvalue. In effect,CONNECT BYspecifies a recursive join. Listing 1 shows the output from this query.

In Listing 1, "Automobile" is the first root node found while executing the query, so it's listed first in the output. Next comes "Automobile's" first child, the "Combustion Engine." You then see that "Piston," "Air Filter," and "Spark Plug" are components of the engine. "Airplane" is the second root node found, and its assemblies, subassemblies, and parts are also listed in hierarchical order.

Recursive Joins

Before going deeper into Oracle's support for hierarchical queries, I want to stop and talk more about the just what theCONNECT BYclause is, and why it's important. If you think about it, from a certain point of view theCONNECT BYclause is nothing more than a join specification. Imagine for a moment you were going to join thebill_of_materialstable to itself and return each row's immediate parent. To that end, you could write:

SELECT bom1.assembly_id, bom1.assembly_name, bom2.assembly_id parent

FROM bill_of_materials bom1 LEFT OUTER JOIN bill_of_materials bom2

ON bom1.parent_assembly = bom2.assembly_id;

ASSEMBLY_ID ASSEMBLY_NAME PARENT

----------- ----------------------- ----------

130 Interior 100

120 Body 100

110 Combustion Engine 100

115 Starter System 110

114 Block 110

...

Doesn't the output from this query look suspiciously like that of the query shown in the previous section? I'll save you the trouble of checking, and tell you that this query returns the same 42 rows as the previous query. The only difference is that those rows are ordered differently. But the order of the rows matters! The order in which rows are returned from the self-join solution does not reflect the hierarchy.

Many problems cannot be solved via a self-join. For example, if you were to use a self-join, you wouldn't be able to select just the assemblies and parts that make up an airplane. In fact, the self-join shown here is next to useless when querying hierarchical data. What you need is arecursiveself-join: The root row is joined to its children, each of those child rows is joined to its children, and so forth on down the hierarchy. The database can be aware of the hierarchical nature of the data, return results in a useful order, return useful information about a row's position in its hierarchy, and enable you to work with nodes in a top-down fashion.CONNECT BYgives you all these things.

To convert the preceding self-join to a recursive join, begin by specifying the table only once in theFROMclause:

FROM bill_of_materials

TheONclause goes away, and is replaced byCONNECT BY. The reference tobom1.parent_assemblybecomes simplyparent_assembly. The two references tobom2.assembly_id, which are references to a value in the parent table, becomePRIOR assembly_id. Add in the necessarySTART WITHclause, and you have the following:

SELECT assembly_id, assembly_name, PRIOR assembly_id parent

FROM bill_of_materials

START WITH parent_assembly IS NULL

CONNECT BY parent_assembly = PRIOR assembly_id;

This query returns the same 42 rows as the preceding self-join. However, the order of those rows reflects the hierarchy, and becauseCONNECT BYis used, you now have some very interesting hierarchical query features at your disposal, features I'll explore in the rest of this article.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值