left join 和 left outer join 的区别 ,full outer join

  通俗的讲:  
  A   
left     join    B   的连接的记录数与A表的记录数同  
  A   
right     join    B   的连接的记录数与B表的记录数同    
  A   
left     join    B   等价B    right     join    A  
   
   
  
table    A:  
Field_K,   Field_A  
   1                        a  
  
3                        b  
  
4                        c  
   
  
table    B:  
  Field_K,   Field_B  
  
1                        x  
  
2                        y  
  
4                        z  
   
  
select    a.Field_K,   a.Field_A,   b.Field_K,   b.Field_B  
  
from    a    left     join    b    on    a.Field_K = b.Field_K  
   
  Field_K         Field_A         Field_K         Field_B          
  
-- --------   ----------   ----------   ----------    
   1                      a                      1                      x                    
  
3                      b                      NULL                 NULL   
  
4                      c                      4                      z                    
   
  
select    a.Field_K,   a.Field_A,   b.Field_K,   b.Field_B  
  
from    a    right     join    b    on    a.Field_K = b.Field_K  
   
  Field_K         Field_A         Field_K         Field_B          
  
-- --------   ----------   ----------   ----------    
   1                      a                      1                      x                    
  
NULL                 NULL                 2                      y                    
  
4                      c                      4                      z       --
举个例子:  
  假设a表和b表的数据是这样的。  
  a                         b    
  id     name  id     stock   
  
1   a              1           15   
  
2          b              2           50   
  
3          c                  
   
  
select     *     from    a    inner     join    b    on    a.id = b.id  
  这个语法是连接查询中的内连接,它产生的结果是  
  两个表相匹配的记录出现在结果列表中。  
  根据上面的表,出现的结果是这样的  
  a.id     name     b.id     stock  
  
1        a              1           15   
  
2              b              2           50   
  
-- --------------------------  
   select     *     from    a,b    where    a.id = b.id  
  这个语法是内连接的另外一种写法,其执行结果与inner   
join    一样  
   
  
-- ------------------------------    
   
  
select     *     from    a    left / right     join    b    on    a.id = b.id  
  这个是外连接语法中的左外连接或右外连接  
  如果是左外连接的话,它将显示a表的所有记录,  
  
select    a. * ,b. *     from    a    left     join    b    on    a.id = b.id  
  查询的结果是这样的:  
  a.id     name     b.id     stock  
  
1          a          1               15   
  
2                b          2               50   
  
3                c        null           null    
  
-- ------------------------------------------  
  如果是右外连接的话,它将显示b表的所有记录,  
  
select    a. * ,b. *     from    a    right     join    b    on    a.id = b.id  
  查询的结果是这样的:  
  a.id     name     b.id     stock  
  
1          a          1               15   
  
2                b          2               50    


--

select    a. * ,b. *     from    a    left     join    b    on    a.k    =    b.k    
  
select    a. * ,b. *     from    a    left     outer     join    b    on    a.k    = b.k  
  
-- --------上面两种一样left   join是left   outer   join的简写  
   select    a. * ,b. *     from    a    left     inner     join    b    on    a.k    =    b.k    
  没有这种写法,错误的语句.


--

在你要使用多个left   join的时候  
  比如说10个  
  我们把10个全都写成left   join的形式  
  然后再SQL让他自动运行一下,它会把最后一次出现的left   join变成left   
outer     join   
  所以依此推理,最后一个left   join会以left   
outer    join的形式存在  
  当然,不管变不变对结果的显示没有任何影响  
  希望我的实验能对你有所帮助   


--

使用关系代数合并数据
1  关系代数
合并数据集合的理论基础是关系代数,它是由E.F.Codd于1970年提出的。
在关系代数的形式化语言中:
        用表、或者数据集合表示关系或者实体。
        用行表示元组。
        用列表示属性。
关系代数包含以下8个关系运算符
        选取――返回满足指定条件的行。
        投影――从数据集合中返回指定的列。
        笛卡尔积――是关系的乘法,它将分别来自两个数据集合中的行以所有可能的方式进行组合。
        并――关系的加法和减法,它可以在行的方向上合并两个表中的数据,就像把一个表垒在另一个表之上一样。
        交――返回两个数据集合所共有的行。
        差――返回只属于一个数据集合的行。
        连接――在水平方向上合并两个表,其方法是:将两个表中在共同数据项上相互匹配的那些行合并起来。
        除――返回两个数据集之间的精确匹配。
此外,作为一种实现现代关系代数运算的方法,SQL还提供了:
        子查询――类似于连接,但更灵活;在外部查询中,方式可以使用表达式、列表或者数据集合的地方都可以使用子查询的结果。
本章将主要讲述多种类型的连接、简单的和相关的子查询、几种类型的并、关系除以及其他的内容。
2  使用连接
2.1  连接类型
在关系代数中,连接运算是由一个笛卡尔积运算和一个选取运算构成的。首先用笛卡尔积完成对两个数据集合的乘运算,然后对生成的结果集合进行选取运算,确保只把分别来自两个数据集合并且具有重叠部分的行合并在一起。连接的全部意义在于在水平方向上合并两个数据集合(通常是表),并产生一个新的结果集合,其方法是将一个数据源中的行于另一个数据源中和它匹配的行组合成一个新元组。
SQL提供了多种类型的连接方式,它们之间的区别在于:从相互交叠的不同数据集合中选择用于连接的行时所采用的方法不同。
连接类型        定义
内连接        只连接匹配的行
左外连接        包含左边表的全部行(不管右边的表中是否存在与它们匹配的行),以及右边表中全部匹配的行
右外连接        包含右边表的全部行(不管左边的表中是否存在与它们匹配的行),以及左边表中全部匹配的行
全外连接        包含左、右两个表的全部行,不管另外一边的表中是否存在与它们匹配的行。
(H)(theta)连接        使用等值以外的条件来匹配左、右两个表中的行
交叉连接        生成笛卡尔积-它不使用任何匹配或者选取条件,而是直接将一个数据源中的每个行与另一个数据源的每个行都一一匹配
在INFORMIX中连接表的查询
如果FROM子句指定了多于一个表引用,则查询会连接来自多个表的行。连接条件指定各列之间(每个表至少一列)进行连接的关系。因为正在比较连接条件中的列,所以它们必须具有一致的数据类型。
SELECT语句的FROM子句可以指定以下几种类型的连接
FROM子句关键字        相应的结果集
CROSS   JOIN         笛卡尔乘积(所有可能的行对)
INNER   JOIN         仅对满足连接条件的CROSS中的列
LEFT   OUTER   JOIN         一个表满足条件的行,和另一个表的所有行
RIGHT   OUTER   JOIN         与LEFT相同,但两个表的角色互换
FULL   OUTER   JOIN          LEFT   OUTER  和  RIGHT  OUTER中所有行的超集

2.2  内连接( Inner   Join
内连接是最常见的一种连接,它页被称为普通连接,而E.FCodd最早称之为自然连接。
下面是ANSI SQL-92标准
select   *
from   t_institution i
inner   join  t_teller t
on  i.inst_no  =  t.inst_no
where  i.inst_no  =  " 5801 "
其中inner可以省略。
等价于早期的连接语法
select   *
from  t_institution i, t_teller t
where  i.inst_no  =  t.inst_no
and  i.inst_no  =  " 5801 "

2.3  外连接
2.3 . 1         左外连接( Left   Outer  Jion)
select   *
from   t_institution i
left   outer   join  t_teller t
on  i.inst_no  =  t.inst_no
其中outer可以省略。
2.3 . 2         右外连接(Rigt  Outer  Jion)
select   *
from   t_institution i
right   outer   join  t_teller t
on  i.inst_no  =  t.inst_no
2.3 . 3         全外连接( Full   Outer )
全外连接返回参与连接的两个数据集合中的全部数据,无论它们是否具有与之相匹配的行。在功能上,它等价于对这两个数据集合分别进行左外连接和右外连接,然后再使用消去重复行的并操作将上述两个结果集合并为一个结果集。
在现实生活中,参照完整性约束可以减少对于全外连接的使用,一般情况下左外连接就足够了。在数据库中没有利用清晰、规范的约束来防范错误数据情况下,全外连接就变得非常有用了,你可以使用它来清理数据库中的数据。
select   *
from   t_institution i
full   outer   join  t_teller t
on  i.inst_no  =  t.inst_no
2.3 . 4         外连接与条件配合使用
当在内连接查询中加入条件是,无论是将它加入到join子句,还是加入到where子句,其效果是完全一样的,但对于外连接情况就不同了。当把条件加入到 join子句时,SQL Server、Informix会返回外连接表的全部行,然后使用指定的条件返回第二个表的行。如果将条件放到where子句中,SQL Server将会首先进行连接操作,然后使用where子句对连接后的行进行筛选。下面的两个查询展示了条件放置位子对执行结果的影响:
条件在join子句
select   *
from   t_institution i
left   outer   join  t_teller t
on  i.inst_no  =  t.inst_no
and  i.inst_no  =  “ 5801
结果是:
inst_no    inst_name            inst_no    teller_no  teller_name
5801        天河区                5801         0001        tom
5801        天河区                5801         0002        david
5802        越秀区
5803        白云区
条件在where子句
select   *
from   t_institution i
left   outer   join  t_teller t
on  i.inst_no  =  t.inst_no
where  i.inst_no  =  “ 5801
结果是:
inst_no    inst_name            inst_no    teller_no  teller_name
5801        天河区                5801         0001        tom
5801        天河区                5801         0002        david

2.4  自身连接
自身连接是指同一个表自己与自己进行连接。这种一元连接通常用于从自反关系(也称作递归关系)中抽取数据。例如人力资源数据库中雇员与老板的关系。
下面例子是在机构表中查找本机构和上级机构的信息。
select  s.inst_no superior_inst, s.inst_name sup_inst_name, i.inst_no, i.inst_name
from  t_institution i
join  t_institution s
on  i.superior_inst  =  s.inst_no

结果是:
superior_inst sup_inst_name        inst_no    inst_name
800            广州市                5801        天河区
800            广州市                5802        越秀区
800            广州市                5803        白云区

2.5  交叉(无限制) 连接
交叉连接用于对两个源表进行纯关系代数的乘运算。它不使用连接条件来限制结果集合,而是将分别来自两个数据源中的行以所有可能的方式进行组合。数据集合中一的每个行都要与数据集合二中的每一个行分别组成一个新的行。例如,如果第一个数据源中有5个行,而第二个数据源中有4个行,那么在它们之间进行交叉连接就会产生20个行。人们将这种类型的结果集称为笛卡尔乘积。
大多数交叉连接都是由于错误操作而造成的;但是它们却非常适合向数据库中填充例子数据,或者预先创建一些空行以便为程序执行期间所要填充的数据保留空间。
select   *
from   t_institution i
cross   join  t_teller t
在交叉连接中没有on条件子句

3  APPENDIX
3.1  A 参考资料与资源
        《Microsoft SQL Server 
2000  Bile》Paul Nielsen
        Paul Nielsen的Web站点
[ url ] www.isnotnull.com [ /url ]
3.2  注文章所有SQL在IBM Informix Dynamic Server Version  9.40 .TC2E1测试通过


--

表A记录如下:
aID        aNum
1            a20050111
2            a20050112
3            a20050113
4            a20050114
5            a20050115

表B记录如下:
bID        bName
1              2006032401
2             2006032402
3             2006032403
4             2006032404
8             2006032408


实验如下:
1 . left   join

sql语句如下:
select   *   from  A
left   join  B
on  A.aID  =  B.bID

结果如下:
aID        aNum                   bID           bName
1             a20050111          1                 2006032401
2             a20050112          2                2006032402
3             a20050113          3                2006032403
4             a20050114          4                2006032404
5             a20050115          NULL         NULL
(所影响的行数为 
5  行)

结果说明:
        
left  join是以A表的记录为基础的,A可以看成左表,B可以看成右表, left  join是以左表为准的.
换句话说,左表(A)的记录将会全部表示出来,而右表(B)只会显示符合搜索条件的记录(例子中为: A.aID 
=  B.bID).
B表记录不足的地方均为NULL.

2 . right   join

sql语句如下:
select   *   from  A
right   join  B
on  A.aID  =  B.bID

结果如下:
aID        aNum                   bID           bName
1             a20050111          1                 2006032401
2             a20050112          2                2006032402
3             a20050113          3                2006032403
4             a20050114          4                2006032404
NULL      NULL                     8                2006032408
(所影响的行数为 
5  行)

结果说明:
        仔细观察一下,就会发现,和left join的结果刚好相反,这次是以右表(B)为基础的,A表不足的地方用NULL填充.

3 . inner   join

sql语句如下:
select   *   from  A
innerjoin B
on  A.aID  =  B.bID

结果如下:
aID        aNum                   bID           bName
1             a20050111          1                 2006032401
2             a20050112          2                2006032402
3             a20050113          3                2006032403
4             a20050114          4                2006032404

结果说明:
        很明显,这里只显示出了 A.aID 
=  B.bID的记录.这说明inner join并不以谁为基础,它只显示符合条件的记录.


-----------------[以下为网上的一点资料]------------------


LEFT  JOIN操作用于在任何的  FROM  子句中,组合来源表的记录。使用  LEFT   JOIN  运算来创建一个左边外部联接。左边外部联接将包含了从第一个(左边)开始的两个表中的全部记录,即使在第二个(右边)表中并没有相符值的记录。

语法:
FROM  table1  LEFT   JOIN  table2  ON  table1.field1 compopr table2.field2

说明:table1, table2参数用于指定要将记录组合的表的名称。
field1, field2参数指定被联接的字段的名称。且这些字段必须有相同的数据类型及包含相同类型的数据,但它们不需要有相同的名称。
compopr参数指定关系比较运算符:"
= ", " < ", " > ", " <= ", " >= " 或 " <> "。
如果在INNER JOIN操作中要联接包含Memo 数据类型或 OLE Object 数据类型数据的字段,将会发生错误。

 

 

 

As many of you know, I strongly recommend that you avoid using RIGHT OUTER JOINs, since they make your SQL code less readable and are easily rewritten as LEFT OUTER JOINs.  In addition, I have yet to find a situation where a FULL OUTER JOIN makes sense or is necessary -- I have found that in just about every case other techniques work better. (Feel free to suggest some FULL OUTER JOIN situations in the comments and we can discuss)

Let's take a common situation where you need to "merge" the data from two tables, and rows may or may not exist in either table.  For example, suppose we have budgets in one table and actuals in another and we wish to present them side by side. Many people would use a FULL OUTER JOIN to accomplish this, perhaps writing something like this:

SELECT
    coalesce(a.company,b.company) as company,
    coalesce(a.account, b.account) as account,
    coalesce(a.year,b.year) as year,
    coalesce(a.month, b.month) as month,
    coalesce(a.amount,0) as Actual,
    coalesce(b.amount,0) as Budget
FROM
    Actuals a
FULL OUTER JOIN
    Budgets b on
        a.company  = b.company and
        a.account = b.account and
        a.year = b.year and
        a.month = b.month


The above FULL OUTER JOIN  will effectively "merge" the two tables and allow you to see all actuals and budgets for every company/account/year/month, and it will show values from either table even if there is not a matching value in the other table.  Essentially, it gets the job done.

There are some important things to note in the above, however:

  • There is no relation between the Budgets and Actuals table (i.e., they do not have a 1:1 or 1:M relationship to each other) yet we are joining them together! To me, this does not make logical sense.

  • Suppose that the PK of the Budgets table allows for more than 1 budget row per Company/Account/Year/Month -- this would result in two Budget rows matching a single Actual row, duplicating the Actual amount. You must be absolutely sure when doing a FULL OUTER JOIN with any sort of totaling that the two tables will never have more than 1 matching row to the other table.

  • What is the "driving" table behind this SQL statement?  Even though we are clearly selecting FROM the Actuals table, the Actuals do not "drive" our results.  And even though we are joining FROM the Actuals TO the Budgets, some rows will have data only from the Budget table without matching Actual rows.  To me, this is difficult to read and logically interpret; it doesn't follow the standard "select from table A join to table B" logic that is clear and easy to understand and work with.  (note: this is basically my same argument against RIGHT OUTER JOINs.)

  • Every column that is returned from either table is potentially nullable. All of them. You cannot reference a column in either table without handling the case where that column might have a null value.  Thus, every column in either table must be wrapped in an ISNULL(), CASE, or COALESCE() expression.  Most importantly: this includes all non-nullable primary key columns in both tables!

  • Once you've wrapped every column in an expression, no further use of indexes from those columns can be used. For example, if I join the result of this FULL OUTER JOIN to other tables to show Account or Company names or information,  no existing indexes on the Actual or Budget table can be used on that join since every column is an expression!   Thus, as soon as you use a FULL OUTER JOIN, you completely eliminate all indexes from both tables involved.

  • Since no indexes are usable in the results, any sorting done on the results is done on expressions and it will not be optimally efficient as well.

So, based on the above, I feel there's two basic issues with FULL OUTER JOINS:  The code itself is difficult to interpret and isn't especially clear, and the result returned is just a big mess of nullable columns all wrapped in expressions which isn't a clean and efficient set of data to work with.
 
I have always felt that it is very important in any SELECT to clearly establish your FROM clause as the primary, "driving" data source, and then to join from that primary source to the auxiliary tables or SQL statements via JOINS.   However, in our example and with FULL OUTER JOINs in general, we don't really want to relate one table to another, we want to MERGE these two tables together.  When merging the data from two tables, doesn't a UNION make more sense?  Why express our intentions as a (FULL OUTER) JOIN when the true operation you are after is actually a UNION?

Consider this SQL statement as an alternative:

SELECT
    company,
    account,
    year,
    month,
    sum(actual) as actual,
    sum(budget) as budget
FROM
(
    SELECT
      company, account, year, month, amount as actual, 0 as budget
    FROM
      Actuals
    UNION ALL
    SELECT
      company, account, year, month, 0 as actual, amount as budget
    FROM
      Budgets
) x
GROUP BY
    company, account, year, month



(note: we are basically using the technique I described here).

Some notes:

  • Since there is technically no direct relation between Budgets and Actuals (i.e., there is not a 1:1 or 1:M relation between the two tables) this SQL make more sense because it is not implying or stating that we are joining these two tables together.

  • If there are multiple budget amounts that match a single Actual amount (or vice versa) for a given company/account/year/month, the correct totals are still calculated -- since we are not joining the two tables, the Actual row will not be duplicated if it matches multiple Budget rows.

  • The UNION makes it very clear that we are taking the results from two tables together, and merging them into 1 set of rows.   The FROM clause truly and accurately describes the primary data that this SELECT will use.

  • The primary key columns returned are not wrapped in a COALESCE() functions; they can never be NULL since they come directly from one of the two tables.

  • Joins or sorts on our indexed columns can now be used.

Even if the UNION ended up being a little longer or even in some cases slightly less efficient, I still feel it is clearer and more readable to use a UNION instead of a FULL OUTER JOIN and it is usually worth the cost.

Other ways of handling this situation would be to use a CROSS JOIN as well, which might even be necessary depending on the results that we want.  If we want to always return all combinations of Companies/Accounts/Months/Years even if there is no Budget or Actual data available for that combination, a CROSS JOIN is the answer you need:

SELECT
  All.Company, 
  All.Account,
  All.Year,
  All.Month,
  coalesce(Actuals.Amount,0) as Actual,
  coalesce( Budgets .Amount,0) as Budget
FROM
(
   SELECT
     C.Company, A.Account, M.Year, M.Month
   FROM
     Companies C
   CROSS JOIN
     Accounts A
   CROSS JOIN
     Months M
   WHERE
     M.Year = 2006   -- or whatever criteria you need here ....
) All
LEFT OUTER JOIN
  Actuals
          ON Actuals.company = All.Company and
             Actuals.Account = All.Account and
             Actuals.Year = All.Year and
             Actuals.Month = All.Month
LEFT OUTER JOIN
  Budgets
          ON Budgets.company = All.Company and
             Budgets.Account = All.Account and
             Budgets.Year = All.Year and
             Budgets.Month = All.Month


(The technique shown above is discussed further here.)

In the above, our primary "driving" data source is a series of CROSS JOINS that produces the combination of rows we need to return,  and then from there we do two separate LEFT OUTER JOINS to our transactional tables.   (Note that, as with a FULL OUTER JOIN, if the primary key of either the Budgets or the Actuals table isn't Co/Acct/Year/Month, you should join instead to summarized derived tables that group on those key columns to prevent duplicating rows in the result.)

Thus, even though it seems that maybe a FULL OUTER JOIN or UNION is necessary to "merge" or two tables, sometimes a CROSS JOIN/LEFT OUTER JOIN can work as well, and it indeed fulfills a different requirement that is not possible (or easily achieved) with those other techniques by guaranteeing the exact set of rows to be returned, even if no matching transactions exist. 

 

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值