join语法在平时的使用中还是很常见的,首先来看下Join语法树。
join_table:
table_reference [INNER] JOIN table_factor [join_condition]
| table_reference {LEFT|RIGHT|FULL} [OUTER] JOIN table_reference join_condition
| table_reference LEFT SEMI JOIN table_reference join_condition
| table_reference CROSS JOIN table_reference [join_condition] (as of Hive 0.10)
首先来看下简单的练习。
--table1: 员工表
CREATE TABLE employee(
id int,
name string,
deg string,
salary int,
dept string
) row format delimited
fields terminated by ',';
--table2:员工家庭住址信息表
CREATE TABLE employee_address (
id int,
hno string,
street string,
city string
) row format delimited
fields terminated by ',';
--table3:员工联系方式信息表
CREATE TABLE employee_connection (
id int,
phno string,
email string
) row format delimited
fields terminated by ',';
--加载数据到表中
load data local inpath '/root/hivedata/employee.txt' into table employee;
load data local inpath '/root/hivedata/employee_address.txt' into table employee_address;
load data local inpath '/root/hivedata/employee_connection.txt' into table employee_connection;
select * from employee;
+--------------+----------------+---------------+------------------+----------------+
| employee.id | employee.name | employee.deg | employee.salary | employee.dept |
+--------------+----------------+---------------+------------------+----------------+
| 1201 | gopal | manager | 50000 | TP |
| 1202 | manisha | cto | 50000 | TP |
| 1203 | khalil | dev | 30000 | AC |
| 1204 | prasanth | dev | 30000 | AC |
| 1206 | kranthi | admin | 20000 | TP |
| 1201 | gopal | manager | 50000 | TP |
| 1202 | manisha | cto | 50000 | TP |
| 1203 | khalil | dev | 30000 | AC |
| 1204 | prasanth | dev | 30000 | AC |
| 1206 | kranthi | admin | 20000 | TP |
+--------------+----------------+---------------+------------------+----------------+
select * from employee_address;
select * from employee_connection;
常见的连接有六种,主要有:
- 内联结 inner join=join,返回左右两边满足条件的数据。
- 左连接left outer join= left join 左为准,左表全部显示,右表与之关联,满足条件返回,不满足条件显示null。
- 右连接right outer join=right join 右表为准,右表全部显示,左表与之关联,满足条件返回,不满足条件显示为null。
- 外连接full join=full outer join关键字结合了 LEFT JOIN 和 RIGHT JOIN 的结果。
- 左半连接left semi join 相当于inner join 但是只返回左表的全部是数据,只不过效率高一点。
- 交叉连接cross join 将会返回被连接的两个表的笛卡尔积,返回结果的行数等于两个表行数的乘积。