视图:
1、视图是一个逻辑表,本身不存储数据。数据来源于它依赖的物理表
2、视图只是简化复杂的查询,相当于一个sql数据集,对于一个大表来说,简化查询sql,并不能提高查询效率。反而增加资源开销,建议频繁查询的数据落地
1.视图如何创建
hive> create view if not exists test.test_view as
> select
> apptypeid,
> accid,
> dt
> from test.test1
>
> union all
> select
> apptypeid,
> accid,
> dt
> from test.test2;
OK
Time taken: 0.968 seconds
查看视图表:
hive> show create table test.test_view;
OK
CREATE VIEW `test.test_view` AS select
`test1`.`apptypeid`,
`test1`.`accid`,
`test1`.`dt`
from `test`.`test1`
union all
select
`test2`.`apptypeid`,
`test2`.`accid`,
`test2`.`dt`
from `test`.`test2`
Time taken: 0.028 seconds, Fetched: 12 row(s)
注:1.视图依赖物理表test1、test2,每次查询视图表,都会从test1/test2中取数据,如果是频繁查询或者使用的地方比较多,建议数据直接落地会比较好
2.视图表里面可以编写复杂的统计逻辑,这里只做了简单的union all
下面我们来看看执行计划,验证查询视图表是否是查询物理表的数据:
hive> explain select count(1) from test.test_view where dt='20210610';
OK
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: test1
properties:
insideView TRUE
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Select Operator
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
Union
Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
Select Operator
Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count(1)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
TableScan
alias: test2
properties:
insideView TRUE
Statistics: Num rows: 1 Data size: 3815370 Basic stats: PARTIAL Column stats: NONE
Select Operator
Statistics: Num rows: 1 Data size: 3815370 Basic stats: PARTIAL Column stats: NONE
Union
Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
Select Operator
Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count(1)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
Time taken: 0.885 seconds, Fetched: 68 row(s)
从执行计划可以验证我们的判断,统计test.test_view表的条数时,是分别从test1和test2表中做count操作的
扩展:
2.删除视图:
drop view test.test_view;
3.更新视图:
create or replace view test.test_view2 as
select
apptypeid,
accid,
dt
from test.test3
union all
select
apptypeid,
accid,
dt
from test.test4;