hive写mysql事物_Hive学习笔记(6)DDL

本文介绍了Hive中的数据库操作,包括创建(CREATE)、删除(DROP)、更改(ALTER)和使用(USE)数据库的详细步骤,并通过示例展示了如何执行这些操作。此外,还提到了数据库的属性设置、所有者变更以及切换和查看数据库。
摘要由CSDN通过智能技术生成

创建/删除/更改/使用数据库

在hive sql中database关键词和 schema关键词可以互换,意思是一样的

创建数据库

CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name

[COMMENT database_comment]

[LOCATION hdfs_path]

[WITH DBPROPERTIES (property_name=property_value, ...)];

小牛试刀:

hive> create database if not exists test;

OK

Time taken: 0.131 seconds

hive> create schema if not exists test_schema;

OK

Time taken: 0.07 seconds

hive> show databases;

OK

default

test

test_schema

Time taken: 0.255 seconds, Fetched: 3 row(s)

hive> create database if not exists stefan comment 'just for test';

OK

Time taken: 0.047 seconds

hive> desc database stefan;

OK

stefan just for test hdfs://jms-master-01:9000/user/hive/warehouse/stefan.db hadoop USER

Time taken: 0.068 seconds, Fetched: 1 row(s)

删除数据库

删除数据库时,默认行为是RESTRICT,这种情况下如果数据库不为空,则删除动作失败。如果需要删除库以及库里的表可以使用CASCADE关键字。

DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];

小牛试刀:

hive> drop database test;

OK

Time taken: 0.242 seconds

hive> drop database if exists test;

OK

Time taken: 0.011 seconds

hive> drop database if exists test_schema cascade;

OK

Time taken: 1.518 seconds

hive> show databases;

OK

default

prop_database

stefan

修改数据库

需要注意下述命令的生效版本号。

ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...); -- (Note: SCHEMA added in Hive 0.14.0)

ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role; -- (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0)

ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; -- (Note: Hive 2.2.1, 2.4.0 and later)

alter database ... set location 语句不会将数据库当前目录的内容移动到新指定的位置。它不会更改当前数据库下已经存在的任何表/分区关联的位置,仅改变在该数据库下新建表的默认父目录。

小牛试刀:

添加属性:

hive> show create database stefan;

OK

CREATE DATABASE `stefan`

COMMENT

'just for test'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse/stefan.db'

Time taken: 0.222 seconds, Fetched: 5 row(s)

hive> alter database stefan set dbproperties ('owner'='stefan', 'create_time'='20190403');

OK

Time taken: 0.07 seconds

hive> show create database stefan;

OK

CREATE DATABASE `stefan`

COMMENT

'just for test'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse/stefan.db'

WITH DBPROPERTIES (

'create_time'='20190403',

'owner'='stefan')

Time taken: 0.196 seconds, Fetched: 8 row(s)

修改owner:

hive> alter schema stefan set owner user stefan_test;

OK

Time taken: 0.084 seconds

hive> desc database stefan;

OK

stefan just for test hdfs://jms-master-01:9000/user/hive/warehouse/stefan.db stefan_test USER

Time taken: 0.145 seconds, Fetched: 1 row(s)

切换数据库

USE database_name;

USE DEFAULT;

hive> use stefan;

OK

Time taken: 0.019 seconds

查看库

show databases;

hive> show databases;

OK

default

Time taken: 5.987 seconds, Fetched: 1 row(s)

查看当前库

show current_database();

hive> select current_database();

OK

default

Time taken: 0.852 seconds, Fetched: 1 row(s)

查看库的属性信息

desc database extended default;

hive> desc database extended default;

OK

default Default Hive database hdfs://jms-master-01:9000/user/hive/warehouse public ROLE

Time taken: 0.361 seconds, Fetched: 1 row(s)

查看建库语句

show create database default;

hive> show create database default;

OK

CREATE DATABASE `default`

COMMENT

'Default Hive database'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse'

Time taken: 0.525 seconds, Fetched: 5 row(s)

建库时携带属性

create database if not exists prop_database comment 'test properties' with dbproperties ('prop1'='aaa', 'prop2'='bbb');

hive> create database if not exists prop_database comment 'test properties database' with dbproperties('prop1'='aaa', 'prop2'='bbb');

OK

Time taken: 0.121 seconds

hive> show create database prop_database;

OK

CREATE DATABASE `prop_database`

COMMENT

'test properties database'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse/prop_database.db'

WITH DBPROPERTIES (

'prop1'='aaa',

'prop2'='bbb')

Time taken: 0.181 seconds, Fetched: 8 row(s)

创建/删除/清空表

建表

根据指定表明创建表。当表明存在时抛出异常,可以使用 IF NOT EXISTS 跳过该异常。

*表和列的注释comment格式是字符串,使用单引号。

*表名和列名不区分大小写。

*使用external创建的表是外部表,不使用默认创建的表是托管表(我习惯称之为内部表)。查看一个表是托管表还是外部表,可以通过describe extended table_name;查看其中的tableType属性值。

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later)

[(col_name data_type [COMMENT col_comment], ... [constraint_specification])]

[COMMENT table_comment]

[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]

[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]

[SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]

ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)

[STORED AS DIRECTORIES]

[

[ROW FORMAT row_format]

[STORED AS file_format]

| STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)

]

[LOCATION hdfs_path]

[TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later)

[AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name

LIKE existing_table_or_view_name

[LOCATION hdfs_path];

data_type

: primitive_type

| array_type

| map_type

| struct_type

| union_type -- (Note: Available in Hive 0.7.0 and later)

primitive_type

: TINYINT

| SMALLINT

| INT

| BIGINT

| BOOLEAN

| FLOAT

| DOUBLE

| DOUBLE PRECISION -- (Note: Available in Hive 2.2.0 and later)

| STRING

| BINARY -- (Note: Available in Hive 0.8.0 and later)

| TIMESTAMP -- (Note: Available in Hive 0.8.0 and later)

| DECIMAL -- (Note: Available in Hive 0.11.0 and later)

| DECIMAL(precision, scale) -- (Note: Available in Hive 0.13.0 and later)

| DATE -- (Note: Available in Hive 0.12.0 and later)

| VARCHAR -- (Note: Available in Hive 0.12.0 and later)

| CHAR -- (Note: Available in Hive 0.13.0 and later)

array_type

: ARRAY < data_type >

map_type

: MAP < primitive_type, data_type >

struct_type

: STRUCT < col_name : data_type [COMMENT col_comment], ...>

union_type

: UNIONTYPE < data_type, data_type, ... > -- (Note: Available in Hive 0.7.0 and later)

row_format

: DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]

[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]

[NULL DEFINED AS char] -- (Note: Available in Hive 0.13 and later)

| SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]

file_format:

: SEQUENCEFILE

| TEXTFILE -- (Default, depending on hive.default.fileformat configuration)

| RCFILE -- (Note: Available in Hive 0.6.0 and later)

| ORC -- (Note: Available in Hive 0.11.0 and later)

| PARQUET -- (Note: Available in Hive 0.13.0 and later)

| AVRO -- (Note: Available in Hive 0.14.0 and later)

| JSONFILE -- (Note: Available in Hive 4.0.0 and later)

| INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname

constraint_specification:

: [, PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE ]

[, CONSTRAINT constraint_name FOREIGN KEY (col_name, ...) REFERENCES table_name(col_name, ...) DISABLE NOVALIDATE

创建托管表:

hive> create table manage_table (id int, name string);

OK

Time taken: 0.674 seconds

创建外部表:

hive> create external table external_table (id int, name string) location '/user/hadoop/tmp/hive/default.db/external_table';

OK

Time taken: 0.449 seconds

查看表是托管表还是外部表:

可以看出托管表的tableType是MANAGED_TABLE;外部表的tableType是EXTERNAL_TABLE。

hive> describe extended manage_table;

OK

id int

name string

Detailed Table Information Table(tableName:manage_table, dbName:default, owner:hadoop, createTime:1554341982, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://jms-master-01:9000/user/hive/warehouse/manage_table, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{totalSize=0, numRows=0, rawDataSize=0, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=0, transient_lastDdlTime=1554341982}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, rewriteEnabled:false)

Time taken: 0.329 seconds, Fetched: 4 row(s)

hive> describe extended external_table;

OK

id int

name string

Detailed Table Information Table(tableName:external_table, dbName:default, owner:hadoop, createTime:1554342083, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://jms-master-01:9000/user/hadoop/tmp/hive/default.db/external_table, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE, transient_lastDdlTime=1554342083}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, rewriteEnabled:false)

Time taken: 0.143 seconds, Fetched: 4 row(s)

创建分区表

hive> create table partition_table (id int, name string) partitioned by (dt string);

OK

Time taken: 0.393 seconds

hive> show create table partition_table;

OK

CREATE TABLE `partition_table`(

`id` int,

`name` string)

PARTITIONED BY (

`dt` string)

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.TextInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse/stefan.db/partition_table'

TBLPROPERTIES (

'transient_lastDdlTime'='1554344052')

Time taken: 0.275 seconds, Fetched: 15 row(s)

create table as select(CTAS)

CTAS语法限制:

目标表不能是外部表

目标表不能是分桶表

CTAS语法可以实现表的格式转换。

hive> select * from partition_table;

OK

1 郭靖 20190402 china

2 黄蓉 20190402 china

3 杨康 20190402 china

4 穆念慈 20190402 china

5 东邪 20190402 china

6 西毒 20190402 china

7 黄老邪 20190402 china

8 杨铁心 20190402 china

Time taken: 1.499 seconds, Fetched: 8 row(s)

hive> create table new_key_value_store row format serde "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe" stored as RCFile as select (id * 10) new_key, concat(id, name) key_value_pair from partition_table sort by new_key, key_value_pair;

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

Query ID = hadoop_20190404103533_902fb48c-2471-4e2d-bc79-5cba46b8c710

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks not specified. Estimated from input data size: 1

In order to change the average load for a reducer (in bytes):

set hive.exec.reducers.bytes.per.reducer=

In order to limit the maximum number of reducers:

set hive.exec.reducers.max=

In order to set a constant number of reducers:

set mapreduce.job.reduces=

Starting Job = job_1552651623473_0007, Tracking URL = http://jms-master-01:8088/proxy/application_1552651623473_0007/

Kill Command = /home/hadoop/tools/hadoop-2.7.7/bin/hadoop job -kill job_1552651623473_0007

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1

2019-04-04 10:35:45,795 Stage-1 map = 0%, reduce = 0%

2019-04-04 10:35:50,537 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.39 sec

2019-04-04 10:35:56,978 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 5.27 sec

MapReduce Total cumulative CPU time: 5 seconds 270 msec

Ended Job = job_1552651623473_0007

Moving data to directory hdfs://jms-master-01:9000/user/hive/warehouse/new_key_value_store

MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 5.27 sec HDFS Read: 8487 HDFS Write: 248 SUCCESS

Total MapReduce CPU Time Spent: 5 seconds 270 msec

OK

Time taken: 24.897 seconds

hive> select * from new_key_value_store;

OK

10 1郭靖

20 2黄蓉

30 3杨康

40 4穆念慈

50 5东邪

60 6西毒

70 7黄老邪

80 8杨铁心

Time taken: 0.21 seconds, Fetched: 8 row(s)

hive> show create table new_key_value_store;

OK

CREATE TABLE `new_key_value_store`(

`new_key` int,

`key_value_pair` string)

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe'

STORED AS INPUTFORMAT

'org.apache.hadoop.hive.ql.io.RCFileInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse/new_key_value_store'

TBLPROPERTIES (

'transient_lastDdlTime'='1554345358')

Time taken: 0.299 seconds, Fetched: 13 row(s)

克隆表(create table ... like)

create table ... like 会创建一个和源表结构完全一致的空表。

hive> create table like_new_key_value_store like new_key_value_store;

OK

Time taken: 0.149 seconds

hive> show create table like_new_key_value_store;

OK

CREATE TABLE `like_new_key_value_store`(

`new_key` int,

`key_value_pair` string)

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe'

STORED AS INPUTFORMAT

'org.apache.hadoop.hive.ql.io.RCFileInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse/like_new_key_value_store'

TBLPROPERTIES (

'transient_lastDdlTime'='1554345937')

Time taken: 1.801 seconds, Fetched: 13 row(s)

hive> select * from like_new_key_value_store;

OK

Time taken: 0.252 seconds

分桶排序表

创建一个分桶排序表

hive> create table bucketed_table (id int, name string) partitioned by (dt string, country string) clustered by (id) sorted by (name) into 4 buckets row format delimited fields terminated by '\001' collection items terminated by '\002' map keys terminated by '\003' stored as sequencefile;

OK

Time taken: 0.241 seconds

hive> show create table bucketed_table;

OK

CREATE TABLE `bucketed_table`(

`id` int,

`name` string)

PARTITIONED BY (

`dt` string,

`country` string)

CLUSTERED BY (

id)

SORTED BY (

name ASC)

INTO 4 BUCKETS

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'

WITH SERDEPROPERTIES (

'colelction.delim'='',

'field.delim'='',

'mapkey.delim'='',

'serialization.format'='')

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.SequenceFileInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'

LOCATION

'hdfs://jms-master-01:9000/user/hive/warehouse/stefan.db/bucketed_table'

TBLPROPERTIES (

'transient_lastDdlTime'='1554346570')

Time taken: 0.268 seconds, Fetched: 26 row(s)

向多个分区多动态插入,需要设置参数

set hive.exec.dynamici.partition=true; #开启动态分区,默认是false

set hive.exec.dynamic.partition.mode=nonstrict; #开启允许所有分区都是动态的,否则必须要有静态分区才能使用。

hive> insert into bucketed_table partition(dt, country) select id, name, dt, country from default.partition_table;

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

Query ID = hadoop_20190404110332_1e7c3bf3-e5e6-4ff1-af30-2a5bcf0278e6

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks determined at compile time: 4

In order to change the average load for a reducer (in bytes):

set hive.exec.reducers.bytes.per.reducer=

In order to limit the maximum number of reducers:

set hive.exec.reducers.max=

In order to set a constant number of reducers:

set mapreduce.job.reduces=

Starting Job = job_1552651623473_0008, Tracking URL = http://jms-master-01:8088/proxy/application_1552651623473_0008/

Kill Command = /home/hadoop/tools/hadoop-2.7.7/bin/hadoop job -kill job_1552651623473_0008

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 4

2019-04-04 11:03:38,849 Stage-1 map = 0%, reduce = 0%

2019-04-04 11:03:43,157 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 sec

2019-04-04 11:03:48,331 Stage-1 map = 100%, reduce = 25%, Cumulative CPU 4.05 sec

2019-04-04 11:03:50,399 Stage-1 map = 100%, reduce = 75%, Cumulative CPU 9.22 sec

2019-04-04 11:03:51,435 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 12.12 sec

MapReduce Total cumulative CPU time: 12 seconds 120 msec

Ended Job = job_1552651623473_0008

Loading data to table stefan.bucketed_table partition (dt=null, country=null)

Loaded : 1/1 partitions.

Time taken to load dynamic partitions: 0.243 seconds

Time taken for adding to write entity : 0.0 seconds

MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Reduce: 4 Cumulative CPU: 12.12 sec HDFS Read: 21718 HDFS Write: 937 SUCCESS

Total MapReduce CPU Time Spent: 12 seconds 120 msec

OK

Time taken: 20.806 seconds

hive> select * from bucketed_table;

OK

8 杨铁心 20190402 china

4 穆念慈 20190402 china

5 东邪 20190402 china

1 郭靖 20190402 china

6 西毒 20190402 china

2 黄蓉 20190402 china

3 杨康 20190402 china

7 黄老邪 20190402 china

Time taken: 0.225 seconds, Fetched: 8 row(s)

数据倾斜表(Skewed Tables)

暂时没接触过也没研究过。

临时表

临时表只在当前会话可见。数据存储在用户的临时目录总,会话结束时删除。

临时表不支持分区,不支持创建索引。

如果临时表表明和永久表名冲突,则在当前会话中会屏蔽永久表;只有drop或rename临时表表名才能使用永久表。

从hive1.1+版本起,临时表可以存储在内存或者SSD中,通过参数hive.exec.temporary.table.storage配置,可选值有memory,ssd,default。

create temporary table ...

小牛试刀

hive> create temporary table temp_table(id int, name string);

OK

Time taken: 6.14 seconds

hive> insert into temp_table select id, name from partition_table;

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

Query ID = hadoop_20190408171623_2c787f42-e5e5-460d-8a21-90a5a5c9afc3

Total jobs = 3

Launching Job 1 out of 3

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1552651623473_0010, Tracking URL = http://jms-master-01:8088/proxy/application_1552651623473_0010/

Kill Command = /home/hadoop/tools/hadoop-2.7.7/bin/hadoop job -kill job_1552651623473_0010

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0

2019-04-08 17:16:32,625 Stage-1 map = 0%, reduce = 0%

2019-04-08 17:16:38,994 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.1 sec

MapReduce Total cumulative CPU time: 2 seconds 100 msec

Ended Job = job_1552651623473_0010

Stage-4 is selected by condition resolver.

Stage-3 is filtered out by condition resolver.

Stage-5 is filtered out by condition resolver.

Moving data to directory hdfs://jms-master-01:9000/tmp/hive-hadoop/hadoop/8dbf32dc-51a3-4caa-b5be-9d682ccdb29c/_tmp_space.db/09e10ccb-5d5f-45ab-b07a-afe5590b2352/.hive-staging_hive_2019-04-08_17-16-23_750_5995661068890136721-1/-ext-10000

Loading data to table default.temp_table

MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Cumulative CPU: 2.1 sec HDFS Read: 4605 HDFS Write: 155 SUCCESS

Total MapReduce CPU Time Spent: 2 seconds 100 msec

OK

Time taken: 16.73 seconds

hive> select * from temp_table;

OK

1 郭靖

2 黄蓉

3 杨康

4 穆念慈

5 东邪

6 西毒

7 黄老邪

8 杨铁心

Time taken: 0.161 seconds, Fetched: 8 row(s)

临时表插入数据不支持insert overwrite语法。

hive> insert overwrite temp_table select id, name from partition_table;

NoViableAltException(24@[])

at org.apache.hadoop.hive.ql.parse.HiveParser.destination(HiveParser.java:38762)

at org.apache.hadoop.hive.ql.parse.HiveParser.insertClause(HiveParser.java:38531)

at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:36478)

at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:35822)

at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:35710)

at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2284)

at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1333)

at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208)

at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)

at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)

at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)

at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)

at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)

at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)

at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)

at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)

at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)

at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hadoop.util.RunJar.run(RunJar.java:226)

at org.apache.hadoop.util.RunJar.main(RunJar.java:141)

FAILED: ParseException line 1:17 cannot recognize input near 'temp_table' 'select' 'id' in destination specification

退出当前会话,临时表会被删除。

hive> quit;

[hadoop@jms-master-01 ~]$ hive

which: no hbase in (/usr/share/svensudo/bin:/home/hadoop/tools/scala-2.12.8/bin:/home/hadoop/tools/spark-2.4.0-bin-hadoop2.7/bin:/home/hadoop/tools/java/jdk1.8.0_191/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/go/bin/:/alc/tool/bin:/alc/tool/bin:/home/hadoop/tools/hadoop-2.7.7/bin:/home/hadoop/tools/apache-hive-2.3.4-bin/bin:/home/hadoop/bin)

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/hadoop/tools/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/tools/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/home/hadoop/tools/apache-hive-2.3.4-bin/conf/hive-log4j2.properties Async: true

Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

hive> select * from temp_table;

FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'temp_table'

事务表(Transactional Tables)

从Hive1.4.0+版本开始支持事务表。Hive提供了支持ACID语义的事务表,可以完成增删改操作。

关于事务表,展开又是一个专题了。可以参考官方事务表文档

主键约束

从Hive2.1.0版本开始,支持约束。Hive支持未经验证的主键和外键约束,注意是未经验证的,所以主键的正确性需要上游数据来保证。

删除表

drop table [if exists] table_name [purge];

通常情况下,删除托管表,hive会删除元数据信息和数据文件,删除文件相当于执行了hadoop fs -rm,也就是说数据文件会被移动到hdfs系统下是trash回收站中,在误操作的情况下,在一定时效期间还可以找回数据。如果指定了purge选项,则不会移动到回收站下,而是直接永久删除。所以drop动作一定要谨慎。

Truncate表

truncate table table_name [partition partition_spec]

这里partition_spec指(partition_column=partition_col_value,partition_column=partition_col_value, ...)

相信如果对关系型数据库sql语言比较熟悉的话,以上操作都可以理解。

Alter table/partition/column

Alter table

重命名表

alter table table_name rename to new_table_name;

hive> show tables;

OK

first_table

Time taken: 5.51 seconds, Fetched: 9 row(s)

hive> alter table first_table rename to second_table;

OK

Time taken: 0.406 seconds

hive> show tables;

OK

second_table

Time taken: 0.05 seconds, Fetched: 9 row(s)

修改表的属性

alter table table_name set tblproperties table_properties;

table_properties:

(property_name = property_value, property_name = property_value, ...)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值