搜集了几篇博客,结合我们自己生产上的实例,整理了一下:
hive 创建外部表语句实例:
CREATE EXTERNAL TABLE `xxxx`(
`mac` string,
`did` string,
`uid` string,
`sid` string,
`tc_version` string,
`app_version` string,
`province` string,
`city` string,
`model` string,
`chip` string,
`source` string,
`start_time` string,
`stop_time` string,
`dur` string,
`video_id` string,
`name` string,
`category` string,
`play_length` string,
`vip` string,
`partner` string,
`video_source` string,
`business_type` string)
PARTITIONED BY (
`partition_day` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'hdfs://xxx/apps/external/hive/xxx/xxxx'
创建hive表的另外的方式:
方式二:CREATE TABLE IF NOT EXISTS default.weblog_20150923
LIKE default.weblog ;
方式三:CREATE TABLE default.weblog_comm
AS select ip, time, req_url from default.weblog;
向hive表里load数据:
load data local inpath '/opt/hive-0.13.1/weblog.txt' into table default.weblog ;
http://blog.csdn.net/weixin_36630761/article/details/78981210
hive提供了复合数据类型:
Structs: structs内部的数据可以通过DOT(.)来存取,例如,表中一列c的类型为STRUCT{a INT; b INT},我们可以通过c.a来访问域a
Maps(K-V对):访问指定域可以通过["指定域名称"]进行,例如,一个Map M包含了一个group-》gid的kv对,gid的值可以通过M['group']来获取
Arrays:array中的数据为相同类型,例如,假如array A中元素['a','b','c'],则A[1]的值为'b'
Struct使用
· hive> create table student_test(id INT, info struct<name:STRING, age:INT>)
· > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
· > COLLECTION ITEMS TERMINATED BY ':';
· OK
· Time taken: 0.446 seconds
'FIELDS TERMINATED BY' :字段与字段之间的分隔符
''COLLECTION ITEMS TERMINATED BY' :一个字段各个item的分隔符
· $ cat test5.txt
· 1,zhou:30
· 2,yan:30
· 3,chen:20
· 4,li:80
· hive> LOAD DATA LOCAL INPATH '/home/work/data/test5.txt' INTO TABLE student_test;
Array使用
· hive> create table class_test(name string, student_id_list array<INT>)
· > ROW FORMAT DELIMITED
· > FIELDS TERMINATED BY ','
· > COLLECTION ITEMS TERMINATED BY ':';
· OK
· Time taken: 0.099 seconds
· $ cat test6.txt
· 034,1:2:3:4
· 035,5:6
· 036,7:8:9:10
· hive> LOAD DATA LOCAL INPATH '/home/work/data/test6.txt' INTO TABLE class_test ;
Map使用
· hive> create table employee(id string, perf map<string, int>)
· > ROW FORMAT DELIMITED
· > FIELDS TERMINATED BY '\t'
· > COLLECTION ITEMS TERMINATED BY ','
· > MAP KEYS TERMINATED BY ':';
· OK
· Time taken: 0.144 seconds
‘MAP KEYS TERMINATED BY’ :key value分隔符
· $ cat test7.txt
· 1 job:80,team:60,person:70
· 2 job:60,team:80
· 3 job:90,team:70,person:100
· hive> LOAD DATA LOCAL INPATH '/home/work/data/test7.txt' INTO TABLE employee;