1 创建表
内部表
说明:hive会管理着数据的生命周期,删除表就会删除数据。(从db导过来的表一般都是内部表)
示例:内部表创建hql语句(array、struct、map用法)
CREATE TABLE IF NOT EXISTS employees (
name STRING COMMENT 'employee name' ,
salary FLOAT COMMENT 'employee salary',
subordinates ARRAY<STRING> COMMENT 'names of subordinates',
deductions MAP<STRING,FLOAT> COMMENT 'keys are deductions names,values are precentages',
address STRUCT<street:STRING,city:STRING,state:STRING,zip:INT> COMMENT 'home address'
)ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':';
[dp@YZSJHL19-53 xinyan.yang]$ cat ./employees.txt
yxy|1000000.0|zhangsan,lisi|federal taxes:.2,state taxes:.5|changanjie,bj,bj,00000
yangzhen|1000000.0|zhangsan,lisi|federal taxes:.2,state taxes:.5|changanjie,bj,chaoyang,00001
hive -e "load data local inpath './employees.txt' overwrite into table employees;"
[dp@YZSJHL19-53 xinyan.yang]$ hive -S -e "select name,salary,subordinates[0],address.state,deductions['state taxes'] from employees;"
yxy 1000000.0 zhangsan bj 0.5
yangzhen 1000000.0 zhangsan chaoyang 0.5
外部表
说明:删除该表并不会删除掉这份数据,不过描述表的元数据信息会被删除掉。(线上日志一般采用外部表)
示例:创建外部分区表hql语句
CREATE EXTERNAL TABLE comment_center(
time string,
server_ip string,
type int,
actor_id int,
method string,
entry_id bigint,
entry_owerid int,
comment_id bigint,
url_md5 string,
limit int, count int,
first_page int,
exec_time int,
error_code int)
PARTITIONED BY (
dt string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n' ;
2 删除表
DROP TABLE IF EXISTS employees
可以选择是否使用IF EXISTS关键字,如果没有这个关键字而且表不存在的话,就会抛出一个错误信息。
对于内部表,表的元数据信息和表的数据都会删除;对于外报表,表的元数据信息会被删除,但是表中的数据不会被删除。
3 修改表
表重命名
ALTER TABLE comment_center RENAME TO x_comment_center ;
增加、删除分区
##增加
use acorn_3g;alter table comment_center add if not exists partition(dt='2013-12-07') location '/xlog/ugc/comment/access/2013/2013-12-07.bz2';
##删除
hive> use acorn_3g;alter table comment_center drop if exists partition(dt='2013-12-07');
修改列信息
#修改列名称
alter table comment_center change column error_code result_code int comment 'rename clolumn';
#修改列类型
alter table comment_center change column result_code result_code float;
增加列
alter table comment_center add columns (error_msg string comment 'error msg');
4 快速查看属性
创建表sql语句(0.11后支持)
hive> show create table employees ;
OK
CREATE TABLE employees(
name string COMMENT 'employee name',
salary float COMMENT 'employee salary',
subordinates array<string> COMMENT 'names of subordinates',
deductions map<string,float> COMMENT 'keys are deductions names,values are precentages',
address struct<street:string,city:string,state:string,zip:int> COMMENT 'home address')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://YZSJHL19-42.opi.com/warehouse.test/employees'
TBLPROPERTIES (
'numPartitions'='0',
'numFiles'='1',
'transient_lastDdlTime'='1386489002',
'numRows'='0',
'totalSize'='177',
'rawDataSize'='0')
Time taken: 0.187 seconds, Fetched: 23 row(s)
表字段信息
desc employees ;
表属性详细信息
desc extended employees (显示创建表的详细信息)
desc formatted employees(显示创建表的详细信息,推荐使用,格式更规整)
5 数据操作
向表中添加数据
##本地拷贝到线上集群
load data local inpath './employees.txt' overwrite into table employees(overwrite会覆盖之前的数据,去掉overwrite则会追加到之前文件后面)
##集群文件移动
load data inpth '/user/xlog/yangzhen/employees.txt' overwrite into table employees