Hive 表操作

转载自:http://xchunlnei.i.sohu.com/blog/view/161921349.htm


hive的表逻辑上是由存储的数据和描述表特征的元数据构成。数据通常存储在HDFS中,但也可以存储在hadoop的文件系统中,或者本地文件系统中;元数据存储在关系数据表中,但不是hdfs中。

hive的load数据,非常快,其实就是转移文件的过程。
The load operation is very fast, since it is just a filesystem move. However,
bear in mind that Hive does not check that the files in the table
directory conform to the schema declared for the table, even for managed
tables. If there is a mismatch, then this will become apparent at
query time, often by the query returning NULL for a missing field. You
can check that the data is being parsed correctly by issuing a simple
SELECT statement to retrieve a few rows directly from the table.

1、创建新表:
create table pv_detail(plate_id string,hour_id string ,city_id int,content_type int,pv_num int) row format delimited fields terminated by '\t';

2、显示所有表:
hive> show tables;
OK
pv_detail

3、显示表的定义:
hive> describe pv_detail;
OK
plate_id        string
hour_id string
city_id int
content_type    int
pv_num  int

4、对表进行修改:
  hive> ALTER TABLE pv_detailADD COLUMNS (new_col INT);
  hive> ALTER TABLE pv_detailADD COLUMNS (new_col2 INT COMMENT 'a comment');
  hive> ALTER TABLE pv_detail RENAME TO 3koobecaf;

5、删除表:
hive> DROP TABLE pv_detail;
删除hdfs://user/hive/warehouse/pv_detail文件,连同元数据和数据一起删除。

6、从本地文件加载数据:
hive> load data local inpath '/data/logs/data/backup/pv_detal.txt' overwrite into table pv_detail;
hive> create table uv_detail(cookie_id string,hour_id string,city_id int,plate_id string,content_type int,pv_num int) row format delimited fields terminated by '\t';
;

7、从hdfs中加载数据:
load data inpath '/user/root/output/part-00000' into table pv_detail;
就是把hdfs://user/root/output/part-00000的文件导入到hdfs://user/hive/warehouse/pv_detail文件下。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值