手动在hive表插入数据时,需特别注意两点:
1、hive表不可以指定字段插入
2、如果hive表是分区表,插入数据时,需指定分区
下面以分区表为例,插入数据:
查看表结构
hive> show create table acct_wt_user_avg_stock_hold_days;
OK
CREATE EXTERNAL TABLE `acct_wt_user_avg_stock_hold_days`(
`init_date` string COMMENT '????',
`fund_account` string COMMENT '????',
`avg_hold_days` decimal(19,0) COMMENT '????????')
PARTITIONED BY (
`part_init_date` string,
`interval_type` string)
CLUSTERED BY (
fund_account)
SORTED BY (
fund_account ASC)
INTO 100 BUCKETS
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
WITH SERDEPROPERTIES (
'colelction.delim'='\u0002',
'field.delim'='\u0001',
'line.delim'='\n',
'mapkey.delim'='\u0003',
'serialization.format'='\u0001')
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'hdfs://xxx.xxx.xxx.xxx:8020/user/hive/warehouse/hs_wt/acct_wt_user_avg_stock_hold_days'
TBLPROPERTIES (
'transient_lastDdlTime'='1533279332')
Time taken: 0.036 seconds, Fetched: 28 row(s)
插入数据
insert into acct_wt_user_avg_stock_hold_days partition (part_init_date='20190425',interval_type=1) select '20190425','4567890',30 from acct_wt_user_hold_stock_earnging_ratio limit 1;
查看结果:
hive> select * from acct_wt_user_avg_stock_hold_days where part_init_date = '20190425' and interval_type = 1;
OK
Time taken: 0.156 seconds
没有数据,考虑了一下,可能是上述表acct_wt_user_hold_stock_earnging_ratio
中无数据。
查看表acct_wt_user_hold_stock_earnging_ratio
是否有数据
hive> select * from acct_wt_user_hold_stock_earnging_ratio limit 1;
OK
Time taken: 0.069 seconds
确实无数据。
找一张存在数据的表,重新执行上述命令:
hive> select 1 from ods_origindb.hs_ods_secumain limit 1;
OK
1
Time taken: 0.04 seconds, Fetched: 1 row(s)
替换表名,重新执行:
insert into acct_wt_user_avg_stock_hold_days partition (part_init_date='20190425',interval_type=1) select '20190425','4567890',30 from ods_origindb.hs_ods_secumain limit 1;
查验数据是否插入:
hive> select * from ztx1.acct_wt_user_avg_stock_hold_days where part_init_date = '20190425' and interval_type = 1;
OK
20190425 4567890 30 20190425 1
Time taken: 0.074 seconds, Fetched: 1 row(s)
数据插入成功;
注意:插入语句中,采用了select … from table…limit … 的语句,后面的 limit 只是为了限定插入条数;如果 table 表里有 100w数据,没有什么冲突的话,又不加限制,可能会插入 10w条数据; 如果 table 表中 没有数据,那么 select 结果为空,数据插入为空;