hive的create table 问题

1) 使用like 关键字 

   在使用hive DDL语句时,遇上了一些问题:

create table if not exists temp.temp1 like vv_20111023;

    表创建成功,describe temp1也输出正常;

    然后执行数据插入操作:

insert OVERWRITE  table temp1 select a.* from vv.vv_20111023 a limit 10;

   执行正常,日志如下:

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201110171502_22133, Tracking URL = http://haa22:50030/jobdetails.jsp?jobid=job_201110171502_22133
Kill Command = /opt/modules/hadoop/hadoop-0.20.203.0/bin/../bin/hadoop job  -Dmapred.job.tracker=haa22:9001 -kill job_201110171502_22133
2011-11-01 10:32:09,433 Stage-1 map = 0%,  reduce = 0%
2011-11-01 10:32:15,464 Stage-1 map = 8%,  reduce = 0%
2011-11-01 10:32:16,470 Stage-1 map = 10%,  reduce = 0%
2011-11-01 10:32:18,482 Stage-1 map = 25%,  reduce = 0%
2011-11-01 10:32:19,488 Stage-1 map = 37%,  reduce = 0%
2011-11-01 10:32:20,496 Stage-1 map = 47%,  reduce = 0%
2011-11-01 10:32:21,512 Stage-1 map = 53%,  reduce = 0%
2011-11-01 10:32:22,519 Stage-1 map = 54%,  reduce = 0%
2011-11-01 10:32:23,562 Stage-1 map = 59%,  reduce = 0%
2011-11-01 10:32:24,679 Stage-1 map = 64%,  reduce = 0%
2011-11-01 10:32:25,685 Stage-1 map = 76%,  reduce = 0%
2011-11-01 10:32:26,691 Stage-1 map = 80%,  reduce = 0%
2011-11-01 10:32:27,697 Stage-1 map = 90%,  reduce = 0%
2011-11-01 10:32:29,709 Stage-1 map = 98%,  reduce = 0%
2011-11-01 10:32:30,715 Stage-1 map = 100%,  reduce = 0%
2011-11-01 10:32:33,732 Stage-1 map = 100%,  reduce = 25%
2011-11-01 10:32:39,764 Stage-1 map = 100%,  reduce = 30%
2011-11-01 10:32:42,780 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201110171502_22133
Loading data to table temp.temp1
Deleted hdfs://haa22:9000/user/hive/warehouse/temp.db/temp1
Table temp.temp1 stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 2096]
10 Rows loaded to temp1
OK
Time taken: 47.768 seconds

    查看输出结果:

  select * from temp1; 

    输出:OK
Time taken: 0.096 seconds

    奇怪的是插入操作显示正常,但是在执行查询时却没有输出结果;

    查看该表的数据文件:

dfs -cat /user/hive/warehouse/temp.db/temp1/000000_0;

    输出:

20111023        18      219.139.149.24  2011-10-23 18:00:01     GET     /vvlog.html     {4A85BC70-9529-45B3-9B92-6F9DF2DB956E}  newbox      9709    13      123132  2       -1      0.0     0.0     newbox1/movie/detail.html       3.11.08.04      \N      中国    湖北省      宜昌市  电信
20111023        18      58.47.57.158    2011-10-23 18:00:01     GET     /vvlog.html     {6C05A61C-84AC-4F23-9211-08A637187F3A}  tre17640    13      207087  1       0       0.0     0.0     tree    3.11.10.12      \N      中国    湖南省  常德市  电信
20111023        18      113.111.164.148 2011-10-23 18:00:01     GET     /vvlog.html     {356C047F-0573-2ABC-8203-53B1913EF441}  bov20555    13      237069  1       0       0.0     0.0     bov     5.03.1001.1111  \N      中国    广东省  广州市  电信
20111023        18      60.189.194.51   2011-10-23 18:00:01     GET     /vvlog.html     {BBEF15BB-8FC9-4CB4-8413-9A4AB12D0F1B}  newbox      18248   13      212575  2       0       2490.665        2489.9822       newbox1/index/index_3.html      3.11.06.29      \N中国    浙江省  台州市  电信
20111023        18      111.79.192.55   2011-10-23 18:00:01     GET     /vvlog.html     {D2867EF8-B912-49E4-B557-9290567765A1}  tre12052    13      152434  1       0       0.0     0.0     tree    3.11.09.26      \N      中国    江西省  宜春市  电信
20111023        18      218.93.113.61   2011-10-23 18:00:01     GET     /vvlog.html     {DC353BFB-3A17-4599-9E28-A080F7D0FD33}  newbox      19367   13      228561  2       0       2401.314        2401.191        newbox1/movie/detail.html       3.10.12.01      \N中国    江苏省  常州市  电信
20111023        18      113.16.173.182  2011-10-23 18:00:01     GET     /vvlog.html     {4EEAC4F0-27AC-414F-BBAA-04220074F9F0}  newbox      15776   13      185594  1       0       0.0     0.0     newbox1/search/result.html      3.11.08.04      \N      中国    广西南宁市  电信
20111023        18      219.146.114.68  2011-10-23 18:00:01     GET     /vvlog.html     {210D0C4B-1BD8-4C9B-9B0D-064217EEF9B9}  newbox      20399   13      235465  2       0       60.371  21.960085       newbox1/movie/detail.html       3.11.10.12      \N      中国山东省  潍坊市  电信
20111023        18      180.113.39.152  2011-10-23 18:00:01     GET     /vvlog.html     {2A887575-077D-456B-941E-46E8AE962869}  newbox      18767   13      237061  1       0       0.0     0.0     newbox1/movie/detail.html       3.11.08.04      \N      中国    江苏省      无锡市  电信
20111023        18      113.106.102.77  2011-10-23 18:00:01     GET     /vvlog.html     {F91F94C2-20AD-0F58-DB1C-7A8AE3D775D4}  newbox      1007    24      1176    1       0       0.0     0.0     newbox1/movie/film/p2p/list-areaid-2-sid-1-p-3.shtml    5.01.0820.2122      \N      中国    广东省  珠海市  电信

很明显,数据已经成功的提取了出来,但是查询却无法输出结果,很有可能时定义表的时候产生的问题。

重新创建一张新表temp2(不使用like语法)

执行数据插入操作:

insert OVERWRITE  table temp2 select a.* from vv.vv_20111023 a limit 10;

查询:

select * from temp2 limit 3;

输出结果:

20111023        18      219.139.149.24  2011-10-23 18:00:01     GET     /vvlog.html     {4A85BC70-9529-45B3-9B92-6F9DF2DB956E}  newbox      9709    13      123132  2       -1      0.0     0.0     newbox1/movie/detail.html       3.11.08.04      \N      中国    湖北省      宜昌市  电信
20111023        18      58.47.57.158    2011-10-23 18:00:01     GET     /vvlog.html     {6C05A61C-84AC-4F23-9211-08A637187F3A}  tre17640    13      207087  1       0       0.0     0.0     tree    3.11.10.12      \N      中国    湖南省  常德市  电信
20111023        18      113.111.164.148 2011-10-23 18:00:01     GET     /vvlog.html     {356C047F-0573-2ABC-8203-53B1913EF441}  bov20555    13      237069  1       0       0.0     0.0     bov     5.03.1001.1111  \N      中国    广东省  广州市  电信



2)自定义列分隔符

     hive目前只支持单字符分割,如果想使用多字符分割,可以采用如下方式:

create table my_table(
col1 string,
col2 string,
col3 string,
col4 string,
col5 string)ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' 
WITH SERDEPROPERTIES ( "input.regex" = "正则表达式" ) STORED AS TEXTFILE;

注意:表的字段只能是string类型




  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值