mysql hive 建表_Hive 创建表

Hive 创建表时正则表达式与java代码中一致,RegexSerDe 中用\\来表示反转码

1. Hive 创建表时正则表达式与java代码中一致,RegexSerDe 中用\\来表示反转码

create table inst_cn_3 (

ip string,

time string,

mac string,

lan string,

ver string,

lc string,

pn string,

reg string,

vrf string,

line string)

ROW FORMAT SERDE 'org.apache.Hadoop.hive.contrib.serde2.RegexSerDe'

WITH SERDEPROPERTIES (

"input.regex" = "- *([\\d\\.]+) *\\[([\\d]{2}/[\\w]+/[\\d]{4}:[\\d]{2}:[\\d]{2}:[\\d]{2}\\s+\\+[\\d]+)\\] *GET */mx3/inst/([0-9a-f]{12})/ver=([\\.\\d]+),lan=(0x[\\w]+)(?:,lc=([\\w]+))(?:,pn=([\\w]+))(?:,reg=([0-1]))(?:,vrf=([\\w]+))?.*"

)

STORED AS TEXTFILE;

alter table inst_cn_3 add columns(line string);

ALTER TABLE inst_cn_3 SET SERDEPROPERTIES (

"input.regex" = "- ([\\d\\.]+) \\[([\\d]{2}/[\\w]+/[\\d]{4}:[\\d]{2}:[\\d]{2}:[\\d]{2}\\s+\\+[\\d]+)\\] GET /mx3/inst/([0-9a-f]{12})/ver=([\\.\\d]+),lan=(0x[\\w]+)(?:,lc=([\\w]+))(?:,pn=([\\w]+))(?:,reg=([0-1]))(?:,vrf=([\\w]+))?.*|(.*)"

);

select * from inst_cn_3 limit 100;

select

line

from inst_cn_3

where

1=1

and mac is null

and line is not null

and !(line rlike '.*unknowuser00.*')

;

hadoop fs -cp /mnt/nfs/log/statcn/inst/inst_cn_3.*.txt /hive/warehouse/inst_cn_3/

2011.06.09 hive时间处理

1.select

from_unixtime(unix_timestamp('02/May/2011:00:00:00 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') ,

from_unixtime(unix_timestamp('02/May/2011:23:59:59 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') ,

from_unixtime(unix_timestamp('03/May/2011:00:00:00 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') ,

from_unixtime(unix_timestamp('03/May/2011:23:59:59 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss')

from

stat_c_log

where

1=1

and partkey = '20110503'

and logType = 'inst_cn_3'

and url rlike '/mx3/inst/.*'

limit

10

;

2.

select

from_unixtime(unix_timestamp('02/May/2011:00:00:00 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') ,

from_unixtime(unix_timestamp('02/May/2011:23:59:59 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') ,

from_unixtime(unix_timestamp('03/May/2011:00:00:00 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') ,

from_unixtime(unix_timestamp('03/May/2011:23:59:59 +0800','dd/MMMMM/yyyy:HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') ,

round((unix_timestamp('02/May/2011:00:00:00 +0800','dd/MMMMM/yyyy:HH:mm:ss Z')-4*3600)/(3600*24))*(3600*24*1000),

round((unix_timestamp('02/May/2011:23:59:59 +0800','dd/MMMMM/yyyy:HH:mm:ss Z')-4*3600)/(3600*24))*(3600*24*1000)

from

stat_c_log

where

1=1

and partkey = '20110503'

and logType = 'inst_cn_3'

and url rlike '/mx3/inst/.*'

limit

10

;

2012.03.01

1. Hive 方法注册类 FunctionRegistry

2012.06.14

1. set hive.cli.print.header=true; 可以设置hive shell的输出.

2012.06.26

1. hive cdh4b2 使用arichive 对表归档后, 使用select line 对归档后的partition查询时,, 报FileNotFoundException 异常。

https://issues.apache.org/jira/browse/MAPREDUCE-2704 是因为CombineFileInputFormat constructs new Path objects by converting an existing path to a URI, and then only pulling out the "path" part of it. This drops the scheme and host, which makes CombineFileInputFormat fail if the paths are on a filesystem other than the default one.

2012.07.16

1. EXPLAIN EXTENDED hive_query; 查看运行.

2012.07.29

1. DESCRIBE FORMATTED mock; 显示columns, location, params等.

2. /src/ql/src/test/queries/clientpositive/

logo.gif

f68f2add0b68e4f9810432fce46917b7.png

本文原创发布php中文网,转载请注明出处,感谢您的尊重!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值