全景图:
1. 创建hive表
1
2
3
4
5
6
7
|
CREATE TABLE IF NOT EXISTS newsinfo.test(
name STRING
)
CLUSTERED BY (name)INTO
3
BUCKETS
ROW FORMAT DELIMITED
STORED AS ORC
TBLPROPERTIES(
'transactional'
=
'true'
);
|
1
|
|
2. 这里用了 ReplaceText 生成 json 数据 (正式环境可以直接从hfs里取到)
3. 用ConvertJSONToAvro 转换json 到avro
{ "name": "dtu", "type": "record", "fields":[ { "name":"name","type": "string" } ] }
4. PutHiveStreaming
本文转自疯吻IT博客园博客,原文链接:http://www.cnblogs.com/fengwenit/p/5928368.html,如需转载请自行联系原作者