Hive结合shell脚本实现自动化业务

最新推荐文章于 2024-07-26 08:55:12 发布

weixin_40652340

最新推荐文章于 2024-07-26 08:55:12 发布

阅读量5.5k

点赞数 1

分类专栏：大数据

本文链接：https://blog.csdn.net/weixin_40652340/article/details/78788922

版权

本文介绍如何结合Hive与Shell脚本，实现数据加载的自动化流程。通过查看Hive表的分区情况，并利用Shell脚本执行具体的load_to_hive.h任务，提升数据处理效率。

摘要由CSDN通过智能技术生成

【案例】hive脚本加载数据到hive分区表
access_logs/20170610/2017061000.log
2017061001.log
2017061002.log
......
2017061023.log

二级分区：天/小时
crontab+shell 实现自动调度。

建库：
create database load_hive;

建表：
create table load_h(
id string,
url string,
referer string,
keyword string,
type string,
guid string,
pageId string,
moduleId string,
linkId string,
attachedInfo string,
sessionId string,
trackerU string,
trackerType string,
ip string,
trackerSrc string,
cookie string,
orderCode