hive基础操作

最新推荐文章于 2024-08-04 00:38:15 发布

steve_tao_csdn

最新推荐文章于 2024-08-04 00:38:15 发布

阅读量271

点赞数 1

分类专栏： hadoop 文章标签： hive sqoop mysql elasticsearch

本文链接：https://blog.csdn.net/qq_18415783/article/details/77480611

版权

hadoop 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

新建hive表

hive>

> create table xxx_result(

> id string,

> zzz int,

> yyy string)

> ROW FORMATDELIMITED

> FIELDS TERMINATEDBY '\t'

> STORED AS TEXTFILE;

hive> create table post_result(id string, zzz int, yyy string) ROW FORMAT DELIMITED FIELDSTERMINATED BY '\t' STORED AS TEXTFILE;

查询表结构

hive> show create table www;

复制表结构

hive> insert into table xxx_result select id,yyy,zzz from www;

hdfs copy数据到本地

hadoop dfs -get hdfs://ns15/... .

sqoop导入数据

sqoop export --connect jdbc:mysql://ip/mysql_database_name--username jnlu -P --table post_result --fields-terminated-by '\t'--export-dir "hdfs://ns15/..."

导入elasticsearch

#!/usr/bin/env bash

set -x
set -e

target_date=`date -d last-day +%Y-%m-%d`
echo "target_date = " $target_date

pig \
-useHCatalog \
-Dmapreduce.job.cache.archives='/user/mart_tbi/xxx/data/jnlu_data.zip#yyy_data' \
-Dmapreduce.job.acl-view-job=* \
-Dmapreduce.job.queuename=root.bdp_jmart_tbi_union.bdp_jmart_tbi_dev \
-Dmapred.create.symlink=yes \
-Dmapred.child.java.opts='-Xmx8192m ' \
-p start_date=$target_date \
-p target_date=$target_date \
es_recovery.pig 2>&1

echo 'DONE!'

REGISTER datafu-pig-incubating-1.3.1.jar;
REGISTER elasticsearch-hadoop-pig-5.4.1.jar;

DEFINE EsStorage org.elasticsearch.hadoop.pig.EsStorage (
'es.http.timeout= 5m',
'es.index.auto.create = true',
'es.nodes = es ip',
'es.mapping.id = id',
'es.mapping.pig.tuple.use.field.names = true',
'es.write.operation = upsert');

raw_data = LOAD 'hivedb.hiveTable' USING org.apache.hive.hcatalog.pig.HCatLoader() As (
album_id:CHARARRAY,,
dt:CHARARRAY);

--DUMP raw_data;

es_data = FOREACH raw_data GENERATE (chararray)$7 as id,(int)$16 as popularity_new, (chararray)$17 as norm_name;

STORE es_data INTO ‘esIndex/esType’' USING EsStorage;

steve_tao_csdn

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
hive基础操作

新建hive表hive> > > create table xxx_result( > id string, > zzz int, > yyy string) > ROW FORMATDELIMITED > FIELDS TERMINATEDBY '\t' > STORED AS TEXTFILE; hiv
复制链接

扫一扫

专栏目录