zeppelin 行转列、开窗函数、动态分区表

最新推荐文章于 2022-08-27 11:45:13 发布

就想做一条闲鱼

最新推荐文章于 2022-08-27 11:45:13 发布

阅读量264

点赞数 1

分类专栏：本科-企业实训文章标签： hive big data

本文链接：https://blog.csdn.net/qq_43893755/article/details/120611451

版权

本科-企业实训专栏收录该内容

6 篇文章 0 订阅

订阅专栏

企业实训课第七节

一键启动所有命令（脚本）

cd /export/  #进入到此文件夹下
ll
mkdir onekey  # 创建一个文件夹
cd onekey/
vim onekey-start.sh  # 进入到编辑模式,编辑内容

### context
# 启动HDFS集群
echo "启动Hadoop集群..."
/export/server/hadoop-3.1.4/sbin/start-all.sh

# 启动Hive MetaStore 服务器
echo "启动Hive MetaStore服务器..."
nohup /export/server/hive/bin/hive --service metastore > ./metastore.log 2>&1 &
sleep 5 

# 启动Hive hiveserver2 服务器
nohup /export/server/hive/bin/hive --service hiveserver2 > ./hiveserver2.log 2>&1 &
sleep 5

# 启动Zeppelin服务器
echo "启动Zeppelin服务器..."
/export/server/zeppelin-0.8.2-bin-all/bin/zeppelin-daemon.sh start


#一键关闭
vim onekey-stop.sh

## content
/export/server/zeppelin-0.8.2-bin-all/bin/zeppelin-daemon.sh stop
jps -lm | grep -i 'server.HiveServer2' | awk '{print $1}' | xargs kill -s 9
jps -lm | grep -i 'metastore.HiveMetaStore' | awk '{print $1}' | xargs kill -s 9
/export/server/hadoop-3.1.4/sbin/stop-all.sh

赋予777权限

chmod -R 777 onekey/

hive内置函数

1、实操理解hive内置函数

行转列操作笔记

以下操作均在zeppelin操作

# 创建表
create table emp( deptno int, ename string ) row format delimited fields terminated by '\t'

试验

在这里插入图片描述

课堂练习题
将上图行转列，以|分割；

select  deptno, concat_ws('|' , collect_set(ename)) from emp group by deptno;

2、开窗函数

在这里插入图片描述

实操理解开窗函数

创建表

create table user_access (
 user_id string, 
 createtime string, --day 
 pv int 
) 
row format DELIMITED FIELDS TERMINATED BY ','

在这里1片描述

实现分组排名
开窗函数：实际上就是生成一个新的字段（新的窗口），来根据我们的逻辑和规则显示；
关键字：| 函数 * over(partition by xxxx [order by ]) -> 组成开窗函数

rank() dense_rank()  row_number()

Q：查看哪一个用户的pv量最多？

select user_id,createtime,pv,
rank() over(partition by user_id order by pv desc) as rn1,
dense_rank() over(partition by user_id order by pv desc) as rn2,
row_number() over(partition by user_id order by pv desc) as rn3
from user_access

在这里插入图片描述

hive内置函数-开窗函数

sum avg min max

分区表

动态分区-——分区表数据加载
【动态分区】指分区的字段值是基于查询结果（参数位置）自动推断出来的。
【核心语法】insert+select；
启用hive动态分区，需要在hive绘画中设置两个参数。

#是否开启动态分区功能
set hive.exec.dynamic.partition=true;
#指定动态分区模式，分为nonstick非严格模式和strict严格模式。
#strict严格模式要求至少有一个分区为静态分区。
set hive.exec.dynamic.partition.mode=nonstrict;

实操动态分区表

创建分区表

create table t_all_hero(
   id int,
   name string,
   hp_max int,
   mp_max int,
   attack_max int,
   defense_max int,
   attack_range string,
   role_main string,
   role_assist string
)
row format delimited
fields terminated by "\t";

在这里插入图片描述

create table t_all_hero_part(
   id int,
   name string,
   hp_max int,
   mp_max int,
   attack_max int,
   defense_max int,
   attack_range string,
   role_main string,
   role_assist string
) partitioned by (role string)--注意哦 这里是分区字段
row format delimited
fields terminated by "\t";

静态分区表-手动加载

在这里插入图片描述
静态加载

load data local inpath '/export/data/hivedata/archer.txt' into table t_all_hero_part partition(role='sheshou');
load data local inpath '/export/data/hivedata/assassin.txt' into table t_all_hero_part partition(role='cike');
load data local inpath '/export/data/hivedata/mage.txt' into table t_all_hero_part partition(role='fashi');
load data local inpath '/export/data/hivedata/support.txt' into table t_all_hero_part partition(role='fuzhu');
load data local inpath '/export/data/hivedata/tank.txt' into table t_all_hero_part partition(role='tanke');
load data local inpath '/export/data/hivedata/warrior.txt' into table t_all_hero_part partition(role='zhanshi');

把6个文件上传到mobx里，之后开beeline

../server/hive/bin/beeline

一键启动脚本

在这里插入图片描述

———————————————————

动态分区

创建动态分区表

--创建一张新的分区表 t_all_hero_part_dynamic
create table t_all_hero_part_dynamic(
    id int,
    name string,
    hp_max int,
    mp_max int,
    attack_max int,
    defense_max int,
    attack_range string,
    role_main string,
    role_assist string
) partitioned by (role string)
row format delimited
fields terminated by "\t";

创建一张新的分区表，执行动态分区插入；

动态分区插入时，分区值是根据查询返回字段位置自动推断的。

核心语句——动态分区

--执行动态分区插入
insert into table t_all_hero_part_dynamic partition(role) select tmp.*,tmp.role_main from t_all_hero tmp;

在这里插入图片描述

分桶表

在这里插入图片描述

在这里插入图片描述
语法

就想做一条闲鱼

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
zeppelin 行转列、开窗函数、动态分区表

企业实训课第七节一键启动所有命令（脚本）cd /export/ #进入到此文件夹下llmkdir onekey # 创建一个文件夹cd onekey/vim onekey-start.sh # 进入到编辑模式,编辑内容### context# 启动HDFS集群echo "启动Hadoop集群..."/export/server/hadoop-3.1.4/sbin/start-all.sh# 启动Hive MetaStore 服务器echo "启动Hive MetaSto
复制链接

扫一扫

专栏目录