【Superset_01】

Superset_01

遇到问题:

[CentOS报错:Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=stock32 error was 14: curl#6 - “Could not resolve host: mirrorlist.centos.org; Unknown error” ](https://www.cnblogs.com/maowenqiang/articles/7728685.html)

解决网址:

https://www.cnblogs.com/maowenqiang/articles/7728685.html

需求:
城市表:mysql
商品表:mysql
{“product_status”:1} 自营的
{“product_status”:0} 非自营
用户行为数据:hdfs上面的

求:
1.按照 区域求最受欢迎的商品的 top3

1.建表
hive:

create table city_info(
city_id int,
city_name string,
area string
)
row format delimited fields terminated by ‘\t’;

create table producet_info(
product_id int,
product_name string,
extend_info string
)
row format delimited fields terminated by ‘\t’;

create table user_click(
user_id int,
user_session string,
dt date,
extend_info int,
product_id int
)
row format delimited fields terminated by ‘,’;

load data local inpath “/home/hadoop/tmp/user_click.txt” into table user_click;

城市表:mysql 
商品表:mysql
用户行为数据
 =》 hive 

sqoop import
–connect jdbc:mysql://bigdata13:3306/bigdata
–username root
–password 123456
–mapreduce-job-name ‘city’
–delete-target-dir
–fields-terminated-by ‘\t’
-m 1
–columns “city_id,city_name,area”
–table “city_info”
–hive-import
–hive-overwrite
–hive-database bigdata_hive
–hive-table city_info

sqoop import
–connect jdbc:mysql://bigdata13:3306/bigdata
–username root
–password 123456
–mapreduce-job-name ‘product_info’
–delete-target-dir
–target-dir /sqoop/product_info_tmp
–fields-terminated-by ‘\t’
-m 1
–columns “product_id,product_name,extend_info”
–table “product_info”
–hive-import
–hive-overwrite
–hive-database bigdata_hive
–hive-table product_info

mysql上:
create table user_click(
user_id int(10),
user_session varchar(50),
dt date,
extend_info varchar(20),
product_id int(10)
);

sqoop export
–connect jdbc:mysql://bigdata13:3306/bigdata
–username root
–password 123456
–table user_click
–mapreduce-job-name ‘hive2mysql’
–input-fields-terminated-by “,”
–fields-terminated-by ‘,’
–export-dir /user/hive/warehouse/bigdata_hive.db/user_click/*
2.数据分析
你们做
3.mysql =》数据可视化

1.按照 区域求最受欢迎的商品的 top3
table:
city_info
product_info
user_click
维度:区域 商品
指标:受欢迎 商品点击次数

1.区域 商品 商品点击次数
user_click :
city_id left join city_info =>area
product_id left join product_info =>product_name

– 需求

create table rpt_area_product_name_cnt as
select
area,
product_name,
count(1) as cnt
from dws_user_click_area_product_name
group by
area,
product_name

– 需要的大表
create table dws_user_click_area_product_name as
select
a.*,
area,
product_name
from
(
– 主表
select
*
from
user_click
) as a
left join (
– city_info
select
city_id,
city_name,
area
from
city_info
) as b on a.city_id = b.city_id
left join (
– product_info
select
product_id,
product_name,
extend_info
from
product_info
) as c on a.product_id = c.product_id;

2.区域 商品 商品点击次数 top3

rpt_area_product_name_cnt

create table rpt_cnt_top3 as
select
area,
product_name,
cnt,
rk
from
(
select
area,
product_name,
cnt,
rank() over(partition by area order by cnt desc ) as rk
from rpt_area_product_name_cnt
) as a
where rk <=3;

注意:
需求主线 到底是以哪个表为主 主表:user_click
维表
1.主表 从表/维表
2.事实表 维度表

3.mysql =》数据可视化
hive rpt => mysql

mysql建表:
CREATE TABLE rpt_cnt_top3(
area varchar(20),
product_name varchar(50),
cnt bigint,
rk int
);

sqoop:
sqoop export
–connect jdbc:mysql://bigdata31:3306/bigdata
–username root
–password 123456
–table rpt_cnt_top3
–mapreduce-job-name ‘hdfs2mysql’
–fields-terminated-by ‘\001’
–export-dir /user/hive/warehouse/bigdata.db/rpt_cnt_top3/*

superset:=》 数据可视化
1.官网地址
superset.apache.org

ootb=> 开箱即用

2.安装supeset:
1.python
2.supeset:
不要和mysql 服务部署在一起
mardb

bigdata32
bigdata33
bigdata34 superset

1.部署python环境
1.anconda =》 python
2.python 建议

安装python相关依赖
[root@bigdata34 src]# pwd
/usr/local/src
上传python安装包

解压
[root@bigdata34 src]# tar -xvf Python-3.6.6.tgz
[root@bigdata34 src]# ll
total 22400
drwxr-xr-x. 17 501 501 4096 Jun 27 2018 Python-3.6.6
-rw-r–r–. 1 root root 22930752 Dec 5 13:43 Python-3.6.6.tgz

[root@bigdata34 Python-3.6.6]#./configure
[root@bigdata34 Python-3.6.6]# make && make install

python安装完成

2.python创建supset虚拟环境
[root@bigdata34 src]# pwd
/usr/local/src
[root@bigdata34 src]# python3 -m venv superset-py3
[root@bigdata34 src]# ll
total 22404
drwxr-xr-x. 18 501 501 4096 Dec 6 06:04 Python-3.6.6
-rw-r–r–. 1 root root 22930752 Dec 5 13:43 Python-3.6.6.tgz
drwxr-xr-x. 5 root root 4096 Dec 6 06:08 superset-py3
[root@bigdata34 src]# source superset-py3/bin/activate

3.部署superset
1.安装superset 额外依赖
[root@bigdata34 src]# vim requirement.txt
2.安装superset
3.安装superset mysql元数据连接

4.配置superset
1.配置superset 元数据库
CREATE DATABASE superset /*!40100 DEFAULT CHARACTER SET utf8 */;

‘mysql://root:123456@bigdata31/superset?charset=utf8’

5.初始化supset

6.启动supsert
superset run -h bigdata34 -p 8889
mysql://root:123456@bigdata31/bigdata

docker部署:
1.安装docker 【找一台机器】
yum remove docker docker-client docker-client-latest docker-common docker-latest docker-latest-logrotate docker-logrotate docker-engine
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce docker-ce-cli containerd.io docker-compose-plugin
yum install docker-ce-20.10.9 docker-ce-cli-20.10.9 containerd.io docker-compose-plugin

2.启动docker
systemctl start docker
systemctl status docker

3.获取supsert镜像
docker pull apache/superset
netstat -nlp | grep 8080 【检查linux 8080端口是否被占用】
docker run -d -p 8080:8088 --name superset apache/superset
docker exec -it superset superset fab create-admin --username admin --firstname Superset --lastname Admin --email admin@superset.com --password admin

docker exec -it superset superset db upgrade
docker exec -it superset superset load_examples
docker exec -it superset superset init

3.superset webui :
http://hostname:8080/login/
[admin/admin]

作业:
1.数据分析案例 + sqoop 做完
2.部署surperset
1.简单用用

hive mysql

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值