练习

2、把预处理之后的数据进行入库到hive中
2.1创建数据库和表
创建数据库名字为:video
创建原始数据表:
视频表:video_ori 用户表:video_user_ori
创建ORC格式的表:
视频表:video_orc 用户表:video_user_orc
给出创建原始表语句
创建video_ori视频表:
create table video_ori(
videoId string,
uploader string,
age int,
category array,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array)
row format delimited
fields terminated by “:”
collection items terminated by “,”
stored as textfile;
创建video_user_ori用户表:
create table video_user_ori(
uploader string,
videos int,
friends int)
row format delimited
fields terminated by “,”
stored as textfile;
请写出ORC格式的建表语句:
创建video_orc表
create table video_orc(
videoId string,
uploader string,
age int,
category array,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array)
row format delimited
fields terminated by “:”
collection items terminated by “,”
stored as orcfile;

创建video_user_orc表:
create table video_user_orc(
uploader string,
videos int,
friends int)
row format delimited
fields terminated by “,”
stored as orcfile;

2.2分别导入预处理之后的视频数据到原始表video_ori和导入原始用户表的数据到video_user_ori中
请写出导入语句:
video_ori:

video_user_ori:

2.3从原始表查询数据并插入对应的ORC表中
请写出插入语句:
video_orc:
insert into video.video_orc(videoid, uploader, age, category, length, views, rate, ratings, comments, relatedid) SELECT t.* FROM video.video_ori t;

video_user_orc:
insert into video.video_user_orc(uploader, videos, friends) SELECT * from video.video_user_ori;

3、对入库之后的数据进行hivesql查询操作
3.1从视频表中统计出视频评分为5分的视频信息,把查询结果保存到/export/rate.txt
请写出sql语句:

insert overwrite local directory ‘/export/rate.txt’ row format delimited fields terminated by ‘:’ collection items terminated by ‘,’ select * from video.video_orc where rate = 5;

3.2从视频表中统计出评论数大于100条的视频信息,把查询结果保存到/export/comments.txt
请写出sql语句:

insert overwrite local directory ‘/export/comments.txt’ row format delimited fields terminated by ‘:’ collection items terminated by ‘,’ select * from video.video_orc where comments>100;

4、把hive分析出的数据保存到hbase中
4.1创建hive对应的数据库外部表
请写出创建rate外部表的语句:
create external table rate(
videoId string,
uploader string,
age int,
category array,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array)
row format delimited
fields terminated by “:”
collection items terminated by “,”;

请写出创建comments外部表的语句:
create external table comments(
videoId string,
uploader string,
age int,
category array,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array)
row format delimited
fields terminated by “:”
collection items terminated by “,”;

4.2加载第3步的结果数据到外部表中
请写出加载语句到rate表:
load data local inpath ‘/export/rate.txt’ into table video.rate;

请写出加载语句到comments表:
load data local inpath ‘/export/comments.txt’ into table video.comments;

4.3创建hive管理表与HBase进行映射
给出此步骤的语句
Hive中的rate,comments两个表分别对应hbase中的hbase_rate,hbase_comments两个表
创建hbase_rate表并进行映射:
create table hbase_rate2( rowkey STRING,
videoId string,
uploader string,
age int,
category array,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array)
stored by ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’
with serdeproperties(“hbase.columns.mapping” = “cf:videoId,cf:uploader,cf:age,cf:category,cf:length,cf:views,cf:rate,cf:ratings,cf:comments,cf:relatedId”)
tblproperties(“hbase.table.name” = “hbase_rate2”);

创建hbase_comments表并进行映射:
create table hbase_comments( rowkey STRING,
videoId string,
uploader string,
age int,
category array,
length int,
views int,
rate float,
ratings int,
comments int,
relatedId array)
stored by ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’
with serdeproperties(“hbase.columns.mapping” = “cf:videoId,cf:uploader,cf:age,cf:category,cf:length,cf:views,cf:rate,cf:ratings,cf:comments,cf:relatedId”)
tblproperties(“hbase.table.name” = “hbase_comments”);

4.4请写出通过insert overwrite select,插入hbase_rate表的语句
insert overwrite table hbase_rate2 select row_number() over (),rate.* from rate;

请写出通过insert overwrite select,插入hbase_comments表的语句
insert overwrite table hbase_comments select row_number() over (),comments.* from comments;

  • 3
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值