Spark项目---- 模拟互联网网站用户行为实时分析系统(第二部分)

1)安装HBASE

https://blog.csdn.net/hailunw/article/details/119057361

2)在HBASE中创建表

[user@NewBieSlave1 hbase-2.3.5]$ hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/user/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/user/hbase-2.3.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.3.5, rfd3fdc08d1cd43eb3432a1a70d31c3aece6ecabe, Thu Mar 25 20:50:15 UTC 2021
Took 0.0014 seconds                                                                                                                                                                        
hbase(main):001:0> create 'course_clickcount','info'
Created table course_clickcount
Took 1.1887 seconds                                                                                                                                                                        
=> Hbase::Table - course_clickcount
hbase(main):002:0> create 'course_search_clickcount','info'
Created table course_search_clickcount
Took 0.6424 seconds                                                                                                                                                                        
=> Hbase::Table - course_search_clickcount
hbase(main):003:0> list
TABLE                                                                                                                                                                                      
category_clickcount                                                                                                                                                                        
course_clickcount                                                                                                                                                                          
course_search_clickcount                                                                                                                                                                   
helloWorld                                                                                                                                                                                 
4 row(s)
Took 0.0186 seconds                                                                                                                                                                        
=> ["category_clickcount", "course_clickcount", "course_search_clickcount", "helloWorld"]
hbase(main):004:0> describe 'course_clickcount'
Table course_clickcount is ENABLED                                                                                                                                                         
course_clickcount                                                                                                                                                                          
COLUMN FAMILIES DESCRIPTION                                                                                                                                                                
{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VE
RSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                                                                                       

1 row(s)
Quota is disabled
Took 0.1509 seconds                                                                                                                                                                        
hbase(main):005:0> scan 'course_clickcount'
ROW                                             COLUMN+CELL                                                                                                                                
0 row(s)
Took 0.0960 seconds                    

3) 创建实体类ClickLog,CourseClickCount 以及CourseSearchClickCount

 

4)创建日期格式 转换工具类(Scala实现)

5)创建 HBASE的DAO类 CourseClickCountDAO 和 CourseSearchClickCountDAO

6) 修改 Kafka集群的SparkStream读取类,增加数据清洗的逻辑

7)修改 Kafka集群的SparkStream读取类,增加数据清洗,以及统计后写入数据库的逻辑

 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值