大抵是不用进厂了罢。
目录
一、大数据集群组件部署
1. 基础环境配置
主机名 | IP地址 | 账号 | 密码 |
---|---|---|---|
master | 192.168.1.100 | root | password |
slave1 | 192.168.1.101 | root | password |
slave2 | 192.168.1.102 | root | password |
修改三个节点的主机名:
[root@master ~]# hostnamectl set-hostname master
slave1:
[root@localhost ~]# hostnamectl set-hostname slave1 [root@localhost ~]# bash [root@slave1 ~]#
slave2:
[root@localhost ~]# hostnamectl set-hostname slave2 [root@localhost ~]# bash [root@slave2 ~]#
设置固定ip:
master:
命令:
[root@master ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33
文件内容:
TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=ens35 DEVICE=ens35 ONBOOT=yes IPADDR=192.168.1.100 NETMASK=255.255.255.0 GATEWAY=192.168.1.254 DNS1=114.114.114.114 DNS2=8.8.8.8
重启网卡:
[root@master ~]# systemctl restart network
查看ip:
[root@master ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens35: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:69:e9:e8 brd ff:ff:ff:ff:ff:ff inet 192.168.1.100/24 brd 192.168.1.255 scope global dynamic ens35 valid_lft 1111sec preferred_lft 1111sec inet6 fe80::2859:3941:9736:7030/64 scope link valid_lft forever preferred_lft forever 3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN qlen 1000 link/ether 52:54:00:00:28:5e brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 1000 link/ether 52:54:00:00:28:5e brd ff:ff:ff:ff:ff:ff [root@master ~]#
slave1:
命令:
[root@master ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33
文件内容:
TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=ens35 DEVICE=ens35 ONBOOT=yes IPADDR=192.168.1.101 NETMASK=255.255.255.0 GATEWAY=192.168.1.254 DNS1=114.114.114.114 DNS2=8.8.8.8
重启网卡:
[root@slave1 ~]# systemctl restart network
查看ip:
[root@slave1 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens35: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:75:1f:92 brd ff:ff:ff:ff:ff:ff inet 192.168.1.101/24 brd 192.168.1.255 scope global dynamic ens35 valid_lft 1507sec preferred_lft 1507sec inet6 fe80::f66b:758e:9c24:88b9/64 scope link valid_lft forever preferred_lft forever 3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN qlen 1000 link/ether 52:54:00:4b:05:cc brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 1000 link/ether 52:54:00:4b:05:cc brd ff:ff:ff:ff:ff:ff [root@slave1 ~]#
slave2:
命令:
[root@master ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens160
文件内容:
TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=ens35 DEVICE=ens35 ONBOOT=yes IPADDR=192.168.1.102 NETMASK=255.255.255.0 GATEWAY=192.168.1.254 DNS1=114.114.114.114 DNS2=8.8.8.8
重启网卡:
root@slave2 ~]# systemctl restart network
查看ip:
[root@slave2 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens35: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:4b:92:6c brd ff:ff:ff:ff:ff:ff inet 192.168.1.102/24 brd 192.168.1.255 scope global dynamic ens35 valid_lft 416sec preferred_lft 416sec inet6 fe80::2162:1337:a894:11ce/64 scope link valid_lft forever preferred_lft forever 3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN qlen 1000 link/ether 52:54:00:4b:05:cc brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 1000 link/ether 52:54:00:4b:05:cc brd ff:ff:ff:ff:ff:ff [root@slave2 ~]#
在master节点的/etc/hosts内部做ip和主机名的映射:
命令:
[root@master ~]# vi /etc/hosts
内容:
192.168.1.100 master 192.168.1.101 slave1 192.168.1.102 slave2
将master的/etc/hosts配置文件拷贝到slave1和slave2的/etc目录下
[root@master ~]# scp /etc/hosts hosts hosts.allow hosts.deny [root@master ~]# scp /etc/hosts slave1:/etc/ The authenticity of host 'slave1 (192.168.1.101)' can't be established. ECDSA key fingerprint is SHA256:4lVQUkkZo5DlZodLDGbEP3NZKrLvXNW/qeIGRch1eNI. ECDSA key fingerprint is MD5:a4:e8:a1:a8:45:d1:69:f5:94:d3:9c:99:1f:7d:c2:d5. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave1,192.168.1.101' (ECDSA) to the list of known hosts. root@slave1's password: hosts 100% 221 64.7KB/s 00:00 [root@master ~]# scp /etc/hosts slave2:/etc/ The authenticity of host 'slave2 (192.168.1.102)' can't be established. ECDSA key fingerprint is SHA256:4lVQUkkZo5DlZodLDGbEP3NZKrLvXNW/qeIGRch1eNI. ECDSA key fingerprint is MD5:a4:e8:a1:a8:45:d1:69:f5:94:d3:9c:99:1f:7d:c2:d5. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave2,192.168.1.102' (ECDSA) to the list of known hosts. root@slave2's password: hosts 100% 221 257.8KB/s 00:00 [root@master ~]# scp -r 复制目录 scr 复制文件
在三个节点做免密登录:
master:
[root@master ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:s3RzKkZ7vcTmll5uQW0ldRuil5NGgC34K/qe33+OgkI root@master The key's randomart image is: +---[RSA 2048]----+ | . o..o oo| | . o .o = =| | . .. * +.| | . o o o| | S + .. . | | E * * . | | o * + =... | | . o.+.=o+o. | | o+o. +*++. | +----[SHA256]-----+ [root@master ~]# ssh-copy-id master /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'master (192.168.1.100)' can't be established. ECDSA key fingerprint is SHA256:2lG2/rO51PcX3P7MHU9/WjpTuuAVJS6yYZiLDA+gUZ4. ECDSA key fingerprint is MD5:c9:06:dc:69:31:20:01:4f:bb:26:db:2e:e0:da:92:94. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@master's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'master'" and check to make sure that only the key(s) you wanted were added. [root@master ~]# ssh-copy-id slave1 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave1'" and check to make sure that only the key(s) you wanted were added. [root@master ~]# ssh-copy-id slave2 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave2'" and check to make sure that only the key(s) you wanted were added. [root@master ~]#
slave1:
[root@slave1 ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:wXPBu3R+MWJQv7fnyGhgvCrsNyiHbzzzJzQB/kkMHn4 root@slave1 The key's randomart image is: +---[RSA 2048]----+ | .... | | +. o. . | | + =+ .o . | | + E+o + o. | | +S= = ..o.| | = = . ...| | + o o o . ..| | o X + o .o o.| | *o*o= .. o .| +----[SHA256]-----+ [root@slave1 ~]# ssh-copy-id master /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'master (192.168.1.100)' can't be established. ECDSA key fingerprint is SHA256:2lG2/rO51PcX3P7MHU9/WjpTuuAVJS6yYZiLDA+gUZ4. ECDSA key fingerprint is MD5:c9:06:dc:69:31:20:01:4f:bb:26:db:2e:e0:da:92:94. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@master's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'master'" and check to make sure that only the key(s) you wanted were added. [root@slave1 ~]# ssh-copy-id slave1 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'slave1 (192.168.1.101)' can't be established. ECDSA key fingerprint is SHA256:4lVQUkkZo5DlZodLDGbEP3NZKrLvXNW/qeIGRch1eNI. ECDSA key fingerprint is MD5:a4:e8:a1:a8:45:d1:69:f5:94:d3:9c:99:1f:7d:c2:d5. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave1'" and check to make sure that only the key(s) you wanted were added. [root@slave1 ~]# ssh-copy-id slave2 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'slave2 (192.168.1.102)' can't be established. ECDSA key fingerprint is SHA256:4lVQUkkZo5DlZodLDGbEP3NZKrLvXNW/qeIGRch1eNI. ECDSA key fingerprint is MD5:a4:e8:a1:a8:45:d1:69:f5:94:d3:9c:99:1f:7d:c2:d5. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave2'" and check to make sure that only the key(s) you wanted were added. [root@slave1 ~]#
slave2:
[root@slave2 ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:RCXC1qIT61bJFnA1ycNG5ET4zaQEv6duUL7y7UvLq+4 root@slave2 The key's randomart image is: +---[RSA 2048]----+ | .o+@Xo. | | ..**Oo. | | * B+* | | + =.+.o | | . + oS . | | o . .o | | . .... | | ..o+ . | | *EoBo | +----[SHA256]-----+ [root@slave2 ~]# ssh-copy-id master /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'master (192.168.1.100)' can't be established. ECDSA key fingerprint is SHA256:2lG2/rO51PcX3P7MHU9/WjpTuuAVJS6yYZiLDA+gUZ4. ECDSA key fingerprint is MD5:c9:06:dc:69:31:20:01:4f:bb:26:db:2e:e0:da:92:94. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@master's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'master'" and check to make sure that only the key(s) you wanted were added. [root@slave2 ~]# ssh-copy-id slave1 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'slave1 (192.168.1.101)' can't be established. ECDSA key fingerprint is SHA256:4lVQUkkZo5DlZodLDGbEP3NZKrLvXNW/qeIGRch1eNI. ECDSA key fingerprint is MD5:a4:e8:a1:a8:45:d1:69:f5:94:d3:9c:99:1f:7d:c2:d5. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave1'" and check to make sure that only the key(s) you wanted were added. [root@slave2 ~]# ssh-copy-id slave2 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'slave2 (192.168.1.102)' can't be established. ECDSA key fingerprint is SHA256:4lVQUkkZo5DlZodLDGbEP3NZKrLvXNW/qeIGRch1eNI. ECDSA key fingerprint is MD5:a4:e8:a1:a8:45:d1:69:f5:94:d3:9c:99:1f:7d:c2:d5. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave2'" and check to make sure that only the key(s) you wanted were added. [root@slave2 ~]#
master:
关闭防火墙:
[root@master ~]# systemctl stop firewalld
禁止防火墙开机启动:
[root@master ~]# systemctl disable firewalld
slave1:
关闭防火墙:
[root@master ~]# systemctl stop firewalld
禁止防火墙开机启动:
[root@master ~]# systemctl disable firewalld
slave2:
关闭防火墙:
[root@master ~]# systemctl stop firewalld
禁止防火墙开机启动:
[root@master ~]# systemctl disable firewalld
2. hadoop集群部署
2.1 jdk的部署和安装
创建要解压的路径:
[root@master softwares]# mkdir /opt/module [root@master softwares]# ls /opt/ module softwares [root@master softwares]#
解压jdk:
[root@master softwares]# tar -zxvf jdk-8u212-linux-x64.tar.gz -C /opt/module/
重命名jdk文件件
[root@master softwares]# ls /opt/module/ jdk1.8.0_212 [root@master softwares]# cd /opt/module/ [root@master module]# ls jdk1.8.0_212 [root@master module]# mv jdk1.8.0_212/ jdk [root@master module]# ls jdk [root@master module]#
修改环境变量配置:
[root@master module]# vi /etc/profile
修改内容:
# JAVA_HOME export JAVA_HOME=/opt/module/jdk export PATH="$JAVA_HOME/bin:$PATH"
source使配置文件生效:
[root@master module]# source /etc/profile [root@master module]#
执行java -version 查看Java版本,能看到版本说明配置生效:
[root@master module]# java -version java version "1.8.0_212" Java(TM) SE Runtime Environment (build 1.8.0_212-b10) Java HotSpot(TM) 64-Bit Server VM (build 25.212-b10, mixed mode) [root@master module]#
在slave1和slave2节点创建module路径:
[root@slave1 ~]# mkdir /opt/module/ [root@slave1 ~]# [root@slave2 ~]# mkdir /opt/module
拷贝jdk和jdk的环境变量到slave1和slave2节点:
[root@master module]# scp -r /opt/module/jdk/ slave1:/opt/module/jdk [root@master module]# scp -r /opt/module/jdk/ slave2:/opt/module/jdk
在slave1和slave2节点配置环境变量:
[root@slave1 jdk]# vi /etc/profile [root@slave1 jdk]# source /etc/profile [root@slave1 jdk]# java -version java version "1.8.0_212" Java(TM) SE Runtime Environment (build 1.8.0_212-b10) Java HotSpot(TM) 64-Bit Server VM (build 25.212-b10, mixed mode) [root@slave1 jdk]# [root@slave2 ~]# vi /etc/profile [root@slave2 ~]# source /etc/profile [root@slave2 ~]# java -version\ > ; java version "1.8.0_212" Java(TM) SE Runtime Environment (build 1.8.0_212-b10) Java HotSpot(TM) 64-Bit Server VM (build 25.212-b10, mixed mode) [root@slave2 ~]#
2.2 Hadoop集群配置
-
解压hadoop到/opt/module/路径下:
[root@master softwares]# tar -zxvf /opt/softwares/hadoop-3.1.3.tar.gz -C /opt/module/
-
修改文件名称,配置hadoop环境变量
[root@master module]# mv hadoop-3.1.3/ hadoop [root@master module]# ls hadoop jdk [root@master module]# vi /etc/profile
配置内容:
# HADOOP_HOME export HADOOP_HOME=/opt/module/hadoop export PATH="$HADOOP_HOME/bin:$PATH"
source并查看hadoop版本:
[root@master module]# hadoop version Hadoop 3.1.3 Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579 Compiled by ztang on 2019-09-12T02:47Z Compiled with protoc 2.5.0 From source with checksum ec785077c385118ac91aadde5ec9799 This command was run using /opt/module/hadoop/share/hadoop/common/hadoop-common-3.1.3.jar [root@master module]#
-
配置hadoop相关配置文件:
核心配置文件
配置core-site.xml
[root@master hadoop]# vi /opt/module/hadoop/etc/hadoop/core-site.xml
文件内容:
<configuration> <!--指定NameNode的地址 --> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <!-- 指定hadoop数据的存储目录--> <property> <name>hadoop.tmp.dir</name> <value>/opt/module/hadoop/data</value> </property> </configuration>
HDFS配置文件:
配置hdfs-site.xml
[root@master hadoop]# vim /opt/module/hadoop/etc/hadoop/hdfs-site.xml
文件内容:
<configuration> <!--namenode web端访问地址 --> <property> <name>dfs.namenode.http-address</name> <value>master:50070</value> </property> <!-- hdfs副本数量--> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
YARN配置文件:
配置yarn-site.xml
[root@master hadoop]# vim /opt/module/hadoop/etc/hadoop/yarn-site.xml
配置内容:
<configuration> <!-- Site specific YARN configuration properties --> <!-- 指定MR走shuffle--> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 指定ResourceManager地址-->‘ <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> </configuration>
MAPREDUCE配置文件:
配置mapred-site.xml
[root@master hadoop]# vim /opt/module/hadoop/etc/hadoop/mapred-site.xml
配置内容:
<configuration> <!--指定MapReduce程序运行环境 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!--指定历史端地址 --> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <!--指定历史web端地址 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> </configuration> 配置workers [root@master hadoop]# vim /opt/module/hadoop/etc/hadoop/workers
配置内容:
master slave1 slave2
配置hadoop-env.sh
[root@master hadoop]# vi /opt/module/hadoop/etc/hadoop/hadoop-env.sh
配置内容:
export JAVA_HOME=/opt/module/jdk
配置mapred-env.sh
[root@master hadoop]# vi /opt/module/hadoop/etc/hadoop/mapred-env.sh
配置内容:
export JAVA_HOME=/opt/module/jdk
配置yarn-env.sh
[root@master hadoop]# vi /opt/module/hadoop/etc/hadoop/yarn-env.sh
配置内容:
export JAVA_HOME=/opt/module/jdk
将hadoop拷贝到slave1和slave2节点:
[root@master hadoop]# scp -r /opt/module/hadoop/ slave1:/opt/module/hadoop/ [root@master hadoop]# scp -r /opt/module/hadoop/ slave2:/opt/module/hadoop/
格式化namenode:
[root@master hadoop]# hdfs namenode -format
在/etc/profile下配置hadoop的HDFS用户和yarn的用户
:
[root@master hadoop]# vi /etc/profile
配置内容:
export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root
source /etc/profile使配置文件生效:
[root@master hadoop]# source /etc/profile
启动Hadoop集群:
[root@master hadoop]# pwd /opt/module/hadoop [root@master hadoop]# ./sbin/start-all.sh
jps查看hadoop进程是否启动:
[root@master hadoop]# jps 7472 DataNode 7297 NameNode 7682 SecondaryNameNode 7939 ResourceManager 8467 Jps 8091 NodeManager [root@master hadoop]#
slave1:
[root@slave1 hadoop]# jps 5194 NodeManager 5325 Jps 5087 DataNode [root@slave1 hadoop]#
slave2:
[root@slave2 hadoop]# jps 5065 NodeManager 5196 Jps 4958 DataNode [root@slave2 hadoop]#
3.hive组件部署
3.1 MYSql数据库部署
安装包位置 /opt/softwares/mysql
账号:root
密码:123456
-
卸载mariadb
[root@master hadoop]# rpm -qa | grep mariadb mariadb-libs-5.5.56-2.el7.x86_64 [root@master hadoop]# rpm -e --nodeps mari mariadb-libs marisa [root@master hadoop]# rpm -e --nodeps mariadb-libs [root@master hadoop]#
-
通过rpm命令按顺序安装mysql:
安装包位置:
[root@master softwares]# cd Mysql/ [root@master Mysql]# ls 01_mysql-community-common-5.7.16-1.el7.x86_64.rpm 02_mysql-community-libs-5.7.16-1.el7.x86_64.rpm 03_mysql-community-libs-compat-5.7.16-1.el7.x86_64.rpm 04_mysql-community-client-5.7.16-1.el7.x86_64.rpm 05_mysql-community-server-5.7.16-1.el7.x86_64.rpm mysql-connector-java-5.1.27-bin.jar [root@master Mysql]# pwd /opt/softwares/Mysql [root@master Mysql]#
通过rpm命令按顺序安装mysql:
[root@master Mysql]# rpm -ivh 01_mysql-community-common-5.7.16-1.el7.x86_64.rpm [root@master Mysql]# rpm -ivh 02_mysql-community-libs-5.7.16-1.el7.x86_64.rpm [root@master Mysql]# rpm -ivh 03_mysql-community-libs-compat-5.7.16-1.el7.x86_64.rpm [root@master Mysql]# rpm -ivh 04_mysql-community-client-5.7.16-1.el7.x86_64.rpm [root@master Mysql]# rpm -ivh 05_mysql-community-server-5.7.16-1.el7.x86_64.rpm
查看是否安装成功:
[root@master Mysql]# rpm -qa | grep mysql mysql-community-server-5.7.16-1.el7.x86_64 mysql-community-libs-5.7.16-1.el7.x86_64 mysql-community-libs-compat-5.7.16-1.el7.x86_64 mysql-community-client-5.7.16-1.el7.x86_64 mysql-community-common-5.7.16-1.el7.x86_64 [root@master Mysql]#
通过命令启动mysql:
[root@master Mysql]# service mysqld start Redirecting to /bin/systemctl start mysqld.service
查看临时密码并通过临时密码登录mysql:
[root@master Mysql]# grep 'temporary password' /var/log/mysqld.log 2022-11-14T13:46:27.335510Z 1 [Note] A temporary password is generated for root@localhost: +N_r2eOrOvk* [root@master Mysql]# mysql -uroot -p+N_r2eOrOvk* mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.16 Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>
mysql修改密码及开启远程访问:
mysql> set global validate_password_policy=0; Query OK, 0 rows affected (0.00 sec) mysql> set global validate_password_length=1; Query OK, 0 rows affected (0.00 sec) mysql> alter user 'root'@'localhost' identified by '123456'; Query OK, 0 rows affected (0.00 sec) mysql> grant all privileges on *.* to 'root'@'%' identified by '123456'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) mysql> exit;
退出后使用自己设置的密码登录mysql测试:
[root@master Mysql]# mysql -uroot -p123456 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 Server version: 5.7.16 MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>
3.2 Hive部署
解压hive,并重命名为hive,将hive环境变量配置到/etc/profile:
[root@master Mysql]# tar -zxvf /opt/softwares/apache-hive-3.1.2-bin.tar.gz -C /opt/module/ [root@master Mysql]# mv /opt/module/apache-hive-3.1.2-bin/ /opt/module/hive [root@master Mysql]# ls /opt/module/ hadoop hive jdk [root@master Mysql]#vi /etc/profile [root@master Mysql]# source /etc/profile [root@master Mysql]#
环境变量配置内容:
# HIVE_HOME export HIVE_HOME=/opt/module/hive export PATH="$HIVE_HOME/bin:$PATH"
将Hive元数据配置到mysql:
将jdbc驱动拷贝到hive的lib目录下:
[root@master Mysql]# cp /opt/softwares/Mysql/mysql-connector-java-5.1.27-bin.jar /opt/module/hive/lib/ [root@master Mysql]#
在hive相对路径下的conf目录下,配置hive-site.xml:
[root@master Mysql]# vi /opt/module/hive/conf/hive-site.xml
配置文件内容:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://master:3306/metastore?useSSL=false</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>hive.server2.thrift.port</name> <value>10000</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>master</value> </property> <property> <name>hive.metastore.event.db.notification.api.auth</name> <value>false</value> </property> <property> <name>hive.cli.print.header</name> <value>true</value> </property> <property> <name>hive.cli.print.current.db</name> <value>true</value> </property> </configuration>
初始化元数据库:
登录Mysql,创建hive的元数据库:
[root@master Mysql]# mysql -uroot -p123456 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 6 Server version: 5.7.16 MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database metastore; Query OK, 1 row affected (0.00 sec) mysql> quit; Bye [root@master Mysql]#
初始化hive元数据库:
[root@master Mysql]# schematool --initSchema -dbType mysql -verbose
启动Hive客户端,查看数据库:
[root@master Mysql]# hive which: no hbase in (/opt/module/hadoop/bin:/opt/module/hive/bin:/opt/module/jdk/bin:/usr/local/python3/bin:/opt/module/hadoop/bin:/opt/module/jdk/bin:/usr/local/python3/bin:/opt/module/hadoop/bin:/opt/module/jdk/bin:/usr/local/python3/bin:/opt/module/jdk/bin:/usr/local/python3/bin:/usr/local/python3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = 69642fdc-0c17-4637-8e11-3415e32bfd4d Logging initialized using configuration in jar:file:/opt/module/hive/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = 1ee7f81d-bfad-4ab6-aa30-6c4c26976021 hive (default)> show databases; OK database_name default Time taken: 0.543 seconds, Fetched: 1 row(s) hive (default)>
4. spark组件部署
解压spark安装包,重命名spark并配置环境变量:
[root@master softwares]# tar -zxvf /opt/softwares/spark-3.1.3-bin-hadoop3.2.tgz -C /opt/module/ [root@master softwares]# mv /opt/module/spark-3.1.3-bin-hadoop3.2/ /opt/module/spark [root@master softwares]# vi /etc/profile
环境变量配置内容:
# SPARK_HOME export SPARK_HOME=/opt/module/spark export PATH="$SPARK_HOME/bin:$PATH"
source 使环境变量生效:
[root@master softwares]# source /etc/profile
重命名workers。template 文件并配置计算节点:
[root@master conf]# mv workers.template workers [root@master conf]# vim workers
配置内容:
master slave1 slave2
修改spark-env.sh文件:
[root@master conf]# cp spark-env.sh.template spark-env.sh [root@master conf]# vim spark-env.sh [root@master conf]#
添加如下内容:
export JAVA_HOME=/opt/module/jdk export HADOOP_HOME=/opt/module/hadoop export SPARK_MASTER_IP=master export SPARK_MASTER_PORT=7077 export SPARK_DIST_CLASSPATH=$(/opt/module/hadoop/bin/hadoop classpath) export SPARK_YARN_USER_ENV="CLASSPATH=/opt/module/hadoop/etc/hadoop" export HADOOP_CONF_DIR=/opt/module/hadoop/etc/hadoop export YARN_CONF_DIR=/opt/module/hadoop/etc/hadoop
将spark拷贝到其他两个节点:
[root@master conf]# scp -r /opt/module/spark/ slave1:/opt/module/ [root@master conf]# scp -r /opt/module/spark/ slave2:/opt/module/
启动spark集群并查看进程:
[root@master spark]# /opt/module/spark/sbin/start-all.sh starting org.apache.spark.deploy.master.Master, logging to /opt/module/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out master: starting org.apache.spark.deploy.worker.Worker, logging to /opt/module/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out slave1: starting org.apache.spark.deploy.worker.Worker, logging to /opt/module/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out slave2: starting org.apache.spark.deploy.worker.Worker, logging to /opt/module/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out [root@master spark]# jps 7472 DataNode 7297 NameNode 7682 SecondaryNameNode 7939 ResourceManager 11715 Master 11909 Jps 8091 NodeManager 11854 Worker [root@master spark]#
运行SparkPI程序测试spark:
[root@master spark]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://master:7077 examples/jars/spark-examples_2.12-3.1.3.jar
结果:
2022-11-14 22:24:35,002 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 2022-11-14 22:24:35,010 INFO scheduler.DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 2.131 s 2022-11-14 22:24:35,013 INFO scheduler.DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job 2022-11-14 22:24:35,013 INFO scheduler.TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished 2022-11-14 22:24:35,014 INFO scheduler.DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 2.189453 s Pi is roughly 3.141475707378537 2022-11-14 22:24:35,046 INFO server.AbstractConnector: Stopped Spark@4fbb001b{HTTP/1.1, (http/1.1)}{0.0.0.0:4040} 2022-11-14 22:24:35,048 INFO ui.SparkUI: Stopped Spark web UI at http://master:4040 2022-11-14 22:24:35,050 INFO cluster.StandaloneSchedulerBackend: Shutting down all executors 2022-11-14 22:24:35,050 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 2022-11-14 22:24:35,122 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! [root@master spark]#
5. Flink组件部署
解压Flink修改flink名称并配置环境变量:
[root@master spark]# tar -zxvf /opt/softwares/flink-1.14.6-bin-scala_2.12.tgz -C /opt/module/ [root@master spark]# mv /opt/module/flink-1.14.6/ /opt/module/flink/ [root@master spark]# vi /etc/profile
配置内容:
# FLINK_HOME export FLINK_HOME=/opt/module/flink export PATH="$FLINK_HOME/bin:$PATH"
source /etc/profile配置文件:
[root@master spark]# source /etc/profile [root@master spark]#
修改flink-conf.yaml配置文件:
[root@master conf]# vim /opt/module/flink/conf/flink-conf.yaml
修改内容:
jobmanager.rpc.address: master
修改workers配置文件:
[root@master conf]# vim /opt/module/flink/conf/workers
修改内容:
master slave1 slave2
将flink分发到slave1和slave2:
[root@master conf]# scp -r /opt/module/flink/ slave1:/opt/module/ [root@master conf]# scp -r /opt/module/flink/ slave2:/opt/module/
启动flink集群并查看进程:
[root@master flink]# ./bin/start-cluster.sh Starting cluster. Starting standalonesession daemon on host master. Starting taskexecutor daemon on host master. Starting taskexecutor daemon on host slave1. Starting taskexecutor daemon on host slave2. [root@master flink]# jps 7472 DataNode 7297 NameNode 13617 TaskManagerRunner 7682 SecondaryNameNode 7939 ResourceManager 11715 Master 13732 Jps 8091 NodeManager 11854 Worker [root@master flink]#
提交测试jar包测试flink:
[root@master batch]# flink run -m master:8081 /opt/module/flink/examples/batch/WordCount.jar 注意:运行flink要关闭spark standalone集群
测试结果:
(under,1) (undiscover,1) (unworthy,1) (us,3) (we,4) (weary,1) (what,1) (when,2) (whether,1) (whips,1) (who,2) (whose,1) (will,1) (wish,1) (with,3) (would,2) (wrong,1) (you,1) [root@master batch]#
FLinkWeb端界面(http://192.168.1.100:8081/#/overview):
6. Flume组件部署
解压flume修改文件名称并配置环境变量:
[root@master module]# tar -zxvf /opt/softwares/apache-flume-1.9.0-bin.tar.gz -C /opt/module/ [root@master module]# mv /opt/module/apache-flume-1.9.0-bin/ /opt/module/flume [root@master batch]# vim /etc/profile
环境变量配置内容:
# FLUME_HOME export FLUME_HOME=/opt/module/flume export PATH="$FLUME_HOME/bin:$PATH"
source /etc/profile:
[root@master batch]# source /etc/profile [root@master batch]#
删除flume和hadoop冲突的jar包:
cd /opt/module/flume/lib [root@master lib]# rm -rf /opt/module/flume/lib/guava-11.0.2.jar
修改log4j配置文件:
[root@master lib]# vim /opt/module/flume/conf/log4j.properties
修改内容:
flume.log.dir=/opt/module/flume/logs
7. Zookeeper组件部署
解压zookeeper修改文件名称并配置环境变量:
[root@master lib]# tar -zxvf /opt/softwares/apache-zookeeper-3.5.7-bin.tar.gz -C /opt/module/ [root@master lib]# mv /opt/module/apache-zookeeper-3.5.7-bin/ /opt/module/zookeeper mv apache-zookeeper-3.5.7-bin/ zookeeper [root@master lib]# vim /etc/profile
环境变量修改内容:
# ZOOKEEPER_HOME export ZOOKEEPER_HOME=/opt/module/zookeeper export PATH="$ZOOKEEPER_HOME/bin:$PATH"
source /etc/profile使环境变量生效:
[root@master lib]# source /etc/profile [root@master lib]#
配置服务器编号:
在zookeeper目录下创建zkData
[root@master zookeeper]# mkdir zkData [root@master zookeeper]# cd zkData/ [root@master zkData]# vim myid
myid文件内容:
2
配置zoo.cfg文件
重命名zoo_sample.cfg为zoo.cfg,并修改zoo.cfg文件内容:
cd ../conf [root@master conf]# mv zoo_sample.cfg zoo.cfg [root@master conf]# vim zoo.cfg
修改内容:
dataDir=/opt/module/zookeeper/zkData # 设置zookeeper内部通信地址和选举端口 server.2=master:2888:3888 server.3=slave1:2888:3888 server.4=slave2:2888:3888
将zookeeper拷贝到slave1和slave2:
[root@master conf]# scp -r /opt/module/zookeeper/ slave1:/opt/module/ [root@master conf]# scp -r /opt/module/zookeeper/ slave2:/opt/module/
修改slave1和slave2上myid内容:
[root@slave1 module]# vim /opt/module/zookeeper/zkData/myid 3 [root@slave2 hadoop]# vim /opt/module/zookeeper/zkData/myid 4
启动zookeeper集群:
[root@slave2 hadoop]# /opt/module/zookeeper/bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@slave2 hadoop]# [root@slave1 module]# /opt/module/zookeeper/bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@slave1 module]# [root@master conf]# /opt/module/zookeeper/bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@master conf]#
查看zookeeper进程状态 验证结果:
[root@master conf]# /opt/module/zookeeper/bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: follower [root@master conf]# [root@slave1 module]# /opt/module/zookeeper/bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: follower [root@slave1 module]# [root@slave2 hadoop]# /opt/module/zookeeper/bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: leader [root@slave2 hadoop]#
8. Kafka组件部署
解压重命名kafka,并配置环境变量:
[root@master conf]# tar -zxvf /opt/softwares/kafka_2.12-3.0.0.tgz -C /opt/module/ [root@master conf]# mv /opt/module/kafka_2.12-3.0.0/ /opt/module/kafka [root@master conf]# vim /etc/profile
配置内容:
# KAFKA_HOME export KAFKA_HOME=/opt/module/kafka export PATH="$KAFKA_HOME/bin:$PATH"
修改kafka的server.properties配置文件:
[root@master config]# vim /opt/module/kafka/config/server.properties
修改内容:
# 唯一值 broker.id=0 # kafka启动会自动创建 log.dirs=/opt/module/kafka/logs # zk连接 zookeeper.connect=master:2181,slave1:2181,slave2:2181/kafka
将kafka拷贝到slave1和slave2节点:
[root@master config]# scp -r /opt/module/kafka/ slave1:/opt/module/ [root@master config]# scp -r /opt/module/kafka/ slave2:/opt/module/
修改slave1和slave2节点的broker.id值
[root@slave1 module]# vim /opt/module/kafka/config/server.properties broker.id=1 [root@slave2 hadoop]# vim /opt/module/kafka/config/server.properties broker.id=2
在三个节点启动kafka:
[root@master config]# /opt/module/kafka/bin/kafka-server-start.sh -daemon /opt/module/kafka/config/server.properties [root@slave1 module]# /opt/module/kafka/bin/kafka-server-start.sh -daemon /opt/module/kafka/config/server.properties [root@slave2 hadoop]# /opt/module/kafka/bin/kafka-server-start.sh -daemon /opt/module/kafka/config/server.properties /opt/module/kafka/bin/kafka-server-start.sh -daemon /opt/kafka/config/server.properties
查看kafka进程:
[root@master config]# jps 7472 DataNode 7297 NameNode 7682 SecondaryNameNode 7939 ResourceManager 15123 CliFrontend 18035 QuorumPeerMain 16388 StandaloneSessionClusterEntrypoint 18725 Kafka 16694 TaskManagerRunner 14170 CliFrontend 8091 NodeManager 18813 Jps [root@master config]# [root@slave1 module]# jps 10032 Jps 9383 QuorumPeerMain 8841 TaskManagerRunner 5194 NodeManager 9963 Kafka 5087 DataNode [root@slave1 module]# [root@slave2 hadoop]# jps 9202 QuorumPeerMain 9874 Jps 5065 NodeManager 8681 TaskManagerRunner 9804 Kafka 4958 DataNode [root@slave2 hadoop]#
9. MaxWell组件部署
解压重命名maxwell,并配置环境变量:
[root@master softwares]# tar -zxvf /opt/softwares/maxwell-1.29.2.tar.gz -C /opt/module/ [root@master softwares]# mv /opt/module/maxwell-1.29.2/ /opt/module/maxwell/ [root@master softwares]# vim /etc/profile
修改内容:
# MAXWELL_HOME export MAXWELL_HOME=/opt/module/maxwell export PATH="$MAXWELL_HOME/bin:$PATH"
修改MySQL相关配置:
启用MySQL Binlog,进行数据同步要先开启BinLog
修改MySQL配置文件/etc/my/cnf
[root@master ~]# vim /etc/my.cnf
修改内容:
#数据库id server-id = 1 #启动binlog,该参数的值会作为binlog的文件名 log-bin=mysql-bin #binlog类型,maxwell要求为row类型 binlog_format=row #启用binlog的数据库,需根据实际情况作出修改 binlog-do-db=ds_pub
在mysql中创建出数据库ds_pub:
[root@master ~]# mysql -uroot -p123456 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 30 Server version: 5.7.16 MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | metastore | | mysql | | performance_schema | | sys | +--------------------+ 5 rows in set (0.00 sec) mysql> create database ds_pub; Query OK, 1 row affected (0.00 sec) mysql> exit; Bye
重启mysql服务:
[root@master ~]# systemctl restart mysqld [root@master ~]#
在MySQL中创建Maxwell所需要的数据库和用户
[root@master ~]# mysql -uroot -p123456 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.16-log MySQL Community Server (GPL) Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE maxwell; Query OK, 1 row affected (0.00 sec) mysql> set global validate_password_policy=0; Query OK, 0 rows affected (0.00 sec) mysql> set global validate_password_length=4; Query OK, 0 rows affected (0.00 sec) mysql> CREATE USER 'maxwell'@'%' IDENTIFIED BY 'maxwell'; Query OK, 0 rows affected (0.01 sec) mysql> GRANT ALL ON maxwell.* TO 'maxwell'@'%'; Query OK, 0 rows affected (0.00 sec) mysql> GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'maxwell'@'%'; Query OK, 0 rows affected (0.00 sec) mysql> exit; Bye [root@master ~]#
[root@master redis]# cd /opt/maxwell [root@master maxwell]# ls bin config.properties.example lib log4j2.xml README.md config.md kinesis-producer-library.properties.example LICENSE quickstart.md [root@master maxwell]# cp config.properties.example config.properties
10. Redis组件部署
解压重命名redis,并配置环境变量:
[root@master softwares]# tar -zxvf redis-6.2.6.tar.gz -C /opt/module/ [root@master softwares]# mv /opt/module/redis-6.2.6/ /opt/module/redis [root@master softwares]# cd /opt/module/ [root@master module]# ls flink flume hadoop hive jdk kafka maxwell redis spark zookeeper [root@master module]#
(1.)查看是否安装了gcc相关程序 命令:which gcc (显示没有相关的文件) (2)安装gcc程序 命令:yum -y install gcc automake autoconf libtool make
查看gcc编译器版本:
[root@master redis]# gcc --version gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44) Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [root@master redis]#
进入redis目录,执行make命令:
[root@master redis]# cd /opt/module/redis/ [root@master redis]# make [root@master redis]# make install
备份redis.conf 到/root目录
[root@master redis]# cp redis.conf /root/ [root@master redis]#
设置Redis在后台启动:
[root@master ~]# vim redis.conf
修改内容:
bind 192.168.1.100 -::1 # 设置为在后台启动 daemonize yes #pro
修改redis.conf 名称:
[root@master ~]# mv redis.conf my_redis.conf
后台启动redis
[root@master ~]# redis-server /root/my_redis.conf [root@master ~]#
用客户端去访问redis:
[root@master ~]# redis-cli -h 192.168.1.100 -p 6379 192.168.1.100:6379>
11. Hbase组件部署
解压重命名hbase,并配置环境:
[root@master hbase]# tar -zxvf hbase-2.3.3-bin.tar.gz -C /opt/module/ [root@master hbase]# mv hbase-2.3.3/ hbase [root@master module]# ls clickhouse flink flume hadoop hbase hive jdk kafka maxwell Mysql redis spark zookeeper
配置profile环境变量:
[root@master module]# vi /etc/profile #hbase_home export HBASE_HOME=/opt/module/hbase export PATH="HBASE_HOME/bin:$PATH" [root@master module]# source /etc/profile
修改hbase-env.sh配置文件:
[root@master conf]# vi hbase-env.sh export JAVA_HOME=/opt/module/jdk export HBASE_MANAGES_ZK=false
修改hbase-site.xml文件:
<configuration> <!-- The following properties are set for running HBase as a single process on a developer workstation. With this configuration, HBase is running in "stand-alone" mode and without a distributed file system. In this mode, and without further configuration, HBase and ZooKeeper data are stored on the local filesystem, in a path under the value configured for `hbase.tmp.dir`. This value is overridden from its default value of `/tmp` because many systems clean `/tmp` on a regular basis. Instead, it points to a path within this HBase installation directory. Running against the `LocalFileSystem`, as opposed to a distributed filesystem, runs the risk of data integrity issues and data loss. Normally HBase will refuse to run in such an environment. Setting `hbase.unsafe.stream.capability.enforce` to `false` overrides this behavior, permitting operation. This configuration is for the developer workstation only and __should not be used in production!__ See also https://hbase.apache.org/book.html#standalone_dist --> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.tmp.dir</name> <value>./tmp</value> </property> <property> <name>hbase.unsafe.stream.capability.enforce</name> <value>false</value> </property> <property> <name>hbase.root.dir</name> <value>hdfs://master:9000/base</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master,slave1,slave2</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.master.maxclockskew</name> <value>30000</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/opt/mopdule/zookeeper</value> </property> <property> <name>zookeeper.session.timeout</name> <value>300000</value> </property> </configuration>
修改regionsservers文件:
[root@master conf]# vi regionservers master slave1 slave2
启动hbase
[root@master hbase]# ./bin/start-hbase.sh
jps:在哪台服务器使用上述命令启动则那台服务器即为master节点,使用 jps命令查看启动情况
[root@master hbase]# jps 2465 ResourceManager 1844 NameNode 5652 HRegionServer 6308 Jps 2792 NodeManager 3144 Worker 3001 Master 2219 SecondaryNameNode 3211 QuorumPeerMain 5435 HMaster 2015 DataNode 3583 Kafka
12. Clickhouse组件部署
rpm安装:
[root@master clickhouse]# rpm -ivh clickhouse-common-static-dbg-21.9.4.35-2.x86_64.rpm 警告:clickhouse-common-static-dbg-21.9.4.35-2.x86_64.rpm: 头V4 RSA/SHA1 Signature, 密钥 ID e0c56bd4: NOKEY 准备中... ################################# [100%] 正在升级/安装... 1:clickhouse-common-static-dbg-21.9################################# [100%] [root@master clickhouse]# rpm -ivh clickhouse-common-static-21.9.4.35-2.x86_64.rpm 警告:clickhouse-common-static-21.9.4.35-2.x86_64.rpm: 头V4 RSA/SHA1 Signature, 密钥 ID e0c56bd4: NOKEY 准备中... ################################# [100%] 正在升级/安装... 1:clickhouse-common-static-21.9.4.3################################# [100%] [root@master clickhouse]# rpm -ivh clickhouse-server-21.9.4.35-2.noarch.rpm 警告:clickhouse-server-21.9.4.35-2.noarch.rpm: 头V4 RSA/SHA1 Signature, 密钥 ID e0c56bd4: NOKEY 准备中... ################################# [100%] 正在升级/安装... 1:clickhouse-server-21.9.4.35-2 ################################# [100%] ClickHouse binary is already located at /usr/bin/clickhouse Symlink /usr/bin/clickhouse-server already exists but it points to /clickhouse. Will replace the old symlink to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-server to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-client to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-local to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-benchmark to /usr/bin/clickhouse. Symlink /usr/bin/clickhouse-copier already exists but it points to /clickhouse. Will replace the old symlink to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-copier to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-obfuscator to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-git-import to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-compressor to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-format to /usr/bin/clickhouse. Symlink /usr/bin/clickhouse-extract-from-config already exists but it points to /clickhouse. Will replace the old symlink to /usr/bin/clickhouse. Creating symlink /usr/bin/clickhouse-extract-from-config to /usr/bin/clickhouse. Creating clickhouse group if it does not exist. groupadd -r clickhouse Creating clickhouse user if it does not exist. useradd -r --shell /bin/false --home-dir /nonexistent -g clickhouse clickhouse Will set ulimits for clickhouse user in /etc/security/limits.d/clickhouse.conf. Creating config directory /etc/clickhouse-server/config.d that is used for tweaks of main server configuration. Creating config directory /etc/clickhouse-server/users.d that is used for tweaks of users configuration. Config file /etc/clickhouse-server/config.xml already exists, will keep it and extract path info from it. /etc/clickhouse-server/config.xml has /var/lib/clickhouse/ as data path. /etc/clickhouse-server/config.xml has /var/log/clickhouse-server/ as log path. Users config file /etc/clickhouse-server/users.xml already exists, will keep it and extract users info from it. chown --recursive clickhouse:clickhouse '/etc/clickhouse-server' Creating log directory /var/log/clickhouse-server/. Creating data directory /var/lib/clickhouse/. Creating pid directory /var/run/clickhouse-server. chown --recursive clickhouse:clickhouse '/var/log/clickhouse-server/' chown --recursive clickhouse:clickhouse '/var/run/clickhouse-server' chown clickhouse:clickhouse '/var/lib/clickhouse/' groupadd -r clickhouse-bridge useradd -r --shell /bin/false --home-dir /nonexistent -g clickhouse-bridge clickhouse-bridge chown --recursive clickhouse-bridge:clickhouse-bridge '/usr/bin/clickhouse-odbc-bridge' chown --recursive clickhouse-bridge:clickhouse-bridge '/usr/bin/clickhouse-library-bridge' Enter password for default user: Password for default user is saved in file /etc/clickhouse-server/users.d/default-password.xml. Setting capabilities for clickhouse binary. This is optional. ClickHouse has been successfully installed. Start clickhouse-server with: sudo clickhouse start Start clickhouse-client with: clickhouse-client --password Created symlink from /etc/systemd/system/multi-user.target.wants/clickhouse-server.service to /etc/systemd/system/clickhouse-server.service. [root@master clickhouse]# ls clickhouse-client-21.9.4.35-2.noarch.rpm clickhouse-common-static-dbg-21.9.4.35-2.x86_64.rpm clickhouse-common-static-21.9.4.35-2.x86_64.rpm clickhouse-server-21.9.4.35-2.noarch.rpm [root@master clickhouse]# rpm -ivh clickhouse-client-21.9.4.35-2.noarch.rpm 警告:clickhouse-client-21.9.4.35-2.noarch.rpm: 头V4 RSA/SHA1 Signature, 密钥 ID e0c56bd4: NOKEY 准备中... ################################# [100%] 正在升级/安装... 1:clickhouse-client-21.9.4.35-2 ################################# [100%] [root@master clickhouse]#
修改配置文件:
[root@master module]# vi /etc/clickhouse-server/config.xml <log>/data/clickhouse/log/clickhouse-server/clickhouse-server.log</log> <errorlog>/data/clickhouse/log/clickhouse-server/clickhouse-server.err.log</errorlog> <!-- Path to data directory, with trailing slash. --> <path>/data/clickhouse/</path> <!-- Path to temporary data for processing hard queries. --> <tmp_path>/data/clickhouse/tmp/</tmp_path> <!-- Directory with user provided files that are accessible by 'file' table function. --> <user_files_path>/data/clickhouse/user_files/</user_files_path>
启动 ClickServer 前台启动:
clickhouse-server --config-file=/etc/clickhouse-server/config.xml 查看启动后的进程:ps -aux | grep click
后台启动:
nohup clickhouse-server --config-file=/etc/clickhouse-server/config.xml >null 2>&1 &
使用 client 连接server
clickhouse-client
二、数据抽取
2.1 实时数据抽取
2.1.1 Maxwell数据抽取
修改Maxwell配置文件config.peoperties
[root@master ~]# cd /opt/module/maxwell/ [root@master maxwell]# ls bin LICENSE config.md log4j2.xml config.properties.example quickstart.md kinesis-producer-library.properties.example README.md lib [root@master maxwell]# cp config.properties.example config.properties [root@master maxwell]# vim config.properties
修改内容:
log_level=info producer=kafka kafka.bootstrap.servers=master:9092,slave1:9092 kafka_topic=maxwell # mysql login info host=master user=maxwell password=maxwell jdbc_options=useSSL=false&serverTimezone=Asia/Shanghai
启动maxwell:
[root@master maxwell]# /opt/module/maxwell/bin/maxwell --config /opt/module/maxwell/config.properties --daemon Redirecting STDOUT to /opt/module/maxwell/bin/../logs/MaxwellDaemon.out Using kafka version: 1.0.0 [root@master maxwell]#
在kafka中创建topic:
[root@master maxwell]# kafka-topics.sh --bootstrap-server master:9092 --partitions 1 --replication-factor 1 --create -topic maxwell Created topic maxwell. [root@master maxwell]#
启动kafka console consumer消费数据:
[root@master maxwell]# [root@master conf]# kafka-console-consumer.sh --bootstrap-server master:9092 --topic maxwell
在mysql中执行source命令,往数据库中添加数据:
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | ds_pub | | maxwell | | metastore | | mysql | | performance_schema | | sys | +--------------------+ 7 rows in set (0.00 sec) mysql> use ds_pub; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> source '/opt/data/ds_pub.sql';
结果(kafka的console consumer消费到MySQL数据库中导入的数据)
{"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":15,"data":{"id":17,"name":"辽宁","region_id":"3","area_code":"210000","iso_code":"CN-21"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":16,"data":{"id":18,"name":"陕西","region_id":"7","area_code":"610000","iso_code":"CN-61"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":17,"data":{"id":19,"name":"甘肃","region_id":"7","area_code":"620000","iso_code":"CN-62"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":18,"data":{"id":20,"name":"青海","region_id":"7","area_code":"630000","iso_code":"CN-63"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":19,"data":{"id":21,"name":"宁夏","region_id":"7","area_code":"640000","iso_code":"CN-64"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":20,"data":{"id":22,"name":"新疆","region_id":"7","area_code":"650000","iso_code":"CN-65"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":21,"data":{"id":23,"name":"河南","region_id":"4","area_code":"410000","iso_code":"CN-41"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":22,"data":{"id":24,"name":"湖北","region_id":"4","area_code":"420000","iso_code":"CN-42"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":23,"data":{"id":25,"name":"湖南","region_id":"4","area_code":"430000","iso_code":"CN-43"}} {"database":"ds_pub","table":"base_province","type":"insert","ts":1668443406,"xid":1326,"xoffset":24,"data":{"id":26,"name":"广东","region_id":"5","area_code":"440000","iso_code":"CN-44"}}
2.1.2 端口日志数据抽取
编写Flume.conf配置文件,配置要采集的端口
[root@master conf]# vim /opt/module/flume/conf/read_socket_write_kafka.conf(出现错误问题) [root@master ~]# cd /opt/flume/conf [root@master conf]# ls flume-conf.properties.template flume-env.sh.template read_socket_write_kafka.conf(出现错误问题) flume-env.ps1.template log4j.properties socket_to_kafka.conf [root@master conf]# vim socket_to_kafka.conf
文件内容:
# # 命名这个代理上的组件 a1.sources = r1 a1.sinks = k1 a1.channels = c1 # #描述/配置源 a1.sources.r1.type = exec a1.sources.r1.command = nc master 26001 # # # 描述接收器 a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink a1.sinks.k1.kafka.bootstrap.servers = master:9092 a1.sinks.k1.kafka.topic = order a1.sinks.k1.kafka.producer.acks = 1 a1.sinks.k1.kafka.acks = 1 # 使用一个通道缓冲内存中的事件 a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # 将source和sink绑定到通道 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
启动flume采集端口日志: ------第二个master
[root@master conf]# flume-ng agent --name a1 --conf /opt/module/flume/conf/ --conf-file /opt/module/flume/conf/read_socket_write_kafka.conf -Dflume.root.logger=INFO,console flume-ng agent --name a1 --conf /opt/flume/conf/ --conf-file /opt/flume/conf/socket_to_kafka.conf-Dflume.root.logger=INFO,console
在kafka中创建topic: -----之前创建好
[root@master ~]# kafka-topics.sh --bootstrap-server master:9092 --partitions 1 --replication-factor 1 --create -topic order Created topic order. [root@master ~]#
开启kafka的console consumer:在命令行消费数据 -----第三个master可直接出现数据
[root@master ~]# kafka-console-consumer.sh --bootstrap-server master:9092,slave1:9092,slave2:9092 --from-beginning --topic order
开启日志端口脚本,往端口实时生成数据: - ----第一个master
[root@master ~]# cd /opt/data/ [root@master data]# ls ds_pub.sql order.sh order.txt [root@master data]# ll -a total 304 drwxr-xr-x 2 root root 57 Nov 15 00:53 . drwxr-xr-x. 5 root root 49 Nov 15 00:52 .. -rw-r--r-- 1 root root 187698 Nov 15 00:53 ds_pub.sql -rw-r--r-- 1 root root 72 Nov 15 00:52 order.sh -rw-r--r-- 1 root root 115027 Nov 15 00:53 order.txt [root@master data]# chmod -R 777 /opt/data/ [root@master data]# sh order.sh | nc -lk 26001
结果:
nc master 26001
D,"8815","3632","14","Dior迪奥口红唇膏送女友老婆礼物生日礼物 烈艳蓝金999+888两支装礼盒","http://kAXllAQEzJWHwiExxVmyJIABfXyzbKwedeofMqwh","496","2","2020-4-26 18:55:16","2402","56" I, D,"8816","3633","14","Dior迪奥口红唇膏送女友老婆礼物生日礼物 烈艳蓝金999+888两支装礼盒","http://kAXllAQEzJWHwiExxVmyJIABfXyzbKwedeofMqwh","496","1","2020-4-26 18:55:16","2401", I, D,"8817","3634","12","联想(Lenovo)拯救者Y7000 英特尔酷睿i7 2019新款 15.6英寸发烧游戏本笔记本电脑(i7-9750H 8GB 512GB SSD GTX1650 4G 高色域","http://kAXllAQEzJWHwiExxVmyJIABfXyzbKwedeofMqwh","6699","2","2020-4-26 18:55:16","2401", I, D,"8818","3635","4","小米Play 流光渐变AI双摄 4GB+64GB 梦幻蓝 全网通4G 双卡双待 小水滴全面屏拍照游戏智能手机","http://SXlkutIjYpDWWTEpNUiisnlsevOHVElrdngQLgyZ","1442","1","2020-4-26 18:55:16","2402","33" I, D,"8819","3636","3","小米(MI)电视 55英寸曲面4K智能WiFi网络液晶电视机4S L55M5-AQ 小米电视4S 55英寸 曲面","http://LzpblavcZQeYEbwbSjsnmsgAjtpudhDradqsRgdZ","3100","3","2020-4-26 18:55:16","2401", I, D,"8820","3637","15","迪奥(Dior)烈艳蓝金唇膏 口红 3.5g 999号 哑光-经典正红","http://kAXllAQEzJWHwiExxVmyJIABfXyzbKwedeofMqwh","252","2","2020-4-26 18:55:16","2401",
reboot new 重启
15日上9点30分完成 从10开始即可
2. 离线数据抽取
教程参考:
hive(19-86):尚硅谷大数据Hive教程(基于hive3.x丨hive3.1.2)_哔哩哔哩_bilibili
sparkSQL(153-184):尚硅谷大数据Spark教程从入门到精通_哔哩哔哩_bilibili
从mysql抽取数据到hive,使用的是spark,spark的开发工具是idea
在idea创建maven项目:
根据情况设置:maven的阿里云的源,加快maven的下载速度:
在idea项目中打开maven的settings.xml文件:
在settings.xml内部增加下面选项:
<mirror> <id>nexus-aliyun</id> <mirrorOf>central</mirrorOf> <name>Nexus aliyun</name> <url>http://maven.aliyun.com/nexus/content/groups/public</url> </mirror>
如果没有settings.xml文件,可以自己新建,并写入以下配置:
<mirrors> <!-- mirror | Specifies a repository mirror site to use instead of a given repository. The repository that | this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used | for inheritance and direct lookup purposes, and must be unique across the set of mirrors. | <mirror> <id>mirrorId</id> <mirrorOf>repositoryId</mirrorOf> <name>Human Readable Name for this Mirror.</name> <url>http://my.repository.com/repo/path</url> </mirror> --> <mirror> <id>nexus-aliyun</id> <mirrorOf>central</mirrorOf> <name>Nexus aliyun</name> <url>http://maven.aliyun.com/nexus/content/groups/public</url> </mirror> </mirrors>
创建完成项目后,配置项目的pom文件:
<properties> <scala.version>2.12</scala.version> <mysqlconnect.version>5.1.47</mysqlconnect.version> <spark.version>3.1.1</spark.version> <hive.version>2.3.6</hive.version> </properties> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>${mysqlconnect.version}</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>2.3.7</version> </dependency> </dependencies>
将hive的hive-site.xml文件拷贝到项目的resources目录下:
如果hadoop的windos项目未配置,需要配置hadoop的windows相关内容:
参考:Windows配置本地Hadoop运行环境 - 走看看
配置完成以后,开始spark项目相关编写:
在scala目录下创建edu.jl.ods package, 在edu.jl.ods中创建hello.scala文件,
在文件中选择scala版本,如果没有的话在idea的plugins下安装scala的插件:
在hello.scala中写入以下内容,测试scala环境是否能正常运行:
package edu.jl.ods object hello { def main(args: Array[String]): Unit = { println("hello") } }
运行结果:
如果scala版本有误,可以在项目的structure配置中重新选择scala版本:
spark读取MySQL数据库ds_pub,表order_info;
在edu.jl.ods中创建SparkReadMySQL:
在文件内写入以下代码:
package edu.jl.ods import org.apache.spark.sql.SparkSession import java.util.Properties object SparkReadMySQL { def main(args: Array[String]): Unit = { // 构建sparkSession val sparkSession: SparkSession = SparkSession.builder() .master("local[*]") .appName("SparkReadMySQL") .getOrCreate() // 设置连接mysql相关配置 val MYSQLDBURL:String = "jdbc:mysql://192.168.1.100:3306/ds_pub?useUnicode=true&characterEncoding=utf-8&useSSL=false" val properties = new Properties() properties.setProperty("user","root") properties.setProperty("password","123456") properties.setProperty("driver","com.mysql.jdbc.Driver") sparkSession.read.jdbc(MYSQLDBURL,"order_info",properties).createTempView("order_info") val selectId = "select id,consignee from order_info" sparkSession.sql(selectId).show() } }
spark读取mysql并写入hive:
在edu.jl.ods中创建SparkReadMySQLToHive.scala:
package edu.jl.ods import org.apache.spark.sql.SparkSession import java.util.Properties object SparkReadMySQLtoHive { def main(args: Array[String]): Unit = { // 设置hadoop的用户名为root System.setProperty("HADOOP_USER_NAME","root") // 构建sparkSession val sparkSession: SparkSession = SparkSession.builder() .master("local[*]") .config("spark.sql.warehouse.dir", "hdfs://192.168.1.100:9000/user/hive/warehouse/") .appName("SparkReadMySQL") .enableHiveSupport() .getOrCreate() // 设置连接mysql相关配置 val MYSQLDBURL: String = "jdbc:mysql://192.168.1.100:3306/ds_pub?useUnicode=true&characterEncoding=utf-8&useSSL=false" // MySQL properties配置 val properties = new Properties() properties.setProperty("user", "root") properties.setProperty("password", "123456") properties.setProperty("driver", "com.mysql.jdbc.Driver") // spark 读取mysqljdbc配置 读取mysql的 order_info 这个表 并创建临时视图 order_info sparkSession.read.jdbc(MYSQLDBURL, "order_info", properties).createTempView("order_info") // 从视图order_info 中查询数据 val selectId = "select * from order_info" // 在hive的ods库中创建order_info表 val create_order_info: String = """create table ods.order_info( | id int, | consignee string, | consignee_tel string, | final_total_amount string, | order_status string, | user_id string, | delivery_address string, | order_comment string, | out_trade_no string, | trade_body string, | create_time date, | operate_time date, | expire_time date, | tracking_no String, | parent_order_id int, | img_url string, | province_id int, | benefit_reduce_amount string, | original_total_amount string, | feight_fee string |) row format delimited fields terminated by ',' stored as textfile; |""".stripMargin // 将从order_info视图中查到的数据加载到 hive的ods库的order_info表中 val loadDataToHive: String = """ |insert overwrite ods.order_info |select * from order_info |""".stripMargin // 从hive的ods库的order_info 查询数据(验证数据是否写入到hive中了) val ods_order_info = "select * from ods.order_info" // 在sparkSession中执行上面的sql语句 // sparkSession.sql(create_order_info).show() // sparkSession.sql("desc ods.order_info").show() sparkSession.sql(loadDataToHive) sparkSession.sql("select * from ods.order_info limit 5").show() // sparkSession.sql(selectId).show() // sql运行完毕关闭sparkSession连接 sparkSession.close() } }
写入hive的分区表,分区为昨天的日期:
在edu.jl.ods中创建SparkReadMySQLToHivePartitionTable.scala:
package edu.jl.ods import org.apache.spark.sql.SparkSession import java.util.Properties object SparkReadMySQLtoHivePartitionTable { def main(args: Array[String]): Unit = { // 设置hadoop的用户名为root System.setProperty("HADOOP_USER_NAME","root") // 构建sparkSession val sparkSession: SparkSession = SparkSession.builder() .master("local[*]") .config("spark.sql.warehouse.dir", "hdfs://192.168.1.100:9000/user/hive/warehouse/") .appName("SparkReadMySQL") .enableHiveSupport() .getOrCreate() // 设置连接mysql相关配置 val MYSQLDBURL: String = "jdbc:mysql://192.168.1.100:3306/ds_pub?useUnicode=true&characterEncoding=utf-8&useSSL=false" // MySQL properties配置 val properties = new Properties() properties.setProperty("user", "root") properties.setProperty("password", "123456") properties.setProperty("driver", "com.mysql.jdbc.Driver") // spark 读取mysqljdbc配置 读取mysql的 order_info 这个表 并创建临时视图 order_info sparkSession.read.jdbc(MYSQLDBURL, "order_info", properties).createTempView("order_info") // 从视图order_info 中查询数据 val selectId = "select * from order_info" // 在hive的ods库中创建order_info表 val create_order_info: String = """create table ods.order_info_par( | id int, | consignee string, | consignee_tel string, | final_total_amount string, | order_status string, | user_id string, | delivery_address string, | order_comment string, | out_trade_no string, | trade_body string, | create_time date, | operate_time date, | expire_time date, | tracking_no String, | parent_order_id int, | img_url string, | province_id int, | benefit_reduce_amount string, | original_total_amount string, | feight_fee string |) partitioned by (etl_date string) row format delimited fields terminated by ',' stored as textfile; |""".stripMargin // 将从order_info视图中查到的数据加载到 hive的ods库的order_info_par表中 val loadDataToHive: String = """ |insert overwrite ods.order_info_par partition(etl_date="20221114") |select * from order_info |""".stripMargin // 从hive的ods库的order_info 查询数据(验证数据是否写入到hive中了) val ods_order_info = "select * from ods.order_info_par limit 5" // 在sparkSession中执行上面的sql语句 // sparkSession.sql(create_order_info).show() // sparkSession.sql("desc ods.order_info").show() sparkSession.sql(loadDataToHive) sparkSession.sql(ods_order_info).show() // sparkSession.sql(selectId).show() // sql运行完毕关闭sparkSession连接 sparkSession.close() } }
结果查询分区表的分区情况:
hive (ods)> show partitions order_info_par; OK partition etl_date=20221114 Time taken: 0.048 seconds, Fetched: 1 row(s) hive (ods)>
查看分区表的前五行数据:
hive (ods)> select * from order_info_par limit 5; OK order_info_par.id order_info_par.consignee order_info_par.consignee_tel order_info_par.final_total_amount order_info_par.order_status order_info_par.user_id order_info_par.delivery_address order_info_par.order_comment order_info_par.out_trade_no order_info_par.trade_body order_info_par.create_time order_info_par.operate_time order_info_par.expire_time order_info_par.tracking_no order_info_par.parent_order_id order_info_par.img_url order_info_par.province_id order_info_par.benefit_reduce_amount order_info_par.original_total_amount order_info_par.feight_fee order_info_par.etl_date 3443 严致 13207871570 1449.00 1005 2790 第4大街第5号楼4单元464门 描述345855 214537477223728 小米Play 流光渐变AI双摄 4GB+64GB 梦幻蓝 全网通4G 双卡双待 小水滴全面屏拍照游戏智能手机等1件商品 2020-04-25 2020-04-26 2020-04-25 NULL NULL http://img.gmall.com/117814.jpg 20 0.00 1442.00 7.00 20221114 3444 慕容亨 13028730359 17805.00 1005 2015 第9大街第26号楼3单元383门 描述948496 226551358533723 Apple iPhoneXSMax (A2104) 256GB 深空灰色 移动联通电信4G手机 双卡双待等2件商品 2020-04-25 2020-04-26 2020-04-25 NULL NULL http://img.gmall.com/353392.jpg 11 0.00 17800.00 5.00 20221114 3445 姚兰凤 13080315675 16180.00 1005 8263 第5大街第1号楼7单元722门 描述148518 754426449478474 联想(Lenovo)拯救者Y7000 英特尔酷睿i7 2019新款 15.6英寸发烧游戏本笔记本电脑(i7-9750H 8GB 512GB SSD GTX1650 4G 高色域等3件商品 2020-04-25 2020-04-26 2020-04-25 NULLNULL http://img.gmall.com/478856.jpg 26 3935.00 20097.00 18.00 20221114 3446 柏锦黛 13487267342 4922.00 1005 7031 第17大街第40号楼2单元564门 描述779464 262955273144195 十月稻田 沁州黄小米 (黄小米 五谷杂粮 山西特产 真空装 大米伴侣 粥米搭档) 2.5kg等4件商品 2020-04-25 2020-04-26 2020-04-25 NULL NULL http://img.gmall.com/144444.jpg 30 0.00 4903.00 19.00 20221114 3447 计娴瑾 13208002474 6665.00 1005 5903 第4大街第25号楼6单元338门 描述396659 689816418657611 荣耀10青春版 幻彩渐变 2400万AI自拍 全网通版4GB+64GB 渐变蓝 移动联通电信4G全面屏手机 双卡双待等3件商品 2020-04-25 2020-04-25 2020-04-25 NULL NULL http://img.gmall.com/793265.jpg 29 0.00 6660.00 5.00 20221114 Time taken: 0.145 seconds, Fetched: 5 row(s) hive (ods)>
实时数据分析
4.1 Linux端环境准备
需要开启端口生成数据,在/opt/data目录下执行端口开启的命令:
[root@master data]# pwd /opt/data [root@master data]# sh order.sh | nc -lk 26001
如果端口被占用(Ncat: bind to :::26001: Address already in use. QUITTING.),需要通过netstat -ntlp命令查看占用端口的进程号,通过kill -9 加进程号的方式关闭对应进程:
[root@master data]# netstat -ntlp [root@master data]# kill -9 xxx
kill掉占用端口的进程后重新启动数据生成脚本:
[root@master data]# sh order.sh | nc -lk 26001
需要配置并启动flume:
[root@master ~]# flume-ng agent --name a1 --conf /opt/module/flume/conf/ --conf-file /opt/module/flume/conf/socket_to_kafka.conf -Dflume.root.logger=INFO,console
socket_to_kafka.conf配置内容:
# # 命名这个代理上的组件 a1.sources = r1 a1.sinks = k1 a1.channels = c1 #描述/配置源 a1.sources.r1.type = exec a1.sources.r1.command = nc master 26001 # # 描述接收器 a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink a1.sinks.k1.kafka.bootstrap.servers = master:9092 a1.sinks.k1.kafka.topic = order a1.sinks.k1.kafka.producer.acks = 1 # a1.sinks.k1.type = logger # 使用一个通道缓冲内存中的事件 a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # 将source和sink绑定到通道 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
如果flume 在后台仍然运行 可以通过jps查看进程号方式找到进程号 并kill -9 关闭
[root@master conf]# kill -9 12618 [root@master conf]# jps 10720 Kafka 3315 NodeManager 2901 SecondaryNameNode 3157 ResourceManager 11669 ConsoleConsumer 3688 QuorumPeerMain 8008 RunJar 15080 Jps 2489 NameNode 2653 DataNode 11279 ConsoleConsumer [1]+ Killed flume-ng agent --name a1 --conf /opt/module/flume/conf/ --conf-file /opt/module/flume/conf/socket_to_kafka.conf -Dflume.root.logger=INFO,console (wd: ~) (wd now: /opt/module/flume/conf) [root@master conf]#
需要修改kafka配置,并重启kafka:
修改kafka/config/server.properties配置:
[root@master config]# vim /opt/module/kafka/config/server.properties
修改内容:
advertised.listeners=PLAINTEXT://192.168.1.100:9092
重启kafka:
[root@master config]# jps 10720 Kafka 15106 Application 3315 NodeManager 2901 SecondaryNameNode 3157 ResourceManager 11669 ConsoleConsumer 3688 QuorumPeerMain 8008 RunJar 2489 NameNode 15370 Jps 2653 DataNode 11279 ConsoleConsumer [root@master config]# kill -9 10720 [root@master config]# /opt/module/kafka/bin/kafka-server-start.sh -daemon /opt/module/kafka/config/server.properties
注:三台linux都要执行上述操作;
在master节点开启kafka-console-consumer.sh接收数据,验证配置是否成功:
[root@master config]# kafka-console-consumer.sh --bootstrap-server master:9092,slave1:9092,slave2:9092 --topic order I,"3578","张娣","13586813843","26718","1005","8708","第12大街第14号楼3单元391门","描述732297","347956132393657","Apple iPhoneXSMax (A2104) 256GB 深空灰色 移动联通电信4G手机 双卡双待等3件商品","2020-4-26 18:55:01","2020-4-26 18:59:01","2020-4-26 19:10:01",,,"http://img.gmall.com/991685.jpg","21","0","26700","18" D,"8756","3573","1","荣耀10青春版 幻彩渐变 2400万AI自拍 全网通版4GB+64GB 渐变蓝 移动联通电信4G全面屏手机 双卡双待","http://AOvKmfRQEBRJJllwCwCuptVAOtBBcIjWeJRsmhbJ","2220","3","2020-4-26 18:55:01","2401", I,"3579","伏彩春","13316165573","7374","1005","1515","第16大街第14号楼7单元619门","描述349867","152178625166735","荣耀10 GT游戏加速 AIS手持夜景 6GB+64GB 幻影蓝全网通 移动联通电信等3件商品","2020-4-26 18:55:01","2020-4-26 19:03:49","2020-4-26 19:10:01",,,"http://img.gmall.com/648845.jpg","15","0","7356","18"
4.2 Windows端环境准备
创建Flink实时数据分析的项目:
在项目中引入Flink开发所需的pom依赖:
<properties> <flink.version>1.14.0</flink.version> <scala.version>2.12</scala.version> <hive.version>2.3.7</hive.version> <mysqlconnect.version>5.1.47</mysqlconnect.version> <!-- <hdfs.version>2.7.2</hdfs.version>--> <spark.version>3.0.0</spark.version> </properties> <dependencies> <!-- <dependency>--> <!-- <groupId>org.scalanlp</groupId>--> <!-- <artifactId>jblas</artifactId>--> <!-- <version>1.2.1</version>--> <!-- </dependency>--> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-runtime-web_2.12</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.12</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-scala_2.12</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka_2.12</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner_2.12</artifactId> <version>${flink.version}</version> </dependency> <!-- <dependency>--> <!-- <groupId>org.apache.flink</groupId>--> <!-- <artifactId>flink-table-planner-blink_2.12</artifactId>--> <!-- <version>${flink.version}</version>--> <!-- </dependency>--> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-scala-bridge_2.12</artifactId> <version>${flink.version}</version> <!--<scope>provided</scope>--> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-common</artifactId> <version>${flink.version}</version> <type>pom</type> </dependency> <!-- <dependency>--> <!-- <groupId>org.apache.flink</groupId>--> <!-- <artifactId>flink-jdbc_2.12</artifactId>--> <!-- <version>${flink.version}</version>--> <!-- </dependency>--> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-csv</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.bahir</groupId> <artifactId>flink-connector-redis_2.12</artifactId> <version>1.1.0</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-avro</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-filesystem_2.12</artifactId> <version>1.10.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner-blink_2.12</artifactId> <version>1.10.2</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>${hive.version}</version> <scope>provided</scope> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>${mysqlconnect.version}</version> </dependency> <!--spark处理离线--> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-mllib_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-core</artifactId> <version>2.8.2</version> </dependency> <!-- https://mvnrepository.com/artifact/org.dom4j/dom4j --> <dependency> <groupId>org.dom4j</groupId> <artifactId>dom4j</artifactId> <version>2.1.3</version> </dependency> </dependencies> <build> <!--<sourceDirectory>src/main/scala</sourceDirectory>--> <resources> <resource> <directory>src/main/scala</directory> </resource> <resource> <directory>src/main/java</directory> </resource> </resources> <plugins> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.2.2</version> <configuration> <recompileMode>incremental</recompileMode> </configuration> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <configuration> <source>8</source> <target>8</target> </configuration> </plugin> </plugins> </build>
导包异常可以检查maven中是否有对应的jar包,如果确认有的话可以换一下maven配置
在windows关闭防火墙(防止端口数据被防火墙拦截)
4.3 实时数据分析代码开发
参考教程(1-114):【尚硅谷】Flink1.13教程(Scala版)_哔哩哔哩_bilibili
创建package edu.jl,并在package下创建Hello.scala:
在hello.scala中设置scala的sdk:
package edu.jl object hello { def main(args: Array[String]): Unit = { println("hello") } }
执行hello,确认scala环境是可用的
在edu.jl下创建readKafka.scala:
在readKafka.scala中写入:
package edu.jl import org.apache.flink.api.common.serialization.SimpleStringSchema import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment import org.apache.flink.streaming.api.scala._ import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer import java.util.Properties object readKafka { def main(args: Array[String]): Unit = { // 获取flink流式执行环境 val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment env.setParallelism(1) // 设置代码并行度为1 //设置kafka的properties val properties = new Properties() properties.setProperty("bootstrap.servers","192.168.1.100:9092") // 设置Flink的数据源为kafka,创建数据流 val ds: DataStream[String] = env.addSource(new FlinkKafkaConsumer[String]("order", new SimpleStringSchema(), properties).setStartFromLatest()) // 打印从kafka获取到的数据 ds.print() // 执行流式环境 env.execute() } }
重启端口和flume后,运行代码文件,结果从idea的输出框中可以看到信息,证明windows的flink代码环境可以消费到数据:
4.4 统计SKU商品的总额,将值存入Redis
启动redis:
[root@master ~]# redis-server /root/my_redis.conf
创建readKafkaToRedis.scala文件,并在文件内写入:
package edu.jl import org.apache.flink.api.common.serialization.SimpleStringSchema import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, _} import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer import org.apache.flink.streaming.connectors.redis.RedisSink import org.apache.flink.streaming.connectors.redis.common.config.FlinkJedisPoolConfig import org.apache.flink.streaming.connectors.redis.common.mapper.{RedisCommand, RedisCommandDescription, RedisMapper} import java.util.Properties object readKafkaToRedis { def main(args: Array[String]): Unit = { // 获取flink流式执行环境 val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment env.setParallelism(1) // 设置代码并行度为1 //设置kafka的properties val properties = new Properties() properties.setProperty("bootstrap.servers","192.168.1.100:9092") // 设置Flink的数据源为kafka,创建数据流 val ds: DataStream[String] = env.addSource(new FlinkKafkaConsumer[String]("order", new SimpleStringSchema(), properties).setStartFromLatest()) // 过滤掉表头的数据 生成新的数据流filter_ds val filter_ds: DataStream[String] = ds.filter(!_.split(",")(1).equals("\"id\"")) // 从filter_ds中提取出需要分析的数据 // 取出 sku_id相同的 sku_price 的和 val map_ds = filter_ds.map(line => { val words: Array[String] = line.split(",") val dat: String = words(0) if (dat.trim == "D") { val sku_price: Double = words(6).split('"')(1).toDouble val sku_id: String = words(3).split('"')(1) (sku_id, sku_price) } else { ("报错", 1.0) } }).keyBy(_._1).sum(1) map_ds.print() map_ds.addSink(new RedisSink[(String, Double)](flinkJedisPoolConfig,new MyRedisMapper)) // 执行流式环境 env.execute() } class MyRedisMapper extends RedisMapper[Tuple2[String,Double]] { // 定义保存数据到redis的命令,HSET 表名 key value override def getCommandDescription: RedisCommandDescription = new RedisCommandDescription(RedisCommand.SET) // 将id转换为key override def getKeyFromData(t: (String, Double)): String = t._1 // 将price值指定为value override def getValueFromData(t: (String, Double)): String = t._2.toString } private val flinkJedisPoolConfig: FlinkJedisPoolConfig = new FlinkJedisPoolConfig.Builder() .setHost("192.168.1.100") .setPort(6379) .build() }
代码运行结果:
在Redis中查询(获取所有的key keys 星号
,获取具体的值 get key
):
keys * // 获取所有的key get key // 获取对应key的值 例如 192.168.1.100:6379> get 4 "1442.0"
结果
192.168.1.100:6379> keys * 1) "10" 2) "\"11\"" 3) "\"6\"" 4) "\"14\"" 5) "6" 6) "\"7\"" 7) "\"4\"" 8) "\"8\"" 9) "\"15\"" 10) "\"13\"" 11) "\"12\"" 12) "\"9\"" 13) "\"2\"" 14) "\"1\"" 15) "\"5\"" 16) "15" 17) "\"16\"" 18) "\"10\"" 19) "\xe6\x8a\xa5\xe9\x94\x99" 20) "14" 21) "3" 192.168.1.100:6379> get 3 "3100.0" 192.168.1.100:6379> get 14 "496.0" 192.168.1.100:6379> get 4 "1442.0"
至此,本篇文章就已经全部结束了,感谢大家的观看。
已许久许久许久……未更新。
忙于考试。
加油加油加油!!!
/(ㄒoㄒ)/~~
🥇Summary
上述内容就是此次 大数据开发相关组件部署及数据抽取 的全部内容了,感谢大家的支持,相信在很多方面存在着不足乃至错误,希望可以得到大家的指正。🙇(ง •_•)ง
我非轻舟
2024年第二期,继续加油!!!
希望大家有好的意见或者建议,欢迎私信,一起加油
以上就是本篇文章的全部内容了
~ 关注我,点赞博文~ 每天带你涨知识!
1.看到这里了就 [点赞+好评+收藏] 三连 支持下吧,你的「点赞,好评,收藏」是我创作的动力。
2.关注我 ~ 每天带你学习 :各种前端插件、3D炫酷效果、图片展示、文字效果、以及整站模板 、HTML模板 、C++、数据结构、Python程序设计、Java程序设计、爬虫等! 「在这里有好多 开发者,一起探讨 前端 开发 知识,互相学习」!
3.以上内容技术相关问题可以相互学习,可 关 注 ↓公 Z 号 获取更多源码 !
获取源码?私信?关注?点赞?收藏?WeChat?PDF PDF PDF
👍+✏️+⭐️+🙇
有需要源码的小伙伴可以 关注下方微信公众号 " Enovo开发工厂 " ,一起交流啊!!!PDF PDF PDF