第一次发,有点水。
最近在用flink做一些数据分析,下面是一些安装步骤
- 共三台服务器:
Jdk11 Kafka Zookeeper Flink Es |
Jdk11 Kafka Zookeeper Flink Es Redis |
Jdk11 flume kafka zookeeper flink(主) ES |
- 将所有压缩包根据上图上传到相应的服务器上;
- 解压所有压缩包
- 关闭防火墙
查看防火墙运行状态:firewall-cmd --state
关闭: systemctl stop firewalld
查看状态:systemctl status firewalld
开机禁用: systemctl disable firewalld
- 配置服务器域名(根据实际IP配置,后续组件的配置文件中涉及域名的,根据实际域名填写)
vi /etc/hosts ##添加格式如下,有几台添加几台
- 配置jdk环境变量
修改/etc/hosts文件
vi /etc/profile ##添加如下3行
export SOC_JAVA=/soc/jdk1.8.0_211/bin/java
export JAVA_HOME=/soc/jdk1.8.0_211
export PATH=$JAVA_HOME/bin:$PATH
##添加完成,保存退出之后执行如下命令
source /etc/profile
##执行完成后,可通过java –verson命令验证是否安装完成
- 为所有文件赋予775权限:chmod –R 775 /soc/
- 安装ES
- 给elasticsearch文件包赋予权限:
chmod 777 -R elasticsearch
-
- 修改elasticsearch相关配置:
(1)启动提示警告:max file descriptors [4096] for elasticsearch process likely too low, consider increasing to at least [65536]
vi /etc/security/limits.conf 添加以下4行
* soft nofile 65536
* hard nofile 131072
* soft nproc 2048
* hard nproc 4096
(2)vi /etc/sysctl.conf
vm.max_map_count=655360
##添加完成执行sysctl -p
(3)vi /soc/elasticsearch/config/jvm.options,找到设置内存的位置,并修改如下:
-Xms32g
-Xmx32g
(4)新建用户elsearch:elasticsearch
useradd -d /soc/elasticsearch -m elsearch
(5)vi/soc/elasticsearch/config/elasticsearch.yml
a).找到cluster.name并设置为如下:
cluster.name: LAB
b).找到node.name并设置为如下:
node.name: MD-1
c).找到path.logs和path.data并设置为如下:
path.logs: /soc/elasticsearch/logs
path.data: /soc/elasticsearch/data
d).找到network.host并设置为如下(本地服务器IP):
network.host: soc-4
network.publish_host: soc-4
e).discovery.zen.ping.unicast.hosts: ["127.0.0.1","127.0.0.2","127.0.0.3"]
f).discovery.zen.minimum_master_nodes: 2
(6)启动es: (jps后进程名为Elasticsearch)
su - elsearch
cd /soc/elasticsearch/bin
./elasticsearch &
- 安装zookeeper:
- 修改/soc/zookeeper/conf/zoo.cfg
(1)找到dataDir和dataLogDir,修改为:
dataDir=/soc/zookeeper/data
dataLogDir=/soc/zookeeper/logs
(2)找到或在文末添加如下:
server.1=soc-3:2888:3888
server.2=soc-4:2888:3888
server.3=soc-5:2888:3888
- 启动zookeeper
nohup /soc/zookeeper/bin/zkServer.sh start >/dev/null 2>&1 &
- 修改/soc/zookeeper/data/ myid,若没有即创建
文件内容写:1(根据10-a-(2)设置的,1对应那台服务器那台就设置为1,其余2,3同理)
- 启动zookeeper:(jps后进程名为QuorumPeerMain)
nohup /soc/zookeeper/bin/zkServer.sh start >/dev/null 2>&1 &
- 安装kafka:
- 修改/soc/kafka/config/server.properties
(1)找到broker.id,修改为:
broker.id=1
(2)找到zookeeper.connect,修改ip为kafka所在的ip(多个IP逗号分隔):
zookeeper.connect= soc-3:2181,soc-3:2181
- 修改/soc/kafka/config/ zookeeper.properties
(1)找到dataDir,修改为:
dataDir=/soc/kafka/data/zookeeper
- 启动kafka:(jps后进程名为Kafka)
nohup /soc/kafka/bin/kafka-server-start.sh /soc/kafka/config/server.properties >/dev/null 2>&1 &
- 创建kafka主题(多个zookeeper ip逗号分隔):
/soc/kafka/bin/kafka-topics.sh --create --replication-factor 3 --partitions 6 --topic md01 --zookeeper soc-3:2181,soc-3:2181
- 安装flume:
- 复制/soc/flume/conf/ flume-conf.properties,并命名为flume-conf-log.properties
文中内容替换如下:其中sinks.k1.brokerList的IP为kafka所在IP,多个IP逗号分隔,a1.sinks.k1.topic为kafka创建的syslog主题名称。
a1.sources=r1
a1.sinks=k1
a1.channels=c1
#Describe/configure the source
a1.sources.r1.type=syslogudp
a1.sources.r1.channels=c1
a1.sources.r1.host=soc-3
a1.sources.r1.port=514
a1.sinks.k1.type=org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic=md01
a1.sinks.k1.brokerList=127.0.0.1:9092, 127.0.0.2:9092, 127.0.0.3:9092
a1.sinks.k1.batchSize=2
a1.sinks.k1.requiredAcks=1
a1.sinks.k1.channel=c1
a1.channels.c1.type=memory
a1.channels.c1.capacity= 1000000
a1.channels.c1.transactionCapacity= 100000
- 复制/soc/flume/conf/ flume-conf.properties,并命名为flume-conf-flow.properties
文中内容替换如下:其中sinks.k1.brokerList的IP为kafka所在IP,多个IP逗号分隔,a1.sinks.k1.topic为kafka创建的flow主题名称。
a2.sources=r2 r3 r4 r5
a2.sinks=k2
a2.channels=c2
a2.sources.r2.type=org.apache.flume.source.FlowSource
a2.sources.r2.channels=c2
a2.sources.r2.host=0.0.0.0
a2.sources.r2.port=9996
a2.sources.r2.ip=127.0.0.1
a2.sources.r2.rate=192.168.140.1-1-1,10.2.1.169-1-1
a2.sources.r2.bufferSize=102400
a2.sources.r3.type=org.apache.flume.source.FlowSource
a2.sources.r3.channels=c2
a2.sources.r3.host=0.0.0.0
a2.sources.r3.port=9995
a2.sources.r3.ip=127.0.0.1
a2.sources.r3.rate=192.168.140.1-1-1,10.2.1.169-1-1
a2.sources.r3.bufferSize=102400
a2.sources.r4.type=org.apache.flume.source.FlowSource
a2.sources.r4.channels=c2
a2.sources.r4.host=0.0.0.0
a2.sources.r4.port=9991
a2.sources.r4.ip=127.0.0.1
a2.sources.r4.rate=192.168.140.1-1-1,10.2.1.169-1-1
a2.sources.r4.bufferSize=102400
a2.sources.r5.type=org.apache.flume.source.FlowSource
a2.sources.r5.channels=c2
a2.sources.r5.host=0.0.0.0
a2.sources.r5.port=6343
a2.sources.r5.ip=127.0.0.1
a2.sources.r5.rate=192.168.140.1-1-1,10.2.1.169-1-1
a2.sources.r5.bufferSize=102400
a2.sinks.k2.type=org.apache.flume.sink.kafka.KafkaSink
a2.sinks.k2.topic=flow01
a2.sinks.k2.brokerList=soc-3:9092,soc-4:9092,soc-2:9092
a2.sinks.k2.batchSize=10
a2.sinks.k2.requiredAcks=1
a2.sinks.k2.channel=c2
a2.channels.c2.type=memory
a2.channels.c2.capacity=10000000
a2.channels.c2.transactionCapacity=10000
- 修改/soc/flume/conf/log4j.properties
找到flume.root.logger,flume.log.dir,flume.log.file修改如下:
flume.root.logger=INFO,LOGFILE
flume.log.dir=/soc/flume/logs
flume.log.file=flume.log
- 修改/soc/flume/bin/flume-ng
找到JAVA_OPTS,并修改如下:
JAVA_OPTS="-Xmx2048m"
- 启动flume:(jps后进程名为Application)
nohup /soc/flume/bin/flume-ng agent -n a1 -c /soc/flume/conf -f /soc/flume/conf/flume-conf-log.properties >/dev/null 2>&1 &
nohup /soc/flume/bin/flume-ng agent -n a2 -c /soc/flume/conf -f /soc/flume/conf/flume-conf-flow.properties >/dev/null 2>&1 &
- 安装flink
- 修改/soc/flink/conf/master,添加flink主服务器的域名端口:
SOC-2:8081
- 修改/soc/flink/conf/slaves,添加flink从服务器的域名:
SOC-2
SOC-3
SOC-4
- 修改/soc/flink/conf/flink-conf.yaml,添加flink从服务器的域名:
jobmanager.rpc.address: SOC-2
taskmanager.numberOfTaskSlots: 4 //修改 32-4
parallelism.default: 4 //修改 16-4
- 停起flink(在master服务器上执行)
cd /soc/flink/bin
./start-cluster.sh
./stop-cluster.sh
访问webui http://10.176.62.42:8081/