Zookeeper、Kafka、Hadoop、Spark、Elasticsearch等开源组件部署手册

开源组件部署手册

Java

  • 安装

使用yum安装jdk:

sudo yum install -y java-1.8.0-openjdk-1.8.0.262
  • 配置JAVA等环境变量

路径:/etc/profile.d

java.sh

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-1.el7_9.x86_64 export PATH=$JAVA_HOME/bin:$PATH
  • 重新执行配置文件,将配置文件中的内容写入内存
source /etc/profile

zookeeper

  • 要求

安装有java开发工具,并且配好JAVA_HOME等变量 。

  • 安装

配置本地yum源,配置好后使用本地yum源安装对应版本zookeeper程序(需制定repo源)

yum install zookeeper-3.4.14  --enablerepo=[username]
  • 配置zookeeper

进入zookeeper的安装目录,conf为配置文件放置路径,进入配置文件目录对配置文件进行修改

cd /opt/zookeeper/conf/ 
sudo cp zoo_sample.cfg zoo.cfg 
sudo vim zoo.cfg 
  • 配置文件
    路径:/opt/zookeeper/conf
    zoo.cfg
# zookeeper的基础时间单位,心跳检测和最小session超时时间是tickTime的两倍
tickTime=2000
# The number of ticks that the initial synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5

# 快照存储路径
dataDir=/data/zookeeper

# WAL日志路径
dataLogDir=/data/zookeeper/datalog

# 客户端连接端口
clientPort=2181

# 允许的最大客户端连接数,可适当提升此值
maxClientCnxns=100
# 允许与客户端通信的最大session超时时间
maxSessionTimeout=120000

# 从版本3.4.0开始,zookeeper支持数据的自动清理,
# 在负载较高时会消耗资源,因此可以时使用操作系统的定时任务进行定期清理

# 当开启自动清理时,dataDir和dataLogDir保存的文件数量
autopurge.snapRetainCount=10
# 自动清理的间隔时间,单位是小时,默认值为0,表示不进行自动清理
autopurge.purgeInterval=48

#zookeeper集群配置,2888是Leader和Follower交互的端口,3888是端口Leader Election使用的端口
server.1=hostname-1:2888:3888
server.2=hostname-2:2888:3888
server.3=hostname-3:2888:3888

日志配置
路径:/opt/zookeeper/conf
log4j.properties

# Define some default values that can be overridden by system properties
# 默认使用的日志输出
zookeeper.root.logger=INFO, ROLLINGFILE
zookeeper.console.threshold=INFO
zookeeper.log.dir=/data/zookeeper/logs
zookeeper.log.file=zookeeper.log
zookeeper.log.threshold=DEBUG
zookeeper.tracelog.dir=/data/zookeeper/logs
zookeeper.tracelog.file=zookeeper_trace.log

#
# ZooKeeper Logging Configuration
#

# Format is "<default threshold> (, <appender>)+

# DEFAULT: console appender only
log4j.rootLogger=${zookeeper.root.logger}

# Example with rolling log file
#log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE

# Example with rolling log file and tracing
#log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE

#
# Log INFO level and above messages to the console
#
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold}
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n

#
# Add ROLLINGFILE to rootLogger to get log file output
#    Log DEBUG level and above messages to a log file
log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold}
log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file}

# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=100MB
# uncomment the next line to limit number of backup files
log4j.appender.ROLLINGFILE.MaxBackupIndex=10

log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n


#
# Add TRACEFILE to rootLogger to get log file output
#    Log DEBUG level and above messages to a log file
log4j.appender.TRACEFILE=org.apache.log4j.FileAppender
log4j.appender.TRACEFILE.Threshold=TRACE
log4j.appender.TRACEFILE.File=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file}

log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout
### Notice we are including log4j's NDC here (%x)
log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L][%x] - %m%n

JVM参数配置
路径:/opt/zookeeper/conf
java.env

JVMFLAGS='-Xms512M -Xmx512M'

zookeeper节点id:
路径:/data/zookeeper
myid

每个节点需要在当前集群中有不同的id值,id需与配置文件中内容对应 1
  • 命令:
mkdir /data/zookeeper -p echo 1 > /data/zookeeper/myid 
  • 启动zookeeper
# 通过systemd管理:
# 启动
sudo systemctl start zookeeper.service
# 查看状态
systemctl status zookeeper.service
# 停止
sudo systemctl stop zookeeper.service
# 设置开机自启
sudo systemctl enable zookeeper.service

kafka

  • 要求

已安装java并配置好JAVA_HOME等环境变量

  • 安装

配置本地yum源,配置好后使用本地yum源安装对应版本kafka程序(需制定repo源)

yum install kafka-2.11-1.1.0  --enablerepo=[username]
  • 配置文件
    路径:/opt/kafka/config
    server.properties
############################# Server Basics #############################

# 集群中的每个实例要配置属于自己的id,这里配置的是master节点
broker.id=1

############################# Socket Server Settings #############################

# 列出kafka监听的URI列表
listeners=PLAINTEXT://0.0.0.0:9092

# 对zookeeper发布的监听器
advertised.listeners=PLAINTEXT://172.20.3.247:9092

############################# Log Basics #############################

# log日志存储的目录,逗号分隔的多个路径,建议每个目录挂载到不同的磁盘上,能够提升读写性能,并且在1.1版本以及上可以支持故障转移
log.dirs=/data/kafka/data

# topic的默认分区数
num.partitions=3

# 在kafka重启恢复数据时使用的线程数,默认值为1
num.recovery.threads.per.data.dir=2

############################# Internal Topic Settings  #############################

# 默认的副本数量
default.replication.factor=3
# 默认的分区数量
num.partitions=3

############################# Log Flush Policy #############################

############################# Log Retention Policy #############################
# 定义数据的清理策略

# 设置数据的保存时间
log.retention.hours=168

# 限制在broker中保存的数据量,默认值为-1,不做限制
#log.retention.bytes=1073741824

############################# Zookeeper #############################

# 指定zookeeper的连接地址
zookeeper.connect=hostname-1:2181,hostname-2:2181,hostname-3:2181
# 等待zookeeper连接的超时时间
zookeeper.connection.timeout.ms=6000

############################# Group Coordinator Settings #############################

############################# User-defined Settings #############################

# 是否自动创建topic,目前没有需要自动创建topic的场景,设置为false,便于管理topic
auto.create.topics.enable=false 
# 当acks设置为all或者-1时,这个配置将规定消息写入节点的数量,如果没有达到指定的数量,生产者将会收到一个exception
min.insync.replicas=2
# 不允许不在ISR列表的副本节点选举为leader,否则可能导致数据丢失
unclean.leader.election.enable=false
# 默认值为true,会定期自动对leader partition进行重新选举,进行不必要的消耗。
auto.leader.rebalance.enable=false
# broker能够接受消息的最大字节数,默认值为1000012字节,在低于0.10.2的版本中需要对应修改消费者的fetch size,允许消费者能够拉取数据
message.max.bytes=5242880
# 要大于message.max.bytes的配置,否则broker之间无法同步数据
replica.fetch.max.bytes=10485760

命令

创建topic:

kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 3 --topic flume-sink

日志配置
路径:/opt/kafka/config
log4j.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Unspecified loggers and loggers with additivity=true output to server.log and stdout
# Note that INFO only applies to unspecified loggers, the log level of the child logger is used otherwise
log4j.rootLogger=INFO, stdout, kafkaAppender

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.kafkaAppender.File=${kafka.logs.dir}/server.log
log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.stateChangeAppender.File=${kafka.logs.dir}/state-change.log
log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.requestAppender.File=${kafka.logs.dir}/kafka-request.log
log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.cleanerAppender.File=${kafka.logs.dir}/log-cleaner.log
log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.controllerAppender.File=${kafka.logs.dir}/controller.log
log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

log4j.appender.authorizerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.authorizerAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.authorizerAppender.File=${kafka.logs.dir}/kafka-authorizer.log
log4j.appender.authorizerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.authorizerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n

# Change the line below to adjust ZK client logging
log4j.logger.org.apache.zookeeper=INFO

# Change the two lines below to adjust the general broker logging level (output to server.log and stdout)
log4j.logger.kafka=INFO
log4j.logger.org.apache.kafka=INFO

# Change to DEBUG or TRACE to enable request logging
log4j.logger.kafka.request.logger=WARN, requestAppender
log4j.additivity.kafka.request.logger=false

# Uncomment the lines below and change log4j.logger.kafka.network.RequestChannel$ to TRACE for additional output
# related to the handling of requests
#log4j.logger.kafka.network.Processor=TRACE, requestAppender
#log4j.logger.kafka.server.KafkaApis=TRACE, requestAppender
#log4j.additivity.kafka.server.KafkaApis=false
log4j.logger.kafka.network.RequestChannel$=WARN, requestAppender
log4j.additivity.kafka.network.RequestChannel$=false

log4j.logger.kafka.controller=TRACE, controllerAppender
log4j.additivity.kafka.controller=false

log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender
log4j.additivity.kafka.log.LogCleaner=false

log4j.logger.state.change.logger=TRACE, stateChangeAppender
log4j.additivity.state.change.logger=false

# Access denials are logged at INFO level, change to DEBUG to also log allowed accesses
log4j.logger.kafka.authorizer.logger=INFO, authorizerAppender
log4j.additivity.kafka.authorizer.logger=false

JVM参数配置
路径:/opt/kafka/config
java.env

KAFKA_HEAP_OPTS="-Xms4G -Xmx4G"

启动

# 通过systemd管理:
# 启动
sudo systemctl start kafka.service
# 查看状态
systemctl status kafka.service
# 停止
sudo systemctl stop kafka.service
# 设置开机自启
sudo systemctl enable kafka.service

hadoop

  • 要求

已安装jdk及ZooKeeper

  • 安装部署
  1. 创建hadoop用户,并为hadoop用户配置密码
$ useradd hadoop
$ passwd hadoop
  1. 创建Hadoop运行时所需文件夹
$ mkdir -p /data/hadoop/data/dfs/namenode
$ mkdir -p /data/hadoop/data/dfs/datanode
$ mkdir -p /data/hadoop/data/dfs/jn
$ mkdir -p /data/hadoop/data/tmp
  1. 下载并解压Hadoop安装包,并将解压后的安装包重命名
$ wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.7/hadoop-2.7.7.tar.gz
$ tar -zxvf hadoop-2.7.7.tar.gz -C /opt/
$ mv /opt/hadoop-2.7.7 /opt/hadoop
  1. 更改文件夹所属用户
$ chown -R hadoop /data/hadoop/
$ chown -R hadoop /opt/hadoop
  1. 配置环境变量:

在/etc/profile.d/路径下添加 hadoop.sh

$ vim /etc/profile.d/hadoop.sh

添加的内容如下:

export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/config
export HADOOP_PID_DIR=/var/run/hadoop
export HADOOP_LOG_DIR=/data/hadoop/logs
export HADOOP_YARN_HOME=$HADOOP_HOME
export YARN_CONF_DIR=$HADOOP_HOME/config
export YARN_LOG_DIR=/data/hadoop/logs
export YARN_PID_DIR=/var/run/hadoop
export HADOOP_YARN_USER=hadoop
export HADOOP_ROOT_LOGGER=DEBUG,console
export HADOOP_SECURITY_LOGGER=DEBUG,DRFAS
export HADOOP_DATANODE_OPTS="-Xms1G -Xmx1G"
export HADOOP_NAMENODE_OPTS="-Xms1G -Xmx1G"
export HADOOP_JOURNALNODE_OPTS="-Xms1G -Xmx1G"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
  1. 切换置hadoop用户
$ su hadoop

注意:从第6步开始,所有操作基于linux下的hadoop用户进行**

  1. 生成ssh秘钥
$ ssh-keygen  # 一直按回车即可
  1. 配置服务器之间ssh免密登录

tip:有多少台服务器部署Hadoop服务,就需要执行多少条 ssh-copy-id hadoop@<Hadoop服务所在机器hostname>

$ ssh-copy-id hadoop@hostname-1
$ ssh-copy-id hadoop@hostname-2
$ ssh-copy-id hadoop@hostname-3
  1. 配置配置Hadoop所需环境变量 (位于 $HADOOP_HOME/etc/hadoop/hadoop-env.sh)
$ vim $HADOOP_HOME/etc/hadoop/hadoop-env.sh

将如下内容添加至hadoop-env.sh中

export JAVA_HOME=/opt/java  # JAVA_HOME路径按实际情况进行填写

\10. 配置slaves节点hostname(位于 $HADOOP_HOME/etc/hadoop/slaves)

$ vim $HADOOP_HOME/etc/hadoop/slaves

将如下内容添加至slaves中

hostname-1 
hostname-2 
hostname-3
  1. 配置Hadoop HA(配置位于 $HADOOP_HOME/etc/hadoop/core-site.xml)
$ vim $HADOOP_HOME/etc/hadoop/core-site.xml

将如下配置文件替换至core-site.xml中

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://hadoop-namenode</value>
        </property>
        
         <!--指定hadoop临时目录, hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果hdfs-site.xml中不配 置namenode和datanode的存放位置,默认就放在这个路径中 -->
        <property>   
            <name>hadoop.tmp.dir</name>
            <value>/data/hadoop/data/tmp</value>
        </property>

         <!-- 指定zookeeper地址 -->
        <property>
            <name>ha.zookeeper.quorum</name>
            <value>hostname-1:2181,hostname-2:2181,hostname-3:2181</value>
        </property>

      <property>
		  <name>io.compression.codecs</name>
		  <value>org.apache.hadoop.io.compress.GzipCodec,
			org.apache.hadoop.io.compress.DefaultCodec,
			org.apache.hadoop.io.compress.BZip2Codec,
			org.apache.hadoop.io.compress.SnappyCodec
		  </value>
      </property>
</configuration>
  1. 配置HDFS(配置位于$HADOOP_HOME/etc/hadoop/hdfs-site.xml)
$ vim $HADOOP_HOME/etc/hadoop/hdfs-site.xml

将以下内容替换至hdfs-site.xml中

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <!--HDFS超级用户组 -->
  <property>
    <name>dfs.permissions.superusergroup</name>
    <value>hadoop</value>
  </property>

  <!--指定HDFS的nameservice,需要和core-site.xml中 fs.defaultFS配置项内 hdfs://{nameservices} 的 nameservices保持一致 -->
  <property>
    <name>dfs.nameservices</name>
    <value>hadoop-namenode</value>
  </property>
  <property>
    <!--设置NameNode 节点ID -->
    <name>dfs.ha.namenodes.hadoop-namenode</name>
    <value>nn1,nn2</value>
  </property>

  <!-- HDFS HA配置:HDFS rpc 通信地址 -->
  <!-- key的格式:dfs.namenode.rpc-address.[nameservice ID].[name node ID]  -->
  <property>
    <name>dfs.namenode.rpc-address.hadoop-namenode.nn1</name>
    <value>hostname-1:9000</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.hadoop-namenode.nn2</name>
    <value>hostname-2:9000</value>
  </property>

  <!-- HDFS HA配置:HDFS http 通信地址 -->
  <!-- key的格式:dfs.namenode.http-address..[nameservice ID].[name node ID] -->
  <property>
    <name>dfs.namenode.http-address.hadoop-namenode.nn1</name>
    <value>hostname-1:50070</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.hadoop-namenode.nn2</name>
    <value>hostname-2:50070</value>
  </property>

  <!-- NameNode 存放fsimage本地目录 -->
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/data/hadoop/data/dfs/namenode</value>
  </property>
  <!-- DataNode 存放 block本地目录 -->
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>/data/hadoop/data/dfs/datanode</value>
  </property>
  <!--JournalNode存放数据本地目录 -->
  <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/data/hadoop/data/dfs/jn</value>
  </property>
  <!--HDFS 文件副本数 -->
  <property>
    <name>dfs.replication</name>
    <value>2</value>
  </property>
  <!-- 块大小128M (默认128M) -->
  <property>
    <name>dfs.blocksize</name>
    <value>134217728</value>
  </property>

  <!-- Namenode editlog同步: 设置JournalNode服务器地址 -->
  <!-- 格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 端口同journalnode.rpc-address -->
  <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hostname-1:8485;hostname-2:8485;hostname-3:8485/hadoop-namenode</value>
  </property>

  <!-- 配置失败自动切换实现方式。即Client连接Namenode识别选择Active NameNode策略 -->
  <property>
    <name>dfs.client.failover.proxy.provider.hadoop-namenode</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>

  <!-- Namenode fencing -->
  <!--Failover后防止停掉的Namenode启动,造成两个服务 -->
  <!--Doc: https://hadoop.apache.org/docs/r2.7.7/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Configuring_automatic_failover -->
  <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
  </property>
  <!-- sshfence选项通过SSH连接到目标节点,因此需要保证Hadoop服务之间需要能够ssh免密登录 -->
  <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hadoop/.ssh/id_rsa</value>
  </property>
  <!--多少milliseconds 认为fencing失败 -->
  <property>
    <name>dfs.ha.fencing.ssh.connect-timeout</name>
    <value>30000</value>
  </property>

  <!-- 是否启用自动故障转移,默认为false -->
  <property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
  <!--动态许可datanode连接namenode列表 -->
   <property>
     <name>dfs.hosts</name>
     <value>/opt/hadoop/etc/hadoop</value>
   </property>
   
   <!-- 关闭HDFS权限检查 -->
   <property>
     <name>dfs.permissions</name>
     <value>false</value>
   </property>
</configuration>
  1. 配置 MapReduce (配置位于:$HADOOP_HOME/etc/hadoop/mapred-site.xml)
vim $HADOOP_HOME/etc/hadoop/mapred-site.xml

将以下内容替换至mapred-site.xml中

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
       Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <!-- 指定mr框架为yarn方式 -->
  <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
  </property>
  <property>
      <name>mapred.job.tracker.http.address</name>
      <value>0.0.0.0:50030</value>
  </property>
  <property>
      <name>mapred.task.tracker.http.address</name>
      <value>0.0.0.0:50060</value>
  </property>
</configuration>
  1. 配置YARN(配置文件位于 $HADOOP_HOME/etc/hadoop/yarn-site.xml)
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <!-- 节点最大可用内存 -->
  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>4096</value>
  </property>
  <!-- 单个任务可申请最少内存,默认1024MB。如果请求的内存小于该值,那么将会将请求的内存数,更改为YARN配置的单个任务最小内存数 -->
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
   </property>

    <!-- 单个任务可申请的最大内存,默认为8192MB -->
    <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>4096</value>
   </property>

  <!-- nodemanager 配置 -->
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

  <!-- HA 配置 -->
  <!-- Resource Manager Configs -->
  <property>
    <name>yarn.resourcemanager.connect.retry-interval.ms</name>
    <value>2000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
  <!-- 使嵌入式自动故障转移。HA环境启动,与 ZKRMStateStore 配合 处理fencing -->
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
    <value>true</value>
  </property>
  <!-- 集群名称,确保HA选举时对应的集群 -->
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarn-cluster</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
  </property>
    <property>
      <name>yarn.resourcemanager.hostname.rm1</name>
      <value>hostname-1</value>
    </property>
        <property>
      <name>yarn.resourcemanager.hostname.rm2</name>
      <value>hostname-2</value>
    </property>
    <!-- RM web application 地址 -->
  <property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>hostname-1:8088</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>hostname-2:8088</value>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
    <value>5000</value>
  </property>
  <!-- ZKRMStateStore 配置 -->
  <property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  </property>
  <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>hostname-1:2181,hostname-2:2181,hostname-3:2181</value>
  </property>

  <!-- 执行结束后,收集各个namenode的container本地的日志 -->
  <property>
     <name>yarn.log-aggregation-enable</name>
     <value>true</value>
  </property>
  <!-- 聚合日志后在hdfs的存放地址 -->
  <property>
     <name>yarn.nodemanager.remote-app-log-dir</name>
     <value>/app-logs</value>
  </property>
  <!-- log server的地址 -->
  <property>
     <name>yarn.log.server.url</name>
     <value>http://hostname-1:19888/jobhistory/logs</value>
  </property>

  <!-- 节点服务器上yarn可以使用的虚拟CPU个数,默认是8,推荐将值配置与物理核心个数相同,如果节点CPU核心不足8个,要调小这个值,yarn不会智能的去检测物理核心数 -->
   <property>
       <name>yarn.nodemanager.resource.cpu-vcores</name>
       <value>4</value>
    </property>
    
    <property>
      <name>yarn.nodemanager.pmem-check-enabled</name>
      <value>false</value>
  </property>
  <property>
      <name>yarn.nodemanager.vmem-check-enabled</name>
      <value>false</value>
  </property>
</configuration>
  • 启动 /停止 Hadoop集群

第一次启动Hadoop集群需在Hadoop集群内的某一台机器执行如下命令:

$ hdfs namenode -format  

启动服务脚本:

在Hadoop集群内的某机器执行如下命令:

$ start-dfs.sh  # 启动hdfs集群

注意:此命令需在开启job history所在的机器上执行

$ mr-jobhistory-daemon.sh start historyserver # 启动job history服务

注意:此命令需在YARN ResourceManager的某台机器上执行:

$ start-yarn.sh # 启动yarn集群

注意:start-yarn.sh 仅仅会启动执行命令所在机器的ResourceManager,如果YARN配置了HA,需在另外一台运行ResourceManager的机器执行如下命令:

$ yarn-daemon.sh start resourcemanager # 启动ResourceManager服务

停止服务脚本:

注意:此命令在部署YARN ResourceManager的每一台机器执行:

$ yarn-daemon.sh start resourcemanager # 停止ResourceManager服务

注意:此命令需在开启job history所在的机器上执行

$ mr-jobhistory-daemon.sh stop historyserver # 停止job history服务

在Hadoop集群内的某机器执行如下命令:

$ stop-yarn.sh # 停止yarn服务 $ stop-dfs.sh # 停止hdfs服务

spark

  • 要求

已安装好jdk

  • 安装部署

1.创建必要文件夹,并将用户所属组更改为hadoop

mkdir /data/spark
mkdir /data/spark/logs 
mkdir /data/spark/tmpdir 
chown -R hadoop /data/spark

2. 部署spark

下载Spark并解压压缩包,并将spark所属用户更改为hadoop

wget https://www.apache.org/dyn/closer.lua/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz 
tar -zxvf spark-2.4.7-bin-hadoop2.7.
tgz -C /opt/ mv /opt/spark-2.4.7-bin-hadoop2.7 /opt/spark 
chown -R hadoop /data/spark

3.配置spark环境变量

路径:/etc/profile.d/spark.sh

$ vim /etc/profile.d/spark.sh

将以下内容添加至spark.sh中

export PATH=/opt/spark/bin:$PATH

4.配置 spark默认参数

路径:${SPARK_HOME}/conf/spark-defaults.conf

$ vim ${SPARK_HOME}/conf/spark-defaults.conf

将以下内容添加至spark-defaults.conf中

注意:spark.eventLog.dir 和 spark.history.fs.logDirectory 两个参数中配置的hdfs地址为 不带有端口号的 hdfs nameservice名称

 #==============================================================================
 # Spark UI配置
 # http://spark.apache.org/docs/2.4.7/configuration.html#application-properties
 #==============================================================================
 # 用于存储Spark Shuffle、Cache等过程的数据. Default: /tmp
 spark.local.dir                 /data/spark/tmpdir
 
 #==============================================================================
 # Spark UI配置
 # http://spark.apache.org/docs/2.4.7/configuration.html#spark-ui
 #==============================================================================
 spark.eventLog.enabled          true
  # Spark记录日志的目录,子目录为appid. Default:file:/tmp/spark-events
 spark.eventLog.dir              hdfs://hadoop-namenode/spark/spark-logs
 
 #==============================================================================
 # Spark Monitoring配置
 # http://spark.apache.org/docs/2.4.7/monitoring.html#spark-history-server-configuration-options
 #==============================================================================
 # 读取Spark历史记录的路径. Default:file:/tmp/spark-events
 spark.history.fs.logDirectory   hdfs://hadoop-namenode/spark/spark-logs
 # history-server日志生命周期,当检查到某个日志文件的生命周期超过30d时,则会删除该日志文件
 spark.history.fs.cleaner.maxAge 30d

5.配置spark环境依赖

路径: ${SPARK_HOME}/conf/spark-env.sh

 $ vim ${SPARK_HOME}/conf/spark-env.sh

将如下内容添加至spark-env.sh中

 #==============================================================================
 # Spark ON YARN配置
 # http://spark.apache.org/docs/2.4.7/running-on-yarn.html#launching-spark-on-yarn
 #==============================================================================
 # 指定Hadoop配置文件路径。用于数据写入HDFS以及YARN的ResourceManager
 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
 export SPARK_DIST_CLASSPATH=$(/opt/hadoop/bin/hadoop classpath)
 export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
 # Spark日志目录
 export SPARK_LOG_DIR=/data/spark/logs

6.添加log4j相关配置

路径:${SPARK_HOME}/conf

(1) 添加Spark Driver端日志
vim $SPARK_HOME/conf/log4j-driver.properties

将以下内容添加到 log4j-driver.properties

# 默认使用的日志输出
spark.root.logger=INFO, console, ROLLING, DAILYROLLING
spark.log.dir=/data/spark/logs
spark.log.file=log-parser-driver.log
spark.log.errorFile=log-parser-driver-error.log

# Spark driver端log4j配置
log4j.rootCategory=${spark.root.logger}

log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %p [%t] %c{1}:%L - %m%n

# Set the default spark-shell log level to WARN. When running the spark-shell, the
# log level for this class is used to overwrite the root logger's log level, so that
# the user can have different defaults for the shell and regular Spark apps.
log4j.logger.org.apache.spark.repl.Main=WARN

# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark_project.jetty=WARN
log4j.logger.org.spark_project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR

log4j.appender.ROLLING=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLING.Threshold=INFO
log4j.appender.ROLLING.file=${spark.log.dir}/${spark.log.file}
log4j.appender.ROLLING.MaxBackupIndex=10
log4j.appender.ROLLING.MaxFileSize=128MB
log4j.appender.ROLLING.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLING.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %p [%t] %c{1}:%L - %m%n

log4j.appender.DAILYROLLING=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DAILYROLLING.Threshold=ERROR
log4j.appender.DAILYROLLING.file=${spark.log.dir}/${spark.log.errorFile}
log4j.appender.DAILYROLLING.DatePattern='.'yyyy-MM-dd
log4j.appender.DAILYROLLING.layout=org.apache.log4j.PatternLayout
log4j.appender.DAILYROLLING.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %p [%t] %c{1}:%L - %m%n
(2) 添加Spark Executor端日志
vim $SPARK_HOME/conf/log4j-executor.properties

将以下内容添加到 log4j-executor.properties

# 默认使用的日志输出
spark.root.logger=INFO, console, ROLLING, DAILYROLLING
spark.log.dir=/data/spark/logs
spark.log.file=log-parser-executor.log
spark.log.errorFile=log-parser-executor-error.log

# Spark executor端log4j配置
log4j.rootCategory=${spark.root.logger}

log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %p [%t] %c{1}:%L - %m%n

# Set the default spark-shell log level to WARN. When running the spark-shell, the
# log level for this class is used to overwrite the root logger's log level, so that
# the user can have different defaults for the shell and regular Spark apps.
log4j.logger.org.apache.spark.repl.Main=WARN

# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark_project.jetty=WARN
log4j.logger.org.spark_project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
log4j.logger.org.apache.parquet=ERROR
log4j.logger.parquet=ERROR

log4j.appender.ROLLING=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLING.Threshold=INFO
log4j.appender.ROLLING.file=${spark.log.dir}/${spark.log.file}
log4j.appender.ROLLING.MaxBackupIndex=10
log4j.appender.ROLLING.MaxFileSize=128MB
log4j.appender.ROLLING.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLING.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %p [%t] %c{1}:%L - %m%n

log4j.appender.DAILYROLLING=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DAILYROLLING.Threshold=ERROR
log4j.appender.DAILYROLLING.file=${spark.log.dir}/${spark.log.errorFile}
log4j.appender.DAILYROLLING.DatePattern='.'yyyy-MM-dd
log4j.appender.DAILYROLLING.layout=org.apache.log4j.PatternLayout
log4j.appender.DAILYROLLING.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %p [%t] %c{1}:%L - %m%n

7.在hdfs中添加需要的 目录

# 创建文件夹 
$ hdfs dfs -mkdir -p /spark/spark-logs 
# 如果使用hadoop用户启动Spark history server无需执行此操作。 
# 如果使用root用户启动Spark history server 务必修改文件夹权限 
$ hdfs dfs -chmod -R 777 /spark
  • 启动spark history server服务

仅在一台机器上启动即可

# 启动Spark’s history server 
$SPARK_HOME/sbin/start-history-server.sh
  • 停止spark history server服务
# 停止Spark’s history server 
$SPARK_HOME/sbin/stop-history-server.sh

elasticsearch

配置文件

注意事项:

  1. 不同集群使用不同 cluster.name
  2. 不同机器使用不同 node.name
  3. discovery.zen.ping.unicast.host 中需要填写本集群中所有节点的名字

elasticsearch.yml

/opt/elasticsearch/config/

elasticsearch.yml

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
cluster.name: elastic-username
#
# ------------------------------------ Node ------------------------------------
#
node.name: hostname-1
node.master: true
node.data: true
#
# ----------------------------------- Paths ------------------------------------
#
path.data: /data/elasticsearch/data
path.logs: /data/elasticsearch/logs
#
# ----------------------------------- Memory -----------------------------------
#
bootstrap.memory_lock: true
#
# ---------------------------------- Network -----------------------------------
#
network.host: 0.0.0.0
http.port: 9200
transport.tcp.port: 9300
#
# --------------------------------- Discovery ----------------------------------
#
discovery.zen.ping.unicast.hosts: ["hostname-1", "hostname-2", "hostname-3"]
discovery.zen.minimum_master_nodes: 2
#
# ------------------------------------ End -------------------------------------

log4j 配置文件:

/opt/elasticsearch/config/

log4j.properties

status = error

# log action execution errors for easier debugging
logger.action.name = org.elasticsearch.action
logger.action.level = debug

appender.console.type = Console
appender.console.name = console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%m%n

appender.rolling.type = RollingFile
appender.rolling.name = rolling
appender.rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}.log
appender.rolling.layout.type = PatternLayout
appender.rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%.-10000m%n
appender.rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}-%d{yyyy-MM-dd}.log
appender.rolling.policies.type = Policies
appender.rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.rolling.policies.time.interval = 1
appender.rolling.policies.time.modulate = true

rootLogger.level = info
rootLogger.appenderRef.rolling.ref = rolling

appender.deprecation_rolling.type = RollingFile
appender.deprecation_rolling.name = deprecation_rolling
appender.deprecation_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_deprecation.log
appender.deprecation_rolling.layout.type = PatternLayout
appender.deprecation_rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%.-10000m%n
appender.deprecation_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_deprecation-%i.log.gz
appender.deprecation_rolling.policies.type = Policies
appender.deprecation_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.deprecation_rolling.policies.size.size = 1GB
appender.deprecation_rolling.strategy.type = DefaultRolloverStrategy
appender.deprecation_rolling.strategy.max = 4

logger.deprecation.name = org.elasticsearch.deprecation
logger.deprecation.level = warn
logger.deprecation.appenderRef.deprecation_rolling.ref = deprecation_rolling
logger.deprecation.additivity = false

appender.index_search_slowlog_rolling.type = RollingFile
appender.index_search_slowlog_rolling.name = index_search_slowlog_rolling
appender.index_search_slowlog_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_search_slowlog.log
appender.index_search_slowlog_rolling.layout.type = PatternLayout
appender.index_search_slowlog_rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c] %marker%.-10000m%n
appender.index_search_slowlog_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_search_slowlog-%d{yyyy-MM-dd}.log
appender.index_search_slowlog_rolling.policies.type = Policies
appender.index_search_slowlog_rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.index_search_slowlog_rolling.policies.time.interval = 1
appender.index_search_slowlog_rolling.policies.time.modulate = true

logger.index_search_slowlog_rolling.name = index.search.slowlog
logger.index_search_slowlog_rolling.level = trace
logger.index_search_slowlog_rolling.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
logger.index_search_slowlog_rolling.additivity = false

appender.index_indexing_slowlog_rolling.type = RollingFile
appender.index_indexing_slowlog_rolling.name = index_indexing_slowlog_rolling
appender.index_indexing_slowlog_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_indexing_slowlog.log
appender.index_indexing_slowlog_rolling.layout.type = PatternLayout
appender.index_indexing_slowlog_rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c] %marker%.-10000m%n
appender.index_indexing_slowlog_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_indexing_slowlog-%d{yyyy-MM-dd}.log
appender.index_indexing_slowlog_rolling.policies.type = Policies
appender.index_indexing_slowlog_rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.index_indexing_slowlog_rolling.policies.time.interval = 1
appender.index_indexing_slowlog_rolling.policies.time.modulate = true

logger.index_indexing_slowlog.name = index.indexing.slowlog.index
logger.index_indexing_slowlog.level = trace
logger.index_indexing_slowlog.appenderRef.index_indexing_slowlog_rolling.ref = index_indexing_slowlog_rolling
logger.index_indexing_slowlog.additivity = false
  • 启动服务
systemctl start elasticsearch.service 
systemctl enable elasticsearch.service
  • 停止服务
systemctl stop elasticsearch.service

Elastalert

1. 安装 python3 和 pip

2. 安装 elastalert

sudo pip3 install elastalert==0.2.1

3. 安装 elasticsearch-py 依赖

sudo pip3 install elasticsearch==7.0.0

PS:此处必须使用指定版本,见 https://github.com/Yelp/elastalert/issues/2725

4. 修改 elastalert 配置文件

配置文件路径:/path/to/elastalert/config.yaml

# 规则文件地址
rules_folder: /home/elastalert/rules

# Elastalert 查询 Elasticsearch 的频率
run_every:
  seconds: 10
# 每次查询时,从当前时间往前的时间窗口
buffer_time:
  seconds: 15
# 失败重试的时间限制
alert_time_limit:
  days: 2

# Elasticsearch 地址和端口
es_host: hostname-3
es_port: 9200

# Elastalert 在 Elasticsearch 中使用的索引
writeback_index: elastalert_status
writeback_alias: elastalert_alerts

# Elastalert 使用的时间戳
timestamp_field: timestamp

5. 创建 elastalert 使用的索引

执行如下命令:

elastalert-create-index

返回结果:

New index name (Default elastalert_status) Name of existing index to copy (Default None) New index elastalert_status created Done!

6. 测试 elastalert 配置文件

执行如下命令:

elastalert-test-rule --config config.yaml example_rules/example_frequency.yaml

PS:此处不能省略 --config 参数,见 https://github.com/Yelp/elastalert/issues/2391

返回结果:

INFO:elastalert:Note: In debug mode, alerts will be logged to console but NOT actually sent.
            To send them but remain verbose, use --verbose instead.
Didn't get any results.
INFO:elastalert:Note: In debug mode, alerts will be logged to console but NOT actually sent.
                To send them but remain verbose, use --verbose instead.
1 rules loaded
INFO:apscheduler.scheduler:Adding job tentatively -- it will be properly scheduled when the scheduler starts
....

Would have written the following documents to writeback index (default is elastalert_status):

elastalert_status - {'rule_name': 'Example frequency rule', 'endtime': datetime.datetime(2020, 12, 21, 7, 35, 31, 599435, tzinfo=tzutc()), 'starttime': datetime.datetime(2020, 12, 21, 3, 33, 7, 599435, tzinfo=tzutc()), 'matches': 0, 'hits': 0, '@timestamp': datetime.datetime(2020, 12, 21, 7, 35, 33, 378891, tzinfo=tzutc()), 'time_taken': 1.7594046592712402}

7. 创建 elastalert 用户

  1. 创建 elastalert 用户和用户组,密码同用户名
  2. 赋予 elastalert 用户 sudo 权限
  3. 允许 elastalert 用户以 用户名/密码的方式 ssh 登录

8. 创建 Elastalert 规则文件存放目录

在 elastalert 用户目录下创建规则文件目录

mkdir /home/elastalert/rules

9. 添加 systemd Unit 文件

添加如下 Systemd 配置文件: /usr/lib/systemd/system/elastalert.service

[Unit]
Description=Elastalert Service

[Service]
Type=simple
User=username
Group=username

Environment=WORKON_HOME=/opt/Envs
Environment=VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
EnvironmentFile=-/usr/local/bin/virtualenvwrapper.sh
EnvironmentFile=-/opt/Envs/esalert/esalert.env
Environment=VIRTUAL_ENV=${WORKON_HOME}/esalert

WorkingDirectory=/opt/Envs/esalert

# ExecStartPre=workon esalert
ExecStart=/opt/Envs/esalert/bin/elastalert --config /opt/Envs/esalert/config.yaml --verbose
Restart=on-abort

[Install]
WantedBy=multi-user.target

10. 启动 elastalert

切换到 elastalert 用户

执行如下命令:

sudo systemctl start elastalert

使用如下命令查看执行状态:

sudo systemctl status elastalert

返回结果中状态为 active (running) 表示正常启动:

● elastalert.service - Elastalert Service   Loaded: loaded (/usr/lib/systemd/system/elastalert.service; static; vendor preset: disabled)   Active: active (running) since Mon 2020-12-21 15:46:02 CST; 7s ago Main PID: 4881 (elastalert)    Tasks: 2   Memory: 37.6M   CGroup: /system.slice/elastalert.service           └─4881 /bin/python3 /usr/local/bin/elastalert --config /opt/elastalert/config.yaml --verbose

mysql

安装mysql

将mysql5.7 bundle包传输到服务器中。并解压至指定目录

$ tar tvf mysql-5.7.32-1.el7.x86_64.rpm-bundle.tar 
-rw-r--r-- bteam/common 26460548 2020-09-25 12:48 mysql-community-client-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common   314936 2020-09-25 12:48 mysql-community-common-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common  3918236 2020-09-25 12:48 mysql-community-devel-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common 47479624 2020-09-25 12:48 mysql-community-embedded-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common 23263144 2020-09-25 12:48 mysql-community-embedded-compat-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common 130933732 2020-09-25 12:48 mysql-community-embedded-devel-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common   2457204 2020-09-25 12:48 mysql-community-libs-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common   1260336 2020-09-25 12:48 mysql-community-libs-compat-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common 181712536 2020-09-25 12:49 mysql-community-server-5.7.32-1.el7.x86_64.rpm
-rw-r--r-- bteam/common 124941892 2020-09-25 12:49 mysql-community-test-5.7.32-1.el7.x86_64.rpm

$ tar xf mysql-5.7.32-1.el7.x86_64.rpm-bundle.tar -C /target_path

依次安装common、libs、libs-compat、client、server,其中server会依赖net-tools,需提前安装好。如果使用普通用户需要用到sudo权限。

$ sudo rpm -ivh mysql-community-common-5.7.32-1.el7.x86_64.rpm 
warning: mysql-community-common-5.7.32-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-common-5.7.32-1.e################################# [100%]
$ sudo rpm -ivh mysql-community-libs-5.7.32-1.el7.x86_64.rpm 
warning: mysql-community-libs-5.7.32-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-libs-5.7.32-1.el7################################# [100%]
$ sudo rpm -ivh mysql-community-libs-compat-5.7.32-1.el7.x86_64.rpm 
warning: mysql-community-libs-compat-5.7.32-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-libs-compat-5.7.3################################# [100%]
$ sudo rpm -ivh mysql-community-client-5.7.32-1.el7.x86_64.rpm 
warning: mysql-community-client-5.7.32-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-client-5.7.32-1.e################################# [100%]
$ sudo rpm -ivh mysql-community-server-5.7.32-1.el7.x86_64.rpm 
warning: mysql-community-server-5.7.32-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-server-5.7.32-1.e################################# [100%]

安装完成后测试启动服务

$ sudo systemctl start mysqld
$ systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-11-17 14:37:29 CST; 2h 24min ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
  Process: 24025 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
  Process: 24007 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
 Main PID: 24029 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─24029 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid

配置mysql集群主从同步

GRANT ALL PRIVILEGES ON . TO '用户名'@'%' IDENTIFIED BY '密码';
GRANT ALL PRIVILEGES ON . TO 'repl'@'%' IDENTIFIED BY 'LogManager';
flush privileges;

并在其他结点测试是否能登陆mysql -u 用户名 -p -h 服务器ip或域名输入密码后进入mysql的cmd界面则操作成功。

$ sudo vim /etc/my.cnf
...
log-bin=mysql-bin
server-id=1
...
$ sudo systemctl restart mysqld
$ sudo systemctl status mysqld

主节点查看master状态,准备给slave节点设置。

mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      245 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

配置从节点

mysql> change master to
    -> master_host='主节点地址',
    -> master_user='用户名',
    -> master_password='密码',
    -> master_port=3306,     
    -> master_log_file='master.000001',
    -> master_log_pos=245;
Query OK, 0 rows affected, 2 warnings (0.50 sec)
mysql> start slave;
Query OK, 0 rows affected (0.03 sec)

注意:如果主库非空建议备份恢复后进行操作

flume

  • 要求

服务器中以部署好openjdk,并配置JAVA_HOME等环境变量

  • 安装

上传安装包,解压到指定目录

tar zxf apache-flume-1.7.0.tgz -C /opt/ mv /opt/flume-1.7.0 /opt/flume
  • 配置文件
    路径:/opt/flume/conf
    flume-conf.properties
# example.conf: A single-node Flume configuration

# Name the components on this agent a1
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# ===================r1:tail dir recursive source1=================
# 定义使用的channel管道
a1.sources.r1.channels = c1
# 定义使用的组件类型
a1.sources.r1.type = pi.dev.flume.source.TaildirRecursiveSource 
# 空格分隔的文件组列表,每个分组代表一系列的文件
a1.sources.r1.filegroups = fg
# 文件组的绝对路径,支持文件名的正则表达式
a1.sources.r1.filegroups.fg = /opt/flume/test/tailDirRecursiveSource/.*.log
# 定义是否使用递归方式读取文件
a1.sources.r1.custom.recursive.read = true
# 为了方便清理测试数据
# 以json格式记录读取文件的inode和对应文件的最后读取位置
a1.sources.r1.positionFile = /opt/flume/test/tailDirRecursiveSource/taildir_position.json

# ==========================k1:kafka sink========================
# 定义使用的channel管道
a1.sinks.k1.channel = c1
# 定义使用的组件类型
a1.sinks.k1.type = pi.dev.flume.sink.KafkaLightSink
# 定义连接的kafka broker的列表,建议使用两个作为高可用,以逗号隔开
a1.sinks.k1.kafka.bootstrap.servers = localhost:9092
# 定义向kafka发送信息的topic
a1.sinks.k1.kafka.topic = flume-sink
# 等待ISR列表中所有的副本完成同步后才算发送成功
a1.sinks.k1.kafka.producer.acks = all

# ========================c1:file channel=========================
# 定义管道类型
a1.channels.c1.type = file
# 为了方便清理测试数据
# checkpoint文件的存储位置
a1.channels.c1.checkpointDir = /opt/flume/test/checkpointDir
# 用逗号分隔的存储文件的目录,使用不同磁盘上的不同目录可以提升性能
a1.channels.c1.dataDirs = /opt/flume/test/fileChannel

日志配置
路径:/opt/flume/conf
log4j.properties

# Define some default values that can be overridden by system properties.
#
# For testing, it may also be convenient to specify
# -Dflume.root.logger=DEBUG,console when launching flume.

#flume.root.logger=DEBUG,console
# 默认使用的日志输出
flume.root.logger=INFO,LOGFILE
flume.log.dir=/data/flume/logs
flume.log.file=flume.log

log4j.logger.org.apache.flume.lifecycle = INFO
log4j.logger.org.jboss = WARN
log4j.logger.org.mortbay = INFO
log4j.logger.org.apache.avro.ipc.NettyTransceiver = WARN
log4j.logger.org.apache.hadoop = INFO
log4j.logger.org.apache.hadoop.hive = ERROR

# Define the root logger to the system property "flume.root.logger".
log4j.rootLogger=${flume.root.logger}


# Stock log4j rolling file appender
# Default log rotation configuration
log4j.appender.LOGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.LOGFILE.MaxFileSize=100MB
log4j.appender.LOGFILE.MaxBackupIndex=10
log4j.appender.LOGFILE.File=${flume.log.dir}/${flume.log.file}
log4j.appender.LOGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.LOGFILE.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n


# Warning: If you enable the following appender it will fill up your disk if you don't have a cleanup job!
# This uses the updated rolling file appender from log4j-extras that supports a reliable time-based rolling policy.
# See http://logging.apache.org/log4j/companions/extras/apidocs/org/apache/log4j/rolling/TimeBasedRollingPolicy.html
# Add "DAILY" to flume.root.logger above if you want to use this
log4j.appender.DAILY=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.DAILY.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.DAILY.rollingPolicy.ActiveFileName=${flume.log.dir}/${flume.log.file}
log4j.appender.DAILY.rollingPolicy.FileNamePattern=${flume.log.dir}/${flume.log.file}.%d{yyyy-MM-dd}
log4j.appender.DAILY.layout=org.apache.log4j.PatternLayout
log4j.appender.DAILY.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n


# console
# Add "console" to flume.root.logger above if you want to use this
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d (%t) [%p - %l] %m%n

JVM参数配置
路径:/opt/flume/conf
flume-env.sh

export JAVA_OPTS="-Xms512M -Xmx512M -Dcom.sun.management.jmxremote"

启动脚本
路径:/opt/flume/start.sh

nohup bin/flume-ng agent -n a1 -c conf -f conf/flume-test.properties >/dev/null 2>&1 &

注释:启动脚本需要在flume的根目录下启动,否则log4j.properties不生效

logstash

要求

服务器中以部署好openjdk,并配置JAVA_HOME等环境变量

配置文件

路径:/opt/logstash/config
logstash.yml
描述:配置logstash组件的参数

# ------------  Node identity ------------
# Use a descriptive name for the node:
# 节点的名字
node.name: logmanager-filebeat
#
# ------------ Data path ------------------

# logstash及插件持久化数据的目录
path.data: /data/logstash/data/
#
# ------------ Pipeline Settings --------------
# Set the pipeline event ordering. Options are "auto" (the default), "true" or "false".
# "auto" will  automatically enable ordering if the 'pipeline.workers' setting
# is also set to '1'.
# "true" will enforce ordering on the pipeline and prevent logstash from starting
# if there are multiple workers.
# "false" will disable any extra processing necessary for preserving ordering.
#
pipeline.ordered: auto
#
# ------------ Pipeline Configuration Settings --------------
#
# ------------ Queuing Settings --------------
#
# Internal queuing model, "memory" for legacy in-memory based queuing and
# "persisted" for disk-based acked queueing. Defaults is memory
# 用持久化在磁盘上的队列存储数据
queue.type: persisted
#
#
# If using queue.type: persisted, the maximum number of written events before forcing a checkpoint
# Default is 1024, 0 for unlimited
# checkpoint之前最大的数据量,设置为1能够使每一条写入落盘,但是会增加开销
queue.checkpoint.writes: 256
#
#
# log.level: info
# logstash日志存储的位置
path.logs: /data/logstash/logs
#
# ------------ Other Settings --------------
#
# Flag to output log lines of each pipeline in its separate log file. Each log filename contains the pipeline.name
# Default is false
# 每个pipline都在单独的日志文件中
pipeline.separate_logs: true
#
# ------------ X-Pack Settings (not applicable for OSS build)--------------

jvm.options
描述:设置logstash中jvm的参数

## JVM configuration

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
# 设置jvm初始堆大小和最大堆大小
-Xms1g
-Xmx1g

################################################################
## Expert settings
################################################################
# 设置jvm使用的gc类型
## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly

## Locale
# Set the locale language
#-Duser.language=en

## basic

# set the I/O temp directory
#-Djava.io.tmpdir=$HOME

# set to headless, just in case
-Djava.awt.headless=true

# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8

# use our provided JNA always versus the system one
#-Djna.nosys=true

# Turn on JRuby invokedynamic
-Djruby.compile.invokedynamic=true
# Force Compilation
-Djruby.jit.threshold=0
# Make sure joni regexp interruptability is enabled
-Djruby.regexp.interruptible=true

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps
# ensure the directory exists and has sufficient space
#-XX:HeapDumpPath=${LOGSTASH_HOME}/heapdump.hprof

## GC logging
#-XX:+PrintGCDetails
#-XX:+PrintGCTimeStamps
#-XX:+PrintGCDateStamps
#-XX:+PrintClassHistogram
#-XX:+PrintTenuringDistribution
#-XX:+PrintGCApplicationStoppedTime

# log GC status to a file with time stamps
# ensure the directory exists
#-Xloggc:${LS_GC_LOG_FILE}

# Entropy source for randomness
-Djava.security.egd=file:/dev/urandom

# Copy the logging context from parent threads to children
-Dlog4j2.isThreadContextMapInheritable=true

pipelines.yml
描述:设置logstash中的管道

# List of pipelines to be loaded by Logstash

# Example of two pipelines:
# 设置管道
- pipeline.id: logmanager-filebeat
  path.config: "/opt/logstash/config/conf.d/filebeat.yml"

filebeat.yml
描述:设置管道的参数

input {
  beats {
    port => 5044
  }
}

filter {

  mutate {
      add_field => { "log" => "%{log}" }
  }

  json {
    source => "@log"
  }

  mutate {
      add_field => { "@file" => "%{file}" }
      add_field => { "@host" => "%{host}" }
  }

  json {
    source => "@file"
  }

  json {
    source => "@host"
  }

  mutate {
    remove_field => [ "tags", "input", "agent", "@version", "ecs", "log", "@log", "file", "@file", "host", "@host" ]
  }

  mutate {
    rename => { "path" => "file_path" }
    rename => { "name" => "hostname" }
  }

}

output {
  kafka {
    id => "logmanager-filebeat" # 设置id,方便后续logstash的监控api定位
    codec => json
    topic_id => "logmanager-filebeat"
    bootstrap_servers => "hostname-1:9092,hostname-2:9092,hostname-3:9092"
    acks => "all"
    compression_type => "lz4"
    max_request_size => 5242880
  }
}
验证

可以用以下命令验证配置文件是否正确:

bin/logstash -f config/conf.d/filebeat.yml --config.test_and_exit

filebeat

部署步骤

解压缩

在FILEBEAT_HOME(filebeat安装目录)下
修改配置文件filebeat.yml

filebeat:
  inputs:
    - type: log
      enabled: true
      paths:
        - D:\logs\test.log*  
      encoding: utf-8 
      scan_frequency: 1s
      recursive_glob.enabled: true
      backoff: 1s # 当读到文件末尾时,检查文件是否有新内容的间隔时间
      close_inactive: 10m 
      close_renamed: false
      close_removed: true 
      clean_inactive: 0
      clean_removed: true
      fields:
        log_identifier: hostname:arvin##ip:127.0.01
        rule_name: test
        store_name: test
      fields_under_root: true
      tail_files: true # 读取新文件时,会在文件的末尾开始读取
      max_bytes: 102400 # 单条日志的最大值100KB
  
output.logstash:
  hosts: [172.20.3.248:5044] # 配置对应的logstash的ip和端口
  bulk_max_size: 512 # 一个单独的logstash请求中事件的最大数量
  slow_start: true # 如果启用,每个事务只传输一批事件中的事件子集。如果没有发生错误,事件的数量将会增长到bulk_max_size.如果发生错误,将会减少。
  
# 日志
logging.level: info
logging.to_file: true
logging.files: 
  name: filebeat.log
  keepfiles: 7
  permissions: 0644

# 内存队列
queue.mem:
  events: 512
  flush.min_events: 512
  flush.timeout: 1s
  
# 设置同时执行的cpu数量
max_procs: 1

# filebeat关闭前等待的时间,可以发送内存中的数据并且接受相应写入registry中
filebeat.shutdown_timeout: 5s

创建windows服务filebeat
管理员打开cmd.exe,执行命令

sc create filebeat binPath= “FILEBEAT_HOME\filebeat.exe -c FILEBEAT_HOME\filebeat.yml" start= delayed-auto

Logmanager-API

mysql
版本 5.7
端口默认:3306
创建数据库:itoa
	字符集:utf8mb4
	排序规则:utf8mb4_0900_ai_ci
	useSSL:false
	useUnicode:true

eureka (作为注册中心)
最新版
端口号:8761
es需要创建一个索引库(后面可能用不到,暂时用于项目部署环境)
索引库名称:xxl-job-admin

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

love6a6

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值