CentOS7中，Logstash 7.13安装以及使用Logstash同步MSSQL数据到ElasitcSearch

最新推荐文章于 2023-05-03 03:29:15 发布

林深时见鹿10years

最新推荐文章于 2023-05-03 03:29:15 发布

阅读量996

点赞数

分类专栏： elastic全家桶文章标签：数据库 centos elasticsearch

本文链接：https://blog.csdn.net/tt91091500/article/details/118492446

版权

elastic全家桶专栏收录该内容

3 篇文章 0 订阅

订阅专栏

CentOS7中，Logstash 7.13安装以及使用Logstash同步MSSQL数据到ElasitcSearch

前言

Logstash 是免费且开放的服务器端数据处理管道，能够从多个来源采集数据，转换数据，然后将数据发送到您最喜欢的“存储库”中。

为了将SQL Server数据库中的某些表数据同步到ElasticSearch中，我们选择使用Logstash进行数据的抽取和转存。
我使用的虚拟机系统是CentOS 7。实体机（Windows 10）中安装了SQL Server 2019。
以上环境和软件安装过程：
Hyper-v 安装Cent OS 7
CentOS 7安装ElasticSearch 7.8 (rpm包安装)
Windows10 安装SQL Server 2019

安装Logstash

Logstash 最新版本下载地址：https://www.elastic.co/cn/downloads/logstash
下载以下文件，并存储到Cent OS 的 /root 目录
在这里插入图片描述
使用yum指令安装

[root@localhost conf]# cd ~
[root@localhost ~]# yum install ./logstash-7.13.2-x86_64.rpm -y

在这里插入图片描述

默认的安装目录是 /usr/share/logstash
默认的配置目录是 /etc/logstash
默认的日志目录是 /var/log/logstash

连接SQL Server 并同步到ES

1.准备一些测试数据

SQL Server 中新建一个数据库 develop_db
develop_db.dbo 中新建两张表
compony, project
这两张表首先要包含id字段（类型int、long或者varchar都可以）、update_time字段（类型datetime2）其它的字段可以随意设计。

2.ES中添加两个index

可以使用curl命令在es中添加以下两个index

curl --location --request PUT 'http://localhost:9200/compony' \
--header 'Content-Type: application/json' \
--data '{}'

curl --location --request PUT 'http://localhost:9200/project' \
--header 'Content-Type: application/json' \
--data '{}'

3.新建配置文件、依赖包等的文件夹

Cent OS中新建以下目录

[root@localhost ~]# mkdir /opt/logstash
[root@localhost ~]# mkdir /opt/logstash/lasstrun
[root@localhost ~]# mkdir /opt/logstash/lib
[root@localhost ~]# mkdir /opt/logstash/conf

4.下载SQL Server JDBC jar包

打开maven repository的mssql-jdbc页面：https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc
根据你的java版本下载对应jar包
在这里插入图片描述
下载好的jar包放到/opt/logstash/lib 目录下

5.配置文件

/opt/logstash/conf 目录下新建一个配置文件sqlserver2es.conf

[root@localhost ~]# vi sqlserver2es.conf

input {
    jdbc {
        # jdbc 数据源配置
        jdbc_driver_library => "/opt/logstash/lib/mssql-jdbc-9.2.1.jre8.jar"
        jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
        jdbc_connection_string => "jdbc:sqlserver://192.168.255.1:1433;DatabaseName=develop_db;"
        jdbc_user => "sa"
        jdbc_password => "123456"
        # 时区，重要
        jdbc_default_timezone => "Asia/Shanghai"
        # 打开分页
        jdbc_paging_enabled => true
        # 每页1000条
        jdbc_page_size => 1000
        # 定时同步配置 schedule => 分 时 天 月 年  
        schedule => "* * * * *"
        # 查询语句
        statement => "SELECT * FROM company WHERE update_time > :sql_last_value "
        # 设置为true时，将tracking_column的值用作:sql_last_value。设置为false时，:sql_last_value反映上次执行查询的时间
        use_column_value => true
        # 跟踪列的类型。目前只有“numeric”和“timestamp”
        tracking_column_type => "timestamp"
        # 跟踪列名
        tracking_column => "update_time"
        # :sql_last_value 的值记录在该文件内
        last_run_metadata_path => "/opt/logstash/lastrun/.company_last_run"
        # 是否清除 .company_last_run 的记录,如果为true则每次都从头开始查询所有的数据库记录
        clean_run => false
        # 设置列名小写
        lowercase_column_names => false
        # type 这里用于输出时区分对应的index
        type => "company"
    }
}

input {
    jdbc {
        # jdbc 数据源配置
        jdbc_driver_library => "/opt/logstash/lib/mssql-jdbc-9.2.1.jre8.jar"
        jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
        jdbc_connection_string => "jdbc:sqlserver://192.168.255.1:1433;DatabaseName=develop_db;"
        jdbc_user => "sa"
        jdbc_password => "123456"
        # 时区，重要
        jdbc_default_timezone => "Asia/Shanghai"
        # 打开分页
        jdbc_paging_enabled => true
        # 每页1000条
        jdbc_page_size => 1000
        # 定时同步配置 schedule => 分 时 天 月 年  
        schedule => "* * * * *"
        # 查询语句
        statement => "SELECT * FROM project WHERE update_time > :sql_last_value "
        # 设置为true时，将tracking_column的值用作:sql_last_value。设置为false时，:sql_last_value反映上次执行查询的时间
        use_column_value => true
        # 跟踪列的类型。目前只有“numeric”和“timestamp”
        tracking_column_type => "timestamp"
        # 跟踪列名
        tracking_column => "update_time"
        # :sql_last_value 的值记录在该文件内
        last_run_metadata_path => "/opt/logstash/lastrun/.project_last_run"
        # 是否清除 .project_last_run 的记录,如果为true则每次都从头开始查询所有的数据库记录
        clean_run => false
        # 设置列名小写
        lowercase_column_names => false
        # type 这里用于输出时区分对应的index
        type => "project"
    }
}

output {
    if [type] == "company" {
        elasticsearch {
            # 要导入到的Elasticsearch所在的主机
            hosts => "127.0.0.1:9200"
            # 要导入到的Elasticsearch的索引的名称
            index => "company"
            # 主键名称（类似数据库主键）
            document_id => "%{id}"
        }
    } else if [type] == "project" {
        elasticsearch {
            # 要导入到的Elasticsearch所在的主机
            hosts => "127.0.0.1:9200"
            # 要导入到的Elasticsearch的索引的名称
            index => "project"
            # 主键名称（类似数据库主键）
            document_id => "%{id}"
        }
    }
    stdout {
        # JSON格式输出
        codec => json_lines
    }
}

保存文件后，使用命令行启动logstash

[root@localhost conf]# /usr/share/logstash/bin/logstash -f /opt/logstash/conf

如果控制台没有输出报错信息，则启动成功，过一分钟后用curl命令查看ES中company索引的数据

[root@localhost ~]# curl -X POST  http://localhost:9200/company/_search

可以看到数据已经同步到ES中

6.logstash作为服务启动

修改startip.optins文件

[root@localhost ~]# vi /etc/logstash/startup.options

找到 LS_OPTS="–path.settings ${LS_SETTINGS_DIR}" 这行，改成 LS_OPTS="–path.settings ${LS_SETTINGS_DIR} -f /opt/logstash/conf"，保存

################################################################################
# These settings are ONLY used by $LS_HOME/bin/system-install to create a custom
# startup script for Logstash and is not used by Logstash itself. It should
# automagically use the init system (systemd, upstart, sysv, etc.) that your
# Linux distribution uses.
#
# After changing anything here, you need to re-run $LS_HOME/bin/system-install
# as root to push the changes to the init script.
################################################################################

# Override Java location
#JAVACMD=/usr/bin/java

# Set a home directory
LS_HOME=/usr/share/logstash

# logstash settings directory, the path which contains logstash.yml
LS_SETTINGS_DIR=/etc/logstash

# Arguments to pass to logstash
LS_OPTS="--path.settings ${LS_SETTINGS_DIR}  -f /opt/logstash/conf"

# Arguments to pass to java
LS_JAVA_OPTS=""

# pidfiles aren't used the same way for upstart and systemd; this is for sysv users.
LS_PIDFILE=/var/run/logstash.pid

# user and group id to be invoked as
LS_USER=logstash
LS_GROUP=logstash

# Enable GC logging by uncommenting the appropriate lines in the GC logging
# section in jvm.options
LS_GC_LOG_FILE=/var/log/logstash/gc.log

# Open file limit
LS_OPEN_FILES=16384

# Nice level
LS_NICE=19

# Change these to have the init script named and described differently
# This is useful when running multiple instances of Logstash on the same
# physical box or vm
SERVICE_NAME="logstash"
SERVICE_DESCRIPTION="logstash"

# If you need to run a command or script before launching Logstash, put it
# between the lines beginning with `read` and `EOM`, and uncomment those lines.
###
## read -r -d '' PRESTART << EOM
## EOM

使用 systemctl 命令启动logstash

[root@localhost conf]# systemctl start  logstash.service

查看日志

[root@localhost conf]# tail -100f /var/log/logstash/logstash-plain.log

在这里插入图片描述

参考文章

利用logstash从mysql同步数据到ElasticSearch
Logstash 参考 [7.13] » 输入插件 » Jdbc 输入插件
 Logstash 参考 [7.13] » 输出插件 » Elasticsearch 输出插件
 Logstash配置以服务方式运行
 Logstash——条件判断
 Logstash：处理多个 input
Logstash：多个配置文件（conf）

林深时见鹿10years

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
CentOS7中，Logstash 7.13安装以及使用Logstash同步MSSQL数据到ElasitcSearch

CentOS7中，Logstash 7.13安装以及使用Logstash同步MSSQL数据到ElasitcSearch目录CentOS7中，Logstash 7.13安装以及使用Logstash同步MSSQL数据到ElasitcSearch前言安装Logstash连接SQL Server 并同步到ES1.准备一些测试数据2.ES中添加两个index3.新建配置文件、依赖包等的文件夹4.下载SQL Server JDBC jar包5.配置文件6.logstash作为服务启动参考文章前言Logstash
复制链接

扫一扫