上篇文章中介绍了将SQLServer中的数据通过logstash同步到Elasticsearch,但logstash能做的不止这些,实际上logstash作为一个数据采集工具支持更多的输入和输出。如果将两个不同的数据库分别作为logstash的输入和输出,那么通过一定的配置就能实现不同数据库间的数据同步。
安装环境
软件环境 | 版本 |
---|---|
操作系统 | Windows 10 prefessional |
容器工具 | Docker Desktop for Windows |
容器可视化工具 | Kitematic |
命令行工具 | PowerShell |
预备知识
Logstash包括三部分,Input、Filter、Output
(1)Inputs:用于从数据源获取数据,常见的插件如file, syslog, redis, beats,jdbc 等
(2)Filters:用于处理数据如格式转换,数据派生等,常见的插件如grok, mutate, drop, clone, geoip等
(3)Outputs:用于数据输出,常见的插件如elastcisearch,file, graphite, statsd等
1 安装Logstash-input-jdbc和Logstash-output-jdbc
FROM elastic/logstash:6.8.2
MAINTAINER myname<mymail@mail.com>
#安装input插件
RUN logstash-plugin install logstash-input-jdbc
#安装output--es插件
RUN logstash-plugin install logstash-output-elasticsearch
#安装output-jdbc插件
RUN logstash-plugin install logstash-output-jdbc
打开PowerShell创建并启动容器。
2 修改logstash.conf文件
配置输入和输出
input {
jdbc {
jdbc_driver_library => "/usr/share/logstash/config/mssql-jdbc-7.0.0.jre8.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://127.0.0.1:1433;DatabaseName=SourceDB;"
jdbc_user => "admin"
jdbc_password => "123456"
# schedule => 分 时 天 月 年 ,每分钟执行一次
schedule => "* * * * *"
jdbc_paging_enabled => true
jdbc_page_size => 100
clean_run => false
use_column_value => true
#设置跟踪字段
tracking_column => Id
#是否启用最后运行记录跟踪
record_last_run => true
#最后运行记录存储路径
last_run_metadata_path => "/usr/share/logstash/config/data.txt"
#是否设置字段小写
lowercase_column_names => false
#数据查询脚本路径
statement_filepath => "/usr/share/logstash/config/query.sql"
#索引的类型
type => "my-info"
}
}
output {
jdbc {
driver_jar_path => "/usr/share/logstash/config/mssql-jdbc-7.0.0.jre8.jar"
driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
connection_string => "jdbc:sqlserver://127.0.0.1:1433;DatabaseName=TargetDB;User=admin;Password=123456;"
#statement_filepath => "/usr/share/logstash/config/sync.sql"
statement => [ "insert into SystemLogs(ControllerName,ActionName,TerminalIp,TerminalName,BrowserInfo,CustomData) values(?,?,?,?,?,?)","ControllerName","ActionName","TerminalIp","TerminalName","BrowserInfo","CustomData" ]
}
}
重启容器。
注意:logstash-input-jdbc和logstash-output-jdbc是两个不同的插件,数据库连接字符串的格式是不同的
数据库查询脚本为:
SELECT Id,BrowserInfo,ClientIpAddress as TerminalIp,ClientName as TerminalName,CustomData,MethodName as ActionName,ServiceName as ControllerName FROM SystemLogs where Id > :sql_last_value
3 logstash-output-jdbc对应的各类数据连接配置的格式
3.1 CockroachDB
input
{
stdin { }
}
output {
jdbc {
driver_jar_path => '/opt/postgresql-9.4.1209.jre6.jar'
connection_test => false
connection_string => 'jdbc:postgresql://127.0.0.1:26257/test?user=root'
statement => [ "INSERT INTO log (host, timestamp, message) VALUES(?, CAST (? AS timestamp), ?)", "host", "@timestamp", "message" ]
}
}
3.2 MariaDB
input
{
stdin { }
}
output {
jdbc {
connection_string => "jdbc:mariadb://HOSTNAME/DATABASE?user=USER&password=PASSWORD"
statement => [ "INSERT INTO log (host, timestamp, message) VALUES(?, ?, ?)", "host", "@timestamp", "message" ]
}
}
3.3 MySQL
input
{
stdin { }
}
output {
jdbc {
driver_class => "com.mysql.jdbc.Driver"
connection_string => "jdbc:mysql://HOSTNAME/DATABASE?user=USER&password=PASSWORD"
statement => [ "INSERT INTO log (host, timestamp, message) VALUES(?, CAST(? AS timestamp), ?)", "host", "@timestamp", "message" ]
}
}
3.4 Oracle
input
{
stdin { }
}
output {
jdbc {
connection_string => "jdbc:oracle:thin:USER/PASS@HOST:PORT:SID"
statement => [ "INSERT INTO log (host, timestamp, message) VALUES(?, CAST (? AS timestamp), ?)", "host", "@timestamp", "message" ]
}
}
3.5 SQLServer
input
{
stdin { }
}
output {
jdbc {
driver_jar_path => '/opt/sqljdbc42.jar'
connection_string => "jdbc:sqlserver://server:1433;databaseName=databasename;user=username;password=password"
statement => [ "INSERT INTO log (host, timestamp, message, comment) VALUES(?, ?, ?, 'static string')", "host", "@timestamp", "message" ]
}
}
3.6 SQLite3
input
{
stdin { }
}
output {
stdout { }
jdbc {
driver_class => "org.sqlite.JDBC"
connection_string => 'jdbc:sqlite:test.db'
statement => [ "INSERT INTO log (host, timestamp, message) VALUES(?, ?, ?)", "host", "@timestamp", "message" ]
}
}