1、下载官方Logstash
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.3.2.tar.gz
2、配置logstash-input-jdbc插件环境
#查看gem环境
gem
RubyGems is a sophisticated package manager for Ruby. This is a
basic help message containing pointers to more information.
Usage:
gem -h/--help
gem -v/--version
gem command [arguments...] [options...]
Examples:
gem install rake
gem list --local
gem build package.gemspec
gem help install
Further help:
gem help commands list all 'gem' commands
gem help examples show some examples of usage
gem help platforms show information about platforms
gem help <COMMAND> show help on COMMAND
(e.g. 'gem help install')
gem server present a web page at
http://localhost:8808/
with info about installed gems
Further information:
http://guides.rubygems.org
显示出gem帮助信息,说明已经安装。
3、如果没有安装gem
yum install gem -y
4、gem源修改
gem sources --add https://gems.ruby-china.org/ --remove https://rubygems.org/
5、查看gem源
gem sources -l
6、修改logstash目录中的Gemfile文件
vim Gemfile
source "https://gems.ruby-china.org/"
7、修改logstash目录中的Gemfile.lock文件
vim Gemfile.lock
remote https://gems.ruby-china.org/
8、安装gem bundler
gem install bundler
9、安装logstash-input-jdbc插件
bin/logstash-plugin install logstash-input-jdbc
10、上传数据库驱动文件mysql-connector-java-5.1.44.jar到logstash目录。
11、编写logstash-input-jdbc-mysql.conf文件
input {
jdbc {
// mysql相关jdbc配置
jdbc_connection_string => "jdbc:mysql://localhost:3306/pgeniusdb_new"
jdbc_user => "root"
jdbc_password => "123456"
//mysql驱动,位置写绝对路径
jdbc_driver_library => "/data/app/logstash-6.3.2/mysql/mysql-connector-java-5.1.46.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
//sql可以写在这里也可以写在文件里,位置绝对路径
statement_filepath => "/data/app/logstash-6.3.2/mysql/jdbc-news.sql"
// 这里类似crontab,可以定制定时操作,比如每10分钟执行一次同步(分 时 天 月 年
schedule => "*/10 * * * *"
// 是否需要记录某个column 的值,如果record_last_run为真,可以自定义我们需要 track 的 column 名称,此时该参数就要为 true. 否则默认 track 的是 timestamp 的值.
use_column_value => "true"
// 如果 use_column_value 为真,需配置此参数. track 的数据库 column 名,该 column 必须是递增的. 一般是mysql主键
tracking_column => "objectId"
//分类
type => "news"
// 是否清除 last_run_metadata_path 的记录,如果为真那么每次都相当于从头开始查询所有的数据库记录
clean_run => "false"
//是否将 字段(column) 名称转小写
lowercase_column_names => "false"
}
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost:3306/pgeniusdb_new"
jdbc_user => "root"
jdbc_password => "123456"
jdbc_driver_library => "/data/app/logstash-6.3.2/mysql/mysql-connector-java-5.1.46.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
statement_filepath => "/data/app/logstash-6.3.2/mysql/jdbc-news-industry.sql"
schedule => "*/10 * * * *"
use_column_value => "true"
tracking_column => "objectId"
type => "news_industry"
}
}
//filter没什么用
filter {
json {
source => "message"
remove_field => ["message"]
}
}
output {
if[type] == "news" {
elasticsearch {
//是http地址,不是tcp,这个是个坑
hosts => ["localhost:8607"]
//索引名字
index => "index_news"
document_id => "%{objectid}"
//类型名字
document_type => "news"
}
}
if[type] == "news_industry" {
elasticsearch {
hosts => ["localhost:8607"]
index => "index_news"
document_id => "%{objectid}"
document_type => "news"
}
}
stdout {
codec => json_lines
}
}
在output中,我们没有传document_type,并且还使用了两个index。这是因为在elasticsearch6.0中,一个索引下只能有一个类型,不然会报错。这里我们可
12、执行同步
/bin/logstash -f logstash-input-jdbc-mysql.conf
踩过的坑
1、配置文件中,当在input的jdbc下,增加type属性时,会导致该索引下增加type字段。所以sql查询出的字段不要用type,如果有,as成其他的名字,不然的话,这里判断会有异常
2、同步多个表,elasticsearch6.0以上的版本,一定要设置多个索引
欢迎关注小编微信公众号:程序猿微刊 ,有更多的干活和资源等你来拿