配置文件如下:
input {
jdbc {
jdbc_driver_library => "../lib/orcl-lib/ojdbc6-11.2.0.3.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => "jdbc:oracle:thin:@192.168.200.155:1521:orcl"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
jdbc_user => "xxxxx"
jdbc_password => "xxxxx"
schedule => "*/3 * * * *"
statement => "select * from FINANCE_BOOKKEEPING_RECORD WHERE DATA_SYNC_TIME > :sql_last_value + 0 + INTERVAL '8' HOUR"
use_column_value => "false"
tracking_column_type => "timestamp"
last_run_metadata_path => "../config/conf.d/finance_bookkeeping_record_last_id"
clean_run => "false"
}
}
filter {
mutate {
convert => { "price" => "float_eu"}
convert => { "amount" => "float_eu"}
}
}
output {
elasticsearch {
hosts => ["192.168.200.237:9200","192.168.200.238:9200"]
index => "yky_es_db_finance_bookkeeping_record"
document_id => "%{id}"
}
}
需要注意的几个地方:
①最好是先在ES上创建好索引。默认的索引模板未应用中文分词器,如果不想创建索引,可以参考以下的索引模板:
####(适用于ES6.8.0,如果是ES7.4.0,需要去掉"_default_"):
{
"order": 0,
"index_patterns": ["yky_es_db_*"],
"settings": {
"index": {
"refresh_interval": "30s"
}
},
"mappings": {
"_default_": {
"dynamic_templates": [
{
"string_fields": {
"mapping": {
"type": "text",
"analyzer": "ik_max_word",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"match_mapping_type": "string",
"match": "*"
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"@version": {
"type": "keyword"
}
}
}
}
}
② last_run_metadata_path => "../config/conf.d/finance_bookkeeping_record_last_id",报错上次更新的时间的文件路劲。
tracking_column_type => "timestamp"表示该文件中保存的内容类型。
③clean_run => "false",设为增量同步,true表示全量同步。
④logstash时区问题:默认是UTC,比北京时间所属的东8区早8个小时,我是在sql中解决该问题的:
select * from FINANCE_BOOKKEEPING_RECORD WHERE DATA_SYNC_TIME > :sql_last_value + 0 + INTERVAL '8' HOUR
:sql_last_value + 0 + INTERVAL '8' HOUR:
:sql_last_value + 0 是把TIMESTAMP 转为date类型,+ INTERVAL '8' HOUR是将默认UTC变为北京时间所在的东8区。
TIMESTAMP '2020-03-16 08:03:00.072000 +00:00' + 0 + INTERVAL '8' HOUR