通过Logstash将MySql中多表导入ES（Elasticsearch）

最新推荐文章于 2023-06-08 15:04:52 发布

XC_Aaron

最新推荐文章于 2023-06-08 15:04:52 发布

阅读量3.9k

点赞数 3

文章标签： Logstash MySql

本文链接：https://blog.csdn.net/cd420928908/article/details/93846478

版权

这边就对将MySql导入ES做一个简单的记录，如果有想查看Logstash信息或者它的安装方法等可以去看：
Logstash官网文档：
https://www.elastic.co/guide/en/logstash-versioned-plugins/current/index.html
Logstash安装：https://www.cnblogs.com/dyh004/p/9638675.html
https://www.cnblogs.com/cjsblog/p/9459781.html

进入正题---->
其实最重要的就是对于你要执行的XXX.conf文件中的信息配置，
1.配置mysql.conf并放到目录/usr/share/logstash/bin/config-mysql/mysql.conf
2.其配置内容

input {
    stdin{
    }
    jdbc {
	  #索引的类型
      type => "chatroomInfo"
      # 数据库
      jdbc_connection_string => "jdbc:mysql://192.168.20.520:3306/saasuseUnicode=true&serverTimezone=CTT"
      # 用户名密码
      jdbc_user => "XXX"
      jdbc_password => "XXXXXX"
      # jar包的位置  （你下载的mysql驱动位置）
      jdbc_driver_library => "/home/logstash/logstash/logstash-mysql/mysql/mysql-connector-java-8.0.16.jar"
      # mysql的Driver
      jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"
      statement_filepath => "/home/logstash/logstash/config-mysql/chatroom.sql"
      #statement => "SELECT * from tb_chatroom_msg WHERE MSG_TIME >= :sql_last_value"
      schedule => "* * * * *"
	  lowercase_column_names => false
	  # 记录上一次运行记录
      record_last_run => true
	    # 使用字段值
      use_column_value => true
	  # 追踪字段名
      tracking_column => "IndexInfo"
      # 字段类型
      #tracking_column_type => "timestamp"
	   # 上一次运行元数据保存路径
      last_run_metadata_path => "/home/logstash/logstash/config-mysql/chatroomTime"
	   # 是否删除记录的数据
      clean_run => false
    }
	jdbc {
	  #索引的类型
      type => "pubiccontentInfo"
      # 数据库
     "jdbc:mysql://192.168.20.520:3306/saasuseUnicode=true&serverTimezone=CTT"
      # 用户名密码
      jdbc_user => "XXX"
      jdbc_password => "XXXXXX"
      # jar包的位置
      jdbc_driver_library => "/home/logstash/logstash/logstash-mysql/mysql/mysql-connector-java-8.0.16.jar"
      # mysql的Driver
      jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"
      statement_filepath => "/home/logstash/logstash/config-mysql/public.sql"
      #statement => "SELECT * from tb_public_content WHERE PUBLISH_TIME >= :sql_last_value"
      schedule => "* * * * *"
	  lowercase_column_names => false
	  # 记录上一次运行记录
      record_last_run => true
	    # 使用字段值
      use_column_value => true
	  # 追踪字段名
      tracking_column => "IndexInfo"
      # 字段类型
      #tracking_column_type => "timestamp"
	   # 设置记录的路径
       last_run_metadata_path => "/home/logstash/logstash/config-mysql/pulicTime"
	   # 是否删除记录的数据
      clean_run => false
    }
	jdbc {
	  #索引的类型
      type => "taskInfo"
      # 数据库
   "jdbc:mysql://192.168.20.520:3306/saasuseUnicode=true&serverTimezone=CTT"
      # 用户名密码
      jdbc_user => "XXX"
      jdbc_password => "XXXXXX"
      # jar包的位置
      jdbc_driver_library => "/home/logstash/logstash/logstash-mysql/mysql/mysql-connector-java-8.0.16.jar"
      # mysql的Driver
      jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"
      #你要执行的sql文件
      statement_filepath => "/home/logstash/logstash/config-mysql/task.sql"
      #要执行的sql语句 
      #statement => "SELECT * from tb_public_content WHERE PUBLISH_TIME >= :sql_last_value"
      #类似于定时任务一样，设置执行的时间，默认是一分钟一次
      schedule => "* * * * *"
	  lowercase_column_names => false
	  # 记录上一次运行记录
      record_last_run => true
	    # 使用字段值
      use_column_value => true
	  # 追踪字段名
      tracking_column => "IndexInfo"
      # 字段类型
      #tracking_column_type => "timestamp"
	   # 设置记录的路径
       last_run_metadata_path => "/home/logstash/logstash/config-mysql/taskTime"
	   # 是否删除记录的数据
      clean_run => false
    }
}

output {
   if [type]=="chatroomInfo"{
	elasticsearch {
        hosts => "192.168.50.207:9200"
        # index名
        index => "chatroom"
        # 需要关联的数据库中有有一个id字段，对应索引的id号
        document_id => "%{MSG_CODE}"
    }
  }
   if [type]=="pubiccontentInfo"{
	elasticsearch {
        hosts => "192.168.50.207:9200"
        # index名
        index => "pubiccontent"
        # 需要关联的数据库中有有一个id字段，对应索引的id号
        #document_id => "%{PUBLIC_CODE}"
    }
  }
   if [type]=="taskInfo"{
	elasticsearch {
        hosts => "192.168.50.207:9200"
        # index名
        index => "taskresult"
        # 需要关联的数据库中有有一个id字段，对应索引的id号
        #document_id => "%{PUBLIC_CODE}"
    }
  }
}

其中：

要执行的sql可以有两种方法，一直读取sql文件中的sql信息，一种就是直接写在里边；
其中在数据库连接的时候要添加 serverTimezone=CTT 如果不添加的话，会出现时间类字段会多增加5-6小时之类的。而UTC时间和我们的当前时间相差8小时；
设置执行的周期时间，默认为一分钟一次 schedule => " * * * "
在导入数据的时候，有两种方式，一种全量和一种增量（从你上次导入结束到这次执行开始之间数据库产生的数据），如果选择增量导入的话，就需要设置某个字段去跟踪记录，有数字和时间两种类型，默认是数字，如果选择是时间的话，必须就要添加 tracking_column_type => “timestamp” ；然后设置记录这个数据的保存地址；注：该字段应选择递增的字段类如 mysql的主键id，插入数据的插入时间等…
如果你要选择时间类型的字段作为追踪字段的话，会出现Logstash 生成的信息出现误差，导致不能及时的将数据导入进来，这是因为 Logstash 整个选择的是UTC时间，你可以选择下面的方式解决这个问题：
举例 task.sql：

SELECT NBTASK_ID,PEOPLE_CODE,MSG_ID,MSG_FLAG,MSG_TEXT,MSG_PIC,MSG_VEDIO,TASK_EXETIME,TASK_EXEC_NUMBER,MSG_OTHER,PEOPLE_HEAD_PIC,PEOPLE_NICKNAME,PEOPLE_SIGNATURE,COLLECT_ACCOUNT,DISTANCE_FROM_EQUIP,EQUIP_SITE_X,EQUIP_SITE_Y,MSG_SEND_TIME,EQUIP_CODE,PEOPLE_SEX,
FIRST_UPDATETIME,LAST_UPDATETIME ,FIRST_UPDATETIME+'' as IndexInfo from tb_nbtask_result WHERE FIRST_UPDATETIME > :sql_last_value ORDER BY FIRST_UPDATETIME

这是要选择执行的task.sql文件中的sql，我以 FIRST_UPDATETIME 为追踪字段，我将通过
FIRST_UPDATETIME+’’ as IndexInfo 把字段转为字符串的形式，让其追踪，就会达到需要的实时更新，这一切按需求把.(其中我把FIRST_UPDATETIME 查了两次，其中没起别名的是为了在后续的范围查找中用到) 如果是这样操作的话，这时候就需要把
#tracking_column_type => “timestamp” 注释掉了！

其他一些详细的参数，学习可以查看官网文档！！
相信解决问题的办法肯定还有很多种，也欢迎大家，一起交流一起进步！
想查看springBoot / Cloud 整合 ES(elasticsearch ) 的可以移步：
https://blog.csdn.net/cd420928908/article/details/93606175

XC_Aaron

关注

3
点赞
踩
10

收藏

觉得还不错? 一键收藏
打赏
0
评论
通过Logstash将MySql中多表导入ES（Elasticsearch）

这边就对将MySql导入ES做一个简单的记录，如果有想查看Logstash信息或者它的安装方法等可以去看：Logstash官网文档：https://www.elastic.co/guide/en/logstash-versioned-plugins/current/index.htmlLogstash安装：https://www.cnblogs.com/dyh004/p/9638675.h...
复制链接

扫一扫