ES 同步oracle logstash增量 全量同步数据

一 需要准备的工具
发了半年都发不上去,最近csdn好像更新了,变快了,有了那么一点点安慰

  1. ES 服务端 我这里用的7.6.1版本 elasticsearch-7.6.1
  2. 操作ES数据的工具 :elasticsearch-head-master 需要安装(nodejs 环境)不是很友好,但是很小,启动简单而且快。
  3. 操作ES数据的工具:kibana-7.6.1-windows-x86_64 很强大牛逼的一个工具,比较大,启动稍慢。
  4. ES 同步数据工具:logstash-7.6.1
  5. 以上软件 官网均可免费下载 地址:链接
  6. 所有ES工具 网盘地址:https://pan.baidu.com/s/1yh7yYb0alsMQBlHU4E1nag 提取码:4n52

二 配置:

  1. 打开 D:\Environment\elasticsearch\elasticsearch-7.6.1\config\ elasticsearch.yml 添加两行配置:
http.cors.enabled: true   #开启跨域访问支持
http.cors.allow-origin: "*"   #跨域访问允许的域名地址

配置完, 双击启动 elasticsearch.bat 文件 启动成功 浏览器输入:http://127.0.0.1:9200

2.logstash 配置
首先将 logstash-7.6.1 文件 放在 D:\Environment\elasticsearch 目录下,然后进入
D:\Environment\elasticsearch\logstash-7.6.1\config 目录下 创建 logstash_alldata.conf (全量更新)文件和logstash_updatedata.conf (增量更新)
打开文件 logstash_alldata.conf (如文件放置的路径改变需要配置) 若路径改变,文件里相关路径地址都需要修改 logstash_updatedata.conf 文件路径也许相应修改 如下图所示:

在这里插入图片描述
logstash_alldata.conf 全量文件配置:

input{
    jdbc{
		# 数据库驱动包存放路径
        jdbc_driver_library => "D:/Environment/elasticsearch/ojdbc8-19.3.0.0.jar"
		# 数据库驱动器;
        jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
		# 数据库连接方式
        jdbc_connection_string => "jdbc:oracle:thin:@192.168.0.115:1521:jzkj"
		# 数据库用户名
        jdbc_user => "jzkjx"
		# 数据库密码
        jdbc_password => "123456"
		
		# 数据库重连尝试次数
		connection_retry_attempts => "3"
		
		# 判断数据库连接是否可用,默认false不开启
		jdbc_validate_connection => "true"
		
		# 数据库连接可用校验超时时间,默认3600s
		jdbc_validation_timeout => "3600"
		
		# 开启分页查询(默认false不开启)
        jdbc_paging_enabled => "true"
		
		# 单次分页查询条数(默认100000,若字段较多且更新频率较高,建议调低此值)
        jdbc_page_size => "10000"
		
		# statement为查询数据sql,如果sql较复杂,建议通过statement_filepath配置sql文件的存放路径
        statement_filepath => "D:/Environment/elasticsearch/queryall_archives_record.sql"
		
		# sql_last_value为内置的变量,存放上次查询结果中最后一条数据tracking_column的值,此处即为rowid
		#statement => "SELECT * FROM EA.ARCHIVES_RECORD WHERE ID > :sql_last_value"
		
		# 是否将字段名转换为小写,默认true(如果有数据序列化、反序列化需求,建议改为false);
		lowercase_column_names => false
		
		# 是否记录上次执行结果,true表示会将上次执行结果的tracking_column字段的值保存到last_run_metadata_path指定的文件中
		#record_last_run => true
		
		# 需要记录查询结果某字段的值时,此字段为true,否则默认tracking_column为timestamp的值
		#是否使用字段值作为增量标识
		#use_column_value => true
		
		# 查询结果某字段的数据类型,仅包括numeric和timestamp,默认为numeric
		#tracking_column_type => numeric
		
		# 需要记录的字段,用于增量同步,需是数据库字段
		#tracking_column => "ID"
		
		# 记录上次执行结果数据的存放位置
		#last_run_metadata_path => "D:/Environment/elasticsearch/last_recordid.txt"
		
		# 是否清除last_run_metadata_path的记录,需要增量同步时此字段必须为false
		#clean_run => false
		
		# 同步频率(分 时 天 月 年),默认每分钟同步一次
        schedule => "* * * * *"
		
		#处理中文乱码问题
        codec => plain { charset => "UTF-8"}
		
		# ES索引的type
        type => "archivesrecord"
    }
	
	
	
	# 如果同步多个表,可以在复制一个jdbc{}块,修改相应的地方即可
}

filter {

 # 删除无用字段
  mutate {   
          remove_field => "@timestamp"
          remove_field => "@version"
  } 
  
}
 
output{
    #elasticsearch{
		# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
    #   hosts => "127.0.0.1:9200"
		# 索引名称
    #   index => "es_archives"
		# 数据唯一索引(建议同数据库表的唯一ID对应)
	#	document_id => "%{ID}"
    #}

 
	# 同步多个表时的写法
	if [type] == "archivesrecord" {
	 	elasticsearch{
			# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
			hosts => "127.0.0.1:9200"
			# 索引名称
			index => "es_archives"
			# 数据唯一索引(建议同数据库表的唯一ID对应)
			document_id => "%{ID}"
		}
	}
	if [type] == "archivesfile" {
	 	elasticsearch{
			# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
			hosts => "127.0.0.1:9200"
			# 索引名称
			index => "es_file"
			# 数据唯一索引(建议同数据库表的唯一ID对应)
			document_id => "%{ID}"
		}
	}
	if [type] == "archivesscanfile" {
	 	elasticsearch{
			# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
			hosts => "127.0.0.1:9200"
			# 索引名称
			index => "es_scanfile"
			# 数据唯一索引(建议同数据库表的唯一ID对应)
			document_id => "%{ID}"
		}
	}
	
	stdout{
		# JSON格式输出
		codec => "json_lines"	
	}
	
}

logstash_updatedata.conf 增量文件配置:


input{
    jdbc{
		# 数据库驱动包存放路径
        jdbc_driver_library => "D:/Environment/elasticsearch/ojdbc8-19.3.0.0.jar"
		# 数据库驱动器;
        jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
		# 数据库连接方式
        jdbc_connection_string => "jdbc:oracle:thin:@192.168.0.115:1521:jzkj"
		# 数据库用户名
        jdbc_user => "jzkjx"
		# 数据库密码
        jdbc_password => "123456"
		
		# 数据库重连尝试次数
		connection_retry_attempts => "3"
		
		# 判断数据库连接是否可用,默认false不开启
		jdbc_validate_connection => "true"
		
		# 数据库连接可用校验超时时间,默认3600s
		jdbc_validation_timeout => "3600"
		
		# 开启分页查询(默认false不开启)
        jdbc_paging_enabled => "true"
		
		# 单次分页查询条数(默认100000,若字段较多且更新频率较高,建议调低此值)
        jdbc_page_size => "10000"
		
		# statement为查询数据sql,如果sql较复杂,建议通过statement_filepath配置sql文件的存放路径
        statement_filepath => "D:/Environment/elasticsearch/queryupdate_archives_record.sql"
		
		# sql_last_value为内置的变量,存放上次查询结果中最后一条数据tracking_column的值,此处即为rowid
		#statement => "SELECT * FROM EA.ARCHIVES_RECORD WHERE ID > :sql_last_value"
		
		# 是否将字段名转换为小写,默认true(如果有数据序列化、反序列化需求,建议改为false);
		lowercase_column_names => false
		
		# 是否记录上次执行结果,true表示会将上次执行结果的tracking_column字段的值保存到last_run_metadata_path指定的文件中
		record_last_run => true
		
		# 需要记录查询结果某字段的值时,此字段为true,否则默认tracking_column为timestamp的值
		use_column_value => true
		
		# 查询结果某字段的数据类型,仅包括numeric和timestamp,默认为numeric
		tracking_column_type => numeric
		
		# 需要记录的字段,用于增量同步,需是数据库字段
		tracking_column => "ID"
		
		# 记录上次执行结果数据的存放位置
		last_run_metadata_path => "D:/Environment/elasticsearch/last_recordid.txt"
		
		# 是否清除last_run_metadata_path的记录,需要增量同步时此字段必须为false
		clean_run => false
		
		# 同步频率(分 时 天 月 年),默认每分钟同步一次 (下方为2秒刷新一次)
        schedule => "*/2 * * * * *"
		
		# ES索引的type
        type => "archivesrecord"
    }
	# 如果同步多个表,可以在复制一个jdbc{}块,修改相应的地方即可
}

filter {

 # 删除无用字段
  mutate {   
          remove_field => "@timestamp"
          remove_field => "@version"
		  
  } 
  
}
 
output{
    #elasticsearch{
		# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
    #   hosts => "127.0.0.1:9200"
		# 索引名称
    #   index => "es_archives"
		# 数据唯一索引(建议同数据库表的唯一ID对应)
	#	document_id => "%{ID}"
    #}

 
	# 同步多个表时的写法
	if [type] == "archivesrecord" {
	 	elasticsearch{
			# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
			hosts => "127.0.0.1:9200"
			# 索引名称
			index => "es_archives"
			# 数据唯一索引(建议同数据库表的唯一ID对应)
			document_id => "%{ID}"
		}
	}
	if [type] == "archivesfile" {
	 	elasticsearch{
			# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
			hosts => "127.0.0.1:9200"
			# 索引名称
			index => "es_file"
			# 数据唯一索引(建议同数据库表的唯一ID对应)
			document_id => "%{ID}"
		}
	}
	if [type] == "archivesscanfile" {
	 	elasticsearch{
			# ES地址,集群中多个地址可用数组形式:hosts => ["192.168.1.1:9200", "192.168.1.2:9200"]
			hosts => "127.0.0.1:9200"
			# 索引名称
			index => "es_scanfile"
			# 数据唯一索引(建议同数据库表的唯一ID对应)
			document_id => "%{ID}"
		}
	}
	
	stdout{
		# JSON格式输出
		codec => "json_lines"	
	}
	
}

打开相同目录下:pipelines.yml 配置相应路径 (启动就方便了)

# List of pipelines to be loaded by Logstash
#
# This document must be a list of dictionaries/hashes, where the keys/values are pipeline settings.
# Default values for omitted settings are read from the `logstash.yml` file.
# When declaring multiple pipelines, each MUST have its own `pipeline.id`.
#
# Example of two pipelines:
#
# - pipeline.id: test
#   pipeline.workers: 1
#   pipeline.batch.size: 1
#   config.string: "input { generator {} } filter { sleep { time => 1 } } output { stdout { codec => dots } }"
# - pipeline.id: another_test
#   queue.type: persisted
#   path.config: "/tmp/logstash/*.config"
#
# Available options:
#
#   # name of the pipeline
#   pipeline.id: mylogs
#
#   # The configuration string to be used by this pipeline
#   config.string: "input { generator {} } filter { sleep { time => 1 } } output { stdout { codec => dots } }"
#
#   # The path from where to read the configuration text
#   path.config: "/etc/conf.d/logstash/myconfig.cfg"
#
#   # How many worker threads execute the Filters+Outputs stage of the pipeline
#   pipeline.workers: 1 (actually defaults to number of CPUs)
#
#   # How many events to retrieve from inputs before sending to filters+workers
#   pipeline.batch.size: 125
#
#   # How long to wait in milliseconds while polling for the next event
#   # before dispatching an undersized batch to filters+outputs
#   pipeline.batch.delay: 50
#
#   # Internal queuing model, "memory" for legacy in-memory based queuing and
#   # "persisted" for disk-based acked queueing. Defaults is memory
#   queue.type: memory
#
#   # If using queue.type: persisted, the page data files size. The queue data consists of
#   # append-only data files separated into pages. Default is 64mb
#   queue.page_capacity: 64mb
#
#   # If using queue.type: persisted, the maximum number of unread events in the queue.
#   # Default is 0 (unlimited)
#   queue.max_events: 0
#
#   # If using queue.type: persisted, the total capacity of the queue in number of bytes.
#   # Default is 1024mb or 1gb
#   queue.max_bytes: 1024mb
#
#   # If using queue.type: persisted, the maximum number of acked events before forcing a checkpoint
#   # Default is 1024, 0 for unlimited
#   queue.checkpoint.acks: 1024
#
#   # If using queue.type: persisted, the maximum number of written events before forcing a checkpoint
#   # Default is 1024, 0 for unlimited
#   queue.checkpoint.writes: 1024
#
#   # If using queue.type: persisted, the interval in milliseconds when a checkpoint is forced on the head page
#   # Default is 1000, 0 for no periodic checkpoint.
#   queue.checkpoint.interval: 1000
#
#   # Enable Dead Letter Queueing for this pipeline.
#   dead_letter_queue.enable: false
#
#   If using dead_letter_queue.enable: true, the maximum size of dead letter queue for this pipeline. Entries
#   will be dropped if they would increase the size of the dead letter queue beyond this setting.
#   Default is 1024mb
#   dead_letter_queue.max_bytes: 1024mb
#
#   If using dead_letter_queue.enable: true, the directory path where the data files will be stored.
#   Default is path.data/dead_letter_queue
#
#   path.dead_letter_queue:

 - pipeline.id: logstash_alldata
   pipeline.workers: 1
   pipeline.batch.size: 1
   path.config: "D:\\Environment\\elasticsearch\\logstash-7.6.1\\config\\logstash_alldata.conf" 
 - pipeline.id: logstash_updatedata
   pipeline.workers: 1
   path.config: "D:\\Environment\\elasticsearch\\logstash-7.6.1\\config\\logstash_updatedata.conf"


然后配置对应的sql

在这里插入图片描述
贴一个增量的sql

SELECT
	t.*,
	t1.type_name,
	t3.username AS create_person_name,
	t4.name AS dept_name,
	t5.label AS data_status_name,
	t6.type_name AS secret_name 
FROM
	ea.ARCHIVES_FILE t
	LEFT JOIN ea.archives_file_type t1 ON t.FILE_TYPE = t1.type_code
	LEFT JOIN jzkjx.sys_user t3 ON t.create_person = t3.user_id
	LEFT JOIN jzkjx.sys_dept t4 ON t4.dept_id = t.dept_id
	LEFT JOIN jzkjx.sys_dict_item t5 ON t5.value = t.data_status 
	AND t5.TYPE = 'file_state'
	LEFT JOIN ea.archives_secret_level t6 ON t6.type_code = t.secret_leval
	WHERE t.ID > :sql_last_value

3.ES 服务器操作工具
首先将 elasticsearch-head-master 文件 放在 D:\Environment\elasticsearch 目录下,需要下载环境 Nodejs Npm环境(可选)
然后进入目录 D:\Environment\elasticsearch\elasticsearch-head-master 下执行 cmd
输入 npm run start 启动工具 如下则启动成功

在这里插入图片描述
访问浏览器 可以各种操作
在这里插入图片描述

若 没有 nodejs 环境 可以使用kibana-7.6.1-windows-x86_64 工具 将工具放在 D:\Environment\elasticsearch 目录下,然后进入目录 D:\Environment\elasticsearch\kibana-7.6.1-windows-x86_64\bin 下,双击 kibana.bat
若出现以下效果 则启动成功

在这里插入图片描述

打开浏览器 如下
在这里插入图片描述

至此 es 软件已经配置完毕

java 代码如下:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <parent>
    <artifactId>jzkj</artifactId>
    <groupId>com.jzkj</groupId>
    <version>3.8.0</version>
  </parent>
  <modelVersion>4.0.0</modelVersion>

  <artifactId>jzkj-elasticsearch</artifactId>


  <description>jzkj 搜索</description>

  <dependencies>
    <!--upms api、model 模块-->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-upms-api</artifactId>
    </dependency>
    <!--日志处理-->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-log</artifactId>
    </dependency>
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-data</artifactId>
    </dependency>
    <!--swagger-->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-swagger</artifactId>
    </dependency>
    <!--文件系统-->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-minio</artifactId>
    </dependency>
    <!--注册中心客户端-->
    <dependency>
      <groupId>com.alibaba.cloud</groupId>
      <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
    </dependency>
    <!--配置中心客户端-->
    <dependency>
      <groupId>com.alibaba.cloud</groupId>
      <artifactId>spring-cloud-starter-alibaba-nacos-config</artifactId>
    </dependency>
    <!--spring security 、oauth、jwt依赖-->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-security</artifactId>
    </dependency>
    <!--支持动态路由配置 -->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-gateway</artifactId>
    </dependency>
    <!--sentinel 依赖-->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-sentinel</artifactId>
    </dependency>
    <!--路由控制-->
    <dependency>
      <groupId>com.jzkj</groupId>
      <artifactId>jzkj-common-gray</artifactId>
    </dependency>
    <!--mybatis-->
    <dependency>
      <groupId>com.baomidou</groupId>
      <artifactId>mybatis-plus-boot-starter</artifactId>
    </dependency>
    <!-- druid 连接池 -->
    <dependency>
      <groupId>com.alibaba</groupId>
      <artifactId>druid-spring-boot-starter</artifactId>
    </dependency>
    <!--数据库-->
    <dependency>
      <groupId>com.oracle.ojdbc</groupId>
      <artifactId>ojdbc8</artifactId>
    </dependency>
    <!-- 使用oracl数据库防止mybatis报错-->
    <dependency>
      <groupId>cn.easyproject</groupId>
      <artifactId>orai18n</artifactId>
      <version>12.1.0.2.0</version>
    </dependency>
    <!--web 模块-->
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <!--undertow容器-->
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-undertow</artifactId>
    </dependency>
	  <!--搜索依赖-->
	  <dependency>
		  <groupId>org.elasticsearch.client</groupId>
		  <artifactId>elasticsearch-rest-high-level-client</artifactId>
		  <version>7.6.1</version>
	  </dependency>

	  <dependency>
		  <groupId>org.elasticsearch</groupId>
		  <artifactId>elasticsearch</artifactId>
		  <version>7.6.1</version>
	  </dependency>

	  <dependency>
		  <groupId>org.elasticsearch.client</groupId>
		  <artifactId>elasticsearch-rest-client</artifactId>
		  <version>7.6.1</version>
	  </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-maven-plugin</artifactId>
      </plugin>
      <plugin>
        <groupId>io.fabric8</groupId>
        <artifactId>docker-maven-plugin</artifactId>
      </plugin>
    </plugins>
  </build>
</project>

controller

package com.jzkj.elasticsearch.controller;

import com.baomidou.mybatisplus.extension.plugins.pagination.Page;
import com.jzkj.elasticsearch.service.EsSearchService;
import com.jzkj.jzkj.common.core.util.R;
import io.swagger.annotations.Api;
import io.swagger.annotations.ApiOperation;
import lombok.AllArgsConstructor;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

/**
 * @author hyh
 */

@RestController
@AllArgsConstructor
@RequestMapping("/EsSearch" )
@Api(value = "EsSearch", tags = "搜索")
public class EsSearchController {

	private final EsSearchService esSearchService;

	@ApiOperation(value = "TEST", notes = "TEST")
	@GetMapping("/test" )
	public R test() {
		return R.ok("Search Success");
	}



	/**
	 * 全字段模糊查询
	 * @param keyword
	 * @param page
	 * @return
	 */
	@ApiOperation(value = "simpleSearch", notes = "simpleSearch")
	@GetMapping("/simpleSearch" )
	public R simpleSearch(String keyword, Page page) {
		return R.ok(esSearchService.simpleSearch(keyword,page.getCurrent(),page.getSize()));
	}


	@ApiOperation(value = "查询所有", notes = "查询所有")
	@GetMapping("/allSearch" )
	public R allSearch(Page page) {
		return R.ok(esSearchService.searchAll(page.getCurrent(),page.getSize()));
	}

	@ApiOperation(value = "即时检索", notes = "即时检索")
	@GetMapping("/realTimeSearch" )
	public R realTimeSearch(String keyword) {
		return R.ok(esSearchService.realTimeSearch(keyword));
	}
}

实现类

package com.jzkj.elasticsearch.service.impl;

import cn.hutool.core.bean.BeanUtil;
import cn.hutool.core.bean.copier.CopyOptions;
import cn.hutool.core.collection.CollUtil;
import cn.hutool.core.net.URLDecoder;
import com.baomidou.mybatisplus.extension.plugins.pagination.Page;
import com.jzkj.elasticsearch.common.EsEnum;
import com.jzkj.elasticsearch.service.EsSearchService;
import com.jzkj.jzkj.admin.api.eaentity.ArchivesRecord;
import lombok.AllArgsConstructor;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.unit.Fuzziness;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
import org.elasticsearch.search.sort.SortOrder;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.nio.charset.Charset;
import java.util.*;
import java.util.concurrent.TimeUnit;

/**
 * @author hyh
 */
@AllArgsConstructor
@Service
public class EsSearchServiceImpl implements EsSearchService {


	@Qualifier("client")
	private final RestHighLevelClient restHighLevelClient;

	/**
	 * 查询所有数据
	 * 自己用分页处理
	 * @param pageNo
	 * @param pageSize
	 * @return
	 */
	public Page<DaDab> searchAll(long pageNo, long pageSize) {
		// 转码 UTF-8
		//keyword = URLDecoder.decode(keyword, Charset.defaultCharset());
		// 条件搜索
		SearchRequest searchRequest = new SearchRequest(EsEnum.INDEX_NAME.getValue());
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

		// 分页
		searchSourceBuilder.from((int) pageNo);
		searchSourceBuilder.size((int) pageSize);

		// 排序
		searchSourceBuilder.sort("CREATE_TIME", SortOrder.DESC);

		QueryBuilder matchAllQueryBuilder = QueryBuilders.matchAllQuery();
		searchSourceBuilder.query(matchAllQueryBuilder);
		searchSourceBuilder.timeout(new TimeValue(10, TimeUnit.SECONDS));

		// 执行搜索
		searchRequest.source(searchSourceBuilder);

		SearchResponse searchResponse = null;
		List<Map<String, Object>> list = new ArrayList<>();
		Page<DaDab> DaDabPage = new Page<>();
		try {
			searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
			SearchHits hits = searchResponse.getHits();
			// 获取检索结果总数
			int count = (int) hits.getTotalHits().value;
			DaDabPage.setTotal(count);
			if (count != 0) {
				for (SearchHit hit : hits.getHits()) {
					Map<String, Object> sourceAsMap = hit.getSourceAsMap();
					list.add(sourceAsMap);
				}
			}
			//restHighLevelClient.close();
		} catch (IOException e) {
			e.printStackTrace();
		}
		DaDabPage.setRecords(formatBean(list));
		return DaDabPage;

	}

	/**
	 * 全字段模糊匹配 (高亮显示)
	 *
	 * @param keyword
	 * @param pageNo
	 * @param pageSize
	 * @return
	 * @throws IOException
	 */
	@Override
	public Page<ArchivesRecord> simpleSearchHeightLight(String keyword, long pageNo, long pageSize) {
		// 转码 UTF-8
		keyword = URLDecoder.decode(keyword, Charset.defaultCharset());
		// 条件搜索
		SearchRequest searchRequest = new SearchRequest(EsEnum.INDEX_NAME.getValue());
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

		// 分页
		searchSourceBuilder.from((int) pageNo);
		searchSourceBuilder.size((int) pageSize);

		// 排序
		searchSourceBuilder.sort("CREATE_TIME", SortOrder.DESC);

		QueryBuilder simpleQueryBuilder = QueryBuilders.simpleQueryStringQuery(keyword);
		searchSourceBuilder.query(simpleQueryBuilder);
		searchSourceBuilder.timeout(new TimeValue(10, TimeUnit.SECONDS));

		// 高亮
		HighlightBuilder highlightBuilder = new HighlightBuilder();
		highlightBuilder.field("ARCHIVES_NAME").field("ARCHIVES_CODE")
				.field("BELONG_PERSON").field("BELONG_TEL").field("DEPT_NAME")
				.field("DATA_STATUS_NAME").field("SECRET_NAME").field("ARCHIVES_TYPE")
				.field("CREATE_PERSON_NAME").field("TYPE_NAME");
		// 多个高亮显示 要设为 false
		highlightBuilder.requireFieldMatch(false);
		highlightBuilder.preTags("<span style='color:red'>");
		highlightBuilder.postTags("</span>");
		searchSourceBuilder.highlighter(highlightBuilder);

		// 执行搜索
		searchRequest.source(searchSourceBuilder);

		SearchResponse searchResponse = null;
		List<Map<String, Object>> list = new ArrayList<>();
		Page<ArchivesRecord> archivesRecordPage = new Page<>();
		try {
			searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
			SearchHits hits = searchResponse.getHits();
			// 获取检索结果总数
			int count = (int) hits.getTotalHits().value;
			archivesRecordPage.setTotal(count);
			if (count != 0) {
				for (SearchHit hit : hits.getHits()) {
					Map<String, Object> sourceAsMap = hit.getSourceAsMap();
					Map<String, HighlightField> highlightFieldMap = hit.getHighlightFields();

					// 获取高亮
					for (Map.Entry<String, HighlightField> entry : highlightFieldMap.entrySet()) {
						Text[] texts = highlightFieldMap.get(entry.getKey()).getFragments();
						String title = "";
						for (Text text : texts) {
							title += text;
						}
						sourceAsMap.put(entry.getKey(), title);
					}
					list.add(sourceAsMap);
				}
			}
			//restHighLevelClient.close();
		} catch (IOException e) {
			e.printStackTrace();
		}
		archivesRecordPage.setRecords(formatBean(list));
		return archivesRecordPage;

	}


	/**
	 * 全字段模糊匹配
	 *
	 * @param keyword
	 * @param pageNo
	 * @param pageSize
	 * @return
	 */
	@Override
	public Page<ArchivesRecord> simpleSearch(String keyword, long pageNo, long pageSize) {
		// 转码 UTF-8
		//keyword = URLDecoder.decode(keyword, Charset.defaultCharset());
		// 条件搜索
		SearchRequest searchRequest = new SearchRequest(EsEnum.INDEX_NAME.getValue());
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

		// 分页
		//searchSourceBuilder.from((int) pageNo);
		//searchSourceBuilder.size((int) pageSize);

		// 排序
		//searchSourceBuilder.sort("CREATE_TIME", SortOrder.DESC);

		// 必须条件
		TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("DEL_FLAG", "0");
		// 多字段模糊条件
		BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
		boolQueryBuilder.must(QueryBuilders.multiMatchQuery(keyword, "ARCHIVES_NAME", "ARCHIVES_CODE",
				"BELONG_PERSON", "BELONG_TEL", "DEPT_NAME", "DATA_STATUS_NAME","REAL_NAME",
				"SECRET_NAME", "ARCHIVES_TYPE", "CREATE_PERSON_NAME", "TYPE_NAME"));

		boolQueryBuilder.must(termQueryBuilder);

		searchSourceBuilder.query(boolQueryBuilder);
		searchSourceBuilder.timeout(new TimeValue(10, TimeUnit.SECONDS));

		// 执行搜索
		searchRequest.source(searchSourceBuilder);

		SearchResponse searchResponse = null;
		List<Map<String, Object>> list = new ArrayList<>();
		Page<ArchivesRecord> archivesRecordPage = new Page<>();
		try {
			searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
			SearchHits hits = searchResponse.getHits();
			// 获取检索结果总数
			int count = (int) hits.getTotalHits().value;
			archivesRecordPage.setTotal(count);
			if (count != 0) {
				for (SearchHit hit : hits.getHits()) {
					Map<String, Object> sourceAsMap = hit.getSourceAsMap();
					list.add(sourceAsMap);
				}
			}
			//restHighLevelClient.close();
		} catch (IOException e) {
			e.printStackTrace();
		}
		//archivesRecordPage.setSize((int) pageSize);
		archivesRecordPage.setRecords(formatBean(list));
		return archivesRecordPage;

	}

	/**
	 *  即时搜索
	 * @param keyword
	 * @return
	 */
	@Override
	public Set<Map<String, Object>> realTimeSearch(String keyword) {

		// 条件搜索
		SearchRequest searchRequest = new SearchRequest(EsEnum.INDEX_NAME.getValue());
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
		// 分页
		//searchSourceBuilder.from(1);
		//searchSourceBuilder.size(100);

		// 返回指定字段
		searchSourceBuilder.fetchSource("ARCHIVES_NAME",null);

		// 必须条件
		TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("DEL_FLAG", "0");
		// 多字段模糊条件
		BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
		boolQueryBuilder.must(termQueryBuilder);
		boolQueryBuilder.must(QueryBuilders.wildcardQuery("ARCHIVES_NAME.keyword","*" + keyword + "*"));
		//boolQueryBuilder.should(QueryBuilders.matchPhrasePrefixQuery("ARCHIVES_CODE",keyword));
		searchSourceBuilder.query(boolQueryBuilder);
		searchSourceBuilder.timeout(new TimeValue(10, TimeUnit.SECONDS));

		// 执行搜索
		searchRequest.source(searchSourceBuilder);

		SearchResponse searchResponse = null;
		Set<Map<String, Object>> set = new HashSet<>();
		try {
			searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

			SearchHits hits = searchResponse.getHits();
			// 获取检索结果总数
			int count = (int) hits.getTotalHits().value;
			if (count != 0) {
				for (SearchHit hit : hits.getHits()) {
					Map<String, Object> sourceAsMap = hit.getSourceAsMap();
					set.add(sourceAsMap);
				}
			}
			//restHighLevelClient.close();
		} catch (IOException e) {
			e.printStackTrace();
		}
		return set;

	}


	/**
	 * 查询所有数据
	 *
	 * @param pageNo
	 * @param pageSize
	 * @return
	 */
	@Override
	public Page<ArchivesRecord> searchAll(long pageNo, long pageSize) {
		// 转码 UTF-8
		//keyword = URLDecoder.decode(keyword, Charset.defaultCharset());
		// 条件搜索
		SearchRequest searchRequest = new SearchRequest(EsEnum.INDEX_NAME.getValue());
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

		// 分页
		searchSourceBuilder.from((int) pageNo);
		searchSourceBuilder.size((int) pageSize);

		// 排序
		searchSourceBuilder.sort("CREATE_TIME", SortOrder.DESC);

		QueryBuilder matchAllQueryBuilder = QueryBuilders.matchAllQuery();
		searchSourceBuilder.query(matchAllQueryBuilder);
		searchSourceBuilder.timeout(new TimeValue(10, TimeUnit.SECONDS));

		// 执行搜索
		searchRequest.source(searchSourceBuilder);

		SearchResponse searchResponse = null;
		List<Map<String, Object>> list = new ArrayList<>();
		Page<ArchivesRecord> archivesRecordPage = new Page<>();
		try {
			searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
			SearchHits hits = searchResponse.getHits();
			// 获取检索结果总数
			int count = (int) hits.getTotalHits().value;
			archivesRecordPage.setTotal(count);
			if (count != 0) {
				for (SearchHit hit : hits.getHits()) {
					Map<String, Object> sourceAsMap = hit.getSourceAsMap();
					list.add(sourceAsMap);
				}
			}
			//restHighLevelClient.close();
		} catch (IOException e) {
			e.printStackTrace();
		}
		archivesRecordPage.setRecords(formatBean(list));
		return archivesRecordPage;

	}


	/**
	 * 将 Map 转化为 对象
	 *
	 * @param mapList
	 * @return
	 */
	public List<ArchivesRecord> formatBean(List<Map<String, Object>> mapList) {
		// 设置别名,用于对应bean的字段名
		Map<String, String> mapping = CollUtil.newHashMap();
		ArrayList<ArchivesRecord> beanList = new ArrayList<>();
		mapping.put("ID", "id");
		mapping.put("ARCHIVES_CODE", "archivesCode");
		mapping.put("ARCHIVES_NAME", "archivesName");
		mapping.put("ARCHIVES_TYPE", "archivesType");
		mapping.put("BELONG_PERSON", "belongPerson");
		mapping.put("BELONG_PERSON_ID_CARD", "belongPersonIdCard");
		mapping.put("BELONG_TEL", "belongTel");
		mapping.put("DEL_FLAG", "delFlag");
		mapping.put("CREATE_PERSON", "createPerson");
		mapping.put("CREATE_TIME", "createTime");
		mapping.put("DEPT_ID", "deptId");
		mapping.put("DATA_STATUS", "dataStatus");
		mapping.put("SECRET_LEVAL", "secretLeval");
		mapping.put("TYPE_NAME", "typeName");
		mapping.put("BELONG_PERSON_NAME", "belongPersonName");
		mapping.put("CREATE_PERSON_NAME", "createPersonName");
		mapping.put("DEPT_NAME", "deptName");
		mapping.put("DATA_STATUS_NAME", "dataStatusName");
		mapping.put("SECRET_NAME", "secretName");
		mapping.put("ENTITY_FLAG", "entityFlag");
		mapping.put("FILE_NUM", "fileNum");
		mapping.put("ORIGIN", "origin");
		mapping.put("SCAN_FILE_NUMBER", "scanFileNumber");
		mapping.put("CHANGE_STATUS", "changeStatus");
		mapping.put("REAL_NAME", "realName");
		mapList.forEach(item -> {
			ArchivesRecord archivesRecord = BeanUtil.mapToBean(item, ArchivesRecord.class, CopyOptions.create().setFieldMapping(mapping));
			beanList.add(archivesRecord);
		});
		return beanList;
	}


	/**
	 * 根据时间范围查找
	 *
	 * @param startTime
	 * @param endTime
	 * @return
	 * @throws IOException
	 */
	public SearchHits searchByTimeRange(String startTime, String endTime) throws IOException {
		SearchRequest searchRequest = new SearchRequest(EsEnum.INDEX_NAME.getValue());
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
		BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
		//queryBuilder.must(QueryBuilders.rangeQuery("CREATE_TIME").gt(startTime).lt(endTime));
		queryBuilder.must(QueryBuilders.rangeQuery("CREATE_TIME").from(startTime).to(endTime));

		searchSourceBuilder.query(queryBuilder);

		searchRequest.source(searchSourceBuilder);
		SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
		SearchHits searchHits = searchResponse.getHits();
		return searchHits;
	}

	/**
	 *  复合查询
	 */
//	public void searchByBool() {
//		//Bool Query
//		BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
//
//		//电影名称必须包含我不是药神经过分词后的文本,比如我、不、是、药、神
//		boolQueryBuilder.must(QueryBuilders.matchQuery(MovieSearch.NAME, "我不是药神"));
//
//		//排除导演是张三的电影信息
//		boolQueryBuilder.mustNot(QueryBuilders.termQuery(MovieSearch.DIRECTORS, "张三"));
//
//		//别名中应该包含药神经过分词后的文本,比如药、神
//		boolQueryBuilder.should(QueryBuilders.matchQuery(MovieSearch.ALIAS, "药神"));
//
//		//评分必须大于9(因为es对filter会有智能缓存,推荐使用)
//		boolQueryBuilder.filter(QueryBuilders.rangeQuery(MovieSearch.SCORE).gt(9));
//
//		//name、actors、introduction、alias、label 多字段匹配"药神",或的关系
//		boolQueryBuilder.filter(QueryBuilders.multiMatchQuery("药神", MovieSearch.NAME, MovieSearch.ACTORS, MovieSearch.INTRODUCTION, MovieSearch.ALIAS, MovieSearch.LABEL));
//
//		String[] includes = {MovieSearch.NAME, MovieSearch.ALIAS, MovieSearch.SCORE, MovieSearch.ACTORS, MovieSearch.DIRECTORS, MovieSearch.INTRODUCTION};
//		SearchRequestBuilder searchRequestBuilder = transportClient.prepareSearch(INDEX).setTypes(TYPE).setQuery(boolQueryBuilder).addSort(MovieSearch.SCORE, SortOrder.DESC).setFrom(0).setSize(10).setFetchSource(includes, null);
//		SearchResponse searchResponse = searchRequestBuilder.get();
//		if (!RestStatus.OK.equals(searchResponse.status())) {
//			return;
//		}
//		for (SearchHit searchHit : searchResponse.getHits()) {
//			String name = (String) searchHit.getSource().get(MovieSearch.NAME);
//			//TODO
//		}
//	}
}

至此 代码已完毕 ,有不懂的可以问哈 es 索引会再同步的时候按系统默认的创建,但可以按自己的来写,提高索引效率

我写的一个测试索引: 在卡板那工具里测试奥

GET _search
{
  "query": {
    "match_all": {}
  }
}
GET es_archives/_mapping

PUT es_archives/
{
  "mappings": {
    "properties": {
       "ARCHIVES_CODE" : {
          "type" : "text",
          "copy_to": "recordkeyword"
        },
        "ARCHIVES_NAME" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "ARCHIVES_TYPE" : {
          "type" : "keyword"
        },
        "BELONG_PERSON" : {
          "type" : "keyword"
        },
        "BELONG_PERSON_ID_CARD" : {
          "type" : "keyword"
        },
        "BELONG_PERSON_NAME" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "BELONG_TEL" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "BUSINESS_SERIAL_NUMBER" : {
          "type" : "keyword"
        },
        "CHANGE_STATUS" : {
          "type" : "keyword"
        },
        "CREATE_PERSON" : {
          "type" : "keyword"
        },
        "CREATE_PERSON_NAME" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "CREATE_TIME" : {
          "type" : "date"
        },
        "DATA_STATUS" : {
          "type" : "keyword"
        },
        "DATA_STATUS_NAME" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "DEL_FLAG" : {
          "type" : "keyword"
        },
        "DEPT_ID" : {
          "type" : "long"
        },
        "DEPT_NAME" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "ENTITY_FLAG" : {
          "type" : "long"
        },
        "FILE_NORMAL" : {
          "type" : "keyword"
        },
        "FILE_NUM" : {
          "type" : "long"
        },
        "ID" : {
          "type" : "long"
        },
        "ITEM_NUMBER" : {
          "type" : "keyword"
        },
        "ORIGIN" : {
          "type" : "long"
        },
        "SCAN_FILE_NUMBER" : {
          "type" : "long"
        },
        "SECRET_LEVAL" : {
          "type" : "keyword"
        },
        "SECRET_NAME" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "TENANT_ID" : {
          "type" : "long"
        },
        "TYPE_NAME" : {
          "type" : "keyword",
          "copy_to": "recordkeyword"
        },
        "VERSION_NUMBER" : {
          "type" : "keyword"
        },
        "type" : {
          "type" : "keyword"
        },
        "recordkeyword":{
          "type" : "keyword"
        }
      
    }
  }
}

DELETE es_archives/

GET es_archives/_search
{
  "query": {
    "match": {
      "ARCHIVES_NAME": "刘一提取档案"
    }
  }
}

GET es_archives/_search
{
  "query": {
    "wildcard": {
      "ARCHIVES_NAME": "*刘*"
    }
  }
}


GET es_archives/_search
{
  "query": {
    "bool" : {
    "must" : [
      {
        "term" : {
          "DEL_FLAG" : {
            "value" : "0",
            "boost" : 1.0
          }
        }
      },
      {
        "wildcard" : {
          "ARCHIVES_NAME" : {
            "wildcard" : "刘*",
            "boost" : 1.0
          }
        }
      }
      ]
    }
  }
}


GET es_archives/_search
{
  "query": {
    "match_all": {
      
    }
  }
}

POST es_archives/_doc/2671
{
  "BELONG_PERSON_NAME" : null,
          "PERSON_NUMBER" : null,
          "SECRET_LEVAL" : 263,
          "BELONG_PERSON" : null,
          "ARCHIVES_NAME" : "刘一提取档案",
          "VERSION_NUMBER" : "1",
          "FILE_NORMAL" : "0         ",
          "ENTITY_FLAG" : 1,
          "CREATE_PERSON" : "3120",
          "CREATE_TIME" : "2020-12-05T10:12:34.000Z",
          "DEPT_NAME" : "档案采集部",
          "SCAN_FILE_NUMBER" : 1,
          "type" : "archivesrecord",
          "DEPT_ID" : 221,
          "ORIGIN" : 1,
          "ARCHIVES_CODE" : "WS-2020-Y-0000000884",
          "CREATE_PERSON_NAME" : "admin1",
          "FILE_NUM" : 1,
          "ID" : 2671,
          "BUSINESS_SERIAL_NUMBER" : null,
          "ARCHIVES_TYPE" : "462389bd4b614e1d",
          "BELONG_PERSON_ID_CARD" : null,
          "ITEM_NUMBER" : null,
          "TENANT_ID" : 1,
          "BELONG_TEL" : null,
          "DATA_STATUS_NAME" : "预审通过",
          "SECRET_NAME" : null,
          "CERTIFICATE_TYPE" : null,
          "CHANGE_STATUS" : null,
          "TYPE_NAME" : "测试档案2",
          "DATA_STATUS" : "5",
          "LOAN_NUMBER" : null,
          "DEL_FLAG" : "0"
}




POST es_archives/_doc/2672
{
        "BELONG_PERSON_NAME" : null,
          "PERSON_NUMBER" : null,
          "SECRET_LEVAL" : 263,
          "BELONG_PERSON" : null,
          "ARCHIVES_NAME" : "实体档案",
          "VERSION_NUMBER" : "1",
          "FILE_NORMAL" : "0         ",
          "ENTITY_FLAG" : 0,
          "CREATE_PERSON" : "1",
          "CREATE_TIME" : "2020-12-07T00:55:50.000Z",
          "DEPT_NAME" : null,
          "SCAN_FILE_NUMBER" : 2,
          "type" : "archivesrecord",
          "DEPT_ID" : 181,
          "ORIGIN" : 1,
          "ARCHIVES_CODE" : "00161-3f608dad42e94b71.2020-Y-181-0000000526",
          "CREATE_PERSON_NAME" : "admindoc",
          "FILE_NUM" : 1,
          "ID" : 2672,
          "BUSINESS_SERIAL_NUMBER" : null,
          "ARCHIVES_TYPE" : "3f608dad42e94b71",
          "BELONG_PERSON_ID_CARD" : null,
          "ITEM_NUMBER" : null,
          "TENANT_ID" : 1,
          "BELONG_TEL" : null,
          "DATA_STATUS_NAME" : "交接流程中",
          "SECRET_NAME" : null,
          "CERTIFICATE_TYPE" : null,
          "CHANGE_STATUS" : null,
          "TYPE_NAME" : "实体档案",
          "DATA_STATUS" : "8",
          "LOAN_NUMBER" : null,
          "DEL_FLAG" : "0"
}
  • 6
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值