spring boot elasticsearch ik ik分词繁体搭建方式已经shp文件上传的解决方案

最新推荐文章于 2025-03-09 18:15:00 发布

置顶

一个奋斗的小白

最新推荐文章于 2025-03-09 18:15:00 发布

阅读量783

点赞数

分类专栏： java elasticsearch 文章标签： elasticsearch

本文链接：https://blog.csdn.net/weixin_40334693/article/details/103226253

版权

本文档详细介绍了如何通过Docker部署Elasticsearch并集成IK分词、拼音分词和繁体字分词。同时，解决es-head在ES6后无法访问数据的问题，并提供Spring Boot上传SHP数据的解决方案。涉及到的步骤包括Dockerfile配置、Elasticsearch YAML设置、自定义实体类和mapping、以及Gruntfile.js的修改。

摘要由CSDN通过智能技术生成

请注意,如果你遇到以下几个问题,那么本篇文章会对你有作用:

通过docker部署elasticsearch+IK分词+拼音分词+繁体字分词
es-head安装后,在es6之后无法在es-head中查看数据
如何在在上传在es中的数据里进行拼音分词
spring boot+es如何上传shp数据

如果上以上几个问题,就可以进行查看这篇文档了

这里我会在下方贴自己的es和spring boot+es上传shp数据的github文档,如果心急的话可以直接去github上看,还有elasticsearch-head无法操作elasticsearch需要改的文件也可以直接下载

通过docker部署elasticsearch+IK分词+拼音分词+繁体字分词

不多说直接上dockerfile,我这里是使用的idea进行dockerfile的部署

##修改你需要的版本即可
FROM elasticsearch:6.8.4
MAINTAINER w741069229@163.com
COPY elasticsearch-analysis-ik-6.8.4.zip /opt/
COPY elasticsearch-analysis-pinyin-6.8.4.zip /opt/
COPY elasticsearch-analysis-stconvert-6.8.4.zip /opt/
ADD config/elasticsearch.yml   /usr/share/elasticsearch/config/elasticsearch.yml
RUN echo y | bin/elasticsearch-plugin install file:///opt/elasticsearch-analysis-ik-6.8.4.zip
RUN ./bin/elasticsearch-plugin   install file:///opt/elasticsearch-analysis-pinyin-6.8.4.zip
RUN ./bin/elasticsearch-plugin   install file:///opt/elasticsearch-analysis-stconvert-6.8.4.zip
RUN rm -rf /opt/*.zip

如果你不懂dockerfile的话,建议你先去了解一下,如果不懂linux的话,建议你去看下linux命令.

这里截图下我的es的dockerfile项目结构:

es项目结构 )]

其中es的yml配置如下:

http.cors.enabled: true
http.cors.allow-origin: "*"
network.host: 0.0.0.0
node.name: master
transport.tcp.port: 9300
cluster.name: es
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
transport.tcp.compress: true

这里是主要的配置;

可以去其他地方拷贝一下,然后进行替换即可,如果嫌麻烦,下面就是es原本的yaml,直接创建elasticsearch.yml,然后复制进去即可.

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
#cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
#node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
#network.host: 192.168.0.1
#
# Set a custom port for HTTP:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["host1", "host2"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
network.host: 0.0.0.0
node.name: master
transport.tcp.port: 9300
cluster.name: es
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
transport.tcp.compress: true

对了,你在部署的时候如果遇到了如下的类似语句:

bootstrap.....

那么这个问题就和你遇到过 Too many open files 是一样的,修改以下参数:

vi /etc/sysctl.conf
vm.max_map_count=655360
sysctl -p

es-head部署之后,无法进行访问数据

这里讲的是dockr部署es-head之后的路径;如果不在docker中,那么修改下就可以了

在/usr/src/app/_site中的vendor.js中:

修改6888行将

	contentType: "这里忘记了就不写了",

修改成如下:

	contentType: "application/json;charset=UTF-8",

修改7573

var inspectData = s.contentType === "这里忘记了就不写了" &&
		( typeof s.data === "string" );

修改成如下:

	var inspectData = s.contentType === "application/json;charset=UTF-8" &&
		( typeof s.data === "string" );

在这之后你可能还会遇到es正常启动,但是es-head就是无法访问es:

那么你还需要修改Gruntfile.js;这个js文件,该文件位于:/usr/src/app/

在第90行修改成如下:

      connect: {
   
            server: {
   
                options: {
   
                    hostname: '0.0.0.0',
                    port: 9100,
                    base: '.',
                    keepalive: true
                }
            }
        }

这里修改完成之后,就使用

docker cp

命令上传,重启es-head就可以正常访问了

如何在在上传在es中的数据里进行拼音分词

我这里是自定义的mapping和setting:

我的实体类如下:

@Mapping(mappingPath = "/json/wkt-mapping.json")
@Document(
    indexName = "wkt",
    type = "wkt",
    shards = 5,
    replicas = 0,
    refreshInterval = "1s",
    indexStoreType = "fs")
@NoArgsConstructor
@AllArgsConstructor
@Getter
@Setter
public class WktBean {
   
  @Id private String id;

  @Field(index = false, type = FieldType.Object, store = true)
  private Wkt wkt;

  @Getter
  @Setter
  public static class Wkt {
   
    private String type;
    private Object coordinates;
  }
}

该类的mapping如下:

{
   
  "wkt": {
   
    "_all": {
   
      "enabled": false
    },
    "properties": {
   
      "id": {
   
        "type": "keyword"
      },
      "wkt": {
   
        "type": "geo_shp"
      }
    }
  }
}

修改自己的话,照猫画虎的方式做就OK了

设定分词+拼音分词的实体类如下

@Setting(settingPath = "/json/address-setting.json")

最低0.47元/天解锁文章