安装elasticsearch及中文分词器、客户端连接示例

最新推荐文章于 2025-01-01 11:29:39 发布

小树叶子

最新推荐文章于 2025-01-01 11:29:39 发布

阅读量1.6k

点赞数

分类专栏： java 安装步骤文章标签： elasticsearch 中文分词分词器 transportClient spring-data-elastics

本文链接：https://blog.csdn.net/xxgwo/article/details/51235149

版权

安装步骤同时被 2 个专栏收录

8 篇文章 0 订阅

订阅专栏

java

4 篇文章 0 订阅

订阅专栏

本文记录了linux下如何安装elasticsearch及分词器，以及如何通过『spring-data-elasticsearch』连接服务器，并进行索引、搜索。

1、下载elasticsearch

我这里下载的是『elasticsearch-2.2.0.tar.gz』，下载地址如下：

https://www.elastic.co/downloads/elasticsearch

2、安装elasticsearch

安装过程十分简单，官网的描述是：1、下载安装文件并解压；2、运行解压后的bin目录中的elasticsearch文件；3、至此已安装完毕，访问浏览器『http://localhost:9200/』即可看到服务器输出的信息。

如果需要修改默认的配置，可以修改解压后的config文件夹中的 elasticsearch.yml 文件：

node.name节点名称

path.data数据存储路径

path.logs日志文件存储路径

network.host绑定到哪个ip，可设置为 _global_ 表示绑定到任意的ip，或设置为具体的本机的ip，如『192.168.0.100』或『127.0.0.1』或公网ip等

http.port服务器提供rest（HTTP）服务的端口，默认为9200

discovery.zen.ping.unicast.hosts: ["host1", "host2"] 如果启动后需要加入现有的集群中，则指定集群中的某几台机器的ip，用于发现集群中的其他机器，以便加入现有集群

discovery.zen.minimum_master_nodes 集群总节点数的大多数，即总结点数的一半加一，用于防止网络分区、脑裂问题

3、安装分词器

执行如下命令，可安装 ik 分词器

git clone https://github.com/medcl/elasticsearch-analysis-ik 复制分词器的代码

cd elasticsearch-analysis-ik 进入代码的目录

mvn clean package 用maven编译代码

mkdir /mnt/elasticsearch-2.2.0/plugins/ik 在elasticsearch的plugins目录下创建ik目录

cd /mnt/elasticsearch-2.2.0/plugins/ik/ 将maven编译后的zip文件解压到ik目录

unzip /mnt/setupFiles/elasticsearch-analysis-ik/target/releases/elasticsearch-analysis-ik-*.zip

在elasticsearch.yml中加入以下内容后，重启即可

index.analysis.default.type: elasticsearch-analysis-ik

执行如下命令，可安装 mmseg 分词器

安装过程和ik分词器的安装过程相同，只是目录和名称不同。

git clone https://github.com/medcl/elasticsearch-analysis-mmseg.git

cd elasticsearch-analysis-mmseg/

mvn clean package

mkdir /mnt/elasticsearch-2.2.0/plugins/mmseg

cd /mnt/elasticsearch-2.2.0/plugins/mmseg/

unzip /mnt/setupFiles/elasticsearch-analysis-mmseg/target/releases/elasticsearch-analysis-mmseg-1.8.0.zip

在elasticsearch.yml中加入以下内容后，重启

index.analysis.default.type: mmseg_complex

配置参数index.analysis.default.type是指默认的中文分词器，设置为ik或者mmseg_complex之一即可。

4、编码调用elasticsearch

maven依赖如下

<dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-elasticsearch</artifactId>
    <version>1.3.4.RELEASE</version>
    <exclusions>
        <exclusion>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>2.2.0</version>
    <exclusions>
        <!--<exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-core</artifactId>
        </exclusion>-->
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-backward-codecs</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-analyzers-common</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-queries</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-memory</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-highlighter</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-queryparser</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-suggest</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-join</artifactId>
        </exclusion>
        <exclusion>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-spatial</artifactId>
        </exclusion>
    </exclusions>
</dependency>

这里排除了lucene相关的依赖，因为客户端采用 TransportClient的方式的时候，不需要用到相关的jar包。

java调用代码示例如下

import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.index.query.QueryBuilders;

import java.io.IOException;
import java.net.InetSocketAddress;

public class App {
private static TransportClient transportClient = TransportClient.builder().build().addTransportAddress(
new InetSocketTransportAddress(new InetSocketAddress("123.56.179.51", 9300)));
public static void main(String[] args) {
try {
// http://123.56.179.51:9200/test_index6/_analyze?analyzer=mmseg_complex&pretty&text=中文内容
IndexResponse indexResponse = transportClient.prepareIndex("test_index6", "testType", "1").setSource(XContentFactory.jsonBuilder()
.startObject()
.field("id", 1)
.field("type", 2)
.field("title", "hello world")
.field("content", "hello world content")
.field("content2", "中文和英文内容")
.field("content4", "中文和英文内容")
.endObject()).execute().actionGet();
System.out.println(indexResponse.isCreated());

indexResponse = transportClient.prepareIndex("test_index6", "testType", "2").setSource(XContentFactory.jsonBuilder()
.startObject()
.field("id", 2)
.field("type", 3)
.field("title", "hello world")
.field("content", "hello world content")
.field("content2", "中文内容和其他内容")
.field("content4", "英文内容")
.endObject()).execute().actionGet();
} catch (IOException e) {
e.printStackTrace();
}

SearchResponse searchResponse = transportClient.prepareSearch("test_index6").setTypes("testType")
.setQuery(QueryBuilders.termQuery("type", 2)).execute().actionGet();
System.out.println(searchResponse);
System.out.println("termQuery准确查询");

searchResponse = transportClient.prepareSearch("test_index6").setTypes("testType")
.setQuery(QueryBuilders.matchPhraseQuery("content4", "英文内容")).execute().actionGet();
System.out.println(searchResponse);
System.out.println("matchPhraseQuery只查询连在一起的");

searchResponse = transportClient.prepareSearch("test_index6").setTypes("testType")
.setQuery(QueryBuilders.matchQuery("content4", "中文内容")).execute().actionGet();
System.out.println(searchResponse);
System.out.println("matchQuery查询包括不连在一起的");
}
}