Elasticsearch

官网:https://www.elastic.co/cn/elasticsearch/

中文文档:https://www.elastic.co/guide/cn/elasticsearch/guide/current/getting-started.html

一、基础概念

1.index 索引

动词:插入相当于MySQL的insert

名称:相当于MySQL database

2.Type 类型

相当于MySQL数据库中的table表,-每一种类型的数据放在一起

3.Document 文档  相当于MySQL中的一条数据 里面是json格式  里面的属性相当于MySQL的字段

倒排索引,相关性得分 快速查询

4.docker 安装eclasticsearch sudo docker pull elasticsearch:7.4.2

安装eclasticsearch可视化检索工具 sudo docker pull kibana:7.4.2

查看内存使用情况 free -m

配置docker 下的elacticsearch

 

sudo mkdir -p /mydata/elasticsearch/config

sudo mkdir -p /mydata/elasticsearch/data

echo "http.host: 0.0.0.0">>/mydata/elasticsearch/config/elasticsearch.yml

docker run  -p 9200:9200  -p 9300:9300 --name elasticsearch  \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v  /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v  /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d  elasticsearch:7.4.2

修改文件夹权限
chmod -R 777 /mydata/elasticsearch/

docker ps -a  显示所有容器 包括未运行的



elasticsearch用户拥有的内存权限太小,至少需要262144;

解决:

切换到root用户

执行命令:

sysctl -w vm.max_map_count=262144

查看结果:

sysctl -a|grep vm.max_map_count

显示:

vm.max_map_count = 262144

 

上述方法修改之后,如果重启虚拟机将失效,所以:

解决办法:

在   /etc/sysctl.conf文件最后添加一行

vm.max_map_count=262144

即可永久修改

5.安装 kibana 可视化工具

docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.56.10:9200 -p 5601:5601 \
-d kibana:7.4.2

设置为自动重启

sudo docker update 0a824ddf52d0 --restart=always

安装完成后可访问http://192.168.56.10:5601访问

6.初步检索

GET /_cat/nodes 查看所有节点

GET /_cat/health 查看es健康状况

GET /_cat/indices 查看所有索引 相当于show databases;

7.索引一个文档

保存一个数据,保存在哪个索引的哪个类型下,指定用哪个唯一标识

PUT customer/external/1 在customer索引external类型下保存 id为1的数据   { “name” : "JD"}

POST请求 如果不指定id,会自动生成id 新增,如果指定id 会修改

PUT请求 可新增必须人为指定id,不指定id新增会报错,也可指定id修改

8.查询文档

GET customer/external

{
    "_index": "customer",//在哪个索引下
    "_type": "external",//在哪个类型
    "_id": "39CzmHYBZcKtI7qwXzJa",//记录id
    "_version": 3,//版本号
    "_seq_no": 2,//并发控制字段,每次更新就会加一,用来做乐观锁
    "_primary_term": 1,//同上,主分片重新分配,如重启,就会变化
    "found": true,
    "_source": {//真正的内容
        "name": "JD",
        "ren": "乔丹"
    }
}

9.乐观锁字段请求上拼接上下面参数

?if_seq_no=0&if_primary_term=1

例如:http://192.168.56.10:9200/customer/external/39CzmHYBZcKtI7qwXzJa?if_seq_no=3&if_primary_term=1

如果_seq_no不是最新的就会报错409

{
    "error": {
        "root_cause": [
            {
                "type": "version_conflict_engine_exception",
                "reason": "[39CzmHYBZcKtI7qwXzJa]: version conflict, required seqNo [4], primary term [1]. current document has seqNo [5] and primary term [1]",
                "index_uuid": "QyKTy5LkRLm_2-zhNM7RSg",
                "shard": "0",
                "index": "customer"
            }
        ],
        "type": "version_conflict_engine_exception",
        "reason": "[39CzmHYBZcKtI7qwXzJa]: version conflict, required seqNo [4], primary term [1]. current document has seqNo [5] and primary term [1]",
        "index_uuid": "QyKTy5LkRLm_2-zhNM7RSg",
        "shard": "0",
        "index": "customer"
    },
    "status": 409
}

10.更新文档

POST customer/external/1/_update

{
"doc":{
    "name":"JohnDoew"
}

}

此种请求,内容不变的情况下 版本号,序列号不变

或着

POST customer/external/1

{
   "name":"JohnDoew2"
}
此种请求,内容不变的情况下 版本号,序列号都会变
或着
PUT customer/external/1
{
   "name":"JohnDoew2"
}
此种请求,内容不变的情况下 版本号,序列号都会变

11.删除文档

DELETE customer/external/1

DELETE customer
可以删除索引,文档不能删除类型

12.bulk批量api

POST customer/external/_bulk
{"index":{"_id":"1"}}//index表示新增id为1 内容为JD
{"name":"JD"}

{"index":{"_id":"2"}}//表示新增id为2 内容为JD
{"name":"JD"}

POST /_bulk
{"delete":{"_index":"website","_type": "blog","_id":"123"}}
{"create":{"_index":"website","_type": "blog","_id":"123"}}
{"title": "My First blog"}
{"index": {"_index":"website","_type": "blog"}}
{"title": "My Second blog"}
{"update": {"_index":"website","_type": "blog","_id":"123"}}
{"doc": {"title": "My update blog"}}

13.match_all查询全部, sort排序,_source 只显示想显示的字段

GET bank/_search
{
  "from": 0,
  "size": 10,
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "account_number": "asc"
    },
    {
      "balance": "desc"
    }
  ],
  "_source": ["balance","firstname"]

14.match条件匹配,分词匹配,match_phrase短语匹配包含该短语才匹配,multi_match段字段匹配

GET bank/_search
{
  "query": {
    "match": {
      "account_number": 20
    }
  }
}
GET bank/_search
{
  "query": {
    "match": {
      "address": "282 Kings Place"
    }
  }
}
GET bank/_search
{
    "query": {
        "match_phrase" : {
            "address": "Kings Hwy"
        }
    }
}
GET bank/_search
{
    "query": {
        "multi_match":{
          "query": "Roberts",
          "fields": [ "firstname", "lastname" ]
        }
    }
}

15.复杂查询must必须满足,must_not必须不包含,should可以包含也可以不包含 影响的是得分,filter过滤只显示满足条件的数据不影响得分

GET bank/_search
{
    "query": {
        "bool":{
         "must": [
           {
             "match": {
               "address": "mill"
             }
           },
           {"match":{
             "gender": "M"
           }}
         ],
         "must_not": [
           {"match": {
             "age": "18"
           }}
         ],
         "should": [
           {"match": {
             "lastname": "Hines"
           }}
         ],
         "filter": {
           "range": {
             "age": {
               "gte": 10,
               "lte": 28
             }
           }
         }
        }
    }
}

16.复杂查询term 只适合精确查询如价格,数字,id等

GET bank/_search
{
  "query":{
    "term": {
      "age": {
        "value": 28
      }
    }
  }
}

17.复杂查询 field.keyword 精确查询 只筛选完全匹配的数据

GET bank/_search
{
  "query":{
    "match": {
      "address.keyword":"694 Jefferson"
    }
  }
}

18.复杂查询 ##搜索address中包含mill的所有人的年龄分布以及平均年龄(aggs聚合,term分组,avg求平均,sum求和,max最大值等,size=0表示不显示查询出来的数据只显示聚合)

GET bank/_search
{
  "query": {
    "match": {
      "address": "mill"
    }
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 10
      }
    },
    "ageAvg":{
      "avg": {
        "field": "age"
      }
    },
    "ageSum":{
      "sum": {
        "field": "age"
      }
    },
    "ageMax":{
      "max": {
        "field": "age"
      }
    },
    "ageMin":{
      "min": {
        "field": "age"
      }
    }
  },
    "size":0
}

19.复杂查询 ##按照年龄聚合,并且请求这些年龄段的平均薪资

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageagg": {
      "terms": {
        "field": "age",
        "size": 10
      },
      "aggs": {
        "ageAvg": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}

20.复杂查询##查出所有年龄分布,并且统计出每个年龄段,男女分别的平均薪资和总体平均薪资

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageagg": {
      "terms": {
        "field": "age",
        "size": 10
      },
      "aggs": {
        "genderAgg": {
          "terms": {
            "field": "gender.keyword",
            "size": 10
          },
          "aggs": {
            "balanceAvg": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
        "agebanlanceAvg":{
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}

21.映射 #mapping 查看映射

GET bank/_mapping

22..映射###新增映射

PUT /my_index
{
  "mappings": {
    "properties": {
      "age":{
        "type":"integer"
      },
      "email":{
        "type":"keyword"
      }
    }
  }
}

 ##添加映射字段,不能修改映射index不需要被索引
PUT /my_index/_mapping
{
    "properties": {
      "employ_id":{
        "type":"keyword",
        "index":false 
      }
    }
}
GE

23.##想要修改映射 只能数据迁移

新建一个索引自己想要的样子,并查看
PUT /newbank
{
     "mappings" : {
      "properties" : {
        "account_number" : {
          "type" : "long"
        },
        "address" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "age" : {
          "type" : "integer"
        },
        "balance" : {
          "type" : "long"
        },
        "city" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "email" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "employer" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "firstname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "gender" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "lastname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "state" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  
}
GET /newbank/_mapping


###数据迁移
POST _reindex
{
  "source": {
    "index": "bank",##原来想要修改的索引
    "type": "account"##新版本没有8.*不在有type已废弃
  },
  "dest": {
    "index": "newbank"##新建的索引
  }
}

24.ik分词器-基础

#使用默认分词器,一般是用于英文语句
POST _analyze
{
  "analyzer": "standard",
  "text":      "我是中国人"
}

#下载用于中文的ik分词器,在github找到相应的版本下载解压到elastic search 下的plugin目录下的ik目录,重启elasticsearch ,下载地址参考:https://github.com/medcl/elasticsearch-analysis-ik/releases/tag/v7.4.2
####智能分词
POST _analyze
{
  "analyzer": "ik_smart",
  "text":      "我是中国人"
}

####最大分词
POST _analyze
{
  "analyzer": "ik_max_word",
  "text":      "我是中国人"
}

25.ik分词器-自定义词库

安装nginx

mkdir /mydata/nginx
#安装nginx
docker run -p 80:80 --name nginx -d nginx:1.10
#复制到nginx文件夹
docker container cp nginx:/etc/nginx .
#停止nginx
docker stop nginx
#删除nginx容器
docker rm nginx
#移动/mydata下的nginx到conf
mv nginx conf
#重新新建mydata/nginx文件夹,移动conf到nginx
mv conf nginx/
#重新安装nginx
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx:1.10

在nginx/html目录新建文件夹es里面新建fenci.txt文件 写入自己想要的分词短语

修改/mydata/elasticsearch/plugins/ik/config 文件夹下IKAnalyzer.cfg.xml配置文件

配置自己定义的分词地址

重启es;

再次查询就会发现根据自定义短语分词

26.Java操作ES 

9300:tcp 端口不建议使用8.*以后会废弃

建议使用9200端口

访问请求es的几种方式

1).JestClient:非官方,更新慢

2).RestTemplate:模拟http请求,es很多操作需要自己封装麻烦

3).httpClient :同上

4).Elasticsearch-Rest-Client:官方RestClient,封装了ES操作,api层次分明,上手简单

使用参考https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.4/index.html

a.引入依赖(注意有springboot有自带的elasticsearch版本,在

<properties>标签下添加
<elasticsearch.version>7.4.2</elasticsearch.version>

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.4.2</version>
        </dependency>

b.编写配置

容器中注入一个

RestHighLevelClient
@Configuration
public class GulimailElasticSearchConfig {

    public RestHighLevelClient esRestClient(){
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("192.168.56.10", 9200, "http")));
        return client;
    }

}

c.测试

import org.elasticsearch.client.RestHighLevelClient;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;

@RunWith(SpringRunner.class)
@SpringBootTest
public class GulimailSearchApplicationTests {
    @Autowired
private RestHighLevelClient client;
    @Test
    public  void contextLoads() {
        System.out.println(client);
    }

}

d.调用es

配置请求头,放到配置文件

import org.apache.http.HttpHost;
import org.elasticsearch.client.HttpAsyncResponseConsumerFactory;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Configuration;

@Configuration
public class GulimailElasticSearchConfig {

    public static final RequestOptions COMMON_OPTIONS;
    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
//        builder.addHeader("Authorization", "Bearer " + TOKEN);
//        builder.setHttpAsyncResponseConsumerFactory(
//                new HttpAsyncResponseConsumerFactory
//                        .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
       COMMON_OPTIONS = builder.build();
    }
@Bean
    public RestHighLevelClient esRestClient(){
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("192.168.56.10", 9200, "http")));
        return client;
    }

}

e.测试

import com.alibaba.fastjson.JSON;
import com.zzw.gulimail.search.config.GulimailElasticSearchConfig;
import lombok.Data;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;

import java.io.IOException;

@RunWith(SpringRunner.class)
@SpringBootTest
public class GulimailSearchApplicationTests {
    @Autowired
private RestHighLevelClient client;
    @Test
    public  void contextLoads() {
        System.out.println(client);
    }
    @Test
    /*保存更新二合一*/
    public  void indexData() throws IOException {
       IndexRequest request = new IndexRequest("users");
        User user=new User();
        user.setUsername("张山");
        user.setAge(11);
        user.setGender("男");
        request.source(JSON.toJSONString(user), XContentType.JSON);
        IndexResponse index=    client.index(request, GulimailElasticSearchConfig.COMMON_OPTIONS);
        System.out.println(index);
    }
    @Data
    class  User{
        private String username;
        private String gender;
        private Integer  age;
    }
}

f.复杂检索

  @Test
    /*复杂检索*/
    public  void searchData() throws IOException {
        //创建检索
        SearchRequest sr=new SearchRequest();
        //指定索引
        sr.indices("bank");
        //指定DSL检索条件
        //SearchBuilder sourceBuilder
        SearchSourceBuilder  sourceBuilder=new SearchSourceBuilder();
        //构建检索条件
        sourceBuilder.query(QueryBuilders.matchQuery("address","mill"));
/*按照年龄的值进行分布聚合*/
        TermsAggregationBuilder ageAgg = AggregationBuilders.terms("ageAgg").field("age").size(10);
        sourceBuilder.aggregation(ageAgg);
//计算平均薪资
        AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance");
        sourceBuilder.aggregation(balanceAvg);
        //sourceBuilder.from();
        //sourceBuilder.size();
        sr.source(sourceBuilder);
        System.out.println(sourceBuilder);
        //执行检索
        SearchResponse  searchResponse=client.search(sr,GulimailElasticSearchConfig.COMMON_OPTIONS);
        //分析结果
        System.out.println(searchResponse.toString());
        //获取查询到的数据
        SearchHits hits = searchResponse.getHits();
        SearchHit[] hitsHits = hits.getHits();
        for (SearchHit hit:hits){
            String sourceAsString = hit.getSourceAsString();
           // Account o = JSON.parseObject(sourceAsString, Account.Class);
           // System.out.println("o="+o);
        }

        //获取检索到的分析信息
        Aggregations aggregations = searchResponse.getAggregations();//拿到检索信息
       /* for (Aggregation aggregation : aggregations.asList()) {
            System.out.println(aggregation.getName());
        }*/
        Terms ageAgg1 = aggregations.get("ageAgg");
        for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
            String keyAsString = bucket.getKeyAsString();
            System.out.println("年龄:"+keyAsString+"人数:"+bucket.getDocCount());
        }
        Avg balanceAvg1 = aggregations.get("balanceAvg");
        System.out.println("平均薪资:"+ balanceAvg1.getValue());

    }

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值