Elasticsearch
/
1.安装
- Linux下安装docker-compose
## 安装
sudo curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
## 创建软链接
sudo chmod +x /usr/local/bin/docker-compose
- 编写docker-compose
version: '2'
services:
elasticsearch:
container_name: elasticsearch
image: daocloud.io/library/elasticsearch:6.5.4
ports:
- 9200:9200
- 9300:9300
environment:
- ES_JAVA_OPTS=-Xms64m -Xmx128m
- discovery.type=single-node
- COMPOSE_PROJECT_NAME=elasticsearch-server
restart: always
kibana:
container_name: kibana
image: daocloud.io/library/kibana:6.5.4
ports:
- 5601:5601
restart: always
environment:
- ELASTICSEARCH_HOSTS=IP:9200
depends_on:
- elasticsearch
- 安装IK分词器
root@..:~# docker exec -it bf84 bash
[root@bf84b051d8dd elasticsearch]# ls
LICENSE.txt NOTICE.txt README.textile bin config data lib logs modules plugins
[root@bf84b051d8dd elasticsearch]# cd bin/
[root@bf84b051d8dd bin]# ./elasticsearch-plugin install http://tomcat01.qfjava.cn:81/elasticsearch-analysis-ik-6.5.4.zip
- 测试IK分词器
GET _analyze
{
"analyzer": "ik_max_word",
"text": "杰杰怪"
}
结果:
2.ES基本操作
2.1,ES的结构
2.1.1,索引 Index,分片和备份
一个ES服务中,可以创建多个索引~
每一个索引默认被分成5片存储
每一个分片都会存在至少一个备份分片
备份分片默认不会帮助检索数据,当ES索引压力特别大的时候,备份才会帮助检索数据
2.1.2,类型 Type
一个索引下,可以创建多个
Ps:版本的不同,类型的创建也不同
2.1.3,文档 Doc
一个类型下可以有多个文档…相当于MYSQL中的多行数据
2.1.4,列 Field
一个文档中包含多个属性…相对于MYSQL的列
2.2 操作RESTful 语法
2.2.1 创建索引
# 创建索引
# number_of_shards设置分片数
# number_of_replicas 设置备份数
PUT /person
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1
}
}
2.2.2 查看索引
1.kibana页面查看
2.RESTful查看
# 查看存在的索引
GET _cat/indices?v
# 查看某一个索引
GET /person
2.2.3 删除索引
# 删除索引
DELETE /person
2.3 Field 数据类型
2.3.1 创建索引,指定数据结构
# 创建索引,指定数据结构
# mappings 指定数据结构
# novel--小说(type)
# properties 设置属性
# analyzer 设置分词器
# index 默认为true,如果index为false,当前的Field不能作为检索条件
# store 默认为false,是否启动一个额外的存储,一般情况下不需要,
# 因为可以在_source中检索出来
PUT /book
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1
},
"mappings": {
"novel":{
"properties":{
"name":{
"type":"text",
"analyzer":"ik_max_word",
"index":true,
"store":false
},
"author":{
"type":"keyword"
},
"count":{
"type":"long"
},
"onSale":{
"type":"date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
},
"descr":{
"type":"text",
"analyzer":"ik_max_word"
}
}
}
}
}
2.4 文档的操作
文档在ES 服务中的唯一标识,
_index
,_type
,_id
三个内容为符合,锁定一个文档,操作时添加还是修改
2.4.1 添加文档
- 自动生成Id
# 添加文档,自动生成Id
POST /book/novel
{
"name":"元尊",
"author":"天蚕土豆",
"count":100000,
"onSale":"2020-10-12",
"descr":"周元成神的历程!!"
}
结果:
- 指定Id
# 添加文档,指定Id
POST /book/novel/1
{
"name":"元尊111",
"author":"天蚕土豆111",
"count":100000,
"onSale":"2020-10-12",
"descr":"周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!周元成神的历程!!"
}
2.4.2 修改文档
- 覆盖修改
# 修改文档
PUT /book/novel/1
{
"name":"元尊",
"author":"天蚕土豆",
"count":100000,
"onSale":"2020-10-12",
"descr":"周元成神的历程!!"
}
GET /book/novel/1
- 指定修改
POST /book/novel/1/_update
{
"doc":{
"count" : 10
}
}
GET /book/novel/1
2.4.3 删除文档
# 删除文档
DELETE /book/novel/QAPt5HUBy0sFrTkyTMnB
2.5 批量操作
不是原子操作
# 批量操作
# index(添加),删除(delete),更新(update)
PUT /book/novel/_bulk
{"index":{"_id":"556665"}}
{"name":"元尊","author":"天蚕土豆","count":100000,"onSale":"2020-10-12","descr":"周元成神的历程!!"}
{"delete":{"_id":"4"}}
{"update":{"_id":"556665"}}
{"doc":{"name":"武威"}}
2.6 QueryString
GET /ems/emp/_search?q=*&sort=age:asc
_search 搜索的API
q=* 匹配所有文档
sort 以结果中的指定字段排序
# QueryString
GET /ems/emp/_search?q=*&sort=age:asc
# 分页 from = (pageNum-1)*size
GET /ems/emp/_search?q=*&sort=age:asc&size=5&from=0
# _source 选取指定的字段
GET /ems/emp/_search?q=*&sort=age:asc&size=5&from=0&_source=name,age
2.7 QueryDSL
- 测试数据
DELETE /ems
PUT /ems
{
"mappings":{
"emp":{
"properties":{
"name":{
"type":"text"
},
"age":{
"type":"integer"
},
"bir":{
"type":"date"
},
"content":{
"type":"text"
},
"address":{
"type":"keyword"
}
}
}
}
}
# 3.插入测试数据
PUT /ems/emp/_bulk
{"index":{}}
{"name":"小黑","age":23,"bir":"2012-12-12","content":"为开发团队选择一款优秀的MVC框架是件难事儿,在众多可行的方案中决择需要很高的经验和水平","address":"北京"}
{"index":{}}
{"name":"王小黑","age":24,"bir":"2012-12-12","content":"Spring 框架是一个分层架构,由 7 个定义良好的模块组成。Spring 模块构建在核心容器之上,核心容器定义了创建、配置和管理 bean 的方式","address":"上海"}
{"index":{}}
{"name":"张小五","age":8,"bir":"2012-12-12","content":"Spring Cloud 作为Java 语言的微服务框架,它依赖于Spring Boot,有快速开发、持续交付和容易部署等特点。Spring Cloud 的组件非常多,涉及微服务的方方面面,井在开源社区Spring 和Netflix 、Pivotal 两大公司的推动下越来越完善","address":"无锡"}
{"index":{}}
{"name":"win7","age":9,"bir":"2012-12-12","content":"Spring的目标是致力于全方位的简化Java开发。 这势必引出更多的解释, Spring是如何简化Java开发的?","address":"南京"}
{"index":{}}
{"name":"梅超风","age":43,"bir":"2012-12-12","content":"Redis是一个开源的使用ANSI C语言编写、支持网络、可基于内存亦可持久化的日志型、Key-Value数据库,并提供多种语言的API","address":"杭州"}
{"index":{}}
{"name":"张无忌","age":59,"bir":"2012-12-12","content":"ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口","address":"北京"}
2.7.1 match_all
# match_all查询全部
GET /ems/emp/_search
{
"query": {
"match_all": {
}
}
}
2.7.2 sort
# 排序 对呀分词的 无法进行排序
# sort
GET /ems/emp/_search
{
"query": {
"match_all": {
}
},
"sort": [
{
"bir": {
"order": "desc"
},
"age": {
"order": "desc"
}
}
]
}
2.7.3 _source,from,size
# from
# size
GET /ems/emp/_search
{
"query": {
"match_all": {
}
},
"size": 3,
"from": 0
}
# _source
GET /ems/emp/_search
{
"query": {
"match_all": {
}
},
"size": 3,
"from": 0,
"_source": ["age","name"]
}
2.7.4 term
# term
# type : text 类型分词,,其他 不分词
# es 默认使用标准分词器
# strandard 中文 单字分词 英文分词 单词分词
GET /ems/emp/_search
{
"query": {
"term": {
"content": {
"value": "良好"
}
}
}
}
text 类型分词,其他 不分词
2.7.5 range
# range 范围查询
# gte 大于等于
# lte 小于等于
GET /ems/emp/_search
{
"query": {
"range": {
"age": {
"gte": 20,
"lte": 30
}
}
}
}
2.7.6 prefix
# prefix 基于关键词前缀查询 (单字分词或其他分词...)
GET /ems/emp/_search
{
"query": {
"prefix": {
"name": {
"value": "张"
}
}
}
}
2.7.7 wildcard
# wildcard 通配符查询 ?只能匹配一个任意字符, *匹配0到任意多个
GET /ems/emp/_search
{
"query": {
"wildcard": {
"name": {
"value": "张*"
}
}
}
}
2.7.8 ids
# ids 多id查询
GET /ems/emp/_search
{
"query": {
"ids": {
"values": ["bQPJ6HUBy0sFrTkyq8nm","agPJ6HUBy0sFrTkyo8mx"]
}
}
}
2.7.9 fuzzy
# fuzzy 模糊匹配 模糊距离是0-2
# 搜索关键词长度为2不允许模糊,3-5之间允许1个模糊,长度超过5最大的模糊距离是2
GET /ems/emp/_search
{
"query": {
"fuzzy": {
"content": "elasticsearch11"
}
}
}
2.7.10 bool
# bool 布尔查询 -多条件查询
# must 相对于&&
# should 相当于 ||
# must_not 不能满足任何一个
# ----查询年龄在20-30,地址没有北京的记录
GET /ems/emp/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"age": {
"gte": 20,
"lte": 30
}
}
}
],
"must_not": [
{
"term": {
"address": {
"value": "北京"
}
}
}
]
}
}
}
2.7.11 highlight
# highlight 高亮查询 --查询结果二次渲染
# fields 指定字段
# 自定义高亮标签 pre_tags , post_tags
GET /ems/emp/_search
{
"query": {
"term": {
"content": {
"value": "redis"
}
}
},
"highlight": {
"pre_tags": ["<span style='color:red'>"],
"post_tags":["</span>"],
"fields": {
"content": {}
}
}
}
2.7.12 multi_match
# multi_match 多字段查询 可分词
# 1.如果搜索的字段分词,他会query进行先分词再搜索
# 2.如果搜索的字段不分词,他会直接使用query整体进行该字段的搜索
GET /ems/emp/_search
{
"query": {
"multi_match": {
"query": "小Redis",
"fields": ["address","content"]
}
}
}
2.7.13 query_string
# 多字段分词查询 query_string
# 无论该字段是否分词都会 先分词再搜索
GET /ems/emp/_search
{
"query": {
"query_string": {
"fields": ["name","content"],
"query": "小Redis"
}
}
}
2.8 索引库底层的存储原理分析
- 每个索引有一个
类型
,每个类型对应一个mapping
( 6.x < version < 7.x ) - 搜索的时候先搜索
索引区
,根据索引区的地址,搜索元数据区
- 搜索元数据区的时候,会进行
相关度排序
2.9 使用IK分词器
- 要在创建索引的时候指定使用什么分词器
2.9.1 配置扩展词
IK支持自定义
扩展词典
和停用词典
,所谓**扩展词典
就是有些词并不是关键词,但是也希望被ES用来作为检索的关键词,可以将这些词加入扩展词典。停用词典
**就是有些词是关键词,但是出于业务场景不想使用这些关键词被检索到,可以将这些词放入停用词典。如何定义扩展词典和停用词典可以修改IK分词器中
config
目录中IKAnalyzer.cfg.xml
这个文件。NOTE:词典的编码必须为UTF-8,否则无法生效
1. 修改vim IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict">ext_dict.dic</entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords">ext_stopword.dic</entry>
</properties>
2. 在ik分词器目录下config目录中创建ext_dict.dic文件 编码一定要为UTF-8才能生效
vim ext_dict.dic 加入扩展词即可
3. 在ik分词器目录下config目录中创建ext_stopword.dic文件
vim ext_stopword.dic 加入停用词即可
4.重启es生效
2.9.2 远程词典
修改vim IKAnalyzer.cfg.xml
2.10 FilterQuery(过滤查询)
先走filter
# filter+range
GET /ems/emp/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"content": {
"value": "框架"
}
}
}
],
"filter": {
"range": {
"age": {
"gte": 0,
"lte": 23
}
}
}
}
}
}
# filter+terms
# 多匹配查询
GET /ems/emp/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"name": {
"value": "小黑"
}
}
}
],
"filter": {
"terms": {
"content": ["架构","redis"]
}
}
}
}
}
# filter+exists+field
# 查询存在address字段的记录
GET /ems/emp/_search
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"filter": {
"exists": {
"field": "address"
}
}
}
}
}
# filter+ids
# 在相应id中查找符合条件的
GET /ems/emp/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"name": {
"value": "小五"
}
}
}
],
"filter": {
"ids": {
"values": [
"Dgbi8nUBO8drTpWlNJwF","EAbi8nUBO8drTpWlNJwF","Fwbj8nUBO8drTpWlaJyE"
]
}
}
}
}
}
3.ElasticSearch的练习(Java)
3.1 ElasticSearch服务说明
用户搜索时会去ES服务查找相关信息
根据结果(你想要的信息)去查找数据库
更新数据时可以把相关的数据插入到ES服务上
3.2 例子
3.2.1 pom.xml
<!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.5.4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.5.4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch.plugin/transport-netty4-client -->
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>transport-netty4-client</artifactId>
<version>6.5.4</version>
</dependency>
3.2.2 创建连接
cluster.name
的值要与配置文件的相同
public class TestESClient {
private TransportClient transportClient;
@Before
public void before() throws UnknownHostException {
Settings setting = Settings.builder().put("cluster.name",
"docker-cluster").build();
//创建客户端
this.transportClient = new PreBuiltTransportClient(setting);
transportClient.addTransportAddress(new TransportAddress(InetAddress.getByName("116.62.131.109"),9300));
}
@After
public void after(){
transportClient.close();
}
}
3.2.3 创建索引
@Test
public void testCreateIndex(){
//创建索引
CreateIndexResponse createIndexResponse = transportClient.admin().indices().prepareCreate("dangdang").get();
//获取信息
boolean acknowledged = createIndexResponse.isAcknowledged();
System.out.println(acknowledged);
}
3.2.4 创建Mapper
@Test
public void testMapper() throws ExecutionException, InterruptedException {
CreateIndexRequest dangdangIndex = new CreateIndexRequest("dangdang");
String json = "{\"properties\":{\"name\":{\"type\":\"text\",\"analyzer\":\"ik_max_word\",\"search_analyzer\":\"ik_max_word\"},\"age\":{\"type\":\"integer\"},\"bir\":{\"type\":\"date\"},\"content\":{\"type\":\"text\",\"analyzer\":\"ik_max_word\",\"search_analyzer\":\"ik_max_word\"},\"address\":{\"type\":\"keyword\"}}}";
//设置类型和mapping
dangdangIndex.mapping("dangdang",json, XContentType.JSON);
//执行创建
CreateIndexResponse createIndexResponse = transportClient.admin().indices().create(dangdangIndex).get();
System.out.println(createIndexResponse.isAcknowledged());
}
3.2.5 添加文档
@Test
public void testCreateDocOptionId(){
//指定id
Book book = new Book("12", "我的未来", 12, new Date(), "未来可期,钱多多!", "广西");
String toJSONString = JSONObject.toJSONString(book);
//添加文档
IndexResponse indexResponse = transportClient.prepareIndex("dangdang", "dangdang", "1").setSource(toJSONString, XContentType.JSON).get();
System.out.println(indexResponse.status());
}
@Test
public void testCreateDocAutoId(){
//自动生成Id
Book book = new Book("12", "我喜欢谁?谁喜欢我?", 12, new Date(), "对爱情的迷茫!", "广东");
String toJSONString = JSONObject.toJSONString(book);
//添加文档
IndexResponse indexResponse = transportClient.prepareIndex("dangdang", "dangdang").setSource(toJSONString, XContentType.JSON).get();
System.out.println(indexResponse.status());
}
3.2.6 更新文档
@Test
public void testUpdateDoc(){
Book book = new Book("1", "我的未来-update", 12, new Date(), "未来可期,钱多多!-update", "广西-update");
String toJSONString = JSONObject.toJSONStringWithDateFormat(book,"yyyy-MM-dd");
//更新文档
UpdateResponse updateResponse = transportClient.prepareUpdate("dangdang", "dangdang", book.getId()).setDoc(toJSONString, XContentType.JSON).get();
System.out.println(updateResponse.status());
}
3.2.7 删除文档
@Test
public void testDeleteDoc(){
//删除文档--
DeleteResponse deleteResponse = transportClient.prepareDelete("dangdang", "dangdang", "1").get();
System.out.println(deleteResponse.status());
}
3.2.8 查询文档
@Test
public void testGetDocOne(){
//得到一条文档--
GetResponse getResponse = transportClient.prepareGet("dangdang", "dangdang", "12").get();
String string = getResponse.getSourceAsString();
Book book = JSONObject.parseObject(string, Book.class);
System.out.println(book);
}
3.2.9 按条件查询
3.2.9.1 查询全部
@Test
public void testGetDocList() {
//查询所有文档--
MatchAllQueryBuilder matchAllQueryBuilder = QueryBuilders.matchAllQuery();
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("dangdang")
.setQuery(matchAllQueryBuilder).get();
System.out.println("总条数:" + searchResponse.getHits().getTotalHits());
System.out.println("最大分数:" + searchResponse.getHits().getMaxScore());
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit :hits) {
Book book = JSONObject.parseObject(hit.getSourceAsString(), Book.class);
System.out.println(book);
}
}
3.2.9.2 term查询
@Test
public void testGetDocListByTerm() {
//term查询
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("content","未来");
SearchResponse searchResponse = transportClient.prepareSearch("dangdang").setTypes("dangdang")
.setQuery(termQueryBuilder).get();
System.out.println("总条数:" + searchResponse.getHits().getTotalHits());
System.out.println("最大分数:" + searchResponse.getHits().getMaxScore());
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit :hits) {
Book book = JSONObject.parseObject(hit.getSourceAsString(), Book.class);
System.out.println(book);
}
}
3.2.9.3 批量操作
3.2.9.4 各种查询