ES 基本使用实操

最新推荐文章于 2024-06-20 15:10:22 发布

HLng-Z

最新推荐文章于 2024-06-20 15:10:22 发布

阅读量168

点赞数

文章标签： elasticsearch java 大数据 Powered by 金山文档

本文链接：https://blog.csdn.net/m0_61066780/article/details/129686098

版权

ElasticSearch

ELK ElasticSearch + Logstash + kibana

Lucene 是一套信息检索工具包 jarbao 不包含搜索引擎系统

包含的：索引结构！读写索引的工具！排序，搜索规则... 工具类

Lucene 和 ElasticSearch 关系

ElasticSearch 是基于Lucene 做了一些封装和增强（上手十分简单）

安装

声明：jdk1.8，最低要求！ ES客户端，界面工具

Java开发，es的版本和我们之后对应的java的核心jar包！版本对应！ jdk环境是正常

下载

https://www.elastic.co/cn/downloads/elasticsearch

目录结构

bin 启动文件

config 配置文件

log4j2 日志配置文件

jvm.options java虚拟机相关配置

elasticsearch.yml elasticsearch的配置文件! 默认9200端口！跨域！

lib 相关jar包

logs 日志文件

modules 功能模块

plugins 插件！ ik分词器

安装可视化插件es head的插件

下载地址： https://github.com/mobz/elasticsearch-head

启动

npm i

npm run start

配置跨域

http.cors.enabled: true

http.cors.allow-origin: '*'

重启es 再次连接

Kibana的安装

下载：https://www.elastic.co/cn/downloads/kibana

倒排索引 score 权重

IK分词器

分词:即把一段中文或者别的划分成一个个的关键字，我们在搜索时候会把自己的信息进行分词，会把数据库中或者索引库中的数据进行分词，然后进行一个匹配操作，默认的中文分词是将每个字看成一个词，比如"我爱狂神”会被分为"我”"爱""狂""神"，这显然是不符合要求的，所以我们需要安装中文分词器ik来解决这个问题。IK提供了两个分词算法 : ik_smart和 ik_max word，其中 ik_smart 为最少切分,ik_maxword为最细粒度划分!一会我们测试!

安装

下载:https://github.com/medcl/elasticsearch-analysis-ik

下载完成之后放入es插件内重启es

GET _analyze
{
  "analyzer":"ik_smart",
  "text":"我爱你"
}

GET _analyze
{
  "analyzer":"ik_max word",
  "text":"我爱你"
}

ik分词器增加自己的配置

新建自己的 xx.dic文件配置到配置文件中

测试关于索引的基本操作

PUT /索引名/类型名/文档id
{请求体}

PUT /test1/type/1
{
  "name":"张三",
  "age":13
}
// 创建规则
PUT /test2
{
  "mappings":{
    "properties":{
      "name":{
        "type":"text"
      },
       "age":{
        "type":"long"
      },
      "birthday":{
        "type":"date"
      }
    }
  }
}
-----------------
// get 获取具体信息
GET test2

-----------------
// 查看默认信息
PUT /test3/_doc/1
{
  "name":"张",
  "age":"13"
}

GET test3

// 拓展 _cat命令
GET _cat/indices?v

// 更新
POST /test3/_doc/1/_update
{
  "doc":{
      "name":"张三"
  }
}
// 删除
delete test1

完成了自动增加了索引! 数据也成功的添加了

测试关于文档的基本操作

PUT /test1/user/1
{
  "name":"张三",
  "age":13,
  "desc":"哈哈哈哈"
   "tags":"["技术宅","旅游"]"
}
PUT /test1/user/2
{
  "name":"李四",
  "age":16,
  "desc":"嘻嘻嘻嘻"
   "tags":"["技术宅","运动"]"
}

GET test1/user/1
// 更新
PUT /test1/user/2
{
  "name":"李四2323",
  "age":16,
  "desc":"嘻嘻嘻嘻"
   "tags":"["技术宅","运动"]"
}
// 更新  推荐 
POST /test1/user/2/_update
{
  "doc":{
      "name":"李四",
  }
}

条件查询

GET test1/user/_search?q=name:李四

复杂条件查询

GET test1/user/_search
{
  "query":{
    "match":{
        "name":"李"
    }
  },
  "_source":["name","desc"]
}

GET test1/user/_search
{
  "query":{
    "match":{
        "name":"李"
    }
  },
  "sort":[
    {
      "age":{  // 根据age排序
          "order":"desc"
      }
    }
  ],
  // limit 分页 
  "from":0,
  "size":5
}
// 多条件查询 must命令 对标 (and)
GET test1/user/_search
{
  "query":{
    "bool":{
      "must":[
        {
          "match":{
             "name":"李四"
          }
        },
        {
          "match":{
             "age":13
          }
        }
      ]
    }
  }
}
// 多条件查询 should 对标 (or)
GET test1/user/_search
{
  "query":{
    "bool":{
      "should":[
        {
          "match":{
             "name":"李四"
          }
        },
        {
          "match":{
             "age":13
          }
        }
      ]
    }
  }
}
// 多条件查询 must_not 对标 (not/!=) 
GET test1/user/_search
{
  "query":{
    "bool":{
      "should":[
        {
          "match":{
             "age":13
          }
        }
      ]
    }
  }
}
// 多条件查询 filter 过滤的条件 gt gte lt lte
GET test1/user/_search
{
  "query":{
    "bool":{
      "must":[
        {
          "match":{
             "name":"李四"
          }
        }
      ],
      "filter":{
         "range": {
            "age":{
              "gte":10,
              "lte":20 
            }
          }
      }
    }
  }
}

// 多条件查询 多条件 空格隔开
GET test1/user/_search
{
  "query":{
    "match":{
     "tags":"男 技术"
    }
  }
}

精确查找

term 查询是直接通过到排索引指定的词条进行精确查找

关于分词

term 直接查询精确的

match 会使用分词器解析! (线分析文档然后在通过分析的文档进行查询)

两个类型 text keyword

keyword字段只能被精确查找

高亮查询

// 自定义搜索高亮条件pre_tags post_tags
GET test1/user/_search
{
  "query":{
    "match":{
     "tags":"男 技术"
    }
  },
  "heightlight":{
  "pre_tags":"<p class='key' style='color:red'>",
  "post_tags":"</p>"
    "fields":{
        "name":{}
    }
  }
}

Spring boot集成

maven

<dependency>
  <proupId>org.elasticsearch.client</eroupId> 
   <artifactId>elasticsearch-rest-high-level-client</artifactId>
  <version>7.6.2</version>

</dependency>

找对象

RestHighLevelClientclient=newRestHighLevelClient(
    RestClient.builder(
      newHttpHost("localhost",9208,"http"),
      newHttpHost("localhost"9201,"http")));

client.close()

分析类里面的方法

配置基本项目

<properties>
  <!--自定义版本依赖 与使用版本保持一致-->
    <elasticsearch.version>7.6.2</elasticsearch.version>
</properties>

config

@Configuration
public class ElasticSearchClientConfig {
  
  @Bean
  public RestHighLevelClient restHighLevelClient(){
    RestHighLevelClient client = new RestHighLevelClient(
      RestClient.builder(
        new HttpHost("localhost",9200,"http")
       )
    );
    return client;
  }
}

@Autowired
private RestHighLevelClient client
// 创建索引
@Test
void test(){
    CreateIndexRequest request = new CreateIndexRequest("zhang_index");
    // 执行请求
    CreateIndexReponce     rep = client.indices().create(request,RequestOptions.DEFAULT);
    System.out.println(rep)
}
// 获取索引
@Test
void test1(){
    GetIndexRequest request = new GetIndexRequest("zhang_index");

    boolean     b = client.indices().exists(request,RequestOptions.DEFAULT);
    System.out.println(b)

}
// 删除索引
@Test
void test3(){
    DeleteIndexRequest request = new DeleteIndexRequest("zhang_index");

client.indices().delete(request,RequestOptions.DEFAULT);

// 添加文档
@Test
void test4(){
    // 实体类
    User user = new User()
    // 添加请求
    IndexRequest request = new IndexRequest("zhang_index");
    // 规则 put/zhang_index/_doc/1
    request.id("1");
    request.timeout("1s")
    // request.timeout(TimeValue.TimeValueSecound(1))
    // 将数据放入请求
    IndexRequest source = request.source(JSON.toJSONString(user),XContentType.JSON)
    // 客户端发送请求
IndexResponse index = client.index(request,RequestOptions.DEFAULT);

}

// 文档是否存在
@Test
void test5(){
    // 添加请求
    GetRequest request = new GetRequest("zhang_index","1");
    // 不获取返回的_source的上下文了
    request.fetchSourceContext(new FetchSourceContext(false))
    request.storeFields("_none_")

    // 客户端发送请求 
boolean b = client.exists(request,RequestOptions.DEFAULT);
}

// 获取文档信息
@Test
void test6(){
    // 添加请求
    GetRequest request = new GetRequest("zhang_index","1");
    // 客户端发送请求 
GetResponse res = client.get(request,RequestOptions.DEFAULT);
// 获得信息
res.getSourceAsString()
}

// 更新文档信息
@Test
void test7(){
    // 添加请求
    UpdateRequest request = new UpdateRequest("zhang_index","1");
    // 
    request.timeout("1s");
    //
    User user =  new User("zhang",18)
    
    request.doc(JSON.toJSONString(user),XContentType.JSON)
    // 客户端发送请求 
UpdateResponse res = client.update(request,RequestOptions.DEFAULT);
// 获得状态
res.status()
}
// 删除
@Test
void test8(){
    // 添加请求
    DeleteRequest request = new DeleteRequest("zhang_index","3");
    // 
    request.timeout("1s");

    // 客户端发送请求 
DeleteResponse res = client.delete(request,RequestOptions.DEFAULT);
// 获得状态
res.status()
}

// 批量添加
@Test
void test9(){
   BulkRequest bulkRequest = new BulkRequest();
  bulkRequest.timeout("10s")
    
    ArrayList<User> userList = new ArrayList<>()
    userList.add(new User("张三"，18))；
    userList.add(new User("张三1"，12))；
    userList.add(new User("张三2"，13))；
    
    for(int i ;i<userList.size();i++){
      bulkRequest.add(
        new IndexRequest("zhang_index")
        .id(""+(i+1))
        .source(JSON.toJSONString(userList.get(i)),XContentType.JSON)
      );
    }

    // 客户端发送请求 
BulkResponse res = client.bulk(bulkRequest,RequestOptions.DEFAULT);
// 获得状态 是否成失败 false 代表成功
res.hasFailures()
}

// 查询
// SearchRequest 搜索请求
// SearchSourceBuilder 条件构造
// --HighlightBuild  高亮 精确查询
// --TermQueryBuilder 精确查询
// -- xxxQueryBuilder
@Test
void test10(){
   SearchRequest request = new SearchRequest("zhang_index");
  // 构建搜索条件
    SearchSourceBuilder sourceBuild = new  SearchSourceBuilder();   
  // 精确查询
      TermQueryBuilder build = QueryBuilders.termQuery("name","张三1")
      //匹配所有 QueryBuilders.matchAllQuery()
      sourceBuild.query(build)
      // request.timeout("60s")     
      sourceBuild.timeout(new TimeValue(60,TimeUnit.SECONDS));
        // sourceBuild.from();
        // sourceBuild.size()
      request.source(sourceBuild)

    // 客户端发送请求 
SearchResponse res = client.search(request,RequestOptions.DEFAULT);
// 获得状态 是否成失败 false 代表成功
  res.getHits();
  res.getHits().getHits()
}

爬虫

jSoup

 <dependency> 
   <groupId>org.jsoup</groupId> 
   <artifactId>jsoup</artifactId> 
   <version>1.7.3</version> 
</dependency>

@Component
public calss HtmlParseUtil {

  public List<Content> parseJD(String keywords) throws Exception{
     String url = "https://search.jd.com/Search?keyword=" + keywords;
       Document document = Jsoup.parse(new URL(url),30000);
    Element element = document.getElementById("J_goodsList");
    // 获取所有li标签
      Elements elements = element.getElementByTag("li");
    ArrayList<Content> list = new ArrayList<>();
     // 获取内容
    for(Element el : elements){
      // 图片处理  source-data-lazy-img
      String img = el.getElementByTag("img").eq(0).attr("source-data-lazy-img");

      String price = el.getElementByClass("p-price").eq(0).text();

      String title = el.getElementByClass("p-name").eq(0).text();
      Content content = new Content()
      content.setTitle(title);
      content.setImg(img);
      content.setPrice(price);
      list.add(content);

    }
    return list
  }
}

@Date
public class Content {
  private String title;
  private String img;
  private String price;
}

@Service
public class ContentService{
  @Autowired
  private RestHighLevelClient restHighLevelClient;
  // 批量添加
  public Boolean paresContent(String keywords) {
    List<Content> contents = new HtmlParseUtil().parseJD(keywords);
    
    BulkRequest bulkRequest = new BulkRequest();
    bulkRequest.tiomeout("2m")
    
      for(int i = 0;i < contents.size();i++){
        // 
        bulkRequest.add(
          new IndexRequest("jd_goods")
                .source(JSON.toJSONString(contents.get(i)),XContentType.JSON));
      }
    
    BulkResponse res = restHighLevelClient.bulk(bulkRequest,RequestOptions.DEFAULT);
    
    return !res.hasFailures();
  }
    
  // 搜索功能 高亮
  public List<Map<String,Object>> searchPage(String keywords,int pageNo ,int pageSize) throw Exception{
    if(pageNo <= 1){
      pageNo = 1;
    }
    // 条件搜索
    SearchRequest searchRequest =new SearchRequest("jd_goods");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    // 分页
    sourceBuilder.from(pageNo)
    sourceBuilder.size(pageSize)
    // 精准匹配 title = keywords
   TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title",keywords);
    sourceBuilder.query(termQueryBuilder)
      
    // 高亮HighLightBuilder  title字段高亮
    HighLightBuilder highLight = new HighLightBuilder
    highLight.field("title")
      highLight.requireFieldMatch(false);// 多个高亮关闭
     highLight.preTags("<span style='color:red'>");
     highLight.postTags("</span>");
    sourceBuilder.highLighter(highLight)
      
    
    sourceBuilder.timeout(new TimeValue(60,TimeUnit.SECONDS));
    
    // 执行
    searchRequest.source(sourceBuilder);
    
    SearchResponse res = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT)
      ArrayList<Map<String,Object>> list =new ArrayList<>();
      for(SearchHits doc : res.getHits().getHits()){
        
        // 解析高亮字段
        Map<String,HighlightField> map = doc.getHighlightFields();
        HighlightField title = map.get(title)
          
          Map<String,Object> result =  doc.getSourceAsMap();
        // 将原来的字段 换为高亮的字段 前端 v-html 绑定
        if(title!=null){
          Text[] fragment = title.fragments();
          String newTitle = "";
          for(Text text : fragment){
            newTitle += text
          }
          result.put("title",newTitle) // 替换
        }
        
        list.add(result);
      }
    return list;
  
  }
    
}

@ResController
public class ContentController{
  
  @Autowired
  private ContentService contentService;
  
  @GetMapping("/parse/{keyword}")
  public Boolean parse(@PathVariable("keyword") Stringkeyword) throws Exception{
    return contentService.paresContent(keyword);
  }
  
  
  
   @GetMapping("/search/{pageNo}/{pageSize}")
  public List<Map<String,Object>> searchPage(
    @PathVariable("keyword") Stringkeywords,
    @PathVariable("pageNo") intpageNo ,
    @PathVariable("pageSize") intpageSize
  ) throws Exception{
    return contentService.searchPage(keyword,pageNo,pageSize);
  }
}

课程:https://space.bilibili.com/95256449/video