ElasticSearch
一、介绍
Lucene是一套用于全文检索和搜寻的开源库,由Apache软件基金会支持和提供。Lucene提供了一个
简单却强大的应用程式接口,能够做全文索引和搜寻。
Solr是一个高性能,采用Java开发,基于Lucene的全文搜索服务器。同时对其进行了扩展,提供了比Lucene更为丰富的查询语言,同时实现了可配置、可扩展并对查询性能进行了优化,并且提供了一个完善的功能管理界面,是一款非常优秀的全文搜索引擎。
Elasticsearch,简称ES,由Shay Banon开发,是一个基于Lucene的搜索服务器,它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。
Elasticsearch是与数据收集和日志解析引擎Logstash以及分析和可视化平台Kibana一起开发的。这三个产品被设计成一个集成解决方案,称为ELK stack
根据DB-Engines的排名显示,Elasticsearch是最受欢迎的企业搜索引擎,其次是Apache Solr
二、安装
镜像下载地址:
ElasticSearch: https://mirrors.huaweicloud.com/elasticsearch/?C=N&O=D
logstash: https://mirrors.huaweicloud.com/logstash/?C=N&O=D
可视化界面:
kibana: https://mirrors.huaweicloud.com/kibana/?C=N&O=D
elasticsearch-head:https://github.com/mobz/elasticsearch-head
Elasticsearch需要jdk1.8及以上
安装Windows版本解压:
配置跨域访问:
修改elasticsearch.yml添加如下内容:
#设置允许跨域请求访问
http.cors.enabled: true
http.cors.allow-origin: "*"
修改jvm.options文件:
启动elasticsearch:
浏览器访问:
安装可视化界面elasticsearch-head
:
进入elasticsearch-head-master目录:
先下载依赖:npm install
运行:npm run start
浏览器访问:
安装可视化界面kibana
(版本和elasticsearch一致):
默认为英文,修改语言为中文:
修改kibana.yml文件添加如下内容:
#国际化设置中文
i18n.locale: "zh-CN"
进入kibana-xxx-windows-x86_64\bin目录:
浏览器访问:
三、ik分词器
ik分词器版本需要和elasticsearch版本对应
下载地址:https://github.com/medcl/elasticsearch-analysis-ik/releases?after=v5.6.3
运行elasticsearch查看是否加载ik分词器:
使用elasticsearch-plugin查看是否加载ik分词器:
使用kibana测试ik分词器:
ik_smart最小划分,划分的部分没有重复的:
结果:
ik_max_word最细粒度划分,尽可能多的划分:
结果:
自定义字典进行分词:
默认分词:
自定义字典:
在elasticsearch-xxx\plugins\ik\config目录下:
创建字典:
重启elasticsearch后结果:
四、基本概念及相关操作
1.索引
类似于数据库
elasticsearch使用倒排索引,加快搜索的速度
eg.下面是一个主键内容关联表
id | content |
---|---|
0001 | My name is Jack |
0002 | My name is Marry |
以前我们要查询内容包含Jack的id值,需要遍历每一行数据进行查询,速度较慢
而elasticsearch会通过关键词建立和id的联系(倒排索引),进行全文检索,如下:
keyword | id |
---|---|
My | 0001、0002 |
name | 0001、0002 |
is | 0001、0002 |
Jack | 0001 |
Marry | 0002 |
要查询内容包含Jack的id值,可以直接找到对应的id为0001。
创建索引:
查看索引信息:
查询所有索引信息:
删除索引:
2.文档
类似于关系型数据库中的一行行数据,包含key和value(键值对)
创建文档:
使用PUT请求创建文档需要设置id值
使用POST请求创建文档可以设置id值,也可以不设置(自动生成UUID)
设置id:
不设置id:
查询文档:
查询特定id值对应的文档:
查询所有文档:
修改文档信息:
发送PUT请求进行全量修改(需要传入所有字段,否则未传入的字段会置空):
未传入name字段则置空:
发送POST请求进行局部修改:
删除文档:
3.字段
类似于关系型数据库中的列
4.复杂查询
通过问号连接查询参数:
通过请求体进行条件查询:
通过请求体进行全量查询:
通过请求体进行分页查询:
from为数据的起始位置,size为每一页的容量
公式:from = (页码 - 1) * 页面容量
下面为查询从第0条开始查询1条数据
使用_source属性可以只显示指定字段:
下面为只显示name字段
使用sort属性进行排序:
下面为对age字段顺序排序
使用bool属性多个条件进行查询:
must表示多个条件相与
should表示多个条件相或
使用filter属性进行范围查询:
下面对age字段进行范围查询,查询age>= 20 并且 age <= 40的数据
全文检索:
match查询倒排索引,对name字段进行分词查询
match_phrase查询不会对name字段进行分词,而是当做一个整体查询
使用highlight属性进行高亮显示(给关键词添加样式):
聚合查询:
对age字段进行统计
求age最大值:
求age最小值:
求平均值:
五、SpringBoot集成
导入依赖:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
查看SpringBoot默认elasticsearch版本:
<!--如果版本不一致需要修改elasticsearch.version-->
<properties>
<!--修改elasticsearch版本-->
<elasticsearch.version>7.9.2</elasticsearch.version>
</properties>
配置elasticsearch的RestHighLevelClient(高级客户端):
@Configuration
public class ElasticSearchConfig {
@Bean
public RestHighLevelClient restHighLevelClient() {
return new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http"))
);
}
}
API:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.9/_search_apis.html
测试:
索引操作:
@Autowired
private RestHighLevelClient client;
/**
* 测试创建索引
*
* @throws IOException
*/
@Test
public void testCreateIndex() throws IOException {
CreateIndexRequest request = new CreateIndexRequest("xyz");
CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
System.out.println(response);
//org.elasticsearch.client.indices.CreateIndexResponse@156038
}
/**
* 测试获取索引信息
*
* @throws IOException
*/
@Test
public void testGetIndex() throws IOException {
GetIndexRequest request = new GetIndexRequest("xyz");
GetIndexResponse response = client.indices().get(request, RequestOptions.DEFAULT);
System.out.println(response.getAliases()); //{xyz=[]}
System.out.println(response.getMappings()); //{xyz=org.elasticsearch.cluster.metadata.MappingMetadata@a6b251b7}
System.out.println(response.getSettings());
//{xyz{"index.creation_date":"1653876942129","index.number_of_replicas":"1","index.number_of_shards":"1","index.provided_name":"xyz","index.uuid":"N9dgDmclRtOJeUeuNVFVmA","index.version.created":"7090299"}}
}
/**
* 测试删除索引
*
* @throws IOException
*/
@Test
public void testDeleteIndex() throws IOException {
DeleteIndexRequest request = new DeleteIndexRequest("xyz");
AcknowledgedResponse response = client.indices().delete(request, RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged()); //true
}
文档操作:
@Autowired
private RestHighLevelClient client;
/**
* 测试添加文档
*/
@Test
public void testAddDocument() throws IOException {
User user = new User("张三", 18);
IndexRequest request = new IndexRequest("xyz");
//设置id,不设置会自动生成,否则会生成UUID值
request.id("1");
request.timeout("1s");
//将用户信息转换为json
request.source(JSON.toJSONString(user), XContentType.JSON);
IndexResponse response = client.index(request, RequestOptions.DEFAULT);
System.out.println(response); //IndexResponse[index=xyz,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]
System.out.println(response.status()); //CREATED
}
/**
* 测试文档是否存在
*
* @throws IOException
*/
@Test
public void testDocumentExist() throws IOException {
GetRequest request = new GetRequest("xyz", "1");
boolean exists = client.exists(request, RequestOptions.DEFAULT);
System.out.println(exists); //true
}
/**
* 测试获取文档信息
*
* @throws IOException
*/
@Test
public void testGetDocument() throws IOException {
GetRequest request = new GetRequest("xyz", "1");
GetResponse response = client.get(request, RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString()); //{"age":18,"name":"张三"}
System.out.println(response);
//{"_index":"xyz","_type":"_doc","_id":"1","_version":1,"_seq_no":0,"_primary_term":1,"found":true,"_source":{"age":18,"name":"张三"}}
}
/**
* 测试修改文档信息
*
* @throws IOException
*/
@Test
public void testUpdateDocument() throws IOException {
UpdateRequest request = new UpdateRequest("xyz", "1");
request.timeout("1s");
//第一种方式
User user = new User();
user.setName("张三");
user.setAge(18);
request.doc(JSON.toJSONString(user), XContentType.JSON);
//第二种方式
//request.doc(XContentType.JSON, "age", 30);
UpdateResponse response = client.update(request, RequestOptions.DEFAULT);
System.out.println(response);
System.out.println(response.status());
}
/**
* 测试删除文档信息
*
* @throws IOException
*/
@Test
public void testDeleteDocument() throws IOException {
DeleteRequest request = new DeleteRequest("xyz", "1");
request.timeout("1s");
DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
System.out.println(response); //UpdateResponse[index=xyz,type=_doc,id=1,version=1,seqNo=0,primaryTerm=1,result=noop,shards=ShardInfo{total=0, successful=0, failures=[]}]
System.out.println(response.status()); //OK
}
批处理:
@Autowired
private RestHighLevelClient client;
/**
* 测试批量添加文档
*
* @throws IOException
*/
@Test
public void testBatchDocument() throws IOException {
BulkRequest request = new BulkRequest("xyz");
request.timeout("10s");
ArrayList<User> users = new ArrayList<>();
users.add(new User("tom", 18));
users.add(new User("kate", 10));
users.add(new User("bob", 20));
users.add(new User("alice", 30));
users.add(new User("marry", 40));
for (int i = 0; i < users.size(); i++) {
request.add(new IndexRequest("xyz")
.id(String.valueOf(i + 1))
.source(JSON.toJSONString(users.get(i)),XContentType.JSON));
}
BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
System.out.println(response); //org.elasticsearch.action.bulk.BulkResponse@47a7c93e
System.out.println(response.status()); //OK
}
复杂查询:
@Autowired
private RestHighLevelClient client;
@Test
public void testSearch() throws IOException {
SearchRequest request = new SearchRequest("xyz");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
//matchAllQuery 全量查询
MatchAllQueryBuilder queryBuilder = QueryBuilders.matchAllQuery();
//termQuery 精确查询
//TermQueryBuilder queryBuilder = QueryBuilders.termQuery("age", 18);
//boolQuery 多条件查询
//must 相与 多条件同时成立
//BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
//queryBuilder.must(QueryBuilders.matchQuery("name", "张三"));
//queryBuilder.must(QueryBuilders.matchQuery("age", 18));
//should 相或 多条件只要有一个成立即可
//queryBuilder.should(QueryBuilders.matchQuery("name", "张三"));
//queryBuilder.should(QueryBuilders.matchQuery("age", 18));
//rangeQuery 范围查询
//查询年龄大于等于20 小于等于40的数据
//RangeQueryBuilder queryBuilder = QueryBuilders.rangeQuery("age")
// .gte(20)
// .lte(40);
//fuzzyQuery 模糊查询
//FuzzyQueryBuilder queryBuilder = QueryBuilders.fuzzyQuery("name", "to");
//Fuzziness.ZERO 查询全部相同
//Fuzziness.ONE 查询允许有一个字符不同
//Fuzziness.TWO 查询允许有两个字符不同
//Fuzziness.AUTO 查询允许有任意个字符不同
//queryBuilder.fuzziness(Fuzziness.ONE);
sourceBuilder.query(queryBuilder);
//聚合查询
//对年龄进行分组
sourceBuilder.aggregation(AggregationBuilders.terms("ageGroup").field("age"));
//查询最大年龄
sourceBuilder.aggregation(AggregationBuilders.max("maxAge").field("age"));
//查询最小年龄
sourceBuilder.aggregation(AggregationBuilders.min("minAge").field("age"));
//查询平均年龄
sourceBuilder.aggregation(AggregationBuilders.avg("averageAge").field("age"));
//进行分页 从第0条数据开始,查询2条数据
sourceBuilder.from(0);
sourceBuilder.size(2);
//按名降序排序
sourceBuilder.sort("age", SortOrder.DESC);
//指定显示相应字段
//包含
String[] include = {"name", "age"};
//排除
String[] exclude = {};
sourceBuilder.fetchSource(include, exclude);
//设置高亮 给关键词添加样式
HighlightBuilder highlightBuilder = new HighlightBuilder()
.field("name")
.preTags("<font color='red'>")
.postTags("</font>");
sourceBuilder.highlighter(highlightBuilder);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
System.out.println("聚合结果:" + response.getAggregations());
//聚合结果:{ageGroup=org.elasticsearch.search.aggregations.bucket.terms.ParsedLongTerms@5fed9976, maxAge=org.elasticsearch.search.aggregations.metrics.ParsedMax@3fdcde7a, averageAge=org.elasticsearch.search.aggregations.metrics.ParsedAvg@4f363abd, minAge=org.elasticsearch.search.aggregations.metrics.ParsedMin@7302ff13}
System.out.println("总条数:" + response.getHits().getTotalHits());
for (SearchHit hit : response.getHits()) {
System.out.println(hit);
}
}
六、简单京东搜索
前后端分离:Vue + SpringBoot + Elasticsearch + Jsoup
前端:
//Search.vue
<template>
<div id="search">
<div id="o-header-2013">
<div id="header-2013" style="display: none"></div>
</div>
<div class="w">
<div id="logo-2014">
<a href="https://www.jd.com/" class="logo">京东</a>
</div>
<div id="search-2014">
<div class="form">
<input v-model="keyword" type="text" class="text blurcolor" /><span
class="photo-search-btn"
>
<form
id="search-img-upload"
clstag="h|keycount|2016|03d"
method="post"
action="//search.jd.com/image?op=upload"
enctype="multipart/form-data"
target="search_upload"
></form>
</span>
<button @click="searchGood" class="button cw-icon"><i></i>搜索</button>
</div>
</div>
<div id="settleup-2014" class="dorpdown">
<div class="dorpdown-layer">
<div class="spacer"></div>
<div id="settleup-content"><span class="loading"></span></div>
</div>
</div>
<span class="clr"></span>
</div>
<div id="J_searchWrap" class="w">
<div id="J_crumbsBar" class="crumbs-bar">
<div class="crumbs-nav">
<div class="crumbs-nav-main clearfix">
<div class="crumbs-nav-item">
<div class="crumbs-first">
<span>全部结果</span>
</div>
</div>
<i class="crumbs-arrow">></i>
<div class="crumbs-nav-item">
<strong v-if="isShow" class="search-key">{{keyword}}</strong>
</div>
</div>
</div>
</div>
<div id="J_container" class="container">
<div
id="J_main"
class="g-main2"
source-data-lazy-advertisement-install="1"
data-lazy-img-install="1"
>
<div class="m-list">
<div class="ml-wrap">
<div
id="J_goodsList"
class="goods-list-v2 gl-type-4 J-goods-list"
>
<ul class="gl-warp clearfix" data-tpl="2">
<li
v-for="(good, index) in goodList"
:key="index"
data-sku="10043840179876"
data-spu=""
ware-type="0"
class="gl-item"
>
<div class="gl-i-wrap">
<div class="p-img">
<a
target="_blank"
title=""
:href="good.image"
οnclick="searchlog(1, '10043840179876','0','2','','adwClk=','%7B%22ad%22%3A%221476%22%2C%22ch%22%3A%222%22%2C%22sku%22%3A%2210043840179876%22%2C%22ts%22%3A%221653532307%22%2C%22uniqid%22%3A%22%7B%5C%22material_id%5C%22%3A%5C%226235208574%5C%22%2C%5C%22pos_id%5C%22%3A%5C%221476%5C%22%2C%5C%22sid%5C%22%3A%5C%220eaa988f-b4a5-4823-9e41-7c77b2090680%5C%22%7D%22%7D');searchAdvPointReport('https://ccc-x.jd.com/dsp/nc?ext=aHR0cHM6Ly9pdGVtLmpkLmNvbS8xMDA0Mzg0MDE3OTg3Ni5odG1s&log=Xgz-dYpnsJ1ZC25-Hwu8kp72-Q7uC7EmdHa-suxEju2IF8uhvRHasUFu8ChSVC4CB2fL7VDuIrJl4saYEJq7cpaoh-Vk8RM8g4JAIKWoxbcaD46a3nQ5K9RtmggyBcvxssoZN2rA39o3VX9XjkcTFc_Qbu3otPpX4WgDIt9wW9Hr_bbwFj79YCk0mMajhp-0fbFlXQRTojKrOpt9pN9BpffCKxsSYCDw7PjxD7nvRgPwQMEC4a8_tZY3z_QORPoutbbfQ2fPZXni4wojfW6znBxSj3_dpAIX3Lb4Tjw3cAiS4CgHCF5LjjYAbbYMq1qLpHGRrmgJnShP9mxmz9TAnBvK04uB7n9x7WxUT2cXTh6Jo-rR7MQq_RfGeLhX0RW_XwEqy7MJGw32NmJNikAr1L_EV-vNfveIL_-1k2KDk3qINmB2TQ4HI5uzKmzxq_IefT4gJi7-8nZOw43sGuKV7cTbGbncx4e6bZtfpE7d0BtZCXiJpgbF5xkwhiMp-HFnPQdE5AxxA463ae0QuiLMS05gG9yxixu5WH6SxPsNUVE9NiqqwgYQ30dparDFFGdqpyaKPPiNmJ2f5k6JhbCQjEZX9iOAkJUbskDb4swC2DY1rSw4xV_QHeEs1xKQZ1MmNojXcQ8tJNmNMjTHBFvIpvSdBnbHAddXv7qCkz-r7sSzb35ElukWuypTD-g_6FDoF_Rnod46cQWtEyoSisEBAkChGpOpFrI1sBqoHb_nQ2PgxUWSjwu3Plge1JEqLolnXJoqnUq1h7Qv1yRyktoLd7EawrL6ZJk5F59_yxkL1axbqYk1gqeGqIs8j4R47W4vZmn5bd5Zo-Q7kcR84GEgU7f7FlU3yKx2tTsGPDH4w6WJZZOIzI8xr1refTAcO8GHTgACv6uUqfKLDwfzi7nUc7GMvxuM25XapD2QFYcZKVAZ3CyrHNedB744ENsjLpNwuNjwCiSYVapOfjgY_H1TdUZBB-KU74HQW-tJYyXROtjsvwg6MiKS-vnEfJAJXBkyxnyZQrAuMs8wUlyvRNhUVRy8kxM-G0D2_AALABWj9cNIMVIkqR2gwUeBWnbrjJs-lYGhQLKZHEh2GzM-SIlQdB-ZDCNZU3ZT9H86GnRsctSZf9b9PRY3ZscWTFeaN7GsyGddo8qKW2AFyGSmrzOT00SSz18MSF1FQ9T3dCs9uFn5QdF55f-FoSXCJ_8vsfoDaOzxuEChERzLSTH0jrJLE65C8j-_w4oMzI6WrOlERpmiqVxtRJoq9_LmeWUFLW_2SHzv2AaVC_RcJGwagwm6Yods6Z4LhOVajFeF0FdZ4G2a4IMNnTg8NlaF9rq11lxEhZwfQI66vLjP135tOx_KONn1G7vmsgAbY-seKcB7or2RDvv31q3yL2ud4uxNT9x4Xf5tD3QJ7BUjstffskl3JK8uT1ao9ISUN6TWW3Taz6TgAWt44U8vQGUXr0oUwoe9752fFwPOoOEQQKIOJ6_3I5wuc6N_NUc-bMAWMFJTFuhg2wMGKwE-oQfxF2AoJ7HUWQqlVW1KxJ2HOuLSI5MakyhXaiHIuKfkDhSbsZoHvPD50rlh3HyIb5uZCHK_ZLK4&v=404&clicktype=1&&clicktype=1');"
>
<img :src="good.image">
</a>
</div>
<div class="p-price">
<strong
class="J_10043840179876"
data-presale="0"
data-done="1"
stock-done="1"
>
<em>¥</em><i data-price="10043840179876">{{good.price}}</i>
</strong>
</div>
<div class="p-name">
<a
target="_blank"
title=""
href="https://item.jd.com/10043840179876.html"
οnclick="searchlog(1, '10043840179876','0','1','','adwClk=','%7B%22ad%22%3A%221476%22%2C%22ch%22%3A%222%22%2C%22sku%22%3A%2210043840179876%22%2C%22ts%22%3A%221653532307%22%2C%22uniqid%22%3A%22%7B%5C%22material_id%5C%22%3A%5C%226235208574%5C%22%2C%5C%22pos_id%5C%22%3A%5C%221476%5C%22%2C%5C%22sid%5C%22%3A%5C%220eaa988f-b4a5-4823-9e41-7c77b2090680%5C%22%7D%22%7D');searchAdvPointReport('https://ccc-x.jd.com/dsp/nc?ext=aHR0cHM6Ly9pdGVtLmpkLmNvbS8xMDA0Mzg0MDE3OTg3Ni5odG1s&log=Xgz-dYpnsJ1ZC25-Hwu8kp72-Q7uC7EmdHa-suxEju2IF8uhvRHasUFu8ChSVC4CB2fL7VDuIrJl4saYEJq7cpaoh-Vk8RM8g4JAIKWoxbcaD46a3nQ5K9RtmggyBcvxssoZN2rA39o3VX9XjkcTFc_Qbu3otPpX4WgDIt9wW9Hr_bbwFj79YCk0mMajhp-0fbFlXQRTojKrOpt9pN9BpffCKxsSYCDw7PjxD7nvRgPwQMEC4a8_tZY3z_QORPoutbbfQ2fPZXni4wojfW6znBxSj3_dpAIX3Lb4Tjw3cAiS4CgHCF5LjjYAbbYMq1qLpHGRrmgJnShP9mxmz9TAnBvK04uB7n9x7WxUT2cXTh6Jo-rR7MQq_RfGeLhX0RW_XwEqy7MJGw32NmJNikAr1L_EV-vNfveIL_-1k2KDk3qINmB2TQ4HI5uzKmzxq_IefT4gJi7-8nZOw43sGuKV7cTbGbncx4e6bZtfpE7d0BtZCXiJpgbF5xkwhiMp-HFnPQdE5AxxA463ae0QuiLMS05gG9yxixu5WH6SxPsNUVE9NiqqwgYQ30dparDFFGdqpyaKPPiNmJ2f5k6JhbCQjEZX9iOAkJUbskDb4swC2DY1rSw4xV_QHeEs1xKQZ1MmNojXcQ8tJNmNMjTHBFvIpvSdBnbHAddXv7qCkz-r7sSzb35ElukWuypTD-g_6FDoF_Rnod46cQWtEyoSisEBAkChGpOpFrI1sBqoHb_nQ2PgxUWSjwu3Plge1JEqLolnXJoqnUq1h7Qv1yRyktoLd7EawrL6ZJk5F59_yxkL1axbqYk1gqeGqIs8j4R47W4vZmn5bd5Zo-Q7kcR84GEgU7f7FlU3yKx2tTsGPDH4w6WJZZOIzI8xr1refTAcO8GHTgACv6uUqfKLDwfzi7nUc7GMvxuM25XapD2QFYcZKVAZ3CyrHNedB744ENsjLpNwuNjwCiSYVapOfjgY_H1TdUZBB-KU74HQW-tJYyXROtjsvwg6MiKS-vnEfJAJXBkyxnyZQrAuMs8wUlyvRNhUVRy8kxM-G0D2_AALABWj9cNIMVIkqR2gwUeBWnbrjJs-lYGhQLKZHEh2GzM-SIlQdB-ZDCNZU3ZT9H86GnRsctSZf9b9PRY3ZscWTFeaN7GsyGddo8qKW2AFyGSmrzOT00SSz18MSF1FQ9T3dCs9uFn5QdF55f-FoSXCJ_8vsfoDaOzxuEChERzLSTH0jrJLE65C8j-_w4oMzI6WrOlERpmiqVxtRJoq9_LmeWUFLW_2SHzv2AaVC_RcJGwagwm6Yods6Z4LhOVajFeF0FdZ4G2a4IMNnTg8NlaF9rq11lxEhZwfQI66vLjP135tOx_KONn1G7vmsgAbY-seKcB7or2RDvv31q3yL2ud4uxNT9x4Xf5tD3QJ7BUjstffskl3JK8uT1ao9ISUN6TWW3Taz6TgAWt44U8vQGUXr0oUwoe9752fFwPOoOEQQKIOJ6_3I5wuc6N_NUc-bMAWMFJTFuhg2wMGKwE-oQfxF2AoJ7HUWQqlVW1KxJ2HOuLSI5MakyhXaiHIuKfkDhSbsZoHvPD50rlh3HyIb5uZCHK_ZLK4&v=404&clicktype=1&&clicktype=1');"
>
<div v-html="good.name"></div>
<i class="promo-words" id="J_AD_10043840179876"></i>
</a>
</div>
<div
class="p-shopnum"
data-dongdong=""
data-selfware="0"
data-score="0"
data-reputation="40"
data-done="1"
>
<a
class="curr-shop hd-shopname"
target="_blank"
:title="good.shopName"
>
{{good.shopName}}
</a>
</div>
<div
class="p-icons"
id="J_pro_10043840179876"
data-done="1"
></div>
<div class="p-operate">
<a
class="p-o-btn focus J_focus"
data-sku="10043840179876"
href="javascript:;"
οnclick="searchlog(1, '10043840179876','0','5','','adwClk=','%7B%22ad%22%3A%221476%22%2C%22ch%22%3A%222%22%2C%22sku%22%3A%2210043840179876%22%2C%22ts%22%3A%221653532307%22%2C%22uniqid%22%3A%22%7B%5C%22material_id%5C%22%3A%5C%226235208574%5C%22%2C%5C%22pos_id%5C%22%3A%5C%221476%5C%22%2C%5C%22sid%5C%22%3A%5C%220eaa988f-b4a5-4823-9e41-7c77b2090680%5C%22%7D%22%7D')"
><i></i>关注</a
>
<a
class="p-o-btn addcart"
data-stocknew="10043840179876"
data-limit="0"
><i></i>加入购物车</a
>
</div>
<div
class="p-stock hide"
data-stocknew="10043840179876"
data-province="湖北"
></div>
<span class="p-promo-flag">广告</span>
<img
source-data-lazy-advertisement="done"
style="display: none"
src="assets/blank.gif"
class="err-poster"
/>
</div>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</template>
<script>
import axios from 'axios'
export default {
name: "Search",
data() {
return {
keyword: "",
goodList: [],
isShow: false
}
},
methods: {
searchGood(){
if(this.keyword === null)
{
alert("请输入关键词进行查询!!!")
}
else
{
//发送请求访问后端
axios.get(`http://localhost:8081/goods/${this.keyword}/0/10`).then(
response=>{
this.goodList = response.data
this.isShow = true
}
)
}
}
},
};
</script>
<style>
</style>
后端:
导入依赖:
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.15.1</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.80</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
商品实体类:
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Good {
private String name;
private Double price;
private String image;
private String shopName;
}
Jsoup工具类:
public class JsoupUtil {
private final String DEFAULT_URL = "https://search.jd.com/Search?keyword=";
private final String DEFAULT_KEYWORD = "java";
private String url = DEFAULT_URL;
private String keyword = DEFAULT_KEYWORD;
public JsoupUtil(String url, String keyword) {
if(StringUtils.hasLength(url)) {
this.url = url;
}
if(StringUtils.hasLength(keyword)) {
this.keyword = keyword;
}
}
//解析京东html,返回商品集合
public List<Good> getContent() throws IOException {
Document document = Jsoup.parse(new URL(url + keyword), 3000);
Element content = document.getElementsByClass("gl-warp").get(0);
Elements goods = content.getElementsByClass("gl-item");
ArrayList<Good> list = new ArrayList<>();
for (Element good : goods) {
String name = good.getElementsByClass("p-name").get(0).getElementsByTag("em").get(0).text();
String price = good.getElementsByClass("p-price").get(0).getElementsByTag("i").get(0).html();
String image = good.getElementsByClass("p-img").get(0).getElementsByTag("img").get(0).attr("data-lazy-img");
String shopName = good.getElementsByClass("curr-shop").get(0).text();
Good g = new Good();
g.setName(name);
g.setPrice(Double.parseDouble(price));
g.setImage(image);
g.setShopName(shopName);
list.add(g);
}
return list;
}
}
Service:
@Service
public class ContentService {
@Autowired
private RestHighLevelClient client;
public boolean addContent(String keyword) throws IOException {
JsoupUtil jsoupUtil = new JsoupUtil(null, keyword);
List<Good> content = jsoupUtil.getContent();
BulkRequest request = new BulkRequest("qingsong");
request.timeout("10s");
for (int i = 0; i < content.size(); i++) {
request.add(new IndexRequest("qingsong")
.source(JSON.toJSONString(content.get(i)), XContentType.JSON));
}
BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
return !response.hasFailures();
}
public List<Map<String, Object>> getContent(String keyword, int pageNum, int pageSize) throws IOException {
if(!StringUtils.hasText(keyword))
{
keyword = "java";
}
if(pageSize <= 0)
{
pageSize = 10;
}
SearchRequest request = new SearchRequest("qingsong");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
MatchQueryBuilder query = QueryBuilders.matchQuery("name", keyword);
sourceBuilder.query(query);
//设置超时
sourceBuilder.timeout(TimeValue.timeValueSeconds(3));
//设置文字高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("name");
highlightBuilder.preTags("<font style='color:#f84a4a;font-size:16px'>");
highlightBuilder.postTags("</font>");
sourceBuilder.highlighter(highlightBuilder);
//分页
sourceBuilder.from(pageNum * pageSize);
sourceBuilder.size(pageSize);
request.source(sourceBuilder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
ArrayList<Map<String, Object>> list = new ArrayList<>();
for (SearchHit hit : response.getHits().getHits()) {
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
HighlightField name = highlightFields.get("name");
Map<String, Object> source = hit.getSourceAsMap(); //原来的数据
if(name != null)
{
Text[] fragments = name.fragments();
String highLightName = "";
for (Text text : fragments) {
highLightName += text;
}
//用高亮的name替换原来的name字段
source.put("name", highLightName);
}
list.add(source);
}
return list;
}
}
Controller:
@CrossOrigin
@RestController
public class ContentController {
@Autowired
private ContentService contentService;
@RequestMapping({"/goods", "/goods/{keyword}"})
private boolean addContent(@PathVariable("keyword") @Nullable String keyword) throws IOException {
return contentService.addContent(keyword);
}
@RequestMapping("/goods/{keyword}/{pageNum}/{pageSize}")
private List<Map<String, Object>> getContent(@PathVariable("keyword") @Nullable String keyword, @PathVariable("pageNum") int pageNum, @PathVariable("pageSize") int pageSize) throws IOException {
return contentService.getContent(keyword, pageNum, pageSize);
}
}
运行截图: