ES是基于Lucene的开源搜索引擎,其查询语法关键字部分和Lucene大致一样:
分页: from/size、字段:fields、排序:sort、查询:query
过滤:filter、高亮:highlight、统计:facet
ES的搜索类型有4种(以下说明是基于elasticsearch2.3):
query and fetch (速度最快)(返回N倍数据量) 受保护,5.3之前可用
query then fetch (默认的搜索方式)
DFS query and fetch 没有了
DFS query then fetch (可以更精确控制搜索打分和排名)
DFS:这个D可能是Distributed,F可能是frequency的缩写,至于S可能是Scatter的缩写,整个单词可能是分布式词频率和文档频率散发的缩写。
初始化散发:从ES的官方网站可以发现,初始化散发其实就是在进行真正的查询之前,先把各个分片的词频率和文档频率收集一下,然后进行词搜索的时候,各分片依据全局的词频率和文档频率进行搜索和排名。显然如果使用DFS_QUERY_THEN_FETCH这种查询方式,效率是最低的,因为一个搜索,可能要请求3次分片。但使用DFS方法,搜索精度应该是最高的。
综上,从性能考虑:
QUERY_AND_FETCH是最快的,DFS_QUERY_THEN_FETCH是最慢的。
从搜索的精确度:
DFS要比非DFS的准确度更高。
对应每个查询项,我们可以通过must、should、mustNot方法对QueryBuilder进行组合,形成多条件查询。(must => and, should=>or)
Luncene 支持基于词条的TermQuery、RangeQuery、PrefixQuery、BooleanQuery、PhraseQuery、WildcardQuery、FuzzQuery
- TermQuery与QueryParser
单个单词作为查询表达式时,它相当于一个单独的项,如果表达式是由单个单词构成,QueryParser的parse()方法会返回一个TermQuery对象。
如查询表达式为content:hello, QueryParser会返回一个域为content,值为hello的TermQuery。
Query query = new TermQuery("content", "hello")
- RangeQuery与QUeryParser
QueryParser可以使用[ 起始 To 终止 ] 或 { 起始 To 终止 }表达式来构造RangeQuery。
如查询表达式:time:[20181010 To 20181210], QueryParser会返回一个域为time,下限为20181010,上限为20181210的RangeQuery。
Term t1 = new Term("time", "20181010");
Term t2 = new Term("time", "20181210");
Query query = new RangeQuery(t1, t2, true);
- PrefixQuery与QueryParser
- 当查询表达式中短语以星号(*)结尾时,QueryParser会创建一个PrefixQuery对象。
-
如查询表达式为content:luc*, 则QueryParser会返回一个域为content,值为luc的PrefixQuery
Query query = new PrefixQuery(luc);
- BooleanQuery与QueryParser
- 当查询表达式中包含多个项时,QueryParser可以方便的构建BooleanQuery。QueryParser使用圆括号分组,通过-,+,AND,OR及NOT来指定所生成的BooleanQuery。
-
PhraseQuery与QueryParser
在QueryParser的分析表达式中双引号的若干项会被转换为一个PhraseQuery对象,默认情况下,Slop因子为0,可以在表达式中通过~n来指定slop因子的值。
如查询表达式为content:"hello world" ~3, 则QueryParser会返回一个域为content,内容为"hello world", slope为3的短语查询。
Query query = new PhraseQuery();
query.setSlop(3);
query.add(new Term("content", "hello"));
query.add(new Term("content", "world"));
-
Wildcard与QueryParser
Luncene使用两个标准的通配符号,*代表0或多个字母,?代表0或1个字母。但查询表达式中包含*或者?时,则QueryParser会返回一个WildcardQuery对象。但要注意的是,当*出现在查询表达式的末尾时,会被优化为PrefixQuery;并且查询表达式的首个字符不能是通配符,防止用户输入以通配符*为前缀的搜索表达式,导致Lucene枚举所有的项而耗费巨大的资源。
- FuzzyQuery和QueryParser
QueryParser通过在某个项之后添加"~"来支持FuzzyQuery类的模糊查询。
代码实现:
-
简单查询及显示所有内容:
-
@Test public void testQuery1(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 /** * 设置检索方式: * QUERY_AND_FETCH: 在5.3之前,之后受保护 * QUERY_THEN_FETCH: 默认 * DFS_QUERY_AND_FETCH: 直接移除,新版本没有 * DFS_QUERY_THEN_FETCH: */ .setSearchType(SearchType.DEFAULT) /** * 设置要检索的内容 * 基于不同的检索方式,是否能够检索到想要的数据,就逐渐衍生出来了一个职位SEO,搜索引擎优化 */ // .setQuery(QueryBuilders.matchPhrasePrefixQuery("firstname", "V*")) // 在firstname字段上检索以V开头的数据 // .setQuery(QueryBuilders.matchQuery("state", "NM")) .setQuery(QueryBuilders.termQuery("age", 40)) //分页,每页显示M条,显示第N页的数据setFrom((N - 1) * M ).setSize() .setFrom(1)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); /** * "hits": [ * { * "_index": "product", * "_type": "bigdata", * "_id": "5", * "_score": 1, * "_source": { * "name": "redis", * "author": "redis", * "version": "5.0.0" * } * } */ SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){ System.out.println("--------------------------------------------"); String index = hit.getIndex(); String type = hit.getType(); String id = hit.getId(); float score = hit.getScore(); System.out.println("index: " + index); System.out.println("type: " + type); System.out.println("id: " + id); System.out.println("score: " + score); Map<String, Object> source = hit.getSourceAsMap(); source.forEach((field, value) ->{ System.out.println(field + "--->" + value); }); } }
-
查询字段部分高亮显示:
-
@Test public void testHightLight(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .highlighter(//设置高亮显示 SearchSourceBuilder.highlight() .field("address") .preTags("<font color='red' size='16px'>") .postTags("</font>") ) .setFrom(0)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){//获取高亮显示的内容 System.out.println("-------------------------------------------"); //高亮字段内容 Map<String, HighlightField> highlightFields = hit.getHighlightFields(); highlightFields.forEach((key,highlightField) -> { System.out.println("key: " + key); String address = ""; Text[] fragments = highlightField.fragments(); for (Text fragment : fragments){ address += fragment.toString(); } System.out.println("address: " + address); }); } }
-
按照某个字段进行排序显示:
-
@Test public void testSort(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .highlighter(//设置高亮显示 SearchSourceBuilder.highlight() .field("address") .preTags("<font color='red' size='16px'>") .postTags("</font>") ) .addSort("age", SortOrder.ASC) // .addSort("age", SortOrder.DESC) .setFrom(0)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){//获取高亮显示的内容 System.out.println("-------------------------------------------"); Map<String, Object> source = hit.getSourceAsMap(); Object firstname = source.get("firstname"); Object age = source.get("age"); System.out.println("firstname: " + firstname); System.out.println("age: " + age); //高亮字段内容 Map<String, HighlightField> highlightFields = hit.getHighlightFields(); highlightFields.forEach((key,highlightField) -> { System.out.println("key: " + key); String address = ""; Text[] fragments = highlightField.fragments(); for (Text fragment : fragments){ address += fragment.toString(); } System.out.println("address: " + address); }); } }
-
聚合操作测试:
-
@Test public void testAggr(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .addAggregation( AggregationBuilders .avg("avg_age")//select avg(age) avg_age --> 这里面的name就是最好显示的别名 .field("age")//select max(age), min(age), avg(age) avg_age --> 这里面的field就是这里的age对应列,或者索引库中的field ) .get(); Aggregations aggrs = response.getAggregations();//是个集合 // System.out.println(aggrs); for (Aggregation aggr : aggrs){ // System.out.println(aggr); // System.out.println(aggr.getName()); // System.out.println(aggr.getType()); InternalAvg avg = (InternalAvg) aggr; double value = avg.getValue(); System.out.println(avg.getName() + "-->" + value); } }
-
过滤部分字段范围测试:
-
@Test public void testFilter(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .highlighter(//设置高亮显示 SearchSourceBuilder.highlight() .field("address") .preTags("<font color='red' size='16px'>") .postTags("</font>") ) //过滤年龄在30~35之间的数据 .setPostFilter( QueryBuilders.rangeQuery("age").gte(30).lte(35) ) .addSort("age", SortOrder.ASC) .setFrom(0)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){//获取高亮显示的内容 System.out.println("-------------------------------------------"); Map<String, Object> source = hit.getSourceAsMap(); Object firstname = source.get("firstname"); Object age = source.get("age"); System.out.println("firstname: " + firstname); System.out.println("age: " + age); //高亮字段内容 Map<String, HighlightField> highlightFields = hit.getHighlightFields(); highlightFields.forEach((key,highlightField) -> { System.out.println("key: " + key); String address = ""; Text[] fragments = highlightField.fragments(); for (Text fragment : fragments){ address += fragment.toString(); } System.out.println("address: " + address); }); } }
-
全代码:
elasticsearch.conf:
cluster.name=rk-ES cluster.host.port=hadoop01:9300,hadoop02:9300,hadoop03:9300
package rk.constants; /** * @Author rk * @Date 2018/12/10 15:14 * @Description: **/ public interface Constants { String CLUSTER_NAME = "cluster.name"; String CLUSTER_HOST_PORT = "cluster.host.port"; }
-
package rk.elastic; import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.action.search.SearchType; import org.elasticsearch.client.transport.TransportClient; import org.elasticsearch.common.settings.Settings; import org.elasticsearch.common.text.Text; import org.elasticsearch.common.transport.TransportAddress; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.SearchHit; import org.elasticsearch.search.SearchHits; import org.elasticsearch.search.aggregations.Aggregation; import org.elasticsearch.search.aggregations.AggregationBuilders; import org.elasticsearch.search.aggregations.Aggregations; import org.elasticsearch.search.aggregations.metrics.avg.InternalAvg; import org.elasticsearch.search.builder.SearchSourceBuilder; import org.elasticsearch.search.fetch.subphase.highlight.HighlightField; import org.elasticsearch.search.sort.SortOrder; import org.elasticsearch.transport.client.PreBuiltTransportClient; import org.junit.After; import org.junit.Before; import org.junit.Test; import rk.constants.Constants; import java.io.IOException; import java.io.InputStream; import java.net.InetSocketAddress; import java.util.Map; import java.util.Properties; /** * @Author rk * @Date 2018/12/10 15:06 * @Description: **/ public class ElasticSearchTest2 { private TransportClient client; @Before public void setUp() throws IOException { Properties properties = new Properties(); InputStream in = ElasticSearchTest2.class.getClassLoader().getResourceAsStream("elasticsearch.conf"); properties.load(in); Settings setting = Settings.builder() .put(Constants.CLUSTER_NAME,properties.getProperty(Constants.CLUSTER_NAME)) .build(); client = new PreBuiltTransportClient(setting); String hostAndPorts = properties.getProperty(Constants.CLUSTER_HOST_PORT); for (String hostAndPort : hostAndPorts.split(",")){ String[] fields = hostAndPort.split(":"); String host = fields[0]; int port = Integer.valueOf(fields[1]); TransportAddress ts = new TransportAddress(new InetSocketAddress(host, port)); client.addTransportAddresses(ts); } System.out.println("cluster.name = " + client.settings().get("cluster.name")); } String[] indices = {"product","test"}; @Test public void testQuery1(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 /** * 设置检索方式: * QUERY_AND_FETCH: 在5.3之前,之后受保护 * QUERY_THEN_FETCH: 默认 * DFS_QUERY_AND_FETCH: 直接移除,新版本没有 * DFS_QUERY_THEN_FETCH: */ .setSearchType(SearchType.DEFAULT) /** * 设置要检索的内容 * 基于不同的检索方式,是否能够检索到想要的数据,就逐渐衍生出来了一个职位SEO,搜索引擎优化 */ // .setQuery(QueryBuilders.matchPhrasePrefixQuery("firstname", "V*")) // 在firstname字段上检索以V开头的数据 // .setQuery(QueryBuilders.matchQuery("state", "NM")) .setQuery(QueryBuilders.termQuery("age", 40)) //分页,每页显示M条,显示第N页的数据setFrom((N - 1) * M ).setSize() .setFrom(1)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); /** * "hits": [ * { * "_index": "product", * "_type": "bigdata", * "_id": "5", * "_score": 1, * "_source": { * "name": "redis", * "author": "redis", * "version": "5.0.0" * } * } */ SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){ System.out.println("--------------------------------------------"); String index = hit.getIndex(); String type = hit.getType(); String id = hit.getId(); float score = hit.getScore(); System.out.println("index: " + index); System.out.println("type: " + type); System.out.println("id: " + id); System.out.println("score: " + score); Map<String, Object> source = hit.getSourceAsMap(); source.forEach((field, value) ->{ System.out.println(field + "--->" + value); }); } } @Test public void testHightLight(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .highlighter(//设置高亮显示 SearchSourceBuilder.highlight() .field("address") .preTags("<font color='red' size='16px'>") .postTags("</font>") ) .setFrom(0)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){//获取高亮显示的内容 System.out.println("-------------------------------------------"); //高亮字段内容 Map<String, HighlightField> highlightFields = hit.getHighlightFields(); highlightFields.forEach((key,highlightField) -> { System.out.println("key: " + key); String address = ""; Text[] fragments = highlightField.fragments(); for (Text fragment : fragments){ address += fragment.toString(); } System.out.println("address: " + address); }); } } @Test public void testSort(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .highlighter(//设置高亮显示 SearchSourceBuilder.highlight() .field("address") .preTags("<font color='red' size='16px'>") .postTags("</font>") ) .addSort("age", SortOrder.ASC) // .addSort("age", SortOrder.DESC) .setFrom(0)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){//获取高亮显示的内容 System.out.println("-------------------------------------------"); Map<String, Object> source = hit.getSourceAsMap(); Object firstname = source.get("firstname"); Object age = source.get("age"); System.out.println("firstname: " + firstname); System.out.println("age: " + age); //高亮字段内容 Map<String, HighlightField> highlightFields = hit.getHighlightFields(); highlightFields.forEach((key,highlightField) -> { System.out.println("key: " + key); String address = ""; Text[] fragments = highlightField.fragments(); for (Text fragment : fragments){ address += fragment.toString(); } System.out.println("address: " + address); }); } } @Test public void testAggr(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .addAggregation( AggregationBuilders .avg("avg_age")//select avg(age) avg_age --> 这里面的name就是最好显示的别名 .field("age")//select max(age), min(age), avg(age) avg_age --> 这里面的field就是这里的age对应列,或者索引库中的field ) .get(); Aggregations aggrs = response.getAggregations();//是个集合 // System.out.println(aggrs); for (Aggregation aggr : aggrs){ // System.out.println(aggr); // System.out.println(aggr.getName()); // System.out.println(aggr.getType()); InternalAvg avg = (InternalAvg) aggr; double value = avg.getValue(); System.out.println(avg.getName() + "-->" + value); } } @Test public void testFilter(){ SearchResponse response = client .prepareSearch(indices) // 指定要检索的索引库 .setSearchType(SearchType.DEFAULT) .setQuery(QueryBuilders.matchQuery("address", "Avenue")) .highlighter(//设置高亮显示 SearchSourceBuilder.highlight() .field("address") .preTags("<font color='red' size='16px'>") .postTags("</font>") ) //过滤年龄在30~35之间的数据 .setPostFilter( QueryBuilders.rangeQuery("age").gte(30).lte(35) ) .addSort("age", SortOrder.ASC) .setFrom(0)//从哪一条开始显示 .setSize(5)//每页显示的内容 .get(); // 返回检索结果数据,被封装SearchHits对象中 SearchHits searchHits = response.getHits(); long totalHits = searchHits.totalHits; System.out.println("搜索到"+totalHits+"个结果"); SearchHit[] hits = searchHits.getHits(); for (SearchHit hit : hits){//获取高亮显示的内容 System.out.println("-------------------------------------------"); Map<String, Object> source = hit.getSourceAsMap(); Object firstname = source.get("firstname"); Object age = source.get("age"); System.out.println("firstname: " + firstname); System.out.println("age: " + age); //高亮字段内容 Map<String, HighlightField> highlightFields = hit.getHighlightFields(); highlightFields.forEach((key,highlightField) -> { System.out.println("key: " + key); String address = ""; Text[] fragments = highlightField.fragments(); for (Text fragment : fragments){ address += fragment.toString(); } System.out.println("address: " + address); }); } } @After public void cleanUp(){ client.close(); } }
elasticsearch2.3版本时的代码实现:
package rk.elastic; import com.fasterxml.jackson.databind.ObjectMapper; import org.dom4j.Document; import org.dom4j.Element; import org.dom4j.io.SAXReader; import org.elasticsearch.action.bulk.BulkRequestBuilder; import org.elasticsearch.action.bulk.BulkResponse; import org.elasticsearch.action.index.IndexRequest; import org.elasticsearch.action.search.SearchRequestBuilder; import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.action.search.SearchType; import org.elasticsearch.client.transport.TransportClient; import org.elasticsearch.common.settings.Settings; import org.elasticsearch.common.text.Text; import org.elasticsearch.common.transport.TransportAddress; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.SearchHit; import org.elasticsearch.search.SearchHits; import org.elasticsearch.search.fetch.subphase.highlight.HighlightField; import org.elasticsearch.transport.client.PreBuiltTransportClient; import org.junit.Before; import org.junit.Test; import rk.constants.Constants; import java.io.File; import java.io.IOException; import java.io.InputStream; import java.net.InetSocketAddress; import java.util.*; /** * @Author rk * @Date 2018/12/10 15:06 * @Description: * * 准备数据: * <doc> * <url>http://gongyi.sohu.com/20120730/n349358066.shtml</url> * <docno>fdaa73d52fd2f0ea-34913306c0bb3300</docno> * <contenttitle>失独父母中年遇独子夭折 称不怕死亡怕养老生病</contenttitle> * <content></content> * </doc> * **/ class Article{ private String url; private String docno; private String content; private String contenttitle; public Article() { } public Article(String url, String docno, String content, String contenttitle) { this.url = url; this.docno = docno; this.content = content; this.contenttitle = contenttitle; } public String getUrl() { return url; } public void setUrl(String url) { this.url = url; } public String getDocno() { return docno; } public void setDocno(String docno) { this.docno = docno; } public String getContent() { return content; } public void setContent(String content) { this.content = content; } public String getContenttitle() { return contenttitle; } public void setContenttitle(String contenttitle) { this.contenttitle = contenttitle; } @Override public String toString() { return "Article{" + "url='" + url + '\'' + ", docno='" + docno + '\'' + ", content='" + content + '\'' + ", contenttitle='" + contenttitle + '\'' + '}'; } } // 解析代码,取其中前20条 class XmlParser { public static List<Article> getArticle() { List<Article> list = new ArrayList<Article>(); SAXReader reader = new SAXReader(); Document document; try { document = reader.read(new File("news_sohusite_xml")); Element root = document.getRootElement(); Iterator<Element> iterator = root.elementIterator("doc"); Article article = null; int count = 0; while(iterator.hasNext()) { Element doc = iterator.next(); String url = doc.elementTextTrim("url"); String docno = doc.elementTextTrim("docno"); String content = doc.elementTextTrim("content"); String contenttitle = doc.elementTextTrim("contenttitle"); article = new Article(); article.setContent(content); article.setDocno(docno); article.setContenttitle(contenttitle); article.setUrl(url); if(++count > 20) { break; } list.add(article); } } catch (Exception e) { e.printStackTrace(); } return list; } } public class ElasticSearchTest2_3 { private TransportClient client; @Before public void setUp() throws IOException { Properties properties = new Properties(); InputStream in = ElasticSearchTest2.class.getClassLoader().getResourceAsStream("elasticsearch.conf"); properties.load(in); Settings setting = Settings.builder() .put(Constants.CLUSTER_NAME,properties.getProperty(Constants.CLUSTER_NAME)) .build(); client = new PreBuiltTransportClient(setting); String hostAndPorts = properties.getProperty(Constants.CLUSTER_HOST_PORT); System.out.println(hostAndPorts); for (String hostAndPort : hostAndPorts.split(",")){ String[] fields = hostAndPort.split(":"); String host = fields[0]; int port = Integer.valueOf(fields[1]); TransportAddress ts = new TransportAddress(new InetSocketAddress(host, port)); client.addTransportAddresses(ts); } TransportAddress[] transportAddrs = new TransportAddress[3]; client.addTransportAddresses(transportAddrs); System.out.println("cluster.name = " + client.settings().get("cluseter.name")); } String index = "search"; // 批量导入ES库 @Test public void bulkInsert() throws Exception { List<Article> list = XmlParser.getArticle(); ObjectMapper oMapper = new ObjectMapper(); BulkRequestBuilder bulkRequestBuilder = client.prepareBulk(); for (int i = 0; i < list.size(); i++) { Article article = list.get(i); String val = oMapper.writeValueAsString(article); bulkRequestBuilder.add(new IndexRequest(index, "news", article.getDocno()).source(val)); } BulkResponse response = bulkRequestBuilder.get(); } //查询 @Test public void testSearch() { String indices = "bigdata";//指的是要搜索的哪一个索引库 SearchRequestBuilder builder = client.prepareSearch(indices) .setSearchType(SearchType.DEFAULT) .setFrom(0) .setSize(5)//设置分页 /** * 这是最新的 * .highlighter(//设置高亮显示 * SearchSourceBuilder.highlight() * .field("address") * .preTags("<font color='red' size='16px'>") * .postTags("</font>") * ) */ .addHighlightedField("name")//设置高亮字段 .setHighlighterPreTags("<font color='blue'>") .setHighlighterPostTags("</font>");//高亮风格 builder.setQuery( QueryBuilders.fuzzyQuery("name", "hadoop")); SearchResponse searchResponse = builder.get(); SearchHits searchHits = searchResponse.getHits(); SearchHit[] hits = searchHits.getHits(); long total = searchHits.getTotalHits(); System.out.println("总共条数:" + total);//总共查询到多少条数据 for (SearchHit searchHit : hits) { Map<String, Object> source = searchHit.getSource();//这是最新的:searchHit.getSourceAsMap() Map<String, HighlightField> highlightFields = searchHit.getHighlightFields(); System.out.println("---------------------------"); String name = source.get("name").toString(); String author = source.get("author").toString(); System.out.println("name=" + name); System.out.println("author=" + author); HighlightField highlightField = highlightFields.get("name"); if(highlightField != null) { Text[] fragments = highlightField.fragments(); name = ""; for (Text text : fragments) { name += text.toString(); } } System.out.println("name: " + name); System.out.println("author: " + author); } } }
与SQL使用LIMIT来控制单“页”数量类似,Elasticsearch使用的是from以及size两个参数:
from:从哪条结果开始,默认值为0
size:每次返回多少个结果,默认值为10
假设每页显示5条结果,那么1至3页的请求就是:
GET /_search?size=5
GET /_search?size=5&from=5
GET /_search?size=5&from=10
注意:不要一次请求过多或者页码过大的结果,这么会对服务器造成很大的压力。因为它们会在返回前排序。一个请求会经过多个分片。每个分片都会生成自己的排序结果。然后再进行集中整理,以确保最终结果的正确性。
————————————————
版权声明:本文为CSDN博主「R_记忆犹新」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/qq_28844767/article/details/84946433