elasticsearch mysql对应关系
MySQL | elasticsearch |
database(数据库) | index(索引库) |
table(表) | type(类型) |
row(行) | document(文档) |
column(列) | field(字段) |
安装elasticsearch
1、下载elasticsearch
https://www.elastic.co/downloads/past-releases/elasticsearch-1-4-4
(或者2.2.0版本:https://www.elastic.co/downloads/past-releases/elasticsearch-2-2-0)
解压
2启动
前端启动 ES_HOME/bin/elasticsearch
后台启动 ES_HOME/bin/elasticsearch -d
启动成功验证
http://localhost:9200
es集群通过集群名称识别,如果多台机器组成es集群,只要集群名称一样,节点名称不一样即可
修改es_home/config/elasticsearch.yml
cluster.name: my-application(顶头写,前面不留空格,所有设置都遵循这条原则)
node.name: node-2
network.host: 10.xx.xx.xx(ip地址)
如果还不能识别则配置
discovery.zen.ping.unicast.hosts: ["10.xxx.xxx.xxx", "10.xxx.xxx.xxx"]
插件安装
在service目录下有命令elasticsearch
sh elasticsearch start 前台方式运行elasticsearch
sh elasticsearch start 后台方式运行elasticsearch
sh elasticsearch stop 停止运行的elasticsearchs
sh elasticsearch install 安装elasticsearch到系统启动项(init.d/service)
sh elasticsearch remove 从系统启动项里面移除elasticsearch(init.d/service)
2、bigdesk
http://localhost:9200/_plugin/bigdesk
3.head
基本操作
GET
Update
删除
Java编程
<!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>1.4.4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/junit/junit -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.8.2</version>
</dependency>
获取TransportClient
@Test public void test1(){ TransportClient transportClient=new TransportClient(); transportClient.addTransportAddress(new InetSocketTransportAddress("192.168.133.101",9300)) .addTransportAddress(new InetSocketTransportAddress("192.168.133.101", 9300)); ImmutableList<DiscoveryNode> nodes= transportClient.connectedNodes(); for(DiscoveryNode node : nodes){ System.out.println(node.getHostName()); } } /** * 通过集群名称探测es节点 */ @Test public void test2(){ //设置一些配置属性 Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "elasticsearch").build(); //获取一个Transport客户端对象 TransportClient transportClient=new TransportClient(settings); transportClient.addTransportAddress(new InetSocketTransportAddress("192.168.133.101",9300)); ImmutableList<DiscoveryNode> nodes= transportClient.connectedNodes(); for(DiscoveryNode node : nodes){ System.out.println(node.getHostName()); } } /** * 通过打开自动探测设置来探测es节点 */ @Test public void test3(){ //设置一些配置属性 Settings settings = ImmutableSettings.settingsBuilder().put("client.transport.sniff", true).build(); //获取一个Transport客户端对象 TransportClient transportClient=new TransportClient(settings); transportClient.addTransportAddress(new InetSocketTransportAddress("192.168.133.101",9300)); ImmutableList<DiscoveryNode> nodes= transportClient.connectedNodes(); for(DiscoveryNode node : nodes){ System.out.println(node.getHostName()); } }
TransportClient client; @Before public void before()throws Exception{ Settings settings = Settings.settingsBuilder() .put("client.transport.sniff", true).build(); client = TransportClient.builder().settings(settings).build() .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("10.202.8.122"),9300)); }插入数据/** * 用json的方式插入数据 */ @Test public void test4() { String json = "{\"name\":\"xiaoli\",\"age\":17}"; IndexResponse response = transportClient.prepareIndex("index1", "type1", "1").setSource(json).execute().actionGet(); System.out.println(response.getId()); } /** * 用Map的方式插入数据 */ @Test public void test5() { Map<String,Object> map=new HashMap<String,Object>(); map.put("name","xiaoying"); map.put("age",18); IndexResponse response = transportClient.prepareIndex("index1", "type1", "2").setSource(map).execute().actionGet(); System.out.println(response.getId()); } /** * bean的方式插入数据 * 添加依赖 *<dependency> *<groupId>com.fasterxml.jackson.core</groupId> *<artifactId>jackson-databind</artifactId> *<version>2.1.3</version> *</dependency> */ @Test public void test6() throws Exception{ Person person=new Person(); person.setName("xiaoli2"); person.setAge(3); // instance a json mapper ObjectMapper mapper = new ObjectMapper(); // create once, reuse // generate json String json = mapper.writeValueAsString(person); IndexResponse response = transportClient.prepareIndex("index1", "type1", "2").setSource(json).execute().actionGet(); System.out.println(response.getId()); } /** * ES工具插入数据 */ @Test public void test7() throws Exception{ XContentBuilder builder= XContentFactory.jsonBuilder().startObject() .field("name","xiaofeng").field("age",4).endObject(); IndexResponse response = transportClient.prepareIndex("index1", "type1", "3").setSource(builder).execute().actionGet(); System.out.println(response.getId()); }查询数据 //查询 @Test public void test8() throws Exception{ GetResponse response = transportClient.prepareGet("index1", "type1", "3").execute().actionGet(); System.out.println(response.getSourceAsString()); }更新数据 //更新1 @Test public void test9() throws Exception{ UpdateRequest request=new UpdateRequest("index1", "type1", "3"); XContentBuilder builder= XContentFactory.jsonBuilder().startObject() .field("age", 18).endObject(); request.doc(builder); UpdateResponse response = transportClient.update(request).actionGet(); System.out.println(response.getVersion()); } //更新2 @Test public void test10() throws Exception{ XContentBuilder builder= XContentFactory.jsonBuilder().startObject() .field("age", 20).endObject(); UpdateResponse response = transportClient.prepareUpdate("index1", "type1", "3").setDoc(builder).get(); System.out.println(response.getVersion()); }如果存在id为4的数据,则更新name 为name44如果不存在,则插入数据{"name":"name4444","age":90} //更新或者插入(如果存在则更新,如果不存在,则插入) @Test public void test11() throws Exception{ UpdateRequest request=new UpdateRequest("index1", "type1", "4"); XContentBuilder builder= XContentFactory.jsonBuilder().startObject() .field("name", "name44").endObject(); request.doc(builder); request.upsert(XContentFactory.jsonBuilder().startObject() .field("name", "name4444").field("age",90)); UpdateResponse response = transportClient.update(request).get(); System.out.println(response.getVersion()); }删除数据 //删除 @Test public void test12()throws Exception{ /*DeleteResponse response= transportClient.prepareDelete("index1","type1","4").execute().get(); System.out.println(response.getVersion());*/ DeleteRequest request=new DeleteRequest(); request.index("index1").type("type1").id("4"); DeleteResponse response=transportClient.delete(request).get(); System.out.println(response.getVersion()); }//查询指定索引库中数据总数 //类似于select count(*) @Test public void test13()throws Exception{ long num=transportClient.prepareCount("index1").execute().get().getCount(); System.out.println(num); } }
group by 相当于Aggregations
/** * bulk Bulk api的支持可以实现一次请求执行批量的添加、删除、更新等操作.Bulk操作使用的是UDP协议,UDP无法确保与ElasticSearch服务器通信时不丢失数据. * @throws Exception */ @Test public void test14()throws Exception{ BulkRequestBuilder bulkRequestBuilder= client.prepareBulk(); IndexRequest indexRequest=new IndexRequest("wuke","person","8") .source(XContentFactory.jsonBuilder().startObject().field("name","中国").field("age",22).endObject()); UpdateRequest updateRequest=new UpdateRequest("wuke","person","9").doc(XContentFactory.jsonBuilder().startObject().field("name","人名").field("age",80)); updateRequest.upsert(XContentFactory.jsonBuilder().startObject().field("name","华人") .field("age",12).endObject()); BulkResponse bulkResponse= bulkRequestBuilder.add(indexRequest).add(updateRequest).execute().get(); if (bulkResponse.hasFailures()){ BulkItemResponse [] bulkItemResponses=bulkResponse.getItems(); for(BulkItemResponse bulkItemResponse:bulkItemResponses){ System.out.println(bulkItemResponse.getFailureMessage()); } }else{ System.out.println("execut success"); } List<DiscoveryNode> nodes= client.connectedNodes(); for(DiscoveryNode node:nodes){ System.out.println(node.getHostAddress()); } client.close(); } /** * search ALL * @throws Exception */ @Test public void selectTest()throws Exception{ SearchResponse searchResponse=client.prepareSearch("wuke").setTypes("person").setSearchType(SearchType.QUERY_THEN_FETCH).execute().get(); SearchHits searchHits=searchResponse.getHits(); System.out.println( searchHits.getTotalHits()); SearchHit[] hits = searchResponse.getHits().getHits(); for (SearchHit hit:hits){ System.out.println( hit.getSourceAsString()); } } /** * search * //select * from wuke where name ='name1' order by age */ @Test public void test15()throws Exception{ QueryBuilder qb = QueryBuilders.rangeQuery("age").gt(20).lt(50); SearchResponse searchResponse=client.prepareSearch("wuke").setTypes("person").setSearchType(SearchType.QUERY_THEN_FETCH) .setQuery(QueryBuilders.matchQuery("name","name1")) // .setPostFilter(FilterBuilders.rangeFilter("age").gte(20).lte(25)) es1.4 .setPostFilter(qb).addSort("age", SortOrder.ASC)//es 2.2.1 .addHighlightedField("name") .setHighlighterPreTags("<font color='red'>") .setHighlighterPostTags("</font>") .setFrom(0).setSize(10).execute().get(); SearchHits searchHits=searchResponse.getHits(); System.out.println( searchHits.getTotalHits()); SearchHit[] hits = searchHits.getHits(); for (SearchHit hit:hits){ HighlightField names = hit.getHighlightFields().get("name"); Text[] fragments = names.getFragments(); for(Text text:fragments){ System.out.println(text); } System.out.println(hit.getSourceAsString()); } } /** * 根据姓名分组,统计相同姓名有多少条数据 * @throws Exception */ @Test public void test20() throws Exception { SearchResponse response = client.prepareSearch("wuke").setTypes("person") .addAggregation(AggregationBuilders.terms("group").field("name")) //默认返回分组之后的前十条数据,设置为0之后会返回所有的数据 .setSize(0) .execute().actionGet(); Terms terms = response.getAggregations().get("group"); List<Terms.Bucket> buckets = terms.getBuckets(); for (Terms.Bucket bucket : buckets) { System.out.println(bucket.getKey()+"------"+bucket.getDocCount()); } } /** * (业务场景不合理) * 根据姓名分组,统计相同姓名的年龄的和 * @throws Exception * select name,sum(age) from index group by name; */ @Test public void test21() throws Exception { SearchResponse response = client.prepareSearch("wuke").setTypes("person") .addAggregation(AggregationBuilders.terms("group").field("name") .subAggregation(AggregationBuilders.sum("sum").field("age"))) //默认返回分组之后的前十条数据,设置为0之后会返回所有的数据 .setSize(0) .execute().actionGet(); Terms terms = response.getAggregations().get("group"); List<Terms.Bucket> buckets = terms.getBuckets(); for (Terms.Bucket bucket : buckets) { Sum sum = bucket.getAggregations().get("sum"); System.out.println(bucket.getKey()+"------"+sum.getValue()); } } }
/** * 创建索引 * @param indexName */ public static void createIndex(String indexName) { try { CreateIndexResponse indexResponse =client .admin() .indices() .prepareCreate(indexName) .get(); System.out.println(indexResponse.isAcknowledged()); // true表示创建成功 } catch (ElasticsearchException e) { e.printStackTrace(); } } /** * 给索引增加mapping。 * @param index 索引名 * @param type mapping所对应的type */ public static void addMapping(String index, String type) { try { // 使用XContentBuilder创建Mapping XContentBuilder builder = XContentFactory.jsonBuilder() .startObject() .field("properties") .startObject() .field("orderNo") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("operateTime") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("expectGetTime") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("actualGetTime") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("expectTime") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("actualTime") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("expectGrabTime") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("actualGrabTime") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("resourceCode") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("deliveryEmpId") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("deliveryEmpName") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("employType") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("sendStoreCode") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("sendStoreName") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("sendTradingAreaCode") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("sendTradingAreaName") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("cityCode") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("orderStatus") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("cancelStatus") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("grabStatus") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("deliverStatus") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("timelydeliverStatus") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("pickupStatus") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .field("incDay") .startObject() .field("index", "not_analyzed") .field("type", "string") .endObject() .endObject() .endObject(); System.out.println(builder.string()); PutMappingRequest mappingRequest = Requests.putMappingRequest(index).source(builder).type(type); client.admin().indices().putMapping(mappingRequest).actionGet(); } catch (ElasticsearchException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } }
query and fetch(速度最快)(返回N倍数据量)(比如取前十条,则每个分别上都取前十条,则放回10*N条,N是分片数量)
query then fetch(默认的搜索方式)(比如取前10条数据,query then fetch会在query and fetch基础上,对每个分片中放回的数据进行排序,最后返回排序后的前10条数据)
DFS query and fetch(可以更精确控制搜索打分和排名。)(比如取前10条数据,先把所有分片上的数据都集中到一个分片上,每个分片再取本分片上的前10条数据,但是排序的规则变了,不是基于本分片规则排序,而是基于全量数据规则排序,这样排序更公平,最后返回的是10*N条数据)
DFS query then fetch(比如取前10条数据,在DFS query and fetch的基础上,对每个分片传来的数据再在一个分片中进行排序,最后返回排序后的前10条数据)
DFS解释:见备注
总结一下,从性能考虑QUERY_AND_FETCH是最快的,DFS_QUERY_THEN_FETCH是最慢的。从搜索的准确度来说,DFS要比非DFS的准确度更高。、
分页排序:
取第二页的数据,每页5条数据:search?size=5&from=10
不要一次请求过多或者页码过大的结果,这么会对服务器造成很大的压力。因为它们会在返回前排序。一个请求会经过多个分片。每个分片都会生成自己的排序结果。然后再进行集中整理,以确保最终结果的正确性。
timeout
可以通过设置timeout设定在规定时间内返回
es会在10ms之内返回查询内容
timeout并不会终止查询,它只是会在你指定的时间内返回当时已经查询到的数据,然后关闭连接。在后台,其他的查询可能会依旧继续,尽管查询结果已经被返回了。
随机选取,表示随机的从分片中取数据
_local:指查询操作会优先在本地节点有的分片中查询,没有的话再在其它节点查询。
_primary:指查询只在主分片中查询
_primary_first:指查询会先在主分片中查询,如果主分片找不到(挂了),就会在副本中查询。
_only_node:指在指定id的节点里面进行查询,如果该节点只有要dx查询索引的部分分片,就只在这部分分片中查找,所以查询结果可能不完整。如_only_node:123在节点id为123的节点中查询。
_prefer_node:nodeid 优先在指定的节点上执行查询
_shards:0 ,1,2,3,4:查询指定分片的数据
自定义:_only_nodes:根据多个节点进行查询
脑裂
http://blog.csdn.net/cnweike/article/details/39083089
discovery.zen.minimum_master_nodes
用于控制选举行为发生的最小集群节点数量。推荐设为大于1的数值,因为只有在2个以上节点的集群中,主节点才是有意义的。
脑裂有可能是网络原因,也可能是某个节点负载太重(比如既充当master节点又存储数据),导致假死的现象
ES 优化
ulimit -a (查看)
ulimit -n 32000(设置)
修改配置文件调整ES的JVM内存大小
1:修改bin/elasticsearch.in.sh中ES_MIN_MEM和ES_MAX_MEM的大小,建议设置一样大,避免频繁的分配内存,根据服务器内存大小,一般分配60%左右(默认256M)
2:如果使用searchwrapper插件启动es的话则修改bin/service/elasticsearch.conf(默认1024M)
设置mlockall来锁定进程的物理内存地址
避免交换(swapped)来提高性能
修改文件conf/elasticsearch.yml
boostrap.mlockall: true
分片多的话,可以提升建立索引的能力,5-20个比较合适。
如果分片数过少或过多都会导致检索比较慢。分片数过多会导致检索时打开比较多的文件,另外也会导致多台服务器之间通讯。而分片数过少会导至单个分片索引过大,所以检索速度慢。建议单个分片最多存储20G左右的索引数据,所以,分片数量=数据总量/20G
副本多的话,可以提升搜索的能力,但是如果设置很多副本的话也会对服务器造成额外的压力,因为需要同步数据。所以建议设置2-3个即可。
要定时对索引进行优化,不然segment越多,查询的性能就越差
索引量不是很大的话情况下可以将segment设置为1
curl -XPOST 'http://localhost:9200/crxy/_optimize?max_num_segments=1'
java代码:client.admin().indices().prepareOptimize("crxy").setMaxNumSegments(1).get();
curl -XPOST 'http://localhost:9200/crxy/_optimize?only_expunge_deletes=true'
client.admin().indices().prepareOptimize("crxy").setOnlyExpungeDeletes(true).get();
如果在项目开始的时候需要批量入库大量数据的话,建议将副本数设置为0
因为es在索引数据的时候,如果有副本存在,数据也会马上同步到副本中,这样会对es增加压力。待索引完成后将副本按需要改回来。这样可以提高索引效率
"_all":{"enabled":"false"}
log输出的水平默认为trace,即查询超过500ms即为慢查询,就要打印日志,造成cpu和mem,io负载很高。把log输出水平改为info,可以减轻服务器的压力。
修改ES_HOME/conf/logging.yaml文件
或者修改ES_HOME/conf/elasticsearch.yaml