Elasticsearch

最新推荐文章于 2024-01-12 00:32:32 发布

kunpengku

最新推荐文章于 2024-01-12 00:32:32 发布

阅读量1.4k

点赞数

分类专栏： Elasticsearch

本文链接：https://blog.csdn.net/u012063703/article/details/52601749

版权

Elasticsearch 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

1,ES是什么？
Elasticsearch is a distributed, scalable, real-time search and analytics engine. It enables you to search, analyze, and explore your data, often in ways that you did not anticipate at the start of a project. It exists because raw data sitting on a hard drive is just not useful.
ES是一个分布式的，可扩展的，实时的，搜索和分析引擎。可以用来搜索，分析，浏览数据。

2，ES能做什么
full-text search, real-time analytics of structured data
全文搜索，实时分析结构化的数据。
complexities of dealing with human language, geolocation, and relationships
处理人类语言，地理定位和关系的难题。

3，从基础搭建 -> 改进搜索体验

4，深入研究
We will also discuss how best to model your data to take advantage of the horizontal scalability of Elasticsearch, and how to configure and monitor your cluster when moving to production.
水平扩展，配置和监视集群。

5，information-retrieval concepts, distributed systems, the query DSL
信息检索，分布式系统，DSL语言

6
reference document how to use features
difinitive guide why and when to use features

7.instead of hoping that the black box will do what you want, understanding gives you certainty and clarity.

8 目标
the best results are on the first page.

9
aggregations and analytics—ways to summarize and group your data to show overall trends

10 谁在用ES
Wikipedia uses Elasticsearch to provide full-text search with highlighted search snippets, and search-as-you-type and did-you-mean suggestions.
Wiki 高亮，键入同时搜索，搜索建议
The Guardian uses Elasticsearch to combine visitor logs with social -network data to provide real-time feedback to its editors about the public’s response to new articles.
Stack Overflow combines full-text search with geolocation queries and uses more-like-this to find related questions and answers.
GitHub uses Elasticsearch to query 130 billion lines of code.1300亿行代码

11性能
scale out to hundreds of servers and petabytes of data.

12 Lucene 复杂，ES隐藏其复杂。
13 ES是什么
A distributed real-time document store where every field is indexed and searchable
A distributed search engine with real-time analytics
Capable of scaling to hundreds of servers and petabytes of structured and unstructured data

14 ES node
A node is a running instance of Elasticsearch
A cluster is a group of nodes with the same cluster.name that are working together to share data and to provide failover and scale.

15安装Sense
1，先下载kibana，https://www.elastic.co/downloads/kibana
2，解压后，
3，./bin/kibana plugin –install elastic/sense 很耗费时间，大约10分钟
4，./bin/kibana
5，http://10.60.0.130:5601/app/sense 网页上输入DSL，很有助于调试。

16，如何与ES通信。
Java客户端通过9300端口，
其他语言通过9200 端口，RESTful API with JSON over HTTP
https://www.elastic.co/guide/en/elasticsearch/guide/current/_talking_to_elasticsearch.html

17，Elasticsearch is document oriented

18，indices
an index is like a database in a traditional relational database.
Inverted index 倒排索引

19，插入数据
https://www.elastic.co/guide/en/elasticsearch/guide/current/_indexing_employee_documents.html

20，根据id检索
GET /megacorp/employee/1

21,搜索全部
GET /megacorp/employee/_search
By default, a search will return the top 10 results.

22，query-string search
GET /megacorp/employee/_search?q=last_name:Smith

23，DSL domain-specific language
24，match query

25, 集群健康
GET /_cluster/health
green
All primary and replica shards are active.
yellow
All primary shards are active, but not all replica shards are active.
red
Not all primary shards are active.

26 index
In reality, an index is just a logical namespace that points to one or more physical shards.

27 shard
A shard is a low-level worker unit that holds just a slice of all the data in the index.
a shard is a single instance of Lucene
a complete search engine in its own right
Our documents are stored and indexed in shards, but our applications don’t talk to them directly. Instead, they talk to an index.

A shard can be either a primary shard or a replica shard.
The number of primary shards in an index is fixed at the time that an index is created, but the number of replica shards can be changed at any time.

indices默认5个shards

如果只有一个节点，集群可以提供服务，但是不安全。状态是yellow，replica无处放置。
28，
只有同一台机器上的节点，才能自动组成集群。
不用机器上的节点，需要靠单播来发现彼此

discovery.zen.ping.unicast.hosts: ["host1", "host2:port"]

29,节点越多，吞吐能力越强
read requests—searches or document retrieval—can be handled by a primary or a replica shard, so the more copies of data that you have, the more search throughput you can handle.

30 ES中 ducument 是一个顶层的对象的序列化。

31，Document Metadata
_index
Where the document lives
_index name must be lowercase, cannot begin with an underscore, and cannot contain commas.
_type
The class of object that the document represents
A _type name can be lowercase or uppercase, but shouldn’t begin with an underscore or period. It also may not contain commas, and is limited to a length of 256 characters.
_id
The unique identifier for the document
The ID is a string

32, 插入文档
1，用自己的ID
https://www.elastic.co/guide/en/elasticsearch/guide/current/index-doc.html

PUT /{index}/{type}/{id}
{
  "field": "value",
  ...
}

2,自动生成id， PUT换成POST

POST /{index}/{type}/
{
  "field": "value",
  ...
}

Autogenerated IDs are 20 character long, URL-safe, Base64-encoded GUID strings.

33,更新文档
更新文档，是将旧文档标记为delete，然后替换上一个新的文档，旧文档并不会立即删除，而是在你继续添加文档时，在适当时候，ES自己删除。

34 创建文档时，实现，没有就更新，有就不更新。
https://www.elastic.co/guide/en/elasticsearch/guide/current/create-doc.html

35,同步控制
1悲观同步控制，在读之前，先锁住。
2乐观同步控制，不锁住，但如果读写之间，数据变化，更新会失败。
ES用的是乐观同步控制。

36 版本，也可以用外部的数字。每次更新时，版本要比之前的大。
https://www.elastic.co/guide/en/elasticsearch/guide/current/optimistic-concurrency-control.html

kunpengku

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Elasticsearch

1,ES是什么？ Elasticsearch is a distributed, scalable, real-time search and analytics engine. It enables you to search, analyze, and explore your data, often in ways that you did not anticipate at the sta
复制链接

扫一扫

专栏目录