推荐关于ElasticSearch的好文

[2017.1 - 12 更新]

Elasticsearch: The Definitive Guide [master]


[2016.4 - 9 更新]

How we reindexed 36 billion documents in 5 days within the same Elasticsearch cluster

  • _cache=false
  • G1 GC over CMS GC
  • The indexer was shard aware.
  • User Logstash for reindexing.

Elasticsearch on Azure Guidance

  1. Transient changes will be persistent when a node restarts or a new node joins a cluster. The master node will sync the changes for you automatically.
  2. Force the allocation of an unassigned shard with a reason. POST /_cluster/reroute?explain {...}
  3. If a node reaches a high JVM value, you can call that API (POST /_cache/clear) as an immediate action on a node level to make Elasticsearch drop caches. It will hurt performance, but it can save you from OOM (Out Of Memory).

In Search of Agile Time Series Databasehttps://www.gitbook.com/book/taowen/tsdb/details

How to Install the ELK Stack on Azure


[原文]

      在项目中使用ElasticSearch作为后端的搜索引擎已经快一年,从年初的1.1.1版本到1.2.2,再到最新的1.4.1,ElasticSearch本身这一年多来是在一步步的不断更新。伴随它的进步,我们对它也有了更进一步的了解。坦白地讲,开源软件使用的前期启动成本比较低,但要真正能在产品环境中把它用好、学习和运维经验的积累成本还是相当高的,一年来的坎坎坷坷经历了不老少,特别是如何能够在 Microsoft Azure 云平台上运维好Elasticsearch,挑战就更大了。

       网上已经有很多关于Elasticsearch的资源,下面 (1 and 2) 这两篇Alex Brasetvik 写的博文是我个人觉得比较全面介绍Elasticsearch的好文章。有理论、有实践还有不少相关的有用链接,非常不错,在此推荐给大家:

  1. Elasticsearch from the Bottom Up, Part 1
  2. Elasticsearch from the Top Down - Tracing a Request Down to the Bits
  3. Elasticsearch Indexing Performance Cheatsheet :
    1. Consider separating data nodes (that actually store and index data) from “aggregator nodes” (used only for querying). When aggregator nodes handle search queries and only contact data nodes as needed, they take load off the data nodes which will then have more capacity for handling indexing requests.
    2. By default, an index shard uses a refresh interval of one second, i.e., new documents become available for search after one second. Even though refreshing is a more lightweight operation than one may think, it comes at a cost. Thus, depending on your search requirements, you may consider setting the refresh interval to something higher than one second. It can even make sense to temporarily turn off refreshing completely for an index (by setting the interval to -1), e.g., during a bulk indexing run, andtrigger it manually at the end.
  4. Elasticsearch in Production
  5. 10 Elasticsearch metrics to watch
  6. Performance Monitoring Essentials - Elasticsearch Edition : 'stored' fields
  7. How we optimized 100 sec elasticsearch queries to be under a sub second.
  8. Elasticsearch vs. Hadoop For Advanced Analytics
  9. Why do people use Hadoop or Spark when there is ElasticSearch?










  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值