Elasticsearch内存设置

参考自《Elasticsearch Definitive Guide》

        Elasticsearch默认堆内存为1 GB,对于实际应用,显然不够,不建议使用默认配置。

        设置Elasticsearch的内存又两种方案,最简单的方法是设置一个环境变量为ES_HEAP_SIZE。当服务器启动时,它将读取此环境变量并相应设置内存。命令如下:

export ES_HEAP_SIZE=10g
        另外一种方法则是在启动ES的时候,设置。

./bin/elasticsearch -Xmx10g -Xms10g

        推荐采用第一种设置方法。

        ES内存建议采用分配机器物理内存的一半,但最大不要超过32GB(原因请看下面的英文文档)。如何判断内存设置是否恰当,看ES启动日志中的:

        [2017-02-04T16:23:25,710][INFO ][o.e.e.NodeEnvironment    ] [n3zKowe] heap size [1.9gb], compressed ordinary object pointers [true]

        如果[true],则表示ok。一半超过32GB会为[false],请依次降低内存设置试试。

        另关闭Linux的swap!


        如果机器物理内存很大的话,多启动几个节点(每个32GB)试试,这个暂楼主没做测试,不过多评价!有测试过的童鞋麻烦评论分享,感谢!


原文如下,如有错误请指正:

Heap: Sizing and Swapping

The default installation of Elasticsearch is configured with a 1 GB heap. Forjust about every deployment, this number is usually too small. If you are using thedefault heap values, your cluster is probably configured incorrectly.

There are two ways to change the heap size in Elasticsearch. The easiest is toset an environment variable calledES_HEAP_SIZE. When the server processstarts, it will read this environment variable and set the heap accordingly.As an example, you can set it via the command line as follows:

export ES_HEAP_SIZE=10g

Alternatively, you can pass in the heap size via a command-line argument when startingthe process, if that is easier for your setup:

./bin/elasticsearch -Xmx10g -Xms10g (1)
  1. Ensure that the min (Xms) and max (Xmx) sizes are the same to preventthe heap from resizing at runtime, a very costly process.

Generally, setting the ES_HEAP_SIZE environment variable is preferred over settingexplicit-Xmx and-Xms values.

Give (less than) Half Your Memory to Lucene

A common problem is configuring a heap that is too large. You have a 64 GBmachine—​and by golly, you want to give Elasticsearch all 64 GB of memory. Moreis better!

Heap is definitely important to Elasticsearch. It is used by many in-memory datastructures to provide fast operation. But with that said, there is another majoruser of memory that isoff heap: Lucene.

Lucene is designed to leverage the underlying OS for caching in-memory data structures.Lucene segments are stored in individual files. Because segments are immutable,these files never change. This makes them very cache friendly, and the underlyingOS will happily keep hot segments resident in memory for faster access. These segmentsinclude both the inverted index (for fulltext search) and doc values (for aggregations).

Lucene’s performance relies on this interaction with the OS. But if you give allavailable memory to Elasticsearch’s heap, there won’t be any left over for Lucene.This can seriously impact the performance.

The standard recommendation is to give 50% of the available memory to Elasticsearchheap, while leaving the other 50% free. It won’t go unused; Lucene will happilygobble up whatever is left over.

If you are not aggregating on analyzed string fields (e.g. you won’t be needingfielddata) you can consider lowering the heap evenmore. The smaller you can make the heap, the better performance you can expectfrom both Elasticsearch (faster GCs) and Lucene (more memory for caching).

Don’t Cross 32 GB!

There is another reason to not allocate enormous heaps to Elasticsearch. As it turnsout, the HotSpot JVM uses a trick to compress object pointers when heaps are lessthan around 32 GB.

In Java, all objects are allocated on the heap and referenced by a pointer.Ordinary object pointers (OOP) point at these objects, and are traditionallythe size of the CPU’s nativeword: either 32 bits or 64 bits, depending on theprocessor. The pointer references the exact byte location of the value.

For 32-bit systems, this means the maximum heap size is 4 GB. For 64-bit systems,the heap size can get much larger, but the overhead of 64-bit pointers means thereis more wasted space simply because the pointer is larger. And worse than wastedspace, the larger pointers eat up more bandwidth when moving values betweenmain memory and various caches (LLC, L1, and so forth).

Java uses a trick called compressed oopsto get around this problem. Instead of pointing at exact byte locations inmemory, the pointers referenceobject offsets. This means a 32-bit pointer canreference four billionobjects, rather than four billion bytes. Ultimately, thismeans the heap can grow to around 32 GB of physical size while still using a 32-bitpointer.

Once you cross that magical ~32 GB boundary, the pointers switch back toordinary object pointers. The size of each pointer grows, more CPU-memorybandwidth is used, and you effectively lose memory. In fact, it takes until around40–50 GB of allocated heap before you have the same effective memory of aheap just under 32 GB using compressed oops.

The moral of the story is this: even when you have memory to spare, try to avoidcrossing the 32 GB heap boundary. It wastes memory, reduces CPU performance, andmakes the GC struggle with large heaps.

Just how far under 32gb should I set the JVM?

Unfortunately, that depends. The exact cutoff varies by JVMs and platforms.If you want to play it safe, setting the heap to31gb is likely safe.Alternatively, you can verify the cutoff point for the HotSpot JVM by adding-XX:+PrintFlagsFinal to your JVM options and checking that the value of theUseCompressedOops flag is true. This will let you find the exact cutoff for yourplatform and JVM.

For example, here we test a Java 1.7 installation on MacOSX and see the max heapsize is around 32600mb (~31.83gb) before compressed pointers are disabled:

$ JAVA_HOME=`/usr/libexec/java_home -v 1.7` java -Xmx32600m -XX:+PrintFlagsFinal 2> /dev/null | grep UseCompressedOops
     bool UseCompressedOops   := true
$ JAVA_HOME=`/usr/libexec/java_home -v 1.7` java -Xmx32766m -XX:+PrintFlagsFinal 2> /dev/null | grep UseCompressedOops
     bool UseCompressedOops   = false

In contrast, a Java 1.8 installation on the same machine has a max heap sizearound 32766mb (~31.99gb):

$ JAVA_HOME=`/usr/libexec/java_home -v 1.8` java -Xmx32766m -XX:+PrintFlagsFinal 2> /dev/null | grep UseCompressedOops
     bool UseCompressedOops   := true
$ JAVA_HOME=`/usr/libexec/java_home -v 1.8` java -Xmx32767m -XX:+PrintFlagsFinal 2> /dev/null | grep UseCompressedOops
     bool UseCompressedOops   = false

The moral of the story is that the exact cutoff to leverage compressed oopsvaries from JVM to JVM, so take caution when taking examples from elsewhere andbe sure to check your system with your configuration and JVM.

Beginning with Elasticsearch v2.2.0, the startup log will actually tell you if yourJVM is using compressed OOPs or not. You’ll see a log message like:

[2015-12-16 13:53:33,417][INFO ][env] [Illyana Rasputin] heap size [989.8mb], compressed ordinary object pointers [true]

Which indicates that compressed object pointers are being used. If they are not,the message will say[false].

I Have a Machine with 1 TB RAM!

The 32 GB line is fairly important. So what do you do when your machine has a lotof memory? It is becoming increasingly common to see super-servers with 512–768 GBof RAM.

First, we would recommend avoiding such large machines (see [hardware]).

But if you already have the machines, you have three practical options:

  • Are you doing mostly full-text search? Consider giving 4-32 GB to Elasticsearchand letting Lucene use the rest of memory via the OS filesystem cache. All thatmemory will cache segments and lead to blisteringly fast full-text search.

  • Are you doing a lot of sorting/aggregations? Are most of your aggregations on numerics,dates, geo_points andnot_analyzed strings? You’re in luck, your aggregations will be done onmemory-friendly doc values! Give Elasticsearch somewhere from 4-32 GB of memory and leave therest for the OS to cache doc values in memory.

  • Are you doing a lot of sorting/aggregations on analyzed strings (e.g. for word-tags,or SigTerms, etc)? Unfortunately that means you’ll need fielddata, which means youneed heap space. Instead of one node with a huge amount of RAM, consider running two ormore nodes on a single machine. Still adhere to the 50% rule, though.

    So if your machine has 128 GB of RAM, run two nodes each with just under 32 GB. This means that lessthan 64 GB will be used for heaps, and more than 64 GB will be left over for Lucene.

    If you choose this option, set cluster.routing.allocation.same_shard.host: truein your config. This will prevent a primary and a replica shard from colocatingto the same physical machine (since this would remove the benefits of replica high availability).

Swapping Is the Death of Performance

It should be obvious, but it bears spelling out clearly: swapping main memoryto disk willcrush server performance. Think about it: an in-memory operationis one that needs to execute quickly.

If memory swaps to disk, a 100-microsecond operation becomes one that take 10milliseconds. Now repeat that increase in latency for all other 10us operations.It isn’t difficult to see why swapping is terrible for performance.

The best thing to do is disable swap completely on your system. This can be donetemporarily:

sudo swapoff -a

To disable it permanently, you’ll likely need to edit your /etc/fstab. Consultthe documentation for your OS.

If disabling swap completely is not an option, you can try to lower swappiness.This value controls how aggressively the OS tries to swap memory.This prevents swapping under normal circumstances, but still allows the OS to swapunder emergency memory situations.

For most Linux systems, this is configured using the sysctl value:

vm.swappiness = 1 (1)
  1. A swappiness of 1 is better than 0, since on some kernel versions aswappinessof0 can invoke the OOM-killer.

Finally, if neither approach is possible, you should enable mlockall. file. This allows the JVM to lock its memory and preventit from being swapped by the OS. In yourelasticsearch.yml, set this:

bootstrap.mlockall: true

  • 7
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值