EHCache 缓存的应用及选择

最新推荐文章于 2022-03-19 22:52:24 发布

weixin_34242509

最新推荐文章于 2022-03-19 22:52:24 发布

阅读量205

点赞数

文章标签： java runtime 开发工具

原文链接：https://segmentfault.com/a/1190000010910630

版权

http://www.ehcache.org/docume...

Ehcache Tiering Options CURRENT
Introduction
Ehcache supports the concept of tiered caching. This section covers the different available configuration options. It also explains rules and best practices to benefit the most from tiered caching.

Moving out of heap
The moment you have a tier different than heap in a cache, a few things happen.

Adding a mapping to the cache means that the key and value have to be serialized,

Reading a mapping from the cache means that the key and value may have to be deserialized.

With these two points above, you need to realise that the binary representation of the data and how it is transformed to and from will play a significant role in caching performance. Make sure you know about the options available in Ehcache 3. Also this means that some configurations, while making sense on paper, may not offer the best performance depending on the real use case of the application.

Single tier setups
All tiering options can be used in isolation. That means you can have caches with data only in offheap or only clustered for example.

The following possibilities are valid configurations:

heap

offheap

disk

clustered

For this, simply define the single resource in the cache configuration.

CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
ResourcePoolsBuilder.newResourcePoolsBuilder().offheap(2, MemoryUnit.GB)).build();
Start with defining the key and value type in the configuration builder.
Then specity the resource (the tier) you want to use. Here we use off-heap only.
Heap

The starting point of every cache and also the faster since no serialization is necessary. You can optionally use copiers to pass keys and values by-value, the default being by-reference.

A heap tier can be sized by entries or by size.

ResourcePoolsBuilder.newResourcePoolsBuilder().heap(10, EntryUnit.ENTRIES);
// or
ResourcePoolsBuilder.newResourcePoolsBuilder().heap(10);
// or
ResourcePoolsBuilder.newResourcePoolsBuilder().heap(10, MemoryUnit.MB);
Only 10 entries allowed on heap. Eviction will occur when full.
A shortcut to specify 10 entries.
Only 10 MB allowed. Eviction will occur when full.
Byte-sized heap

For every tier except the heap, calculating the size of the cache is fairly easy. You more or less sum the size of every byte buffers containing the serialized entries.

When heap is limited by size instead of entries, it is a bit more complicated.

Byte sizing has a runtime performance impact that depends on the size and graph complexity of the data cached.
CacheConfiguration<Long, String> usesConfiguredInCacheConfig = CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
ResourcePoolsBuilder.newResourcePoolsBuilder()

.heap(10, MemoryUnit.KB) 
.offheap(10, MemoryUnit.MB))

.withSizeOfMaxObjectGraph(1000)
.withSizeOfMaxObjectSize(1000, MemoryUnit.B)
.build();

CacheConfiguration<Long, String> usesDefaultSizeOfEngineConfig = CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
ResourcePoolsBuilder.newResourcePoolsBuilder()

.heap(10, MemoryUnit.KB))

.build();

CacheManager cacheManager = CacheManagerBuilder.newCacheManagerBuilder()
.withDefaultSizeOfMaxObjectSize(500, MemoryUnit.B)
.withDefaultSizeOfMaxObjectGraph(2000)
.withCache("usesConfiguredInCache", usesConfiguredInCacheConfig)
.withCache("usesDefaultSizeOfEngine", usesDefaultSizeOfEngineConfig)
.build(true);
This will limit the amount of memory used by the heap tier for storing key-value pairs. There is a cost associated to sizing objects.
The settings are only used by the heap tier. So off-heap won’t use it at all.
The sizing can also be further restrained by 2 additional configuration settings: The first one specifies the maximum number of objects to traverse while walking the object graph, the second defines the maximum size of a single object. If the sizing goes above any of these two limits, the entry won’t be stored in cache.
A default configuration can be provided at CacheManager level to be used by the caches unless defined explicitly.
Off-heap

If you wish to use off-heap, you’ll have to define a resource pool, giving the memory size you want to allocate.

ResourcePoolsBuilder.newResourcePoolsBuilder().offheap(10, MemoryUnit.MB);
Only 10 MB allowed off-heap. Eviction will occur when full.
The example above allocates a very small amount of off-heap. You will normally use a much bigger space.

Remember that data stored off-heap will have to be serialized and deserialized - and is thus slower than heap.

You should thus favor off-heap for large amounts of data where on-heap would have too severe an impact on garbage collection.

Do not forget to define in the java options the -XX:MaxDirectMemorySize option, according to the off-heap size you intend to use.

Disk

As you might have guessed, disk tier means the data is stored on disk. The faster and more dedicated the disk is, the faster accessing the data will be.

PersistentCacheManager persistentCacheManager = CacheManagerBuilder.newCacheManagerBuilder()
.with(CacheManagerBuilder.persistence(new File(getStoragePath(), "myData")))
.withCache("persistent-cache", CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,

ResourcePoolsBuilder.newResourcePoolsBuilder().disk(10, MemoryUnit.MB, true))

)
.build(true);

persistentCacheManager.close();
To use disk storage, you’ll have to provide a location where data should be stored.
Doing this will return a PersistentCacheManager which is a normal CacheManager but with the ability to destroy caches.
Defines a resource pool for the disk that will be used by the cache. The third parameter is a boolean value which is used to set whether the disk pool is persistent. When set to true, the pool is persistent. When the 2 parameters version disk(long, MemoryUnit) is used, the pool is not persistent.
The example above allocates a very small amount of disk storage. You will normally use a much bigger storage.

Persistence means the cache will survive a JVM restart. Everything that was in the cache will still be there after restarting the JVM and creating a CacheManager disk persistence at the same location.

A disk tier can’t be shared between cache managers. A persistence directory is dedicated to one cache manager at the time.
Remember that data stored on disk will have to be serialized / deserialized and written to / read from disk - and is thus slower than heap and offheap. So disk storage is interesting if

You have a large amount of data that can’t fit off-heap

Your disk is much faster than the storage it is caching

You are interested in persistence

Ehcache 3 only offers persistence in the case of clean shutdowns (close() was called). If the JVM crashes there is no data integrity guarantee. At restart, Ehcache will detect that the CacheManager wasn’t cleanly closed and will wipe the disk storage before using it.
Clustered

A clustered tier means the client is connecting to a remote Terracotta server where the cached data is put. It is also as way to have a shared cache between JVMs. Having so much new possibilities warrants it having its own section in the documentation.

Multiple tiers setups
The moment you want to use more than one tier, you have to observe some constraints.

There must always be a heap tier in a multi tier setup,

You cannot combine disk and clustered tiers,

Tiers should be sized in a pyramidal fashion.

For 1, this is a limitation of the current implementation.

For 2, this is a design decision as having two tiers with content that can outlive the life of a single JVM makes for all kind of interesting consistency questions on restart.

Tiers hierarchy
Figure 1. Tiers hierarchy
For 3, the idea is that tiers are related between each others. If you picture the fastest tier - heap - is on top, while the slower tiers are below, you can see the pyramid. It comes from the fact that heap is more constrained than the total memory of the machine. And in addition memory is more constrained than disk or the memory available on the cluster. And the Ehcache implementation takes this into account.

It means that when sizing in the same units, that is memory quantity, the validation of the configuration will fail if an upper tier is larger or equal to a lower tier. While we cannot verify that a count based sizing for heap will not be larger than a byte sizing for another tier, you should make sure that is the case during testing.

With the above into account, the following possibilities are valid configurations:

heap + offheap

heap + offheap + disk

heap + offheap + clustered

heap + disk

heap + clustered

Here is an example using heap, offheap and clustered.

PersistentCacheManager persistentCacheManager = CacheManagerBuilder.newCacheManagerBuilder()
.with(cluster(CLUSTER_URI).autoCreate())
.withCache("threeTierCache",

CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
  ResourcePoolsBuilder.newResourcePoolsBuilder()
    .heap(10, EntryUnit.ENTRIES) 
    .offheap(1, MemoryUnit.MB) 
    .with(ClusteredResourcePoolBuilder.clusteredDedicated("primary-server-resource", 2, MemoryUnit.MB)) 
)

).build(true);
Clustered specific information telling how to connect to the Terracotta cluster
Heap tier. Our closest caching tier
Offheap tier. Next in line as caching tier
Clustered tier. The authoritative tier for this cache
Resource pools
Tiers are configured using resource pools. Most of the time using a ResourcePoolsBuilder. Let’s revisit an example from the Getting Started.

CacheConfigurationBuilder.newCacheConfigurationBuilder(Long.class, String.class,
  ResourcePoolsBuilder.newResourcePoolsBuilder()
    .heap(10, EntryUnit.ENTRIES)
    .offheap(1, MemoryUnit.MB)
    .disk(20, MemoryUnit.MB, true)
)

).build(true);
This is a cache using 3 tiers (heap, offheap, disk). They are created and chained using the ResourcePoolsBuilder. The declaration order doesn’t matter (e.g. offheap can be declared before heap) because each tier has a height. Higher is the height, the closest to the client will be the tier.

It is really important to understand that a resource pool is only specifying a configuration. It is not an actual pool that can be shared between caches. For instance this code.

ResourcePools pool = ResourcePoolsBuilder.newResourcePoolsBuilder().heap(10).build();

CacheManager cacheManager = CacheManagerBuilder.newCacheManagerBuilder()
.withCache("test-cache1", CacheConfigurationBuilder.newCacheConfigurationBuilder(Integer.class, String.class, pool))
.withCache("test-cache2", CacheConfigurationBuilder.newCacheConfigurationBuilder(Integer.class, String.class, pool))
.build(true);
You will end up with two caches that can contain 10 entries each. Not a shared pool of 10 entries. Pools are never shared between caches. The exception being clustered caches that can be shared or dedicated.

Update ResourcePools

Limited size adjustment can be performed on a live cache.

updateResourcePools() only allows you to change the heap tier sizing, not the pool type. Thus you can’t change the sizing of off-heap or disk tiers.
ResourcePools pools = ResourcePoolsBuilder.newResourcePoolsBuilder().heap(20L, EntryUnit.ENTRIES).build();
cache.getRuntimeConfiguration().updateResourcePools(pools);
assertThat(cache.getRuntimeConfiguration().getResourcePools()
.getPoolForResource(ResourceType.Core.HEAP).getSize(), is(20L));
You will need to create a new ResourcePools object with resources of required size, using ResourcePoolsBuilder. This object can then be passed to the said method so as to trigger the update.
To update capacity of ResourcePools, the updateResourcePools(ResourcePools) method in RuntimeConfiguration can be of help. The ResourcePools object created earlier can then be passed to this method so as to trigger the update.
Destroy persistent tiers
The disk and clustered tiers are the two persistent tiers. It means that when the JVM is stopped, all the created caches and their data are still existing on disk or on the cluster.

Once in a while, your might want to delete them. That’s why these cache managers both are PersistentCacheManager which adds two useful methods

destroy()
Will destroy everything related to the cache manager (including caches, of course) on the persistent storage. The cache manager must close or uninitialized to call this method. Also, for a clustered tier, no other cache manager should currently be connected to the same cache manager server entity.

destroyCache(String cacheName)
Will destroy a given cache. The cache shouldn’t be used by another cache manager.

Architecture
In order to understand correctly what happens for different cache operations when using multiple tiers, here are two examples. They are oversimplifying the actual sequence diagram but are still showing what is important.

Put
Figure 2. Put
Get
Figure 3. Get
You should then notice the following:

When putting a value into the cache, it goes straight to the authoritative tier

A following get will push the value up in the caching tiers

Of course, as soon as a value is put in the authoritative tier, all caching tiers are invalidated

A full cache miss (the value isn’t on any tier), will always go all the way down to the authoritative tier

The slower your authoritative tier, the slower your puts will be. For a normal cache usage, it usually doesn’t matter since gets are much more frequent than puts. The opposite would mean you probably shouldn’t be using a cache in the first place.

weixin_34242509

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫