apache ignite 持久化

ignite持久化与固化内存
1.持久化的机制
ignite持久化的关键点如下:

ignite持久化可防止内存溢出导致数据丢失的情况;
持久化可以定制化配置,按需持久化;
持久化能解决在大量缓存数据情况下ignite节点启动缓慢的问题;
使用持久化后,ignite能存储海量的数据;
使用持久化之后需要手工启动集群;
持久化涉及到的一个关键点就是WAL,所谓WAL就是预写日志,目的是为了保证在持久化机制下数据写入的性能,其原理图如下所示:

​ 在ignite中,对内存中数据的操作并不会立即同步到持久化文件(Partition File)中,而是先记录在预写日志(Write-Ahead Log)中,检查点线程(Checkpointing)将内存中的脏数据(dirty page)同步到持久化文件中,并且会将预写日志中的过期数据删除。

​ dirty page:在wal文件中但是还没写入partition files中,当dirty page 比例占到内存数据的2/3的时候会触发checkpoint机制。

​ checkpoint : 将内存中的数据同步到partition files中,当checkpoint结束之后,wal会归档,开启一个新的wal文件。

​ wal可以防止在极端情况下,比如断电,程序崩溃的情况下数据丢失,但是如果wal中的数据过多,那么在ignite启动的时候从wal读取数据势必会导致启动速度缓慢,因为从wal中读取数据的速度远比从partition files中读取数据的速度慢。缓存配置项中有个’writeThrottlingEnabled’配置项可以改善这个情况,除此之外,还可以调整检查点的线程数以及同步频率来提升预写日志的效率,相关配置如下所示:

......
 <!--Checkpointing frequency which is a minimal interval when the dirty pages will be written to the Persistent Store.-->
 <!-- 检查点频率 -->
 <property name="checkpointFrequency" value="180000"/>
    
 <!-- Number of threads for checkpointing.-->
 <!-- 检查点线程数 -->
 <property name="checkpointThreads" value="4"/>
    
 <!-- 在检查点同步完成后预写日志历史保留数量
 <!-- Number of checkpoints to be kept in WAL after checkpoint is finished.-->
 <property name="walHistorySize" value="20"/>       
......
</bean>

​ WAL有几种模式可以选择,可以关闭WAL或者强同步模式,保证数据在苛刻条件下也不会丢失,设置方式如下:

<!-- 设置持久化预写日志模式. -->
<property name="walMode">
    <util:constant static-field="org.apache.ignite.configuration.WALMode.DEFAULT"/>
</property>
  1. 通过配置开启持久化:
    ignite中对于存储有个内存区的概念,每个cache默认使用的是Default_Region,可以自定义内存区,然后在定义缓存的时候指定缓存区,这样可以做到个性化持久化,比如有些缓存的数据量比较小,那么就没有持久化的必要,而有些表数据量比较大,而且还在持续增长,需要开启持久化防止内存溢出,这时可以通过自定义内存区将两者缓存区分开来,实现定制化持久化。

xml配置:

    <!-- 设置持久化预写日志模式. -->
    <property name="walMode">
        <util:constant static-field="org.apache.ignite.configuration.WALMode.DEFAULT"/>
    </property>

    <!-- 持久化文件存储路径. -->
    <!-- <property name="storagePath" value="D:\\Test\\db" /> -->
    <property name="storagePath" value="/data/local/db" />

    <!-- 预写日志存储路径. -->
    <!-- <property name="walPath" value="D:\\Test\\db\\wal" /> -->
    <property name="walPath" value="/data/local/db/wal" />

    <!-- 预写日志解压路径. -->
    <!-- <property name="walArchivePath" value="D:\\Test\\db\\wal\\archive" /> -->
    <property name="walArchivePath" value="/data/local/db/wal/archive" />

</bean>
java配置:
private static final String usrDir = System.getProperty("user.dir");
private static final String separator = File.separator;
private static final String DB = "db";
private static final String WAL = "wal";
private static final String ARCHIVE = "archive";

/**设置一致性Id*/

igniteCfg.setConsistentId(“ABC”);
/*ignite持久化配置/
DataStorageConfiguration dcfg = igniteCfg.getDataStorageConfiguration();
dcfg.getDefaultDataRegionConfiguration()
.setMaxSize(4L * 1024 * 1024 * 1024) //设置默认区域的最大可用内存
.setPersistenceEnabled(true); //默认区域开启持久化
//设置持久化路径
dcfg.setStoragePath(String.format("%s%s%s", usrDir, separator, DB));
dcfg.setWalPath(String.format("%s%s%s%s%s", usrDir, separator, DB, separator, WAL));
dcfg.setWalArchivePath(String.format("%s%s%s%s%s%s%s", usrDir, separator, DB, separator, WAL, separator, ARCHIVE));
相关说明:

1.1 设置consistentId的原因:

​默认状态下,如果节点重启,那么ignite会随机生成一个全局唯一的consistentId, 而持久化的磁盘路径是用consistentId 区分的,如果重启之后那么无法再读取原来区间的持久化文件,但是指定consistentId就可以使用固定空间,使用之前的持久化文件。

如果一台主机启动了若干个节点,那么每个节点进程都会在一个预定义的唯一子目录中,比如${IGNITE_HOME}/work/db/node{IDX}-{UUID},有自己的持久化文件,这里IDX和UUID参数都是Ignite在节点启动时自动计算的(这里有详细描述)。如果在持久化层次结构中已经有了若干node{IDX}-{UUID}子目录,那么他们是按照节点先入先出的顺序进行赋值的。如果希望某节点即使重启也有专用目录和专用的数据分区,需要在集群范围配置唯一的IgniteConfiguration.setConsistentId,这个唯一ID会在node{IDX}-{UUID}字符串中映、射setStoragePath(…)到、setWalArchivePath(…)ffUUID`。

1.2 自定义存储区域的使用方式:

​ 默认配置下,缓存使用的是默认内存区(defaultDataRegionConfiguration),也可以自定义内存区,如上面配置文件中定义的"500MB_Region"。这样可以将需要持久化的数据和不需要持久化的数据分离出来,但是使用自定义的内存区的时候需要设置额外的属性:

... ... // Creating a cache configuration. CacheConfiguration cacheCfg = new CacheConfiguration(); // Binding the cache to the earlier defined region. cacheCfg.setDataRegionName("500MB_Region"); 1.3 启用持久化之后需要手工激活集群:

集群激活
注意如果开启了Ignite持久化,集群默认是未激活的,无法进行任何的CRUD操作。用户需要手工激活集群,后面会介绍如何进行操作。

集群激活方式:

a. 代码激活:

    // Activating the cluster once all the cluster nodes are up and running.
    if(!ignite.active()) {
        ignite.active(true);  //如果集群未启动则启动集群
    }

b. web控制台:

img

c. 命令激活:

在命令行中,使用$IGNITE_HOME/bin文件夹中的control.sh|bat脚本,比如
.sh:
control.sh|bat

./control.sh --activate
.bat:

./control.bat --activate
1.4 ignite的destroyCache()方法同样会清除持久化文件.

destroyCache同样会清除持久化文件,但是持久化的缓存配置不会清除, 所以重启之后会出现容量为空的cache。如果要动态修改cache配置,必须先destroyCache,再做调整;

  1. 持久化相关测试:
    持久化占用的磁盘空间大小,以及持久化对于节点启动速度的提升:

数据量 磁盘占用 未持久化启动 持久化后启动
350w(生产数据) 分区文件5.6g, 预写日志7.0g 2分钟 39s
2400w(本地数据) 分区文件6.76g,预写日志12.1g …(内存溢出) 12s

分类: apache ignite

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
My first acquaintance with High load systems was at the beginning of 2007, and I started working on a real-world project since 2009. From that moment, I spent most of my office time with Cassandra, Hadoop, and numerous CEP tools. Our first Hadoop project (the year 2011-2012) with a cluster of 54 nodes often disappointed me with its long startup time. I have never been satisfied with the performance of our applications and was always looking for something new to boost the performance of our information systems. During this time, I have tried HazelCast, Ehcache, Oracle Coherence as in-memory caches to gain the performance of the applications. I was usually disappointed from the complexity of using these libraries or from their functional limitations. When I first encountered Apache Ignite, I was amazed! It was the platform that I’d been waiting on for a long time: a simple spring based framework with a lot of awesome features such as DataBase caching, Big data acceleration, Streaming and compute/service grids. In 2015, I had participated in Russian HighLoad++ conference1 with my presentation and started blogging in Dzone/JavaCodeGeeks and in my personal blog2 about developing High-load systems. They became popular shortly, and I received a lot of feedback from the readers. Through them, I clarified the idea behind the book. The goal of the book was to provide a guide for those who really need to implement an in-memory platform in their projects. At the same time, the idea behind the book is not writing a manual. Although the Apache Ignite platform is very big and growing day by day, we concentrate only on the features of the platform (from our point of view) that can really help to improve the performance of the applications. We hope that High-performance in-memory computing with Apache Ignite will be the go-to guide for architects and developers: both new and at an intermediate level, to get up and to develop with as little friction as possible.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值