这是美国时间6月10日,拉里做的12c inmemory option的现场发布会。
如今,内存数据库被大家广泛认可,懂得技术的人都明白,数据从磁盘读写肯定比在内存中读写要慢很多,而且目前也有很多内存数据已经有非常成熟的实施经验,当然,当今数据库的老大Oracle更加不会无视这个市场,很早就渲染他们Oracle12c的内存组件多么的牛叉,快到不行更是他们经常使用的词汇。
在今年7月22日,Oracle终于发布了12.1.0.2版本,当然最关注的就是这个In-Memory组件的使用了。下载地址:http://www.oracle.com/technetwork/database/enterprise-edition/downloads/database12c-linux-download-2240591.html
在12c的In-Memory Option选件之中,数据在内存的独立区域中按照列式存储,数据是被压缩存放的,内存与列式压缩可以极大提升查询的性能,下图是IMO的示意图:
这是美国时间6月10日,拉里做的12c inmemory option的现场发布会。
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
测试环境1:VM虚拟机 、Linux 5.5、4GB内存、Oracle12.1.0.2 数据介绍:非空间数据
1:连接到sys用户下,查看内存初始化参数的值
[oracle@oracle12c ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Wed Jul 30 23:25:42 2014
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> show parameter inm
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default string
inmemory_force string DEFAULT
inmemory_max_populate_servers integer 0
inmemory_query string ENABLE
inmemory_size big integer 0
inmemory_trickle_repopulate_servers_ integer 1
percent
optimizer_inmemory_aware boolean TRUE
2:默认情况下内存参数inmemory_size是没有值得,用户需要手动修改参数值。
SQL> alter system set inmemory_size=2G scope=spfile;
System altered.
SQL> alter system set inmemory_max_populate_servers=2 scope=spfile;
System altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 4294967296 bytes
Fixed Size 2932632 bytes
Variable Size 603979880 bytes
Database Buffers 1526726656 bytes
Redo Buffers 13844480 bytes
In-Memory Area 2147483648 bytes
Database mounted.
Database opened.
SQL> show parameter inm
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default string
inmemory_force string DEFAULT
inmemory_max_populate_servers integer 2
inmemory_query string ENABLE
inmemory_size big integer 2G
inmemory_trickle_repopulate_servers_ integer 1
percent
optimizer_inmemory_aware boolean TRUE
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
3:创建一个普通表,然后进行一个普通测试
SQL> create table t1 as select * from dba_objects;
Table created.
SQL> select bytes/1024/1024 from user_segments where segment_name='T1';
BYTES/1024/1024
---------------
12.5
SQL> set timing on
SQL> set time on
01:53:00 SQL> set autot traceonly
01:53:07 SQL> select * from t1;
92177 rows selected.
Elapsed: 00:00:03.51
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 92177 | 10M| 429 (1)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T1 | 92177 | 10M| 429 (1)| 00:00:01 |
--------------------------------------------------------------------------
Statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
7596 consistent gets
1546 physical reads
0 redo size
12280356 bytes sent via SQL*Net to client
68146 bytes received via SQL*Net from client
6147 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
92177 rows processed
我们从执行计划中可以看到,逻辑读7596个、物理读1546个。该表在数据库中占用空间约12.5MB。
4:创建同样的表在内存组件中进行测试
SQL> create table t2 as select * from dba_objects;
Table created.
SQL> set line 200
SQL> alter table t2 inmemory;
Table altered.
SQL> select * from v$inmemory_area;
POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL 1710227456 4194304 DONE 3
64KB POOL 419430400 51314688 DONE 3
SQL> select count(*) from t2;
COUNT(*)
----------
92178
SQL> select * from v$inmemory_area;
POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL 1710227456 8388608 DONE 3
64KB POOL 419430400 51445760 DONE 3
SQL> set timing on
SQL> set time on
01:59:12 SQL> set autot traceonly
01:59:21 SQL> select * from t2;
92178 rows selected.
Elapsed: 00:00:03.42
Execution Plan
----------------------------------------------------------
Plan hash value: 1513984157
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 92178 | 10M| 31 (17)| 00:00:01 |
| 1 | TABLE ACCESS INMEMORY FULL| T2 | 92178 | 10M| 31 (17)| 00:00:01 |
-----------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
9 consistent gets
0 physical reads
0 redo size
5016356 bytes sent via SQL*Net to client
68146 bytes received via SQL*Net from client
6147 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
92178 rows processed
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
5:结论
从上面可以看出,当将表至于inmemory状态时,该数据并没有在内存中,我们可以查看v$inmemory_area表里面信息,我们需要执行一些SQL语句如select count(*) from table将表读入内存中,这时候查看v$inmemory_area表中可以看到增长的信息,可能我原来进行过相关测试,1M内存pool和64K内存pool都有相关的信息,我们可以进行压缩对比来查看行式存储和列式存储的比较。
02:12:21 SQL> select (51445760 +8388608 -51314688 -4194304)/1024/1024 MB from dual;
MB
----------
4.125
可以进行对比,行式存储为12.5MB,列式存储约4MB,如果数据量更大的话这个对比更加明显。
另外我们可以看到,使用in-memory组件,相关的逻辑读仅有5,而且没有物理读,这个也是该组件的高效之处。
但是我们细心的朋友也会发现一个问题,普通查询耗时3.51秒,但是内存组件查询耗时3.42秒,我们看到后者的指标要比前者漂亮的多,但是性能方面并没有想象中的提高,这个是为什么呢?
经过咨询,这个问题可能是:inmemory是列式存储,数据经过压缩的。它的优势是针对某些列的分析型操作。你如果只是把数据拿出来,数据库需要把列数据拼成行数据,相对于普通的行式存储还要干额外的工作,当然要慢了。
PS:因为虚拟机的问题,我们测试都是在同一条件下进行,结果可能有所不同,但是希望能够说明相关的问题。
6:其他
当然,Oracle也提供了In-Memory 的视图来帮助用户进行分析
v$im_segments
SQL> desc v$im_segments
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER VARCHAR2(128)
SEGMENT_NAME VARCHAR2(128)
PARTITION_NAME VARCHAR2(128)
SEGMENT_TYPE VARCHAR2(18)
TABLESPACE_NAME VARCHAR2(30)
INMEMORY_SIZE NUMBER
BYTES NUMBER
BYTES_NOT_POPULATED NUMBER
POPULATE_STATUS VARCHAR2(9)
INMEMORY_PRIORITY VARCHAR2(8)
INMEMORY_DISTRIBUTE VARCHAR2(15)
INMEMORY_DUPLICATE VARCHAR2(13)
INMEMORY_COMPRESSION VARCHAR2(17)
CON_ID NUMBER
SQL> select inmemory_size/1024/1024,bytes/1024/1024 from v$im_segments where segment_name='T2';
INMEMORY_SIZE/1024/1024 BYTES/1024/1024
----------------------- ---------------
4.125 12.5
user_tables表也会多了几项关于INMEMORY的相关信息
SQL> desc user_tables
Name Null? Type
----------------------------------------- -------- ----------------------------
TABLE_NAME NOT NULL VARCHAR2(128)
TABLESPACE_NAME VARCHAR2(30)
CLUSTER_NAME VARCHAR2(128)
IOT_NAME VARCHAR2(128)
STATUS VARCHAR2(8)
PCT_FREE NUMBER
PCT_USED NUMBER
INI_TRANS NUMBER
MAX_TRANS NUMBER
INITIAL_EXTENT NUMBER
NEXT_EXTENT NUMBER
MIN_EXTENTS NUMBER
MAX_EXTENTS NUMBER
PCT_INCREASE NUMBER
FREELISTS NUMBER
FREELIST_GROUPS NUMBER
LOGGING VARCHAR2(3)
BACKED_UP VARCHAR2(1)
NUM_ROWS NUMBER
BLOCKS NUMBER
EMPTY_BLOCKS NUMBER
AVG_SPACE NUMBER
CHAIN_CNT NUMBER
AVG_ROW_LEN NUMBER
AVG_SPACE_FREELIST_BLOCKS NUMBER
NUM_FREELIST_BLOCKS NUMBER
DEGREE VARCHAR2(10)
INSTANCES VARCHAR2(10)
CACHE VARCHAR2(5)
TABLE_LOCK VARCHAR2(8)
SAMPLE_SIZE NUMBER
LAST_ANALYZED DATE
PARTITIONED VARCHAR2(3)
IOT_TYPE VARCHAR2(12)
TEMPORARY VARCHAR2(1)
SECONDARY VARCHAR2(1)
NESTED VARCHAR2(3)
BUFFER_POOL VARCHAR2(7)
FLASH_CACHE VARCHAR2(7)
CELL_FLASH_CACHE VARCHAR2(7)
ROW_MOVEMENT VARCHAR2(8)
GLOBAL_STATS VARCHAR2(3)
USER_STATS VARCHAR2(3)
DURATION VARCHAR2(15)
SKIP_CORRUPT VARCHAR2(8)
MONITORING VARCHAR2(3)
CLUSTER_OWNER VARCHAR2(128)
DEPENDENCIES VARCHAR2(8)
COMPRESSION VARCHAR2(8)
COMPRESS_FOR VARCHAR2(30)
DROPPED VARCHAR2(3)
READ_ONLY VARCHAR2(3)
SEGMENT_CREATED VARCHAR2(3)
RESULT_CACHE VARCHAR2(7)
CLUSTERING VARCHAR2(3)
ACTIVITY_TRACKING VARCHAR2(23)
DML_TIMESTAMP VARCHAR2(25)
HAS_IDENTITY VARCHAR2(3)
CONTAINER_DATA VARCHAR2(3)
INMEMORY VARCHAR2(8)
INMEMORY_PRIORITY VARCHAR2(8)
INMEMORY_DISTRIBUTE VARCHAR2(15)
INMEMORY_COMPRESSION VARCHAR2(17)
INMEMORY_DUPLICATE VARCHAR2(13)
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
测试环境2:IBM 笔记本 W500 、Linux 6.4、8GB内存、Oracle12.1.0.2、ArcSDE10.3 数据介绍:空间数据 ST_Geometry存储,内存参数设置为3GB
1:首先看一下数据情况,面状数据subdltb约300W条记录,查询数据也是面状数据query,里面包含一个大的要素
11:40:28 SQL> select count(*) from subdltb;
COUNT(*)
----------
2999999
Elapsed: 00:00:11.09
11:40:53 SQL> select sde.st_area(shape) from query where objectid=3;
SDE.ST_AREA(SHAPE)
------------------
4.0640E+10
Elapsed: 00:00:00.42
2:使用ArcSDE for Oracle的ST_Intersects函数进行查询,然后进行sum求和
11:41:16 SQL> select sum(a.db2gse_st_) from subdltb a,query b where sde.st_intersects(a.shape,b.shape)=1 and b.objectid=3;
SUM(A.DB2GSE_ST_)
-----------------
4451543224
Elapsed: 00:03:01.04
Execution Plan
----------------------------------------------------------
Plan hash value: 2821153078
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows |Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |4648 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 |4648 | | |
| 2 | NESTED LOOPS | | 27712 |122M | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID | QUERY | 1 |2324 | 0 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | R7_SDE_ROWID_UK | 1 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID | SUBDLTB | 27712 |61M | 2 (0)| 00:00:01 |
|* 6 | DOMAIN INDEX (Sel: Default - No Stats)| SHAPE_92247_4_SIDX | | | 18E (0)| |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("B"."OBJECTID"=3)
6 - access("SDE"."ST_INTERSECTS"("A"."SHAPE","B"."SHAPE")=1)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Statistics
----------------------------------------------------------
733910 recursive calls
0 db block gets
835491 consistent gets
88615 physical reads
0 redo size
559 bytes sent via SQL*Net to client
552 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
677 sorts (memory)
0 sorts (disk)
1 rows processed
3:使用In-MEMORY组件进行测试
将相关要素类进行Inmemory
SQL> select * from v$inmemory_area;
POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL 2565865472 4194304 DONE 3
64KB POOL 637534208 51642368 DONE 3
SQL> alter table query inmemory;
Table altered.
SQL> select count(*) from query;
COUNT(*)
----------
4
SQL> select * from v$inmemory_area;
POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL 2565865472 4194304 DONE 3
64KB POOL 637534208 51642368 DONE 3
SQL> select bytes/1024 KB from user_segments where segment_name='QUERY';
KB
----------
64
我们发现,Query要素类并没有加入内存中,Oracle帮助有提示:Objects that are smaller than 64KB are not populated into memory ,Query数据刚好64KB。
然后将subdltb数据放入内存中。
SQL> alter table subdltb inmemory;
Table altered.
SQL> select count(*) from subdltb;
COUNT(*)
----------
2999999
SQL> select * from v$inmemory_area;
POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL 2565865472 849346560 POPULATING 3
64KB POOL 637534208 52297728 POPULATING 3
SQL> /
POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID
-------------------------- ----------- ---------- -------------------------- ----------
1MB POOL 2565865472 2554331136 DONE 3
64KB POOL 637534208 52953088 DONE 3
我们看到,一开始查看v$inmemory_area的populate_status是populating,这是因为300W记录的数据,需要一定的时间写入内存中,所以需要稍等些时间状态才会变成DONE。然后查看一下v$im_segments表信息
SQL> select INMEMORY_SIZE/1024/1024,bytes/1024/1024 from v$im_segments where segment_name='SUBDLTB';
INMEMORY_SIZE/1024/1 BYTES/1024/1024
-------------------- ---------------
1182.25 1025
我们发现,针对于ST_Geometry存储的数据,列式存储压缩之后比行式存储还要大,这个让人很不理解。
进行实际查询
SQL> set timing on
SQL> set time on
12:36:09 SQL> set autot on
12:36:16 SQL> select sum(a.db2gse_st_) from subdltb a,query b where sde.st_intersects(a.shape,b.shape)=1 and b.objectid=3;
SUM(A.DB2GSE_ST_)
-----------------
4451543224
Elapsed: 00:03:00.14
Execution Plan
----------------------------------------------------------
Plan hash value: 2821153078
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4648 | 2 (0) | 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 4648 | | |
| 2 | NESTED LOOPS | | 27712 | 122M| 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID | QUERY | 1 | 2324 | 0 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | R7_SDE_ROWID_UK | 1 | | 0 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID | SUBDLTB | 27712 | 61M| 2 (0)| 00:00:01 |
|* 6 | DOMAIN INDEX (Sel: Default - No Stats)| SHAPE_92247_4_SIDX | | | 18E (0) | |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("B"."OBJECTID"=3)
6 - access("SDE"."ST_INTERSECTS"("A"."SHAPE","B"."SHAPE")=1)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Statistics
----------------------------------------------------------
734531 recursive calls
0 db block gets
835836 consistent gets
87819 physical reads
0 redo size
559 bytes sent via SQL*Net to client
551 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
455 sorts (memory)
0 sorts (disk)
1 rows processed
12:39:25 SQL> select sum(a.db2gse_st_) from subdltb a,query b where sde.st_intersects(b.shape,a.shape)=1 and b.objectid=3;
SUM(A.DB2GSE_ST_)
-----------------
4451543224
Elapsed: 00:12:46.23
Execution Plan
----------------------------------------------------------
Plan hash value: 209829830
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4648 | 1851 (29)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 4648 | | |
| 2 | NESTED LOOPS | | 27712 | 122M| 1851 (29)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| QUERY | 1 | 2324 | 0 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | R7_SDE_ROWID_UK | 1 | | 0 (0)| 00:00:01 |
|* 5 | TABLE ACCESS INMEMORY FULL | SUBDLTB | 27712 | 61M| 1851 (29)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("B"."OBJECTID"=3)
5 - filter("SDE"."ST_INTERSECTS"("B"."SHAPE","A"."SHAPE")=1)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Statistics
----------------------------------------------------------
1801418 recursive calls
0 db block gets
1124 consistent gets
17 physical reads
0 redo size
559 bytes sent via SQL*Net to client
551 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
1 rows processed
看到这个结果,我觉得是否是空间索引表如果In memory是否有相关效果,结果发现,空间索引表不支持in memory,会报ora-64358错误
SQL> select index_id from st_geometry_index where table_name='SUBDLTB';
INDEX_ID
----------
2
SQL> alter table s2_idx$ inmemory;
alter table s2_idx$ inmemory
*
ERROR at line 1:
ORA-64358: in-memory column store feature not supported for IOTs
4:结论
通过对比可以看到,虽然在内存中进行查询,大家都知道,st_instersects(a,b)需要传入两个参数,该函数a参数会走空间索引,b参数走全表扫描,所以尽可能将数据量大的放到a参数的位置,我按照最高效的方式测试,发现这种方式与普通查询没有任何差别,不管是逻辑读还是物理读和普通查询区别不大,查询时间也基本类似,如果我更换顺序,走比较低效的查询方式,果然,从执行计划指标上看,在内存中进行全表扫描,而且物理读和逻辑读明显减少,但是执行效率更加低效。
在内存技术方面,甲骨文并没有采用SAP HANA的“全内存”架构,数据会根据不同的“温度”来选择不同的处理方式,包含传统硬盘、闪存和内存三个层级,而不是把全部的数据都放到内存当中。Andy Mendelsohn介绍,在Oracle Database In-memory当中,最活跃或者说最热的数据将放到内存中进行分析,活跃度相对较低的数据会采用闪存(事实上,Oracle数据库是最早拥抱闪存的产品之一,在Exadata上已经大面积使用了闪存存储),而温度最低、最不活跃的数据还是会采用传统磁盘来存储。根据不同需求的数据采取不同的策略,这样做的好处在于,客户不必采购大量的内存设备就可以获得最佳性能提升,降低了总体成本,提升了投资回报率。
目前,Oracle的 IN-MEMORY组件还处于研究阶段,这方面的资料还比较少,该问题还在不断研究中,希望能够得到一些有些的解决方法!
当然Oracle的IN-MEMORY OPTION作为一个刚刚发布的组件还没有经过项目的实践,这不已经可以看到关于它的Bug问题了。
Oracle: That BUG in our In-Memory Option will be fixed in October
http://www.theregister.co.uk/2014/07/31/oracle_in_memory_bug_fix
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
欢迎添加微信公众号:ArcGIS技术分享(arcgis_share),直接回复1就可以在移动端获取最新技术文章
------------------------------------------------------------------
参考文档:
Oracle 12c新特性:IN-Memory Option - 列存与压缩:http://www.eygle.com/archives/2014/07/oracle_12c_inmemory_option_two.html
Oracle 12c In-Memory option:http://www.orasql.com/blog/archives/2014/07/23/12c_inmemory.htm
【Oracle Database 12c新特性】In-Memory Option:http://www.askmaclean.com/archives/12c-in-memory-option.html
inmemory option的简单介绍和测试:http://www.oracleblog.org/study-note/in-memory-option-simple-test/