Tablespace与Datafile的关系是一对多的关系。一个表空间Tablespace可以对应多个Datafile,Datafile则是只能归属在一个Tablespace里。传统的Oracle管理概念中,倡导一个表空间中创建多个数据文件,特别是创建分布在多个存储磁盘上,以期分散I/O。但是,Oracle10g推出的BigFile Tablespace大文件表空间,将这个概念有所变化。
1、 Bigfile tablespace
在Oracle10g中,推出了Bigfile tablespace的概念。表空间Tablespace从Oracle10g以后就分为两个类型,smallfile tablespace和bigfile tablespace。过去一个表空间对应多个数据文件我们成为Smallfile Tablespace。
10g 数据库在创建的时候,会指定默认的表空间类型。如果不特殊指定的话,默认为 SMALLFILE 类型的表空间。
======================================================================================
SYS@ orcl> select * from database_properties where property_name = 'DEFAULT_TBS_TYPE';
PROPERTY_NAME PROPERTY_VALU DESCRIPTION
------------------------------ ------------- --------------------------
DEFAULT_TBS_TYPE SMALLFILE Default tablespace type
## 可以设置默认的表空间类型。
SQL> ALTER DATABASE SET DEFAULT bigfile TABLESPACE;
======================================================================================
下面我们通过一连串的实验来观察Bigfile Tablespace。
此处,我们通过create bigfile tablespace语句建立Bigfile Tablespace。注意两方面的问题,其一是Bigfile Tablespace必须使用local本地extent管理方式,不允许使用DMT(Dictionary Managed Tablespace)。另一方面是段segment空间使用auto自动方式,不要使用manual。
不过这两个条件在Undo或者临时Bigfile表空间的时候,是允许例外的。
Small Tablespace和Bigfile Tablespace是可以并存的。
===================================================================
SYS@ orcl> select tablespace_name, bigfile from dba_tablespaces;TABLESPACE_NAME BIG
------------------------------ ---
SYSTEM NO
SYSAUX NO
UNDOTBS1 NO
TEMP NO
USERS NO
BTTEST YES
===================================================================
在创建bigfile tablespace的时候,就已经指定了数据文件。如果此时我们尝试加入一个新的数据文件,Oracle系统会报错。
=================================================================================================
SYS@ orcl> alter tablespace bttest add datafile '/oradata/orcl/bttest2.dbf' size 10M autoextend off;
alter tablespace bttest add datafile '/oradata/orcl/bttest2.dbf' size 10M autoextend off
*
ERROR at line 1:
ORA-32771: cannot add file to bigfile tablespace
=================================================================================================
2、Bigfile的相对文件编号
此处我们观察一下Bigfile Tablespace体系下的一些特征。
=======================================================================
SYS@ orcl> select file_name, file_id, relative_fno from dba_data_files;FILE_NAME FILE_ID RELATIVE_FNO
------------------------------ ---------- ------------
/oradata/orcl/users01.dbf 4 4
/oradata/orcl/undotbs01.dbf 3 3
/oradata/orcl/sysaux01.dbf 2 2
/oradata/orcl/system01.dbf 1 1
/oradata/orcl/bttest.dbf 5 1024
=======================================================================
每个文件都包括两个编号,一个是绝对文件编号file_id,另一个是相对文件编号relative_fno。在文件数量较少,或者创建删除文件比较少情况下的数据库,file_id 和relative_fno 是相同的。Relative_fno 是一个循环周期,以1024为一个循环。当file_id依次递增到1024整数倍之后,file_id会继续增加,而relative_fno会形成一个内部循环。
这种机制让我们联想起了Oracle Small Tablespace 的数据文件上限限制,也是1023个文件。我们设想一种极端的情况,如果一个表空间中的数据文件个数超过了1024,也就是一个表空间内的数据文件中,存在相同的relative_fno。这种情况首先是不被允许的。
设置1024的relative_fno 是进行Oracle 数据寻址的需要。这里我们要关注到Oracle定位数据行的rowid 信息,rowid 是Oracle 内部唯一标注一行记录的地址。Rowid 包括四部分组成,对象号 + 文件号 + 块号 + slot行号。Rowid长度固定,所以四部分的长度都是固定的,这里的文件号fno是相对文件编号。这里的查找顺序是,对象号获取所在表空间的编号,在表空间内部使用相对文件号来定位到文件。如果一个表空间内出现相同的relative_fno 文件,定位就不可能了。所以,small tablepsace 的内部文件上限必然是1023。
但是,Bigfile Tablespace存在一些不同之处。如果一个表空间只有一个数据文件且只能拥有一个数据文件,那么relative_fno就失去了存在的意义。所以在dba_data_files 中的relative_fno 列上,bigfile tablespace 对应的datafile 是直接1024。
==========================================================================================
SYS@ orcl> create table t_big tablespace bttest as select * from dba_tables;
Table created.
SYS@ orcl> select dbms_rowid.rowid_object(t.rowid) obj_id, dbms_rowid.rowid_relative_fno(t.rowid,'BIGFILE') relative_fno,
2 dbms_rowid.rowid_block_number(t.rowid) block_num, dbms_rowid.rowid_row_number(t.rowid) rownums,
3 dbms_rowid.rowid_to_absolute_fno(t.rowid,'SYS','T_BIG') abso_fno
4 from t_BIG t where rownum < 4;
OBJ_ID RELATIVE_FNO BLOCK_NUM ROWNUMS ABSO_FNO
---------- ------------ ---------- ---------- ----------
74519 1024 131 0 5
74519 1024 131 1 5
74519 1024 131 2 5
==========================================================================================
借助dbms_rowid包的相应方法,我们也可以看清bigfile tablespace上rowid行的relative_fno信息。
3、Bigfile Datafile大小探索
一个数据文件最大能有多大,在Oracle中有两个因素控制,一个是Oracle内部寻址能够寻找到的最大范围多少个块,另一个是数据库每个块大小是多大。
我们首先来看看Small Datafile的情况。Rowid是一个用16进制表示的18位长度类字符串。其中,前6位表示object_id,之后3位表示文件编号,之后6位表示块号,最后3位表示row的slot编号。
========================================
--语句摘自MOS[ID 1057891.6]
SQL> select substr(rowid,1,6) "object",substr(rowid,7,3) "file",
substr(rowid,10,6) "block",
substr(rowid,16,3) "row"
from test;
object fil block row
------ --- ------ ---
AAABPW AAF AAAAv1 AAA
========================================
在small datafile中,对数据块有6×4=24位二进制表示。Oracle利用中间的22位进行块地址存储。这样,22位二进制可以表示的数据块编号范围为(2^22-1)=4M个数据块。如果按照每个数据块8K的普遍大小计算,这样每个small datafile大小上限就是32G。如果按照每个数据块上限32K计算,这样每个small datafile大小上限就是128G。
在bigfile datafile中,事情有所不同。由于没有relative_fno的问题,这样rowid中就不需要保存relative_fno 的最多1024的数值。这样就节省出10位二进制位给数据块定位,相同长度的rowid 就可以进行32位二进制长度的数据块寻址。每个bigfile datafile 中,最多可以包括到(2^32-1)=4G 个数据块。在数据块大小8K的环境下,最大为32TB数据。如果是数据块大小为32K环境下,文件最大可以到128TB大小。
4、结论和讨论
在本文第一部分中,我们讨论了Bigfile Tablespace 和Small Tablespace 的理论分析,两者理论上最大容量值是一样的。只是Bigfile的单个文件容量上限很大,我们如何看待这种新类型呢?
首先,实际生产环境下,Bigfile Tablespace 在应对大数据量存储的时候略有优势。因为Small Tablespace 在实现相同的存储大小时,要消耗大量的多文件管理开销。同时控制文件为了维护多位置上的文件,协同检查点和文件头SCN等内容,要消耗很大精力。
其次,Small Tablespace的存储量只是理论,很少有系统真的建立上千个文件在一个表空间中。从这个角度看,Bigfile吸引力是不小的。
最后,Oracle对于Bigfile Tablespace支持是全面的。存储层面上,ASM、Logical Volumn Manager/RAID、Dynamically Extensible Logical Volumes 和OMF都提供了比较全面的支持。
但是,Bigfile Tablespace 并不是万灵药。Bigfile 是Oracle 层面的Bigfile,并不涉及到OS层面的支持。使用Bigfile的时候,要选择合适的底层存储系统提供支持。此外,Bigfile的理念违背了我们过去宣称的I/O 分散理论。所以,最好在有底层强大存储支持(如ASM)的情况下使用Bigfile,达到最优的性能平衡点。
根据 Oracle 官方的文档,DB_FILES 和 MAXDATAFILES 这两个参数的值给 SGA 带来的压力会减轻。数据库中最大数据文件数是有限的 (64K files),BFT的出现的确对海量数据库有一定的积极意义。从一定程度上来说,BFT简化了管理多个数据文件的复杂性。
当然在备份大数据文件的时候要采取合理的备份策略,比如我们可以采用multisection backup:
=====================================================================================
run {
allocate channel c1 device type disk;
allocate channel c2 device type disk;
backup section size 100M database format '/u01/backup/full_%D_%s_%p_%T'
plus archivelog format '/u01/backup/arc_%D_%s_%p_%T';
}
released channel: ORA_DISK_1
allocated channel: c1
channel c1: SID=17 device type=DISK
allocated channel: c2
channel c2: SID=38 device type=DISK
Starting backup at 12-JUN-13
current log archived
channel c1: starting archived log backup set
channel c1: specifying archived log(s) in backup set
input archived log thread=1 sequence=38 RECID=72 STAMP=817911054
channel c1: starting piece 1 at 12-JUN-13
channel c2: starting archived log backup set
channel c2: specifying archived log(s) in backup set
input archived log thread=1 sequence=39 RECID=73 STAMP=817911119
input archived log thread=1 sequence=40 RECID=74 STAMP=817911158
channel c2: starting piece 1 at 12-JUN-13
channel c1: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_85_1_20130612 tag=TAG20130612T133238 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:01
channel c2: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_86_1_20130612 tag=TAG20130612T133238 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:01
Finished backup at 12-JUN-13
Starting backup at 12-JUN-13
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/orcl/system01.dbf
backing up blocks 1 through 12800
channel c1: starting piece 1 at 12-JUN-13
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata/orcl/sysaux01.dbf
backing up blocks 1 through 12800
channel c2: starting piece 1 at 12-JUN-13
channel c1: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_87_1_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:16
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/orcl/system01.dbf
backing up blocks 12801 through 25600
channel c1: starting piece 2 at 12-JUN-13
channel c2: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_88_1_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:16
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/orcl/system01.dbf
backing up blocks 25601 through 38400
channel c2: starting piece 3 at 12-JUN-13
channel c1: finished piece 2 at 12-JUN-13
piece handle=/u01/backup/a_12_87_2_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:16
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/orcl/system01.dbf
backing up blocks 38401 through 51200
channel c1: starting piece 4 at 12-JUN-13
channel c2: finished piece 3 at 12-JUN-13
piece handle=/u01/backup/a_12_87_3_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:15
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/orcl/system01.dbf
backing up blocks 51201 through 64000
channel c2: starting piece 5 at 12-JUN-13
channel c1: finished piece 4 at 12-JUN-13
piece handle=/u01/backup/a_12_87_4_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:15
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/orcl/system01.dbf
backing up blocks 64001 through 76800
channel c1: starting piece 6 at 12-JUN-13
channel c2: finished piece 5 at 12-JUN-13
piece handle=/u01/backup/a_12_87_5_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:16
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/orcl/system01.dbf
backing up blocks 76801 through 87040
channel c2: starting piece 7 at 12-JUN-13
channel c1: finished piece 6 at 12-JUN-13
piece handle=/u01/backup/a_12_87_6_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:15
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata/orcl/sysaux01.dbf
backing up blocks 12801 through 25600
channel c1: starting piece 2 at 12-JUN-13
channel c2: finished piece 7 at 12-JUN-13
piece handle=/u01/backup/a_12_87_7_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:15
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata/orcl/sysaux01.dbf
backing up blocks 25601 through 38400
channel c2: starting piece 3 at 12-JUN-13
channel c1: finished piece 2 at 12-JUN-13
piece handle=/u01/backup/a_12_88_2_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:16
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata/orcl/sysaux01.dbf
backing up blocks 38401 through 51200
channel c1: starting piece 4 at 12-JUN-13
channel c2: finished piece 3 at 12-JUN-13
piece handle=/u01/backup/a_12_88_3_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:16
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata/orcl/sysaux01.dbf
backing up blocks 51201 through 64000
channel c2: starting piece 5 at 12-JUN-13
channel c1: finished piece 4 at 12-JUN-13
piece handle=/u01/backup/a_12_88_4_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:15
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00002 name=/oradata/orcl/sysaux01.dbf
backing up blocks 64001 through 70400
channel c1: starting piece 6 at 12-JUN-13
channel c2: finished piece 5 at 12-JUN-13
piece handle=/u01/backup/a_12_88_5_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:15
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
input datafile file number=00003 name=/oradata/orcl/undotbs01.dbf
channel c2: starting piece 1 at 12-JUN-13
channel c1: finished piece 6 at 12-JUN-13
piece handle=/u01/backup/a_12_88_6_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:04
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00005 name=/oradata/orcl/bttest.dbf
input datafile file number=00006 name=/oradata/orcl/users02.dbf
input datafile file number=00004 name=/oradata/orcl/users01.dbf
channel c1: starting piece 1 at 12-JUN-13
channel c2: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_100_1_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:03
channel c2: starting full datafile backup set
channel c2: specifying datafile(s) in backup set
channel c1: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_101_1_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:03
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel c1: starting piece 1 at 12-JUN-13
including current control file in backup set
channel c2: starting piece 1 at 12-JUN-13
channel c1: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_103_1_20130612 tag=TAG20130612T133240 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:01
channel c2: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_102_1_20130612 tag=TAG20130612T133240 comment=NONE
channel c2: backup set complete, elapsed time: 00:00:02
Finished backup at 12-JUN-13
Starting backup at 12-JUN-13
current log archived
channel c1: starting archived log backup set
channel c1: specifying archived log(s) in backup set
input archived log thread=1 sequence=41 RECID=75 STAMP=817911263
channel c1: starting piece 1 at 12-JUN-13
channel c1: finished piece 1 at 12-JUN-13
piece handle=/u01/backup/a_12_104_1_20130612 tag=TAG20130612T133423 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:00
Finished backup at 12-JUN-13
released channel: c1
released channel: c2
=====================================================================================