rely constraint

Stuck in a rut. I’m on constraints this week. It all started with “something about nothing” – complete misunderstandings about NULL values, especially in indexes. Now I’m stuck on constraints. Just today I was giving a seminar session – and one of the topics was “Effective Schema” – all about constraints, using the right types, telling the CBO more information. This is an example I’ve used in that section.

I’ve been asked in the past “Why should I use a primary key? Why not just use a unique index?” Well, the answer is that you could, but doesn’t the fact you used a primary key say something over and above just using a unique index? In fact it does – it can say a lot. The same goes for the use of foreign keys,  NOT NULL constraints and others – they not only protect the data, they add information  about the data into the data dictionary. Using this additional information Oracle is able to perform a query rewrite more often, in many complex cases.

Consider the following small example. We will copy the EMP and DEPT tables from the SCOTT schema and create a materialized view that pre-joins the tables together for us.
ops$tkyte@ORA10GR2> create table emp
2  as
3  select * from scott.emp;
Table created.

ops$tkyte@ORA10GR2> create table dept
2  as
3  select * from scott.dept;
Table created.

ops$tkyte@ORA10GR2> create materialized view emp_dept
2  build immediate
3  refresh on demand
4  enable query rewrite
5  as
6  select dept.deptno, dept.dname, count (*)
7    from emp, dept
8   where emp.deptno = dept.deptno
9   group by dept.deptno, dept.dname
10  /
Materialized view created.

Now, we have withheld a lot of information from Oracle here. It does not understand the relationship between EMP and DEPT, does not know what columns are primary keys, and so on. Now, let’s run a query and see what happens:
ops$tkyte@ORA10GR2> set autotrace on
ops$tkyte@ORA10GR2> select count(*) from emp;

COUNT(*)
----------
     14

Execution Plan
-------------------------------------------...
Plan hash value: 2083865914

---------------------------------------------...
| Id  | Operation          | Name | Rows  | C...
---------------------------------------------...
|   0 | SELECT STATEMENT   |      |     1 |  ...
|   1 |  SORT AGGREGATE    |      |     1 |  ...
|   2 |   TABLE ACCESS FULL| EMP  |    14 |  ...
---------------------------------------------...

Now, you and I know that the  count(*) could easily, and more efficiently (especially if the number of employees in each department was large and there were lots of departments), have been answered from the materialized view. There we have all of the information we need to get the count of employees. We know that because we know things about the data that we kept from Oracle:

  • DEPTNO is the primary key of DEPT. That means that each EMP record will join to at most one DEPT record.
  • DEPTNO in EMP is a foreign key to DEPTNO in DEPT. If the DEPTNO in EMP is not a null value, then it will be joined to a row in DEPT (we won’t lose any non-null EMP records during a join)
  • DEPTNO in EMP is NOT NULL – this coupled with the foreign key constraint tells us we won’t lose any EMP records.

These three facts imply that if we join EMP to DEPT – each EMP row will be observed in the result set AT LEAST once and AT MOST once. Since we never told Oracle these facts, it was not able to make use of the materialized view. So, let’s make Oracle aware of them:
ops$tkyte@ORA10GR2> alter table dept
2  add constraint
3  dept_pk
4  primary key(deptno);
Table altered.

ops$tkyte@ORA10GR2> alter table emp
2  add constraint
3  emp_fk_dept
4  foreign key(deptno)
5  references dept(deptno);
Table altered.

ops$tkyte@ORA10GR2> alter table emp
2  modify deptno not null;
Table altered.

ops$tkyte@ORA10GR2> set autotrace on
ops$tkyte@ORA10GR2> select count(*) from emp;

COUNT(*)
----------
     14


Execution Plan
---------------------------------------------------...
Plan hash value: 155013515

---------------------------------------------------...
| Id  | Operation                     | Name     | ...
---------------------------------------------------...
|   0 | SELECT STATEMENT              |          | ...
|   1 |  SORT AGGREGATE               |          | ...
|   2 |   MAT_VIEW REWRITE ACCESS FULL| EMP_DEPT | ...
---------------------------------------------------...

And now Oracle is able to rewrite the query using the materialized view. Anytime that you know Oracle could use a materialized view, but it is not doing so (and you have verified you can use materialized views in general) – take a closer look at the data and ask yourself “what piece of information have I withheld from Oracle?” Nine times out of ten, you'll find a missing piece of metadata that, when included, allows Oracle to rewrite the query.

So, what happens if this is a true data warehouse and there are tens of millions of records in the above tables? You don’t really want the additional effort of verifying a foreign key relationship – you already did that in your data scrubbing routine didn’t you? In which case, you can create a non-validated constraint – one that is used to inform the database about a relationship but that has not been validated by the database itself. Let’s look at the above example again but this time we’ll simulate a load of data into an existing data warehouse (our example above is our data warehouse). We’ll drop our constraints, load the data, refresh the materialized views and add our constraints back. We’ll start with dropping the constraints:
ops$tkyte@ORA10GR2> alter table emp drop constraint emp_fk_dept;
Table altered.

ops$tkyte@ORA10GR2> alter table dept drop constraint dept_pk;
Table altered.

ops$tkyte@ORA10GR2> alter table emp modify deptno null;
Table altered.

Now, in order to simulate our load, I will insert a single new row into EMP (not much of a load but enough to demonstrate with). Then, we will refresh our materialized view:
ops$tkyte@ORA10GR2> insert into emp (empno,deptno) values ( 1, 1 );
1 row created.

ops$tkyte@ORA10GR2> exec dbms_mview.refresh( 'EMP_DEPT' );
PL/SQL procedure successfully completed.


purposely inserted a row into  EMP that will violate the constraint we are about to put back on the EMP. This is to show you what can happen if you tell Oracle to trust your data and the data is invalid. It shows that Oracle will trust you but - Oracle will give you the wrong answer – though through no fault of its own. When you use these constraints on a large warehouse, you had better be very sure of the data. Now we tell Oracle about the relationships between EMP and DEPT:
ops$tkyte@ORA10GR2> alter table dept
2  add constraint dept_pk primary key(deptno)
3  RELY enable NOVALIDATE
4  /
Table altered.

ops$tkyte@ORA10GR2> alter table emp
2  add constraint emp_fk_dept
3  foreign key(deptno) references dept(deptno)
4  RELY enable NOVALIDATE
5  /
Table altered.

ops$tkyte@ORA10GR2> alter table emp modify deptno not null NOVALIDATE;
Table altered.

So here we have told Oracle that there is a foreign key from  EMP to  DEPT as before, but this time we have told Oracle to trust that if it joins  EMP to  DEPT by  DEPTNO – every row in  EMP will be retrieved at least once and at most once. Oracle will perform no validating checks. That is the purpose of the NOVALIDATE and RELY options. The NOVALIDATE bypasses the checking of existing data we loaded, RELY tells Oracle to “rely” on the integrity of the data. We are now ready to query:
ops$tkyte@ORA10GR2> alter session set query_rewrite_integrity=enforced;
Session altered.

ops$tkyte@ORA10GR2> select count(*) from emp;

COUNT(*)
----------
     15

Execution Plan
----------------------------------------------...
Plan hash value: 2083865914

----------------------------------------------...
| Id  | Operation          | Name | Rows  | Co...
----------------------------------------------...
|   0 | SELECT STATEMENT   |      |     1 |   ...
|   1 |  SORT AGGREGATE    |      |     1 |   ...
|   2 |   TABLE ACCESS FULL| EMP  |    15 |   ...
----------------------------------------------...


This time QUERY_REWRITE_INTEGRITY=ENFORCED is not good enough to cause a rewrite of our query as you can see. Oracle did not rewrite the query to use the materialized view because of this. We must go down a level in query integrity. We need Oracle to “trust” us:
ops$tkyte@ORA10GR2> alter session set query_rewrite_integrity=trusted;
Session altered.

ops$tkyte@ORA10GR2> select count(*) from emp;

COUNT(*)
----------
     14


Execution Plan
--------------------------------------------------...
Plan hash value: 155013515

------------------------------------------------------...
| Id  | Operation                     | Name     | Row...
------------------------------------------------------...
|   0 | SELECT STATEMENT              |          |    ...
|   1 |  SORT AGGREGATE               |          |    ...
|   2 |   MAT_VIEW REWRITE ACCESS FULL| EMP_DEPT |    ...
------------------------------------------------------...

So, here Oracle did in fact rewrite the query but as you can see – the side effect of that in this case is the wrong answer is returned. The reason we got the wrong answer is because the “fact” that each row in EMP would be preserved in a join to DEPT is not a fact given the data we loaded. When the materialized view refreshed, it did not get the newly added EMP row. The data we told Oracle to rely on was not reliable. This demonstration points out two things:

  • You can use materialized views in a large data warehouse very efficiently, without having to perform lots of extra, typically redundant, verifications of the data.
  • BUT, you had better be 100% sure that your data is scrubbed if you ask Oracle to rely on it.
POST A COMMENT 

22 COMMENTS:
Blogger  Howard J. Rogers said....

Nice demo. I intend to steal it shamelessly whenever I next get blank stares when I start mentioning the RELY flag!

MON JAN 30, 06:41:00 PM EST   
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值