大数据下union all的视图分页解决方案

原创 2015年11月18日 20:47:52

最近的一个项目需要用到大数据分页,问题还是比较复杂的,下面分享一些自己的解决方法,与大家一起学习。

在对于union all的视图中,如果采用传统的采用rownum分页方式的时候,如
WHERE rownum < M)
WHERE linenum >=N
因为,Oracle执行到linenum >=N的时候,将不知所措,导致执行计划乱掉。如,假定bwm_users就是一个union all的视图。
代码如下:
select *
from mv_bmw_users_db1
union all
select  *
from mv_bmw_users_db2

如果我们在该视图上执行如下操作,可以看到
SQL> select * from
  2  (select rownum linenum,id,nick from
  3  (select id,nick from bmw_users  where nick ='test' order by id)
  4  where rownum < 50)
  5  where linenum >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=20385 Card=49 Bytes=2401)
   1    0   VIEW (Cost=20385 Card=49 Bytes=2401)
   2    1     COUNT (STOPKEY)
   3    2       VIEW (Cost=20385 Card=1728633 Bytes=62230788)
   4    3         SORT (ORDER BY STOPKEY) (Cost=20385 Card=1728633 Bytes=62230788)
   5    4           VIEW OF 'BMW_USERS' (Cost=9278 Card=1728633 Bytes=62230788)
   6    5             UNION-ALL
   7    6               TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB1' (Cost=4639 Card=864090 Bytes=38884050)
   8    6               TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB2' (Cost=4639 Card=864543 Bytes=38904435)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      97298  consistent gets
      20770  physical reads

          0  redo size
        518  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed
         
一个非常简单的查询,在nick上是有索引的,而且表与索引都是分析过的,居然是全表扫描,耗费非常大的资源,这个时候,Oracle已经不能正确的判断使用索引了,所以错误的使用了全表,从统计信息也可以看到,该查询产生了大量的cr读与磁盘读。这些在前出塞项目里用到,这个时候,就是强行指定hint也不能改变oracle的执行计划,当然,这样是行不通的,我们必须找到一个行之有效的办法。

这样的问题怎么解决呢?有两个办法,一个是仍然使用union all语句在查询中,直接查询基表而不是视图。如以上语句改造为:
SQL> select * from
  2  (select rownum linenum,id,nick from
  3  (select * from
  4  (select id,nick from MV_BMW_USERS_DB1 where nick ='test'
  5  union all
  6  select id,nick from MV_BMW_USERS_DB1 where nick ='test')
  7  order by id)
  8  where rownum < 50)
  9  where linenum >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=2 Bytes=98)
   1    0   VIEW (Cost=17 Card=2 Bytes=98)
   2    1     COUNT (STOPKEY)
   3    2       VIEW (Cost=17 Card=2 Bytes=72)
   4    3         SORT (ORDER BY STOPKEY) (Cost=17 Card=2 Bytes=72)
   5    4           VIEW (Cost=8 Card=2 Bytes=72)
   6    5             UNION-ALL
   7    6               TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
   8    7                 INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
   9    6               TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
  10    9                 INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        553  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          2  rows processed

语句基本上是一样的,只是这次查询了基表,而不是视图,执行计划马上发生了改变,这次能使用了索引,而且成本有了很大的减少,可以看到cr读减少到只有8个块,而且磁盘读为0。

我们采用第二种方法,分析函数的办法,把语句改写为
SQL>select * from
 1 (select row_number() over(order by id) rn,id,nick from bmw_users where nick ='test')
 2 where rn <50 and rn >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=49)
   1    0   VIEW (Cost=13 Card=1 Bytes=49)
   2    1     WINDOW (SORT PUSHED RANK) (Cost=13 Card=1 Bytes=45)
   3    2       VIEW OF 'BMW_USERS' (Cost=4 Card=1 Bytes=45)
   4    3         UNION-ALL (PARTITION)
   5    4           TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
   6    5             INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
   7    4           TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB2' (Cost=4 Card=1 Bytes=45)
   8    7             INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK2' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads

          0  redo size
        513  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed
可以看到,同样的功能,分析函数的方法是最简单的,同样也能正确的使用索引。

以上是一个简单的例子,我们再分析一个复杂的实际语句。
原始语句为:
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM
(SELECT * FROM
(SELECT T1.*, rownum as linenum
FROM
(SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,t.topic_distillate,
t.topic_vote,t.topic_status, t.topic_moved_id,TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id, t.topic_views,t.topic_title, t.topic_replies, t.topic_poster
FROM forum_topics t
WHERE t.forum_id = ?
AND t.topic_type < 2
AND t.topic_status <> 3
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC ) T1
WHERE rownum < ?)
WHERE linenum >=?) T2,
forum_posts p2,
bmw_users u1,bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id

因为其中bmw_users是union all的视图,所以,该查询也使用了基表的全表扫描。如果把它改写为union all的语句,也将是异常的复杂,如,该写成union all将是这个样子
select * from (
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS')
post_time FROM (
SELECT *FROM (
SELECT T1.*, rownum as linenum FROM(
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,
t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t WHERE t.forum_id = :bind0 
AND t.topic_type < 2 AND t.topic_status <> 3  ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind1)
WHERE linenum >=:bind2
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick,
u1.user_id, u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownumas linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title,
t.topic_replies, t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind3 
AND t.topic_type < 2 AND t.topic_status <> 3 ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind4)
WHERE linenum >=:bind5
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB2 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2, u2.id as userid2,
p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/ t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind6  AND t.topic_type < 2 AND t.topic_status <> 3 
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind7)
WHERE linenum >=:bind8
) T2,
forum_posts p2,
MV_BMW_USERS_DB2 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND   T2.topic_last_post_id = p2.post_id 
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topicst WHERE t.forum_id = :bind9 
AND t.topic_type < 2 AND t.topic_status <> 3 
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind10)
WHERE linenum >=:bind11
) T2, forum_posts p2,
MV_BMW_USERS_DB2 u1,MV_BMW_USERS_DB2 u2 WHERE T2.topic_poster =
u1.user_id AND p2.post_id = T2.topic_last_post_id  AND u2.user_id = p2.poster_id
)
order by topic_type DESC,topic_last_post_id desc

 

但是,我们利用分析函数,将显得非常简单,而且正确的使用索引
SELECT /*+ ordered use_nl(u1,p2,u2)*/ T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/
row_number() over(order by t.topic_type DESC, t.topic_last_post_id DESC) rn,
t.topic_id,t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = ?  AND t.topic_type < 2 AND t.topic_status <> 3 
) T1
WHERE rn < ? and rn >= ?
) T2,
forum_posts p2,
bmw_users u1,
bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id



当分页语句遇到union all

SELECT * FROM (SELECT pubformdat0_.id id332_, pubformdat0_.domain_id domain2_332_, ...

UNION ALL的排序分页问题

SQLSERVER中分页语句通常需要排序,如果同时要对UNION ALL子句排序的话就会遇到一些问题,看看如何来解决....
  • Sai_1
  • Sai_1
  • 2015年09月02日 11:03
  • 1777

在union all的视图中怎么解决分页问题

在对于union all的视图中,如果采用传统的采用rownum分页方式的时候,如 WHERE rownum WHERE linenum >=N 因为,Oracle执行到linenum >=N...
  • yztezhl
  • yztezhl
  • 2015年11月27日 16:15
  • 280

DevExpress的GridControl的实时加载数据解决方案(取代分页)

传统的Winform(不使用第三方控件)针对DataGridView的实时加载数据的文章请看这里 http://www.cnblogs.com/liulun/archive/2009/09/28...

mysql分页加排序数据重复问题解决方案

mysql分页加排序数据重复问题解决方案

大数据量报表APPLET打印分页传输方案

大数据量报表APPLET打印分页传输方案

Oracle医疗大数据解决方案参考

  • 2015年09月27日 22:51
  • 3.23MB
  • 下载
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:大数据下union all的视图分页解决方案
举报原因:
原因补充:

(最多只允许输入30个字)