大数据下union all的视图分页解决方案

最新推荐文章于 2024-07-28 02:51:25 发布

小发猫

最新推荐文章于 2024-07-28 02:51:25 发布

阅读量1.5k

点赞数

文章标签：数据库分页解决方案

本文链接：https://blog.csdn.net/i_like_cpp/article/details/49912817

版权

最近的一个项目需要用到大数据分页，问题还是比较复杂的，下面分享一些自己的解决方法，与大家一起学习。

在对于union all的视图中，如果采用传统的采用rownum分页方式的时候，如
WHERE rownum < M)
WHERE linenum >=N
因为，Oracle执行到linenum >=N的时候，将不知所措，导致执行计划乱掉。如，假定bwm_users就是一个union all的视图。
代码如下：
select *
from mv_bmw_users_db1
union all
select *
from mv_bmw_users_db2

如果我们在该视图上执行如下操作，可以看到
SQL> select * from
2 (select rownum linenum,id,nick from
3 (select id,nick from bmw_users where nick ='test' order by id)
4 where rownum < 50)
5 where linenum >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=20385 Card=49 Bytes=2401)
   1    0   VIEW (Cost=20385 Card=49 Bytes=2401)
   2    1     COUNT (STOPKEY)
   3    2       VIEW (Cost=20385 Card=1728633 Bytes=62230788)
   4    3         SORT (ORDER BY STOPKEY) (Cost=20385 Card=1728633 Bytes=62230788)
   5    4           VIEW OF 'BMW_USERS' (Cost=9278 Card=1728633 Bytes=62230788)
   6    5             UNION-ALL
   7    6               TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB1' (Cost=4639 Card=864090 Bytes=38884050)
   8    6               TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB2' (Cost=4639 Card=864543 Bytes=38904435)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      97298 consistent gets
      20770 physical reads
          0 redo size
        518 bytes sent via SQL*Net to client
        504 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
          1 rows processed

一个非常简单的查询，在nick上是有索引的，而且表与索引都是分析过的，居然是全表扫描，耗费非常大的资源，这个时候，Oracle已经不能正确的判断使用索引了，所以错误的使用了全表，从统计信息也可以看到，该查询产生了大量的cr读与磁盘读。这些在前出塞项目里用到，这个时候，就是强行指定hint也不能改变oracle的执行计划，当然，这样是行不通的，我们必须找到一个行之有效的办法。

这样的问题怎么解决呢？有两个办法，一个是仍然使用union all语句在查询中，直接查询基表而不是视图。如以上语句改造为：
SQL> select * from
2 (select rownum linenum,id,nick from
3 (select * from
4 (select id,nick from MV_BMW_USERS_DB1 where nick ='test'
5 union all
6 select id,nick from MV_BMW_USERS_DB1 where nick ='test')
7 order by id)
8 where rownum < 50)
9 where linenum >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=2 Bytes=98)
   1    0   VIEW (Cost=17 Card=2 Bytes=98)
   2    1     COUNT (STOPKEY)
   3    2       VIEW (Cost=17 Card=2 Bytes=72)
   4    3         SORT (ORDER BY STOPKEY) (Cost=17 Card=2 Bytes=72)
   5    4           VIEW (Cost=8 Card=2 Bytes=72)
   6    5             UNION-ALL
   7    6               TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
   8    7                 INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
   9    6               TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
10    9                 INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          8 consistent gets
          0 physical reads
          0 redo size
        553 bytes sent via SQL*Net to client
        504 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
          2 rows processed

语句基本上是一样的，只是这次查询了基表，而不是视图，执行计划马上发生了改变，这次能使用了索引，而且成本有了很大的减少，可以看到cr读减少到只有8个块，而且磁盘读为0。

我们采用第二种方法，分析函数的办法，把语句改写为
SQL>select * from
1 (select row_number() over(order by id) rn,id,nick from bmw_users where nick ='test')
2 where rn <50 and rn >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=49)
   1    0   VIEW (Cost=13 Card=1 Bytes=49)
   2    1     WINDOW (SORT PUSHED RANK) (Cost=13 Card=1 Bytes=45)
   3    2       VIEW OF 'BMW_USERS' (Cost=4 Card=1 Bytes=45)
   4    3         UNION-ALL (PARTITION)
   5    4           TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
   6    5             INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
   7    4           TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB2' (Cost=4 Card=1 Bytes=45)
   8    7             INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK2' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          7 consistent gets
          0 physical reads
          0 redo size
        513 bytes sent via SQL*Net to client
        504 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
          1 rows processed
可以看到，同样的功能，分析函数的方法是最简单的，同样也能正确的使用索引。

以上是一个简单的例子，我们再分析一个复杂的实际语句。
原始语句为：
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM
(SELECT * FROM
(SELECT T1.*, rownum as linenum
FROM
(SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,t.topic_distillate,
t.topic_vote,t.topic_status, t.topic_moved_id,TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS') topic_time,
t.topic_last_post_id, t.topic_views,t.topic_title, t.topic_replies, t.topic_poster
FROM forum_topics t
WHERE t.forum_id = ?
AND t.topic_type < 2
AND t.topic_status <> 3
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC ) T1
WHERE rownum < ?)
WHERE linenum >=?) T2,
forum_posts p2,
bmw_users u1,bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id
AND u2.user_id = p2.poster_id

因为其中bmw_users是union all的视图，所以，该查询也使用了基表的全表扫描。如果把它改写为union all的语句，也将是异常的复杂，如，该写成union all将是这个样子
select * from (
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS')
post_time FROM (
SELECT *FROM (
SELECT T1.*, rownum as linenum FROM(
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,
t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS') topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t WHERE t.forum_id = :bind0
AND t.topic_type < 2 AND t.topic_status <> 3 ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind1)
WHERE linenum >=:bind2
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick,
u1.user_id, u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownumas linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS') topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title,
t.topic_replies, t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind3
AND t.topic_type < 2 AND t.topic_status <> 3 ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind4)
WHERE linenum >=:bind5
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB2 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2, u2.id as userid2,
p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/ t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS') topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind6 AND t.topic_type < 2 AND t.topic_status <> 3
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind7)
WHERE linenum >=:bind8
) T2,
forum_posts p2,
MV_BMW_USERS_DB2 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND T2.topic_last_post_id = p2.post_id
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS') topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topicst WHERE t.forum_id = :bind9
AND t.topic_type < 2 AND t.topic_status <> 3
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind10)
WHERE linenum >=:bind11
) T2, forum_posts p2,
MV_BMW_USERS_DB2 u1,MV_BMW_USERS_DB2 u2 WHERE T2.topic_poster =
u1.user_id AND p2.post_id = T2.topic_last_post_id AND u2.user_id = p2.poster_id
)
order by topic_type DESC,topic_last_post_id desc

但是，我们利用分析函数，将显得非常简单，而且正确的使用索引
SELECT /*+ ordered use_nl(u1,p2,u2)*/ T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/
row_number() over(order by t.topic_type DESC, t.topic_last_post_id DESC) rn,
t.topic_id,t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS') topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = ? AND t.topic_type < 2 AND t.topic_status <> 3
) T1
WHERE rn < ? and rn >= ?
) T2,
forum_posts p2,
bmw_users u1,
bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id
AND u2.user_id = p2.poster_id