测试数据:
表名 :by_blogreply2 记录 :7389147 条
说明 :其他测试条件以及MySQL优化配置未计算在内,采取默认配置。 测试1:
select * from by_blogreply2 order by rand() limit 1; 结果1: 1 min 26.08 sec
测试2:# max() * rand()
select * from by_blogreply2 where replyid >= (select floor(rand() * ((select max(replyid) from by_blogreply2) - (select min(replyid) from by_blogreply2)) + (select min(replyid) from by_blogreply2))) order by replyid limit 1; 结果2: 0.01 sec
测试3:# join
select * from by_blogreply2 as t1 join(select round(rand() * ((select max(replyid) from by_blogreply2) - (select min(replyid) from by_blogreply2)) + (select min(replyid) from by_blogreply2)) as replyid) as t2 where t1.replyid >= t2.replyid order by t1.replyid limit 1;
结果3: 0.03 sec
总结:一般来说都喜欢用第一条sql语句,可以看出大数据量时候速度是爆慢无比,不过也没谁喜欢在几百万数据里随机一个出来把。第二和第三条语句看似很复杂,不过效率确实提升了不止一个档次可言。第二条是用floor下rand * 最大ID - 最小ID再加上最小ID,实际是取一个ID的随机数出来。第三条是了join。
表名 :by_blogreply2 记录 :7389147 条
说明 :其他测试条件以及MySQL优化配置未计算在内,采取默认配置。 测试1:
select * from by_blogreply2 order by rand() limit 1; 结果1: 1 min 26.08 sec
测试2:# max() * rand()
select * from by_blogreply2 where replyid >= (select floor(rand() * ((select max(replyid) from by_blogreply2) - (select min(replyid) from by_blogreply2)) + (select min(replyid) from by_blogreply2))) order by replyid limit 1; 结果2: 0.01 sec
测试3:# join
select * from by_blogreply2 as t1 join(select round(rand() * ((select max(replyid) from by_blogreply2) - (select min(replyid) from by_blogreply2)) + (select min(replyid) from by_blogreply2)) as replyid) as t2 where t1.replyid >= t2.replyid order by t1.replyid limit 1;
结果3: 0.03 sec
总结:一般来说都喜欢用第一条sql语句,可以看出大数据量时候速度是爆慢无比,不过也没谁喜欢在几百万数据里随机一个出来把。第二和第三条语句看似很复杂,不过效率确实提升了不止一个档次可言。第二条是用floor下rand * 最大ID - 最小ID再加上最小ID,实际是取一个ID的随机数出来。第三条是了join。