假设存在下面二张表
t_media:
mediaid flags
11111 1<<17
22222 1<<18
33333 1<<17
44444 1<<2
t_media_locations:
mediaid locationid
11111 999999
22222 999999
33333 999999
现在有一个需求需要统计locationid=999999下面flags=1<<17和flag=1<<18的media数量,最初的丑陋实现如下:
SELECT a_tmp.locationid,a_tmp.attraction_count,r_tmp.restructant_count FROM ( SELECT COUNT(1)AS attraction_count, tmp.locationid FROM ( SELECT A . ID, b.locationid FROM t_media A, t_media_locations b WHERE A . ID = b.mediaid AND b.locationid IN(999999) AND A .flags & 134217728 > 0 GROUP BY A . ID, b.locationid )tmp GROUP BY tmp.locationid )AS a_tmp, ( SELECT COUNT(1)AS restructant_count, tmp.locationid FROM ( SELECT A . ID, b.locationid FROM t_media A, t_media_locations b WHERE A . ID = b.mediaid AND b.locationid IN(999999) AND A .flags & 268435456 > 0 GROUP BY A . ID, b.locationid )tmp GROUP BY tmp.locationid )AS r_tmp WHERE a_tmp.locationid = r_tmp.locationid
如此多的联表,性能肯定很差,改进如下:
SELECT sum ( CASE WHEN (flags & 268435456 > 0) THEN 1 ELSE 0 END ),sum ( CASE WHEN (flags & 134217728 > 0) THEN 1 ELSE 0 END ),b.locationid FROM t_media a, t_media_locations b WHERE a.id= b.mediaid AND b.locationid IN(99999) and a.flags is not null group by b.locationid
小结:
1.如果不需要locationid,group by的操作也可以省略,注意,聚合函数本身针对多记录操作并不一定是group by之后才使用,group by分组记录也是多记录。所以聚合函数使用的字段不需要出现在group by里。但是如果是直接select字段,postgres就需要group by该字段,有的db是不需要的比如sql lite.
2.postgres对于oracle中的decode函数的替代:CASE WHEN .....
3.flag这样的type类型值在db中直接存储为int,postgres的位与运算符&可以直接计算。
4.count(expr)如果expr是一个boolean值,count会被误用。
错误的使用方式:
SELECT --错误的统计行数的方式 count((flags & 134217728 > 0) ) FROM t_media a, t_media_locations b WHERE a.id= b.mediaid AND b.locationid IN(999999)
正确的方式就是case when使用之后sum:
SELECT sum( CASE WHEN (flags & 134217728 > 0) THEN 1 ELSE 0 END ) FROM t_media a, t_media_locations b WHERE a.id= b.mediaid AND b.locationid IN(999999)