工作中遇到个问题:要求所有投资人首月投资的金额。举例说明:求fff4d7d1752f4950b088c5b550a82ac9的首月投资金额即求2015-12月所有的投资金额。
为实现上述需求,原本采用笛卡尔集的形式,详细SQL如下:
#统计首月投资金额
SELECT
C.INVESTOR AS INVEST,
SUM(C.INVESTAMOUNT) AS F_M_INVESTAMOUNT,
C.FLAG
FROM
(
SELECT
B.*, A.*
FROM
DW_ODS.ODS_HZCF_T_INVEST AS B,
(
SELECT
INVESTOR AS MININVESTOR,
MIN(
DATE_FORMAT(INVESTTIME, '%Y%m')
) AS MINTIME,
FLAG AS MINFLAG
FROM
DW_ODS.ODS_HZCF_T_INVEST
GROUP BY
MININVESTOR
HAVING
MINFLAG = '30007001002'
) AS A
) AS C
WHERE
C.INVESTOR = C.MININVESTOR
AND DATE_FORMAT(INVESTTIME, '%Y%m') = C.MINTIME
GROUP BY
INVESTOR
HAVING
C.FLAG = '30007001002';
这样做的话非常非常非常非常慢。因此,需要对SQL进行优化。采用什么方式进行优化呐?答:临时表。详见下面的SQL:
#统计首月投资金额
DROP table TEM_1;
CREATE TEMPORARY TABLE TEM_1 SELECT
INVESTOR AS MININVESTOR,
MIN(
DATE_FORMAT(INVESTTIME, '%Y%m')
) AS MINTIME,
FLAG AS MINFLAG
FROM
DW_ODS.ODS_HZCF_T_INVEST
GROUP BY
MININVESTOR
HAVING
MINFLAG = '30007001002';
SELECT
C.INVESTOR,
C.INVESTAMt AS FIRST_MONTH_INVEST_MONEY,
C.INVESTTIME
FROM
(
SELECT
INVESTOR,
INVESTTIME,
SUM(INVESTAMOUNT) AS INVESTAMt
FROM
DW_ODS.ODS_HZCF_T_INVEST AS A
JOIN TEM_1 AS B ON A.INVESTOR = B.MININVESTOR
AND A.FLAG = '30007001002'
AND DATE_FORMAT(A.INVESTTIME, '%Y%m') = B.MINTIME
GROUP BY
A.INVESTOR
) AS C
GROUP BY C.INVESTOR;
结语:优化后的SQL执行起来要比原来的SQL快N倍,因此,对表里的数据进行统计时,尽量别用笛卡尔集。