我试图解决SQL问题的方法是逐步实现.
>您希望每个产品的最大主版本对应的最大次要版本的最大版本.
每个产品的最大主要数量由下式给出:
SELECT Name, MAX(major) AS Major FROM CA GROUP BY Name;
因此,每个产品的最大主号对应的最大次数由下式给出:
SELECT CA.Name, CA.Major, MAX(CA.Minor) AS Minor
FROM CA
JOIN (SELECT Name, MAX(Major) AS Major
FROM CA
GROUP BY Name
) AS CB
ON CA.Name = CB.Name AND CA.Major = CB.Major
GROUP BY CA.Name, CA.Major;
因此,最大修订(对于每个产品的最大主编号的最大次要版本号)由下式给出:
SELECT CA.Name, CA.Major, CA.Minor, MAX(CA.Revision) AS Revision
FROM CA
JOIN (SELECT CA.Name, CA.Major, MAX(CA.Minor) AS Minor
FROM CA
JOIN (SELECT Name, MAX(Major) AS Major
FROM CA
GROUP BY Name
) AS CB
ON CA.Name = CB.Name AND CA.Major = CB.Major
GROUP BY CA.Name, CA.Major
) AS CC
ON CA.Name = CC.Name AND CA.Major = CC.Major AND CA.Minor = CC.Minor
GROUP BY CA.Name, CA.Major, CA.Minor;
测试 – 它的工作原理和产生与Andomar的query相同的答案.
性能
我创建了更大量的数据(11616行数据),并运行了针对矿山目标DBMS的Andomar查询的基准时间是IBM Mac OS X 10.7.2上运行的IBM Informix Dynamic Server(IDS)版本11.70.FC2.我使用Andomar的两个查询中的第一个,因为IDS不支持第二个查询的比较符号.我加载了数据,更新了统计数据,并且跟随着Andomar和Andomar之后跟随我的这些查询.我还记录了IDS优化器报告的基本成本.两个查询的结果数据是相同的(所以查询都是准确的 – 或者同样不准确).
表无索引:
Andomar's query Jonathan's query
Time: 22.074129 Time: 0.085803
Estimated Cost: 2468070 Estimated Cost: 22673
Estimated # of Rows Returned: 5808 Estimated # of Rows Returned: 132
Temporary Files Required For: Order By Temporary Files Required For: Group By
具有唯一索引的表(名称,主要,次要,修订):
Andomar's query Jonathan's query
Time: 0.768309 Time: 0.060380
Estimated Cost: 31754 Estimated Cost: 2329
Estimated # of Rows Returned: 5808 Estimated # of Rows Returned: 139
Temporary Files Required For: Group By
正如你所看到的,该索引显着提高了Andomar查询的性能,但与我的查询相比,该系统似乎还比较昂贵.该索引为我的查询提供了25%的时间节省.对于Andomar对可比较数据量的查询的两个版本,有和没有索引,我会好奇地看到可比数字. (如果需要,我的测试数据可以提供;有132个产品 – 问题中列出的3个和129个新产品;每个新产品有相同的90个版本条目.)
差异的原因是Andomar查询中的子查询是一个相关的子查询,这是一个相对昂贵的过程(当索引丢失时,这显然非常显着).