目录
一步一步的免费OLAP
在本文中,我将展示如何完全免费实现OLAP。您不必为MSAS、Cognos等花费一分钱。如果您查看OLAP系统的价格,您会发现它们将花费数万美元,不包括每个报告用户的年费。另一方面,硬件现在很便宜,并且有很多免费的数据库。在许多情况下,对于中型项目,购买RAD或HDD并以牺牲硬件资源为代价来解决问题会更容易。
OLAP多维数据集生成器将表格数据作为输入,并生成具有聚合数据的特殊格式的多维数据集文件。如果我们采用表格数据,将它们聚合在SQL Server数据库中并将它们反向放入SQL Server表中,该怎么办?我将使用Tableau superstore文件作为示例的数据源。该文件可在Tableau论坛上访问。超级文件
我改变了一下。我用`替换了引号和省略号。并替换了标题中下划线符号的空格和短划线。我还添加了2年的发货日期,只需2018年和2019年。所有OLAP查询都返回大量数据集,因此我只提供一段摘录而不是所有数据来了解数据结构。如果您想要所有数据,请自行运行。我把Year,Month和Quarter分别作为一个单独的列。我还在日期格式中添加了Shipment月份。所有这些都使SQL查询更简单。
源数据样本(Tableau Superstore摘录)
您可以使用SQL脚本生成我使用的完整表。文章开头代码中的insert_script脚本
最原始的分析查询看起来像这样......
简单的分析查询
SELECT
State,
Ship_Year,
Ship_Month,
Category ,
sum(Sales) as SumSales,
sum(Quantity) as SumQuantity,
sum(Profit) as SumProfit
FROM tblSales
GROUP BY State, Category , Ship_Year, Ship_Month
此查询的结果是按州按类别按月销售集合。
查询1结果(不包括所有数据)
State | Ship_Year | Ship_Month | Category | SumSales | SumQuantity | SumProfit |
Alabama | 2018 | 9 | Furniture | 350.98 | 1 | 84.2352 |
Alabama | 2019 | 10 | Furniture | 10.16 | 2 | 3.4544 |
Alabama | 2019 | 12 | Furniture | 141.96 | 2 | 35.49 |
Alabama | 2018 | 5 | Office Supplies | 4.98 | 1 | 2.4402 |
Alabama | 2018 | 6 | Office Supplies | 247.65 | 9 | 75.6134 |
Alabama | 2018 | 8 | Office Supplies | 197.05 | 7 | 59.115 |
Alabama | 2018 | 9 | Office Supplies | 913.16 | 6 | 123.0272 |
Alabama | 2018 | 10 | Office Supplies | 88.86 | 12 | 37.9872 |
Alabama | 2018 | 11 | Office Supplies | 269.77 | 22 | 117.3024 |
Alabama | 2019 | 1 | Office Supplies | 56.37 | 16 | 23.1341 |
Alabama | 2019 | 4 | Office Supplies | 492.18 | 7 | 16.4172 |
Alabama | 2019 | 5 | Office Supplies | 26.16 | 4 | 12.8184 |
Alabama | 2019 | 6 | Office Supplies | 16.24 | 4 | 6.7064 |
Alabama | 2019 | 12 | Office Supplies | 728.8 | 10 | 349.281 |
Alabama | 2018 | 6 | Technology | 979.95 | 5 | 274.386 |
Alabama | 2018 | 7 | Technology | 29 | 2 | 7.25 |
Alabama | 2018 | 10 | Technology | 209.97 | 3 | 71.3898 |
Alabama | 2018 | 11 | Technology | 4359.96 | 12 | 1987.184 |
Alabama | 2019 | 4 | Technology | 25.98 | 2 | 1.5588 |
Alabama | 2019 | 11 | Technology | 239.92 | 8 | 23.992 |
Alabama | 2019 | 12 | Technology | 90.48 | 2 | 23.5248 |
Arizona | 2018 | 2 | Furniture | 14.368 | 2 | 3.9512 |
Arizona | 2018 | 4 | Furniture | 2229.726 | 15 | -702.3924 |
Arizona | 2018 | 5 | Furniture | 111.888 | 7 | 22.3776 |
Arizona | 2018 | 9 | Furniture | 393.165 | 3 | -204.4458 |
Arizona | 2018 | 10 | Furniture | 477.488 | 7 | -34.641 |
Arizona | 2018 | 12 | Furniture | 455.97 | 6 | -218.8656 |
Arizona | 2019 | 1 | Furniture | 83.413 | 3 | -19.9054 |
Arizona | 2019 | 2 | Furniture | 455.614 | 11 | -169.3996 |
Arizona | 2019 | 5 | Furniture | 209.979 | 7 | -356.9643 |
Arizona | 2019 | 6 | Furniture | 280.792 | 1 | 35.099 |
Arizona | 2019 | 7 | Furniture | 1275.144 | 10 | -481.6947 |
Arizona | 2019 | 8 | Furniture | 120.576 | 8 | 33.1584 |
Arizona | 2019 | 11 | Furniture | 1126.592 | 8 | -141.8228 |
Arizona | 2019 | 12 | Furniture | 1033.47 | 9 | -27.8676000000001 |
要按月计算OLAP多维数据集,我们必须稍微更改此查询。源多维数据集包含层次结构。
- 地理层次结构:国家 ->地区 ->州 ->城市
- 产品层次结构:分类-> 子类-> 产品
- 期间:月,季度
让我们将层次级别字段添加到每个维度(geography,product,period)中,以明确标识层次级别。
查询2
SELECT
'State' as GeoLevel,
State as Geo,
Ship_Year as SaleYear,
Ship_Month as SalePeriod,
max(Ship_MonthDate) as SalePeriodDate,
DATEADD(month, -1, max(Ship_MonthDate)) as PrevSalePeriodDate,
'M' as PeriodType,
'M' as PeriodSubType,
'Category' as ProductLevel,
Category as ProductName,
sum(Sales) as SumSales,
sum(Quantity) as SumQuantity,
sum(Profit) as SumProfit
FROM tblSales
GROUP BY State, Category , Ship_Year, Ship_Month
查询2结果(不包括所有数据)
Geo | Geo | Sale | Sale | Sale | Prev | Per | Per | Product | Product | Sum | Sum | Sum |
State | Alabama | 2018 | 9 | 01.09.2018 | 01.08.2018 | M | M | Category | Furniture | 350.98 | 1 | 84.2352 |
State | Alabama | 2019 | 10 | 01.10.2019 | 01.09.2019 | M | M | Category | Furniture | 10.16 | 2 | 3.4544 |
State | Alabama | 2019 | 12 | 01.12.2019 | 01.11.2019 | M | M | Category | Furniture | 141.96 | 2 | 35.49 |
State | Alabama | 2018 | 5 | 01.05.2018 | 01.04.2018 | M | M | Category | Office Supplies | 4.98 | 1 | 2.4402 |
State | Alabama | 2018 | 6 | 01.06.2018 | 01.05.2018 | M | M | Category | Office Supplies | 247.65 | 9 | 75.6134 |
State | Alabama | 2018 | 8 | 01.08.2018 | 01.07.2018 | M | M | Category | Office Supplies | 197.05 | 7 | 59.115 |
State | Alabama | 2018 | 9 | 01.09.2018 | 01.08.2018 | M | M | Category | Office Supplies | 913.16 | 6 | 123.0272 |
State | Alabama | 2018 | 10 | 01.10.2018 | 01.09.2018 | M | M | Category | Office Supplies | 88.86 | 12 | 37.9872 |
State | Alabama | 2018 | 11 | 01.11.2018 | 01.10.2018 | M | M | Category | Office Supplies | 269.77 | 22 | 117.3024 |
State | Alabama | 2019 | 1 | 01.01.2019 | 01.12.2018 | M | M | Category | Office Supplies | 56.37 | 16 | 23.1341 |
State | Alabama | 2019 | 4 | 01.04.2019 | 01.03.2019 | M | M | Category | Office Supplies | 492.18 | 7 | 16.4172 |
State | Alabama | 2019 | 5 | 01.05.2019 | 01.04.2019 | M | M | Category | Office Supplies | 26.16 | 4 | 12.8184 |
State | Alabama | 2019 | 6 | 01.06.2019 | 01.05.2019 | M | M | Category | Office Supplies | 16.24 | 4 | 6.7064 |
State | Alabama | 2019 | 12 | 01.12.2019 | 01.11.2019 | M | M | Category | Office Supplies | 728.8 | 10 | 349.281 |
State | Alabama | 2018 | 6 | 01.06.2018 | 01.05.2018 | M | M | Category | Technology | 979.95 | 5 | 274.386 |
State | Alabama | 2018 | 7 | 01.07.2018 | 01.06.2018 | M | M | Category | Technology | 29 | 2 | 7.25 |
State | Alabama | 2018 | 10 | 01.10.2018 | 01.09.2018 | M | M | Category | Technology | 209.97 | 3 | 71.3898 |
State | Alabama | 2018 | 11 | 01.11.2018 | 01.10.2018 | M | M | Category | Technology | 4359.96 | 12 | 1987.184 |
State | Alabama | 2019 | 4 | 01.04.2019 | 01.03.2019 | M | M | Category | Technology | 25.98 | 2 | 1.5588 |
State | Alabama | 2019 | 11 | 01.11.2019 | 01.10.2019 | M | M | Category | Technology | 239.92 | 8 | 23.992 |
State | Alabama | 2019 | 12 | 01.12.2019 | 01.11.2019 | M | M | Category | Technology | 90.48 | 2 | 23.5248 |
State | Arizona | 2018 | 2 | 01.02.2018 | 01.01.2018 | M | M | Category | Furniture | 14.368 | 2 | 3.9512 |
State | Arizona | 2018 | 4 | 01.04.2018 | 01.03.2018 | M | M | Category | Furniture | 2229.726 | 15 | -702.3924 |
State | Arizona | 2018 | 5 | 01.05.2018 | 01.04.2018 | M | M | Category | Furniture | 111.888 | 7 | 22.3776 |
State | Arizona | 2018 | 9 | 01.09.2018 | 01.08.2018 | M | M | Category | Furniture | 393.165 | 3 | -204.4458 |
State | Arizona | 2018 | 10 | 01.10.2018 | 01.09.2018 | M | M | Category | Furniture | 477.488 | 7 | -34.641 |
State | Arizona | 2018 | 12 | 01.12.2018 | 01.11.2018 | M | M | Category | Furniture | 455.97 | 6 | -218.8656 |
State | Arizona | 2019 | 1 | 01.01.2019 | 01.12.2018 | M | M | Category | Furniture | 83.413 | 3 | -19.9054 |
State | Arizona | 2019 | 2 | 01.02.2019 | 01.01.2019 | M | M | Category | Furniture | 455.614 | 11 | -169.3996 |
State | Arizona | 2019 | 5 | 01.05.2019 | 01.04.2019 | M | M | Category | Furniture | 209.979 | 7 | -356.9643 |
State | Arizona | 2019 | 6 | 01.06.2019 | 01.05.2019 | M | M | Category | Furniture | 280.792 | 1 | 35.099 |
State | Arizona | 2019 | 7 | 01.07.2019 | 01.06.2019 | M | M | Category | Furniture | 1275.144 | 10 | -481.6947 |
State | Arizona | 2019 | 8 | 01.08.2019 | 01.07.2019 | M | M | Category | Furniture | 120.576 | 8 | 33.1584 |
State | Arizona | 2019 | 11 | 01.11.2019 | 01.10.2019 | M | M | Category | Furniture | 1126.592 | 8 | -141.8228 |
State | Arizona | 2019 | 12 | 01.12.2019 | 01.11.2019 | M | M | Category | Furniture | 1033.47 | 9 | -27.8676000000001 |
State | Arizona | 2018 | 3 | 01.03.2018 | 01.02.2018 | M | M | Category | Office Supplies | 272.318 | 9 | 17.1947 |
State | Arizona | 2018 | 4 | 01.04.2018 | 01.03.2018 | M | M | Category | Office Supplies | 396.813 | 20 | -32.6021 |
State | Arizona | 2018 | 7 | 01.07.2018 | 01.06.2018 | M | M | Category | Office Supplies | 95.424 | 13 | -30.1926 |
在进一步的操作中,我将使用JavaScript生成的动态SQL。让我们参数化查询2以展开所有级别的层次结构(地理位置,产品和期间)。我使用JavaScript here-docs进行模板化和变量插值。感谢上帝,它现在支持here-docs,并且没有必要在这里处理Perl文档。我们必须生成所有层次结构成员的笛卡尔积,并计算其度量聚合。
月季度
要生成SQL,请将JavaScript代码复制到HTML文件并使用Chrome打开它。在SQL Server Management Studio中运行它之前,请为季度算术创建一个标量函数dbo.prevQuarter。
月季度生成脚本sales_M_Q.html
月季度(res_mq.zip)查询结果生成脚本sales_M_Q.html。
在SQL Server management studio 中创建由此脚本生成的视图(最后UNION ALL应删除),它为您提供了一个事实数据表形式的OLAP多维数据集,其中包含所有层次结构的层次结构。请注意,如果您遇到同一项目city同时处于两个具有相同名称的状态的情况,则必须像在脚本中那样使其独一无二。销售数据的性质不允许在多个类别中拥有相同的产品。所以当我们扁平化层次结构时没有问题。否则,我们必须存储具有分组路径的单独字段。
对于季度期间,我将季度存储在date的month字段中。为了计算上一季度,我使用UDF dbo.prevQuarter进行季度数学运算。
年初至今的月份季度
要计算YTD期间,我们必须在额外周期中循环处理所有12个月。为了删除大于当前时间段的期间,where部分datefromparts( Ship_Year, ${xmonth}, 1) <= datefromparts(year(GETDATE()), month(GETDATE()),1)中有一个额外的条件。
YTD月季度生成脚本sales_YTD_M_Q.zip YTD脚本结果(res_ytd.zip)
移动年度总月季度
为了生成MAT,我们的数据库必须有3年的趋势。在我的脚本中,我只取了一个去年的数据。我遍历列表中的所有时段并计算12个月的聚合。这种情况与YTD脚本${xperiod} <= datefromparts(year(GETDATE()), month(GETDATE()),1)中的情况相同,可以缩短未来的周期。和以前的脚本一样,我在month字段上存储季度。这将允许进一步计算PPG%。MAT脚本sales_MAT_M_Q.zip
MAT脚本结果(res_mat.zip)。
结果多维数据集
生成所有3个查询后,您可以创建结果查询并执行SELECT INTO tblCube。然后,您可以在维度上添加一些索引以提高性能。
select * from [cubeq].[CUBE_Sales_M_Q]
UNION ALL
select * from [cubeq].[CUBE_Sales_YTD_M_Q]
UNION ALL
select * from [cubeq].[CUBE_Sales_MAT_M_Q]
结论
这种方法允许使用昂贵的系统免费计算销售分析。
源INSERT脚本大约3MB。生成的多维数据集为31MB。这是10倍的增长。这是因为组合数据爆炸。
在其他项目中,50 MB平面文件打开500MB SQL Server数据库。对于现代数据库而言,这不是一个重要的卷。
在此阶段,您可以在SSRS或MS Report Builder中基于这些数据绘制仪表板。当然,在没有层次结构元信息的情况下,您将受到平面过滤Excel自动过滤方式的限制,但数据将适用于分析。您可以在SQLite、MySQL或任何免费数据库中执行相同的操作。
我在两个销售分析项目中使用了这种方法,效果很好。首先是制药公司的KPI计算,第二个是B2B应用的报告门户。在以后的文章中,我将展示如何构建免费的OLAP Web应用程序。
原文地址:https://www.codeproject.com/Articles/5139754/OLAP-for-Free