Query Performance regarding DISTINCT COUNT Measure

最新推荐文章于 2024-06-20 01:44:41 发布

wantfly2006

最新推荐文章于 2024-06-20 01:44:41 发布

阅读量1.1k

点赞数

分类专栏： BI(Analysis Service) 文章标签： performance query parallel processing service buffer

本文链接：https://blog.csdn.net/wantfly2006/article/details/713734

版权

BI(Analysis Service) 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Generally, we need to take are when we created DISTINCE COUNT(DC) measure in our cube as this kind of measure will cause performance issue.

As to performance issue in Anlaysis Service (2000 & 2005), it includes the process performance and query performance.

----Process performance with DC measure
Cubes with distinct count measures take longer to process, for two reasons: because the processing query against the relational database uses an ORDER BY clause, and because distinct counts dramatically increase the size of each aggregation. Large aggregations take longer to calculate because of their size, and because they may require Analysis Services to use temporary files to supplement the process buffer memory.

To resovle this issue,
A. The OLTP source table of the target cube shoud at least have an index on the distinct count column as this column will be used in the ORDER BY clause during process.This may avoid a large sort operation on the relational side and so should improve processing.

B. Processing memory usage should be considered when a distinct count measure is present. Please follow the following performance guide to tune the process buffer size so that we can aovoid avoid temp file usage as much as possible. However, it's normal that we cannot completely avoid temp file usage sometimes.
Appendix D: How to Tune the Process Buffer Size
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ansvcspg.mspx#E6TAG

C. Partitioning cube so that the partitions can be processed parallelly. In Analysis Services 2000, the partitions were processed serially unless a custom DSO program explicitly forced parallel processing, such as the Parallel Process utility (available as a free download is http://www.microsoft.com/downloads/details.aspx?FamilyID=a2eef773-6df7-4688-8211-e02cf13cbdb4&DisplayLang=en). In SSAS (Analysis Service 2005, the partitions are processed parallelly by default.

Also, indexes supporting the partition boundaries may help avoid table scans during process.

----Query performance with DC measure
Cubes with distinct count measures are also slower to query because Analysis Services must query each partition serially rather than querying partitions in parallel, which it normally does.

To solve this problem,
1. Using appropriate aggregations.

2. In Anlaysis Service 2000, create a separate physical cube for the distinct measure with the same dimensions as the initial cube, and then use a virtual cube that links these two cubes together to give the appearance to the user of one seamless cube. Remember that you can only have one distinct count measure per physical cube. Analysis Services resolves user queries that do not use the distinct count measure against the initial cube in parallel with queries that require the distinct count measure.

In Analysis Service 2005, we need to create a separate measure group for the DC measure and the other non-DC measures should be in different measure group.

Thanks and regards,
Michael

wantfly2006

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Query Performance regarding DISTINCT COUNT Measure

Generally, we need to take are when we created DISTINCE COUNT(DC) measure in our cube as this kind of measure will cause performance issue. As to performance issue in Anlaysis Service (2000 & 2005),
复制链接

扫一扫

专栏目录