Query Performance regarding DISTINCT COUNT Measure

Generally, we need to take are when we created DISTINCE COUNT(DC) measure in our cube as this kind of measure will cause performance issue.

As to performance issue in Anlaysis Service (2000 & 2005), it includes the process performance and query performance.

----Process performance with DC measure
Cubes with distinct count measures take longer to process, for two reasons: because the processing query against the relational database uses an ORDER BY clause, and because distinct counts dramatically increase the size of each aggregation. Large aggregations take longer to calculate because of their size, and because they may require Analysis Services to use temporary files to supplement the process buffer memory.

To resovle this issue,
A. The OLTP source table of the target cube shoud at least have an index on the distinct count column as this column will be used in the ORDER BY clause during process.This may avoid a large sort operation on the relational side and so should improve processing.

B. Processing memory usage should be considered when a distinct count measure is present. Please follow the following performance guide to tune the process buffer size so that we can aovoid avoid temp file usage as much as possible. However, it's normal that we cannot completely avoid temp file usage sometimes.
Appendix D: How to Tune the Process Buffer Size
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ansvcspg.mspx#E6TAG

C. Partitioning cube so that the partitions can be processed parallelly. In Analysis Services 2000, the partitions were processed serially unless a custom DSO program explicitly forced parallel processing, such as the Parallel Process utility (available as a free download is http://www.microsoft.com/downloads/details.aspx?FamilyID=a2eef773-6df7-4688-8211-e02cf13cbdb4&DisplayLang=en). In SSAS (Analysis Service 2005, the partitions are processed parallelly by default.

Also, indexes supporting the partition boundaries may help avoid table scans during process.

----Query performance with DC measure
Cubes with distinct count measures are also slower to query because Analysis Services must query each partition serially rather than querying partitions in parallel, which it normally does.


To solve this problem,
1. Using appropriate aggregations.

2. In Anlaysis Service 2000, create a separate physical cube for the distinct measure with the same dimensions as the initial cube, and then use a virtual cube that links these two cubes together to give the appearance to the user of one seamless cube. Remember that you can only have one distinct count measure per physical cube. Analysis Services resolves user queries that do not use the distinct count measure against the initial cube in parallel with queries that require the distinct count measure.

In Analysis Service 2005, we need to create a separate measure group for the DC measure and the other non-DC measures should be in different measure group.

Thanks and regards,
Michael

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值