数仓六西格玛(Six Sigma)

About Six Sigma

https://docs.oracle.com/cd/B31080_01/doc/owb.102/b28223/concept_data_quality.htm

 

Warehouse Builder provides Six Sigma results embedded within the other data profiling results to provide a standardized approach to data quality.

What is Six Sigma?

Six Sigma is a methodology that attempts to standardize the concept of quality in business processes. It achieves this goal by statistically analyzing the performance of business processes. The goal of Six Sigma is to improve the performance of these processes by identifying the defects, understanding them, and eliminating the variables that cause these defects.

Six Sigma metrics give a quantitative number for the number of defects in each 1,000,000 opportunities. The term "opportunities" can be interpreted as the number of records. The perfect score is 6.0. The score of 6.0 is achieved when there are only 3.4 defects in each 1,000,000 opportunities. The score is calculated using the following formula:

  • Defects Per Million Opportunities (DPMO) = (Total Defects / Total Opportunities) * 1,000,000

  • Defects (%) = (Total Defects / Total Opportunities)* 100%

  • Yield (%) = 100 - %Defects

  • Process Sigma = NORMSINV(1-((Total Defects) / (Total Opportunities))) + 1.5

    where NORMSINV is the inverse of the standard normal cumulative distribution.

Six Sigma Metrics for Data Profiling

Six Sigma metrics are also provided for data profiling in Warehouse Builder. When you perform data profiling, the number of defects and anomalies discovered are shown as Six Sigma metrics. For example, if data profiling finds that a table has a row relationship with a second table, the number of records in the first table that do not adhere to this row-relationship can be described using the Six Sigma metric.

Six Sigma metrics are calculated for the following measures in the Data Profile Editor:

  • Aggregation: For each column, the number of null values (defects) to the total number of rows in the table (opportunities).

  • Data Types: For each column, the number of values that do not comply with the documented data type (defects) to the total number of rows in the table (opportunities).

  • Data Types: For each column, the number of values that do not comply with the documented length (defects) to the total number of rows in the table (opportunities).

  • Data Types: For each column, the number of values that do not comply with the documented scale (defects) to the total number of rows in the table (opportunities).

  • Data Types: For each column, the number of values that do not comply with the documented precision (defects) to the total number of rows in the table (opportunities).

  • Patterns: For each column, the number of values that do not comply with the common format (defects) to the total number of rows in the table (opportunities).

  • Domains: For each column, the number of values that do not comply with the documented domain (defects) to the total number of rows in the table (opportunities).

  • Referential: For each relationship, the number of values that do not comply with the documented foreign key (defects) to the total number of rows in the table (opportunities).

  • Referential: For each column, the number of values that are redundant (defects) to the total number of rows in the table (opportunities).

  • Unique Key: For each unique key, the number of values that do not comply with the documented unique key (defects) to the total number of rows in the table (opportunities).

  • Unique Key: For each foreign key, the number of rows that are childless (defects) to the total number of rows in the table (opportunities).

  • Data Rule: For each data rule applied to the data profile, the number of rows that fail the data rule to the number of rows in the table.

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值