[读书笔记]DataWarehouse for dummies(1)

[29/07/2008 20:04:04 ]Data Warehouse for Dummies

Formal Definitions:
    Data warehousing is the coordinated, architected and periodic COPYING of data from various sources, both inside and outside the enterprise, into an environment optimized for analytical and informational processing.
    (Tips)The keys to this definition are that that the data is copied(duplicated) in a controlled manner and copied periodically, batch-oriented processing.
    
Is a bigger Data Warehouse a Better Data Warehouse?
    A common misconception that many data warehouse aficionados hold is that the only good data warehouse is a big data warehouse -- an enormously big data warehouse. However, the path to determining the size of data warehouse should look like this:
    1.Determine the mission, or the business objectives, of the data warehouse
      "Why bother creating this warehouse?"
    2.Determine the functionality the data warehouse should have
      Figure out what types of questions users will ask and what types of answers they will seek
    3.Determine what contents(types of data) should be in the data warehouse to support it functionality
    4.Determine, based on the content(which is based on the functionality, which in turn is based on the mission), how BIG your data warehouse should be
   
Realizing That a Data Warehouse Usually Has a Historical Perspective
    It should be better to send a small number of updates, perhaps only one at a time, much more frequently from the data source to the data warehouse.
    And we will get a much more up-to-date picture of the subject areas of the data warehouse.
   
It's Data Warehouse, Not Data Dump
    In a commonly related story about knowledge gained from a successful data warehouse implementation, a grocery-store chain discovered an unusually high correlation of disposable baby diapers and beer sales during a two- or three-hour period early every Friday evening and found out that a significant number of people on their way home from work were buying both these items. The store then began stocking display shelves with beer and disposable diapers next to on another, and sales increased significantly.
    You should be selective about what goes in your data warehouse and not just assume that you should be able ask any possible question and therefore have to get every possible type of data from all the sources.
    Anyway, you should look at it this way: For some types of data, you can analyze, analyze and analyze some more and still find out little of value that could positively affect your business. Although you can put this data in your warehouse, you probably won't get much for your trouble. Other types of data, though, have significant value "locked away" and therefore do belong in your warehouse.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14210591/viewspace-412225/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/14210591/viewspace-412225/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值