[29/07/2008 20:04:04 ]Data Warehouse for Dummies
Formal Definitions:
Data warehousing is the coordinated, architected and periodic COPYING of data from various sources, both inside and outside the enterprise, into an environment optimized for analytical and informational processing.
(Tips)The keys to this definition are that that the data is copied(duplicated) in a controlled manner and copied periodically, batch-oriented processing.
Is a bigger Data Warehouse a Better Data Warehouse?
A common misconception that many data warehouse aficionados hold is that the only good data warehouse is a big data warehouse -- an enormously big data warehouse. However, the path to determining the size of data warehouse should look like this:
1.Determine the mission, or the business objectives, of the data warehouse
"Why bother creating this warehouse?"
2.Determine the functionality the data warehouse should have
Figure out what types of questions users will ask and what types of answers they will seek
3.Determine what contents(types of data) should be in the data warehouse to support it functionality
4.Determine, based on the content(which is based on the functionality, which in turn is based on the mission), how BIG your data warehouse should be
Realizing That a Data Warehouse Usually Has a Historical Perspective
It should be better to send a small number of updates, perhaps only one at a time, much more frequently from the data source to the data warehouse.
And we will get a much more up-to-date picture of the subject areas of the data warehouse.
It's Data Warehouse, Not Data Dump
In a commonly related story about knowledge gained from a successful data warehouse implementation, a grocery-store chain discovered an unusually high correlation of disposable baby diapers and beer sales during a two- or three-hour period early every Friday evening and found out that a significant number of people on their way home from work were buying both these items. The store then began stocking display shelves with beer and disposable diapers next to on another, and sales increased significantly.
You should be selective about what goes in your data warehouse and not just assume that you should be able ask any possible question and therefore have to get every possible type of data from all the sources.
Anyway, you should look at it this way: For some types of data, you can analyze, analyze and analyze some more and still find out little of value that could positively affect your business. Although you can put this data in your warehouse, you probably won't get much for your trouble. Other types of data, though, have significant value "locked away" and therefore do belong in your warehouse.
Formal Definitions:
Data warehousing is the coordinated, architected and periodic COPYING of data from various sources, both inside and outside the enterprise, into an environment optimized for analytical and informational processing.
(Tips)The keys to this definition are that that the data is copied(duplicated) in a controlled manner and copied periodically, batch-oriented processing.
Is a bigger Data Warehouse a Better Data Warehouse?
A common misconception that many data warehouse aficionados hold is that the only good data warehouse is a big data warehouse -- an enormously big data warehouse. However, the path to determining the size of data warehouse should look like this:
1.Determine the mission, or the business objectives, of the data warehouse
"Why bother creating this warehouse?"
2.Determine the functionality the data warehouse should have
Figure out what types of questions users will ask and what types of answers they will seek
3.Determine what contents(types of data) should be in the data warehouse to support it functionality
4.Determine, based on the content(which is based on the functionality, which in turn is based on the mission), how BIG your data warehouse should be
Realizing That a Data Warehouse Usually Has a Historical Perspective
It should be better to send a small number of updates, perhaps only one at a time, much more frequently from the data source to the data warehouse.
And we will get a much more up-to-date picture of the subject areas of the data warehouse.
It's Data Warehouse, Not Data Dump
In a commonly related story about knowledge gained from a successful data warehouse implementation, a grocery-store chain discovered an unusually high correlation of disposable baby diapers and beer sales during a two- or three-hour period early every Friday evening and found out that a significant number of people on their way home from work were buying both these items. The store then began stocking display shelves with beer and disposable diapers next to on another, and sales increased significantly.
You should be selective about what goes in your data warehouse and not just assume that you should be able ask any possible question and therefore have to get every possible type of data from all the sources.
Anyway, you should look at it this way: For some types of data, you can analyze, analyze and analyze some more and still find out little of value that could positively affect your business. Although you can put this data in your warehouse, you probably won't get much for your trouble. Other types of data, though, have significant value "locked away" and therefore do belong in your warehouse.
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14210591/viewspace-412225/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/14210591/viewspace-412225/