A data warehouse is a system with its own database. It draws data from diverse sources and is designed to support query and analysis. To facilitate data retrieval for analytical processing, we use a special database design technique called a star schema.
The concept of a star schema is not new; indeed, it has been used in industry for years. For the data in the previous section, we can create a star schema like that shown in Figure 1.1.
The star schema derives its name from its graphical representation—that is, it looks like a star. A fact table appears in the middle of the graphic, along with several surrounding dimension tables. The central fact table is usually very large, measured in gigabytes. It is the table from which we retrieve the interesting data. The size of the dimension tables amounts to only 1 to 5 percent of the size of the fact table. Common dimensions are unit and time, which are not shown in Figure 1.1. Foreign keys tie the fact table to the dimension tables. Keep in mind that dimension tables are not required to be normalized and that they can contain redundant data.
As indicated in Table 1.3, the sales organization changes over time. The dimension to which it belongs - sales rep dimension - is called the slowly changing dimension.
The following steps explain how a star schema works to calculate the total quantity sold in the Midwest region:
1. From the sales rep dimension, select all sales rep IDs in the Midwest region.
2. From the fact table, select and summarize all quantity sold by the sales rep IDs of Step 1.
The concept of a star schema is not new; indeed, it has been used in industry for years. For the data in the previous section, we can create a star schema like that shown in Figure 1.1.
The star schema derives its name from its graphical representation—that is, it looks like a star. A fact table appears in the middle of the graphic, along with several surrounding dimension tables. The central fact table is usually very large, measured in gigabytes. It is the table from which we retrieve the interesting data. The size of the dimension tables amounts to only 1 to 5 percent of the size of the fact table. Common dimensions are unit and time, which are not shown in Figure 1.1. Foreign keys tie the fact table to the dimension tables. Keep in mind that dimension tables are not required to be normalized and that they can contain redundant data.
As indicated in Table 1.3, the sales organization changes over time. The dimension to which it belongs - sales rep dimension - is called the slowly changing dimension.
The following steps explain how a star schema works to calculate the total quantity sold in the Midwest region:
1. From the sales rep dimension, select all sales rep IDs in the Midwest region.
2. From the fact table, select and summarize all quantity sold by the sales rep IDs of Step 1.
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/23592834/viewspace-753674/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/23592834/viewspace-753674/