Microsoft Data Warehouse Toolkit —读书笔记(一):序言

序言

本书讨论的是基于SQL2005DW/BI系统构建过程,并以kimball的维度建模理论为指导。

在阅读本书前的参考书有:

n         Database warehouse toolkit

n         Database warehouse ETL toolkit

TITLE

SUBJECT

PRIMARY AUDIENCE

The Data Warehouse Lifecycle Toolkit

Implementation guide

Good overview for all project participants; key tool for project managers, business analysts, and data modelers

The Data Warehouse Toolkit, Second Edition

Dimensional data modeling

Data modelers, business analysts, DBAs, ETL developers

The Data Warehouse ETL Toolkit

ETL System architecture

ETL architects and developers

 

本书的WEB站点如下:

       www.MsftDWToolkit.com.

一、       业务维度生命周期

在许多DW项目开始时有一个错误的观念,认为DW就是把数据迁移到一个新的机器上,并且做一些清洗,开发一些报表。这样的工作最多需要两个月。这种观念是错误的,就好像要过河,步行到水中的时候才发现应该建一座桥。

二、       Lifecycle Tracks and Task Areas

·         The top track is about technology. These tasks are primarily about planning which pieces of Microsoft technology you’ll use, and how you’ll install and configure them.

·         The middle track is about data. In the data track you’ll design and instantiate the dimensional model, and develop the Extract, Transformation, and Load (ETL) system to populate it. You could think of the data track as “building the data warehouse databases,” although your data warehouse will not succeed unless you surround it with the rest of the Lifecycle tasks.

·         The bottom track is about business intelligence applications. In these tasks you design and develop BI applications for the business users.

三、       关键术语

1.      The business process dimensional model

is a specific discipline for modeling data that is an alternative to normalized modeling. A dimensional model contains the same information as a normalized model but packages the data in a symmetrical format whose design goals are user understandability, business intelligence query performance, and resilience to change. Normalized models, sometimes called third normal form models, were designed to support the high-volume, single-row inserts and updates that define transaction systems, and generally fail at being understandable, fast, and resilient to change

2.      The online analytic processing (OLAP) database

is a technology for storing, managing, and querying data specifically designed to support business intelligence uses. SQL Server 2005 Analysis Services is Microsoft’s OLAP database engine. The business process dimensional model can be stored in an OLAP database, but a transactional database cannot, unless it first undergoes transformation to cast it in an explicitly dimensional form.

3.      Business intelligence (BI) applications

are predefined applications that query, analyze, and present information to support a business need. There is a spectrum of BI applications, ranging in complexity from a set of predefined static reports, all the way to an analytic application that directly affects transaction systems and the day-to-day operation of the organization. You can use SQL Server Reporting Services to build a reporting application, and a wide range of Microsoft and third-party technologies to build complex, analytic applications.

4.      A data mining model

 is a statistical model, often used to predict future behavior based on data about past behavior. Data mining is a term for a loose (and ever-changing) collection of statistical techniques or algorithms that serve different purposes. The major categories are clustering, decision trees, neural networks, and prediction. Analysis Services Data Mining is an example of a data mining tool.

5.      Ad hoc queries

are formulated by the user on the spur of the moment. The dimensional modeling approach is widely recognized as the best technique to support ad hoc queries because the simple database structure is easy to understand. Microsoft Office, notably Excel pivot tables, is the most popular ad hoc query tool on the market. You can use Reporting Services Report Builder to perform ad hoc querying and simple report definition. Nonetheless, many systems supplement Excel and Report Builder with a third-party ad hoc query tool for their power users.

四、       角色与职责

·         The DW/BI manager is responsible for overall leadership and direction of the project. The DW/BI manager must be able to communicate effectively with both senior business and IT management. The manager must also be able to work with the team to formulate the overall architecture of the DW/BI system.

·         The project manager is responsible for day-to-day management of project tasks and activities during system development.

·         The business project lead is a member of the business community and works closely with the project manager.

·         The business systems analyst (or business analyst) is responsible for leading the business requirements definition activities, and often participates in the development of the business process dimensional model. The business systems analyst needs to be able to bridge the gap between business and technology.

·         The data modeler is responsible for performing detailed data analysis including data profiling, and developing the detailed dimensional model.

·         The system architect(s) design the various components of the DW/BI system. These include the ETL system, security system, auditing system, and maintenance systems.

·         The development database administrator (DBA) creates the relational data warehouse database(s) and is responsible for the overall physical design including disk layout, partitioning, and initial indexing plan.

·         The OLAP database designer creates the OLAP databases.

·         The ETL system developer creates Integration Services packages, scripts, and other elements to move data from the source databases into the data warehouse.

·         The DW/BI management tools developer writes any custom tools that are necessary for the ongoing management of the DW/BI system. Examples of such tools include a simple UI for entering metadata, scripts or Integration Services packages to perform system backups and restores, and a simple UI for maintaining dimension hierarchies.

·         The BI application developer is responsible for building the BI applications, including the standard reports and any advanced analytic applications required by the business. This role is also responsible for developing any custom components in the BI portal and integrating data mining models into business operations.

当进入布署和维护阶段后,增加了一些新的角色:

·         The data steward is responsible for ensuring the data in the data warehouse is accurate.

·         The security manager specifies new user access roles that the business users need, and adds users to existing roles. The security manager also determines the security procedures in the ETL “back room” of the DW/BI system.

·         The BI portal content manager manages the BI portal. She determines the content that’s on the portal and how it’s laid out, and keeps it fresh.

·         The DW/BI educator creates and delivers the training materials for the BI/DW system.

·         The relational database administrator (DBA) is responsible for managing the performance and operations of the relational data warehouse database.

·         The OLAP DBA is responsible for managing the performance and operations of the OLAP data warehouse database.

·         The compliance manager is responsible for ensuring that the DW/BI policies and operations comply with corporate and regulatory directives such as privacy policies, HIPAA, and Sarbanes-Oxley. The compliance manager works closely with the security manager and Internal Audit.

·         The metadata manager has the final word on what metadata is collected, where it is kept, and how it’s published to the business community. As we discuss in Chapter 13, metadata tends not to be managed unless there’s a person identified to lead the charge.

·         The data mining analyst is deeply familiar with the business and usually has some background in statistics. The data mining analyst develops data mining models and works with the BI application developers to design operational applications that use the data mining models.

·         User support personnel within the DW/BI team must be available to help business users, especially with ad hoc access. Corporate-wide help desks tend not to have the specialized expertise necessary to do more than assist with minor connectivity issues.


 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值