TIPS: Data Warehouse

最新推荐文章于 2022-12-22 21:30:35 发布

informatica

最新推荐文章于 2022-12-22 21:30:35 发布

阅读量489

点赞数

分类专栏： TIPS DW 文章标签： transactions schema transformation null date calendar

本文链接：https://blog.csdn.net/informatica/article/details/5678363

版权

TIPS 同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

3 篇文章 0 订阅

订阅专栏

DW Design Procedure
- The basic step of data warehousing starts with data modeling. i.e. creation dimensions and facts.
- Data warehouse starts with collection of data from source systems such as OLTP,CRM,ERPs etc
- Cleansing and transformation process is done with ETL(Extraction Transformation Loading) tool.
- By the end of ETL process target databases(dimensions, facts) are ready with data which accomplishes the business rules.
- Now finally with the use of Reporting tools(OLAP) we can get the information which is used for decision support.
Granularity—Granularity is the level of detail required. In other words, should a data warehouse store every single transaction? Should it summarize transactions as a single record for each day, month, year, and so on? The more granularity the data warehouse contains, the bigger fact tables are because the more records they contain.
Snowflake schema------ a normalized star schema, such that dimension entities are normalized (dimensions are separated into multiple tables). Normalized dimensions have all duplication removed from each dimension, such that the result is a single fact table, connected directly to some of the dimensions.
class->table; object->column (dimension ,measure, detail)
Source DB->ETL (Informatica)->Target DB(ErWin) -> redesign many schema on Target DB for different purpose (Desigher: Schema)->Build many Universe for different requirement from schema(Designer)->Build Query&Reporting (BO&Webintelligent)>Admin
Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
Difference from OLAP and OLTP

OLTP	OLAP
Current data	Current and historical data
Short database transactions	Long database transactions
Online update/insert/delete	Batch update/insert/delete
Normalization is promoted	De-normalization is promoted
High volume transactions	Low volume transactions
Transaction recovery is necessary	Transaction recovery is not necessary

Materialized View (MV): The view is a tail raid representation of data, but materialized view is stores precaluculated data, view is a logical structure but MV is physical structure, view is can't occupy the data space (use stored query) but MV occupies data space.
Star schema contains the dimension tables mapped around one or more fact tables. It is a denormalised model. No need to use complicated joins. Queries results fast.
Snowflake schema It is the normalised form of Star schema. contains in depth joins ,because the tables r spitted in to many pieces. We can easily do modification directly in the tables. We have to use complicated joins ,since we have more tables . There will be some delay in processing the Query .

The snowflake schema is a schema in which the fact table is indirectly linked to a number of dimension tables. The dimension tables are normalized to remove redundant data and partitioned into a number of dimension tables for ease of maintenance. An example of the snowflake schema is the splitting of the Product dimension into the product_category dimension and product_manufacturer dimension..
Conformed dimensions' are dimensions which are common to the cubes
Generate DATE DIM drop table generic_date_dim
go

CREATE TABLE generic_date_dim (

       generic_date_dim_id int not null primary key,

       date_value           smalldatetime NOT NULL unique,

       the_day              varchar(60) NOT NULL,

       the_month            varchar(60) NOT NULL,

       the_year             varchar(60) NOT NULL,

       day_of_the_month     int NOT NULL,

       day_of_the_year      int NOT NULL,

       week_of_the_year     int NOT NULL,

       month_of_the_year    int NOT NULL,

       calendar_quarter     int NOT NULL,

       fiscal_year          int NOT NULL,

       fiscal_quarter       int NOT NULL

)

go

declare @startDay datetime, @endDay datetime

select @startDay = '1/1/2003',

@endDay =   '4/30/2004'

declare @i int, @currentDay smalldatetime

set @currentDay = @startDay

begin transaction

while @currentDay <= @endDay

begin

insert into generic_date_dim([generic_date_dim_id], [date_value],

[the_day], [the_month], [the_year], [day_of_the_month], [day_of_the_year],

[week_of_the_year], [month_of_the_year], [calendar_quarter], [fiscal_year],

[fiscal_quarter])

select dbo.date$calcIdValue(@currentDay),

@currentDay as dateValue,

datename(dw, @currentDay) as theDay,

datename(month, @currentDay) as theMonth,

datename(year, @currentDay) as theYear,

datepart(day,@currentDay) as dayOfTheMonth,

datepart(dayofyear,@currentDay) as dayOfTheYear,

datepart(week,@currentDay) as weekOfTheYear,

datepart(month,@currentDay) as monthOfTheYear,

cast((datepart(month,@currentDay) / 4) + 1 as varchar(3)) as

calendarQuarter,

datepart(year, @currentDay) as fiscalYear,

cast((datepart(month,@currentDay) / 4) + 1 as varchar(3)) as fiscalQuarter

  if @@error <> 0

begin

raiserror 50000 'Error creating timeByDay dimension'

rollback transaction

goto endofscript

end

  set @currentDay = dateadd(day,1,@currentDay)

end



commit transaction
Data Warehouse Performance Tunings??

informatica

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
TIPS: Data Warehouse

DW Design ProcedureThe basic step of data warehousing starts with data modeling. i.e. creation dimensions and facts.Data warehouse starts with collection of data from source systems such as OLTP,CRM,ERPs etcCleansing and transformation process is done wit
复制链接

扫一扫

专栏目录