《数据挖掘概念与技术》学习笔记第2章(2/10)数据仓库和数据挖掘的OLAP技术

 

多维数据模型:

数据仓库和OLTP基于多维数据模型,该模型将数据看成数据立方体(data cube).

多维数据模型的形式有:

星型模式: 一个事实表,若干维度表

雪花模式:一个事实表,若干维度表,但是维度表是规范化的,即进一步把数据分解到附加的表中。省空间,花时间。

事实星座模式:多个事实表,它们各自可以有自己独有的维度表,也可以共享维度表,并且维度表可以是规范化的也可以不是规范化的。

 

数据挖掘查询语言DMQL

DMQL来定义数据立方体:

define cube <cube_name> [<dimension_list>]:<measure_list>

DMQL来定义维度:

define dimension <dimension_name> as (<attribute_or_subdimentiosn_list>)

其中的黑体为关键字

 

DMQL定义星型模型:

define cube sales star [time, item, branch, location]:dollars sold = sum(sales in dollars), units sold = count(*)

define dimension time as (time key, day, day of week, month, quarter, year)

define dimension item as (item key, item name, brand, type, supplier type)

define dimension branch as (branch key, branch name, branch type)

define dimension location as (location key, street, city, province or state,country)

 

 

DMQL定义雪花模型:

define cube sales snowflake [time, item, branch, location]:dollars sold = sum(sales in dollars), units sold = count(*)

define dimension time as (time key, day, day of week, month, quarter, year)

define dimension item as (item key, item name, brand, type, supplier(supplier key, supplier type))

define dimension branch as (branch key, branch name, branch type)

define dimension location as (location key, street, city(city key, city, province or state, country)

 

DMQL(Data Mining Query Language)定义星座模型:

define cube sales [time, item, branch, location]:dollars sold = sum(sales in dollars), units sold = count(*)

define dimension time as (time key, day, day of week, month, quarter, year)

define dimension item as (item key, item name, brand, type, supplier type)

define dimension branch as (branch key, branch name, branch type)

define dimension location as (location key, street, city, province or state, country)

define cube shipping [time, item, shipper, from location, to location]:dollars cost = sum(cost in dollars), units shipped = count(*)

define dimension time as time in cube sales

define dimension item as item in cube sales

define dimension shipper as (shipper key, shipper name, location as location in cube sales, shipper type)

define dimension from location as location in cube sales

define dimension to location as location in cube sales

 


概念分层(concept hierarchy)

 

(a)属性location的层次结构(a hierarchy for location)

(b)属性time的格(a lattice for time)

如果属性的值为连续值,那么将其离散化后也可以形成concept hierarchy.

 

OLAP:

上卷roll-up, 下钻drill-down, 切片slice, 切块dice,旋转 pivot

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值