python cubes 之 Schemas and Models

最新推荐文章于 2023-06-26 14:45:38 发布

wh_xia_jun

最新推荐文章于 2023-06-26 14:45:38 发布

阅读量306

点赞数

分类专栏： olap

olap 专栏收录该内容

25 篇文章 15 订阅

订阅专栏

Schemas and Models

本节包含示例数据库架构及其各自的模型以及说明。这些示例适用于SQL后端。请参阅您选择的后端文档，以获取有关非SQL设置的更多信息。

也可以看看

逻辑模型和元数据

逻辑模型描述。

后端

后端参考。

Model Reference

开发人员对模型类和功能的参考。

基本架构

Simple Star Schema

简介：事实表与多维数据集具有相同的名称，维度表与维度具有相同的名称。

事实表称为销售，具有一个度量值和两个维度：商店和产品。每个维度都有两个属性。

"cubes": [
    {
        "name": "sales",
        "dimensions": ["product", "store"],
        "joins": [
            {"master":"product_id", "detail":"product.id"},
            {"master":"store_id", "detail":"store.id"}
        ]
    }
],
"dimensions": [
    { "name": "product", "attributes": ["code", "name"] },
    { "name": "store", "attributes": ["code", "address"] }
]

Simple Dimension

简介：维度仅由一个属性表示，没有细节，也没有层次结构。

与简单星型模式类似的模式请注意，维度年份仅由一个数字属性表示。

可以为维度不指定任何属性。维度将仅通过其名称进行引用，并且维度标签也将用作属性标签。

_images/schema-flat_dimension.png

ps:year?类似于事实维？退化维？

"cubes": [
    {
        "name": "sales",
        "dimensions": ["product", "store"],
        "joins": [
            {"master":"product_id", "detail":"product.id"},
            {"master":"store_id", "detail":"store.id"}
        ]
    }
],
"dimensions": [
    { "name": "product", "attributes": ["code", "name"] },
    { "name": "store", "attributes": ["code", "address"] }
]

表前缀

概要：维表共享一个公共前缀，事实表共享一个公共前缀。

在我们的示例中，维度表的前缀dim_为dim_product 或dim_store，事实表的前缀fact_为fact_sales。

无需更改模型，只需更改数据存储配置。在Python代码中，我们在数据存储区注册期间指定前缀 cubes.Workspace.register_store()：

工作空间 =  Workspace （）
工作空间。register_store （“ default” ， “ sql” ，
                         url = DATABASE_URL ，
                         Dimension_prefix = “ dim_” ，
                         Dimension_suffix = “ _dim” ，
                         fact_suffix = “ _fact” ，
                         fact_prefix = “ fact_” ）

使用OLAP服务器时，我们[store] 在slicer.ini配置文件的部分中指定前缀：

[店铺]
...
Dimensions_prefix =“ dim_”
fact_prefix =“事实_”

非默认数据库架构

简介：所有表都存储在一个默认数据库模式以外的通用模式中。

要sales_datamart在Python中指定数据库架构（在我们的示例中），请在的schema参数中传递它cubes.Workspace.register_store()：

workspace = Workspace()
workspace.register_store("default", "sql",
                         url=DATABASE_URL,
                         schema="sales_datamart")

对于OLAP服务器[store]，在slicer.ini配置文件的部分中指定了架构：

[store]
...
schema="sales_datamart"

单独的维度架构

概要：维表共享一个数据库模式（文件） /database schema，事实表共享另一个数据库模式(文件)

_images/schema-different_db_schemas.png

维度可以存储在与事实表架构不同的数据库架构中。

要dimensions在Python中指定维度的数据库架构（在我们的示例中），请在的Dimension_schema参数中传递它cubes.Workspace.register_store()：

workspace = Workspace()
workspace.register_store("default", "sql",
                                   url=DATABASE_URL,
                                   schema="facts",
                                   dimension_schema="dimensions")

对于OLAP服务器[store]，在slicer.ini配置文件的部分中指定了维度架构：

[store]
...
schema="facts"
dimension_schema="dimensions"

多对多关系

简介：一个事实可能分配了多个维度成员

有几个选项可以解决每个事实中多维成员的情况。每个都有优点和缺点。这是其中之一：使用桥表。

这是我们的逻辑意图：一个交互案例中可能涉及多个表示：

_images / schema-many_to_many-intention.png

我们可以添加桥表并创建artificial level的Representative_group来解决该问题。该group是参与互动的代表的独特组合。

_images/schema-many_to_many.png

该模型如下所示：

"cubes": [
    {
        "dimensions": ["representative", ...],
        "joins": [
            {
                "master":"representative_group_id",
                "detail":"bridge_representative.group_id"
            },
            {
                "master":"bridge_representative.representative_id",
                "detail":"representative.id"
            }
        ]
    }
],
"dimensions": [
    {
        "name": "representative",
        "levels": [
            { "name":"team" },
            { "name":"name", "nonadditive": "any"}
        ]
    }
]

您可能已经注意到桥接表是隐藏的----即在多维数据集中的任何位置都看不到它的内容。

当涉及到这样的维度时，聚合存在一个问题：通过在不是最详细（最深）的任何级别上进行聚合，我们可能会获得对维度成员的两次（多次）计数。所以这里增加属性"nonadditive": "any”来解决这个问题

某些前端甚至可能不允许按标记为nonadditivy的级别进行汇总。

Mappings

以下模式使用Explicit Mapping。

基本属性映射（Basic Attribute Mapping）

简介：表列的名称与维度属性或度量的名称不同。

_images / schema-mapping.png

在我们的示例中，我们有一个扁平的维度，称为year，但物理表列为“ sales_year”。另外，我们有一个度量值，但是相应的物理列名为total_amount。

我们在多维数据集中定义映射：

"cubes": [
    {
        "dimensions": [..., "year"],
        "measures": ["amount"],
        "mappings": {
            "year":"sales_year",
            "amount":"total_amount"]
        }
    }
],
"dimensions": [
    ...
    { "name": "year" }
]

共享维度表

简介：多个维度共享同一个维度表

_images/schema-alias.png

客户和供应商可以与所有组织和公司共享organization。我们必须在多维数据集定义的联接部分中指定表别名。表别名应与其他表采用相同的命名方式-也就是说，如果我们使用维度前缀，则别名也应包括前缀：

如果别名遵循维度命名约定（如示例中所示），则不需要任何映射。

"cubes": [
    {
        "name": "sales"
        "dimensions": ["supplier", "client"],
        "measures": ["amount"],
        "joins": [
            {
                "master":"supplier_id",
                "detail":"dim_organisation.id",
                "alias":"dim_supplier"
            },
            {
                "master":"client_id",
                "detail":"dim_organisation.id",
                "alias":"dim_client"
            }
        ]
    }
],
"dimensions": [
    {
      "name": "supplier",
      "attributes": ["id", "name", "address"]
    },
    {
      "name": "client",
      "attributes": ["id", "name", "address"
    }
]

层次结构（Hierarchies）

以下模式显示如何指定一个或多个维度层次结构。

简单层次结构

简介：维度具有多个层次。

产品维度具有两个级别：产品类别和产品。在产品类别等级由两个属性category_code 和category表示。该产品还具有两个属性： product_code和name。

"cubes": [
    {
        "dimensions": ["product", ...],
        "measures": ["amount"],
        "joins": [
            {"master":"product_id", "detail":"product.id"}
        ]
    }
],
"dimensions": [
    {
        "name": "product",
        "levels": [
            {
                "name":"category",
                "attributes": ["category_code", "category"]
            },
            {
                "name":"product",
                "attributes": ["code", "name"]
            }
        ]
    }
]

多个层次

提要：维度有多种方法将级别组织到层次结构中。

_images / schema-hierarchy2.png

日期（如下所述）或地理等维度可能具有多种将其属性组织到层次结构中的方式。日期可以由年-月-日或年-季度-月-日组成。

要定义多个层次结构，请首先定义所有可能的级别。然后创建层次结构列表，在其中为该特定层次结构指定级别顺序。

下面是代码示例：

{
    "name":"date",
    "levels": [
        { "name": "year", "attributes": ["year"] },
        { "name": "quarter", "attributes": ["quarter"] },
        { "name": "month", "attributes": ["month", "month_name"] },
        { "name": "week", "attributes": ["week"] },
        { "name": "weekday", "attributes": ["weekday"] },
        { "name": "day", "attributes": ["day"] }
    ],
    "hierarchies": [
        {"name": "ymd", "levels":["year", "month", "day"]},
        {"name": "ym", "levels":["year", "month"]},
        {"name": "yqmd", "levels":["year", "quarter", "month", "day"]},
        {"name": "ywd", "levels":["year", "week", "weekday"]}
    ],
    "default_hierarchy_name": "ymd"
}

如果没有指定hierarchy 则默认为default_hierarchy_name。

维度级别的多个表

简介：每个维度级别（level）都有一个单独的表

_images/schema-two_joins.png

我们必须联接其他表并映射不在“主要”维度表（与该名称同名的表）中的属性：

"cubes": [
    {
        "dimensions": ["product", ...],
        "measures": ["amount"],
        "joins": [
            {"master":"product_id", "detail":"product.id"},
            {"master":"product.category_id", "detail":"category.id"}
        ],
        "mappings": {
            "product.category_code": "category.code",
            "product.category": "category.name"
        }
    }
],
"dimensions": [
    {
        "name": "product",
        "levels": [
            {
                "name":"category",
                "attributes": ["category_code", "category"]
            },
            {
                "name":"product",
                "attributes": ["code", "name"]
            }
        ]
    }
]

笔记

加入的顺序应“从主人到细节”。这意味着始终将表连接到更接近事实表的表，然后再连接其他表。

面向用户的元数据

Model Labels

内容提要：要显示给用户的model各部分的标签

_images / schema-labels.png

标签在报告表中用作列标题或过滤器描述。属性（和列）名称应仅用于创建报告，尽管可读易懂，但不应以原始格式显示给用户。

可以使用label属性为任何模型对象（多维数据集，维度，级别，属性）指定标签：

"cubes": [
    {
        "name": "sales",
        "label": "Product Sales",
        "dimensions": ["product", ...]
    }
],
"dimensions": [
    {
        "name": "product",
        "label": "Product",
        "attributes": [
            {"name": "code", "label": "Code"},
            {"name": "name", "label": "Product"},
            {"name": "price", "label": "Unit Price"},
        ]
    }
]

Key and Label Attribute

内容提要：指定哪些属性将用于过滤（键），哪些属性将在用户界面中显示（标签）

"dimensions": [
    {
        "name": "product",
        "levels": [
            {
                "name": "product",
                "attributes": ["code", "name", "price"]
                "key": "code",
                "label_attribute": "name"
            }
        ]
    }
]

使用示例：

result = browser.aggregate(drilldown=["product"])

for row in result.table_rows("product"):
   print "%s: %s" % (row.label, row.record["amount_sum"])

本土化

本地化数据

简介：属性可能具有多种语言的值

维度属性可能具有特定于语言的内容。在多维数据集中，可以通过为每种语言提供一列来实现（非规范化本地化）。默认的列名应该是相同的，与语言环境后缀的本地化属性的名称，例如如果报告的属性称为名称，则列应name_en英语本地化和name_hu匈牙利本地化。

“尺寸” ： [ 
     { 
         “名称” ： “产品” ，
         “标签” ： “产品” ，
         “属性” ： [ 
             { “名称” ： “代码” ， “标签” ： “代码” }，
             { 
                 “名称” ： “ name” ，
                 “ label” ： “ Product” ，
                 “ locales” ： [ “ en” ， “ fr” ， “ es”] 
             } 
         ] 
     } 
 ]

在Python中使用：

浏览器 = 工作区。浏览器（cube ， locale = “ fr” ）

现在，浏览器实例将仅使用属性的法语本地化。

在切片器中，服务器请求语言可以由lang=URL中的参数指定。

无论本地化如何，都以相同的方式引用维度属性。添加新语言后，无需更改报告。

笔记：

每个浏览器实例只有一个语言环境–切换语言环境或创建另一个浏览器
当请求不存在的语言环境时，则使用默认语言环境（在本地化属性列表中的第一个）

本地化模型标签(略)

wh_xia_jun

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python cubes 之 Schemas and Models

Schemas and Models本节包含示例数据库架构及其各自的模型以及说明。这些示例适用于SQL后端。请参阅您选择的后端文档，以获取有关非SQL设置的更多信息。也可以看看逻辑模型和元数据逻辑模型描述。后端后端参考。Model Reference开发人员对模型类和功能的参考。基本架构Simple Star Schema简介：事实表与多维数据集具有相同的名称，维度表与维度具有相同的名称。事实表称为销售，具有一个度量值和两个维度：商店和产品。每个维度.
复制链接

扫一扫

专栏目录