dbt
在生成table
模型的时候,默认是不带表注释和字段注释的。如果需要添加注释,则需要使用以下方式配置 persist_docs
选项,建表时带上注释。
测试环境
- dbt 1.5
- 数据库:Postgres 14
方式一:在模型中配置 persiste_docs
- 创建测试用的表模型
<dbt-project/models/example/fct_score.sql>
materialized='table'
,指定模型的物化方式为table
,即物理表。此外还有view
,没错,就是你理解的那个视图。incremental
, 增量更新表ephemeral
,临时表,只在dbt
运行过程中使用,不会进入数据库。
persist_docs={'relation': true, 'columns': true}
,建表和字段时使用description
作为注释。如果不带该项配置,则description
只会出现在docs
中。
{{ config(materialized='table', persist_docs={'relation': true, 'columns': true}) }}
with source_data as (
select 1 as id, 1 as student_id, 'chinese' as class, 98 as score union all
select 2 as id, 1 as student_id, 'math' as class, 97 as score union all
select 3 as id, 1 as student_id, 'english' as class, 96 as score union all
select 4 as id, 2 as student_id, 'chinese' as class, 88 as score union all
select 5 as id, 2 as student_id, 'math' as class, 87 as score union all
select 6 as id, 2 as student_id, 'english' as class, 86 as score union all
select 4 as id, 3 as student_id, 'chinese' as class, 83 as score union all
select 5 as id, 3 as student_id, 'math' as class, 82 as score union all
select 6 as id, 3 as student_id, 'english' as class, 81 as score
)
select * from source_data
- 配置
description
在schema.yml
中添加description
models:
- name: fct_score
description: "事实表:学生成绩"
columns:
- name: id
description: "记录ID"
- name: student_id
description: "学生ID"
- name: class
description: "课程"
- name: score
description: "成绩"
- 执行
dbt run -s fct_score
,可以看到已经带上注释了(注意,如果表已存在,需要提前删除,否则不会创建新表)
方式二:在 schema.yml
中配置 persist_docs
dbt
的核心思核想是 Data as Code,即像交付代码一样交付数据,对数据加工代码进行统一的管理。所以最好是在 schema.yml
中配置 persist_docs
,以便开发表模型的时候,可以专注于 SQL。
models:
- name: fct_score
description: "事实表:学生成绩"
persist_docs:
- relation: true
- columns: true
columns:
- name: id
description: "记录ID"
- name: student_id
description: "学生ID"
- name: class
description: "课程"
- name: score
description: "成绩"
方式三:在 dbt-project.yml
中配置 persist_docs
也可以在 dbt-project.yml
中统一开启 persist_docs
,优先级是从下到上,即表模型中的配置优先程度最高, dbt-project.yml
中的配置优先度最低。
但 description
还是需要在 schema.yml
中配置
models:
my_test_dbt:
example:
+materialized: table
+schema: dwd
+persist_docs:
relation: true
columns: true
参考
dbt
官方文档:persist_docsdbt
官方文档:Materializations