【dbt】使用source引用已存在的表

dbt在构建DAG时依据ref()和source()执行模型,并确定数据血缘。ref()用于引用dbt内的模型,而source()用于引用数据库中存在的表。在Postgres数据库中创建dim_student表作为source,dbt中建立fct_score事实表模型,并通过这两者创建fct_student_score表。配置source后,dbt运行生成的表在docs中展示血缘关系。
摘要由CSDN通过智能技术生成

dbt在运行时,会按照ref()source()方法,构造出所有模型的DAG——有向无环图,再按照DAG的顺序执行。此外,dbt也是通过这两个方法,确定数据血缘关系,并在docs中展现出来。
dbt docs 中的数据血源,标记的表即是 source() 引用ref() 引用的是dbt中构造的表模型,但是对于数据库中已经有,在dbt中没有模型的表,虽然可以直接在SQL语句中使用,但这种方法在血源图中就看不到了,需要使用source()引用。

测试环境

  • dbt 1.5
  • 数据库:Postgres 14

构造测试表

  1. Postgres 数据库中直接生成表,做为 source 来源。
create table dim_student as (
	select 1 as id, '张三' as name, 19 as age, 'M' as gender, '计算机' as major union all 
	select 2 as id, '李四' as name, 18 as age, 'M' as gender, '物理'   as major union all
	select 3 as id, '王舞' as name, 20 as age, 'F' as gender, '外语'   as major
)
  1. dbt中建立事实表模型,使用<dbt-project>/models/fct_score.sql
with source_data as (
    select 1 as id, 1 as student_id, 'chinese' as class, 98 as score union all 
    select 2 as id, 1 as student_id, 'math'    as class, 97 as score union all 
    select 3 as id, 1 as student_id, 'english' as class, 96 as score union all 
    select 4 as id, 2 as student_id, 'chinese' as class, 88 as score union all 
    select 5 as id, 2 as student_id, 'math'    as class, 87 as score union all 
    select 6 as id, 2 as student_id, 'english' as class, 86 as score union all 
    select 4 as id, 3 as student_id, 'chinese' as class, 83 as score union all 
    select 5 as id, 3 as student_id, 'math'    as class, 82 as score union all 
    select 6 as id, 3 as student_id, 'english' as class, 81 as score 
)

select * from source_data

配置source

<dbt-project>/models/schema.yml 配置文件中增加 source 配置,如果没有该文件的话,可以自行添加。

  • - name: src,可以在一个 schema 下面放多个来源表,scr 是这个schema的名称,也可以自己取名
  • database,指定来源数据库。
  • schema,指定来源schema
  • tables,下面是来源表
    • - name: dim_student,来源表名
    • descriptions,来源表描述
    • columns:来源表字段,可以不填
version: 2

sources:
  - name: src
    database: postgres
    schema: test_exp
    tables:
      - name: dim_student
        descriptions: "维度表: 学生信息"
        columns:
          - name: id
          - name: name
          - name: age
          - name: gender
          - name: major

测试

使用dim_student (已存在的表) 和 fct_score (在dbt中建立的模型),建立新表 fct_student_score

with source_data as (
    select 
        fct.student_id , 
        dim.name,
        dim.gender,
        dim.age,
        dim.major,
        fct.class,
        fct.score 
    from {{ ref('fct_score') }} fct 
        left join {{ source('test_exp', 'dim_student')}} dim on fct.student_id=dim.id
    order by fct.score desc
)

select * from source_data 

运行 dbt run -s fct_student_score 后生成的表
在这里插入图片描述
docs 中的血源关系
在这里插入图片描述

参考

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值