dbt version 0.4.0
dbt v0.4.0
dbt v0.4.0 provides new ways to materialize models in your database.
0. tl;dr
- new types of materializations:
incremental
andephemeral
- if upgrading, change
materialized: true|false
tomaterialized: table|view|incremental|ephemeral
- optionally specify model configs within the SQL file
1. Feature: {{this}}
template variable #81
The {{this}}
template variable expands to the name of the model being compiled. For example:
-- my_model.sql
select 'the fully qualified name of this model is {{ this }}'
-- compiles to
select 'the fully qualified name of this model is "the_schema"."my_model"'
2. Feature: materialized: incremental
#90
After initially creating a table, incremental models will insert
new records into the table on subsequent runs. This drastically speeds up execution time for large, append-only datasets.
Each execution of dbt run will:
- create the model table if it doesn't exist
- insert new records into the table
New records are identified by a sql_where
model configuration option. In practice, this looks like:
sessions:
materialized: incremental
sql_where: "session_start_time > (select max(session_start_time) from {{this}})"
There are a couple of new things here. Previously, materialized
could either be set to true
or false
. Now, the valid options include view
, table,
incremental
, and ephemeral
(more on this last one below). Also note that incremental models generally require use of the {{this}} template variable to identify new records.
The sql_where
field is supplied as a where
condition on a subquery containing the model definition. This resultset is then inserted into the target model. This looks something like:
insert into schema.model (
select * from (
-- compiled model definition
) where {{sql_where}}
)
3. Feature: materialized: ephemeral
#78
Ephemeral models are injected as CTEs (with
statements) into any model that ref
erences them. Ephemeral models are part of the dependency graph and generally function like any other model, except ephemeral models are not compiled to their own files or directly created in the database. This is useful for intermediary models which are shared by other downstream models, but shouldn't be queried directly from outside of dbt.
To make a model ephemeral:
employees:
materialized: ephemeral
Suppose you wanted to exclude employees
from your users
table, but you don't want to clutter your analytics schema with an employees
table.
-- employees.sql
select * from public.employees where is_deleted = false
-- users.sql
select *
from {{ref('users')}}
where email not in (select email from {{ref('employees')}})
The compiled SQL would look something like:
with __dbt__CTE__employees as (
select * from public.employees where is_deleted = false
)
select *
from users
where email not in (select email from __dbt__CTE__employees)
Ephemeral models play nice with other ephemeral models, incremental models, and regular table/view models. Feel free to mix and match different materialization options to optimize for performance and simplicity.
4. Feature: In-model configs #88
Configurations can now be specified directly inside of models. These in-model configs work exactly the same as configs inside of the dbt_project.yml file.
An in-model-config looks like this:
-- users.sql
-- python function syntax
{{ config(materialized="incremental", sql_where="id > (select max(id) from {{this}})") }}
-- OR json syntax
{{
config({"materialized:" "incremental", "sql_where" : "id > (select max(id) from {{this}})"})
}}
select * from public.users
The config resolution order is:
- dbt_project.yml
model-defaults
- in-model config
- dbt_project.yml
models
config
5. Fix: dbt seed null values #102
Previously, dbt seed
would insert empty CSV cells as "None"
, whereas they should have been NULL
. Not anymore!