Custom schemas
By default, all dbt models are built in the schema specified in your environment (dbt Cloud) or profile's target (dbt Core). This default schema is called your target schema.
For dbt projects with lots of models, it's common to build models across multiple schemas and group similar models together. For example, you might want to:
- Group models based on the business unit using the model, creating schemas such as
core
,marketing
,finance
andsupport
. - Hide intermediate models in a
staging
schema, and only present models that should be queried by an end user in ananalytics
schema.
To do this, specify a custom schema. dbt generates the schema name for a model by appending the custom schema to the target schema. For example, <target_schema>_<custom_schema>
.
Target schema | Custom schema | Resulting schema |
---|---|---|
analytics_prod | None | analytics_prod |
alice_dev | None | alice_dev |
dbt_cloud_pr_123_456 | None | dbt_cloud_pr_123_456 |
analytics_prod | marketing | analytics_prod_marketing |
alice_dev | marketing | alice_dev_marketing |
dbt_cloud_pr_123_456 | marketing | dbt_cloud_pr_123_456_marketing |
How do I use custom schemas?
To specify a custom schema for a model, use the schema
configuration key. As with any configuration, you can do one of the following:
- apply this configuration to a specific model by using a config block within a model
- apply it to a subdirectory of models by specifying it in your
dbt_project.yml
file
{{ config(schema='marketing') }}
select ...
# models in `models/marketing/ will be built in the "*_marketing" schema
models:
my_project:
marketing:
+schema: marketing
Understanding custom schemas
When first using custom schemas, it's a common misunderstanding to assume that a model only uses the new schema
configuration; for example, a model that has the configuration schema: marketing
would be built in the marketing
schema. However, dbt puts it in a schema like <target_schema>_marketing
.
There's a good reason for this deviation. Each dbt user has their own target schema for development (refer to Managing Environments). If dbt ignored the target schema and only used the model's custom schema, every dbt user would create models in the same schema and would overwrite each other's work.
By combining the target schema and the custom schema, dbt ensures that objects it creates in your data warehouse don't collide with one another.
If you prefer to use different logic for generating a schema name, you can change the way dbt generates a schema name (see below).
How does dbt generate a model's schema name?
dbt uses a default macro called generate_schema_name
to determine the name of the schema that a model should be built in.
The following code represents the default macro's logic:
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
💡 Use Jinja's whitespace control to tidy your macros!
When you're modifying macros in your project, you might notice extra white space in your code in the target/compiled
folder.
You can remove unwanted spaces and lines with Jinja's whitespace control by using a minus sign. For example, use {{- ... -}}
or {%- ... %}
around your macro definitions (such as {%- macro generate_schema_name(...) -%} ... {%- endmacro -%}
).
Changing the way dbt generates a schema name
If your dbt project has a custom macro called generate_schema_name
, dbt will use it instead of the default macro. This allows you to customize the name generation according to your needs.
To customize this macro, copy the example code in the section How does dbt generate a model's schema name into a file named macros/generate_schema_name.sql
and make changes as necessary.
Be careful. dbt will ignore any custom generate_schema_name
macros included in installed packages.
generate_schema_name arguments
Argument | Description | Example |
---|---|---|
custom_schema_name | The configured value of schema in the specified node, or none if a value is not supplied | marketing |
node | The node that is currently being processed by dbt | {"name": "my_model", "resource_type": "model",...} |
Jinja context available in generate_schema_name
If you choose to write custom logic to generate a schema name, it's worth noting that not all variables and methods are available to you when defining this logic. In other words: the generate_schema_name
macro is compiled with a limited Jinja context.
The following context methods are available in the generate_schema_name
macro:
Jinja context | Type | Available |
---|---|---|
target | Variable | ✅ |
env_var | Variable | ✅ |
var | Variable | Limited, see below |
exceptions | Macro | ✅ |
log | Macro | ✅ |
Other macros in your project | Macro | ✅ |
Other macros in your packages | Macro | ✅ |
Which vars are available in generate_schema_name?
Globally-scoped variables and variables defined on the command line with
--vars are accessible in the generate_schema_name
context.
Managing different behaviors across packages
See docs on macro dispatch
: "Managing different global overrides across packages"
A built-in alternative pattern for generating schema names
A common customization is to ignore the target schema in production environments, and ignore the custom schema configurations in other environments (such as development and CI).
Production Environment (target.name == 'prod'
)
Target schema | Custom schema | Resulting schema |
---|---|---|
analytics_prod | None | analytics_prod |
analytics_prod | marketing | marketing |
Development/CI Environment (target.name != 'prod'
)
Target schema | Custom schema | Resulting schema |
---|---|---|
alice_dev | None | alice_dev |
alice_dev | marketing | alice_dev |
dbt_cloud_pr_123_456 | None | dbt_cloud_pr_123_456 |
dbt_cloud_pr_123_456 | marketing | dbt_cloud_pr_123_456 |
Similar to the regular macro, this approach guarantees that schemas from different environments will not collide.
dbt ships with a macro for this use case — called generate_schema_name_for_env
— which is disabled by default. To enable it, add a custom generate_schema_name
macro to your project that contains the following code:
-- put this in macros/get_custom_schema.sql
{% macro generate_schema_name(custom_schema_name, node) -%}
{{ generate_schema_name_for_env(custom_schema_name, node) }}
{%- endmacro %}
When using this macro, you'll need to set the target name in your production job to prod
.
Managing environments
In the generate_schema_name
macro examples shown in the built-in alternative pattern section, the target.name
context variable is used to change the schema name that dbt generates for models. If the generate_schema_name
macro in your project uses the target.name
context variable, you must ensure that your different dbt environments are configured accordingly. While you can use any naming scheme you'd like, we typically recommend:
- dev — Your local development environment; configured in a
profiles.yml
file on your computer. - ci — A continuous integration environment running on pull requests in GitHub, GitLab, and so on.
- prod — The production deployment of your dbt project, like in dbt Cloud, Airflow, or similar.
If your schema names are being generated incorrectly, double-check your target name in the relevant environment.
For more information, consult the managing environments in dbt Core guide.