Track
In analytics engineering, dbt (data build tool) macros provide a powerful way to write dynamic, reusable SQL logic through reusable blocks of code.
In this tutorial, we’ll explore the fundamentals of dbt macros, how to build them using Jinja, best practices for maintainability, and real-world applications, all with code examples.
What Are dbt Macros?
dbt macros are reusable code blocks written using the Jinja templating language within dbt, which allows you to generate SQL dynamically. They help reduce repetition and centralize transformation logic.
These macros work the same way functions do in traditional programming languages.
For example, if you need to calculate a specific metric across multiple models, a macro can encapsulate that calculation, making it easier to maintain and update.
Macros can also be stored in packages, such as dbt-utils.
Why Use dbt Macros?
Macros bring several advantages, mainly:
- Reduce redundancy: You can write a logic block once and reuse it everywhere.
- Consistency: Applying the same transformation logic across models without manual duplication.
- Flexibility: Macros support parameters, making them adaptable to various inputs.
- Cross-project sharing: Well-designed macros can be shared as packages.
These benefits lead to cleaner, more maintainable, and scalable codebases.
dbt Macro Fundamental Concepts
To help you understand how macros work in dbt, let’s look at some key concepts.
1. Reusability
Macros allow you to define a piece of SQL logic once and reuse it across multiple models, tests, or even other macros.
2. Jinja templating engine
Jinja’s integration with dbt makes it possible to insert variables, apply conditions, and loop over lists, all before your SQL reaches the database. dbt compiles this template into raw SQL, making macros database-agnostic.
3. Introduction of programming techniques
Macros can transform SQL into a programmable framework. They allow for logic, loops, and conditions—common things that standard SQL cannot handle as flexibly.
Similar to functions, macros in dbt can accept arguments (parameters) and can be called with different inputs to produce different outputs. They help simplify complex transformations by breaking them down into smaller, reusable units.
Building Macros with Jinja
Next, let’s see how we can build some macros for ourselves.
Jinja is the templating engine that powers dbt macros, letting you treat SQL as dynamic, programmable text.
Template constructs and delimiters
To use Jinja, you’ll need to use specific template constructs and delimiters.
Here are some common constructs and delimiters:
- Expressions
{{ }}: Inserts expressions. - Statements
{% %}: Executes statements. - Comments
{# #}: Adds comments that you want dbt to ignore.
In macros, {{ ... }} is used for expressions that should output a value (e.g., column names or calculations), while {% ... %} is used for control flow (loops, if/else statements, macro definitions). Jinja comments use {# ... #} and are removed entirely during compilation.
Here’s a sample of the syntax used for statements used in macros:
{% macro your_macro_name(your_variable) %}
select 'Hello, {{ your_variable}}' as greeting
{% endmacro %}
Here’s an example of how to use statements to create a macro:
{% macro greet(name) %}
select 'Hello, {{ name }}' as greeting
{% endmacro %}
To run the macro, use this:
dbt run-operation greet(name) --args 'Austin'
Variable management
You can pass variables into macros or define them inside. Filters such as upper or lower can transform text values.
Here’s an implementation of the upper function to change all text values into uppercase.
{% macro to_upper_case(column_name) %}
upper({{ column_name }})
{% endmacro %}
To implement this, you would have to call/reference the newly-created macro in your dbt model.
Here’s a code example:
SELECT
id,
{{ to_upper_case('product_name') }} AS product_name_upper
FROM {{ source('public', 'orders') }}
Control structures
Next, similar to programming languages, dbt macros can offer the following:
- Conditional statements (
{% if %}/{% endif %}): - Loops (
{% for %}/{% endfor %})
These structures let you dynamically build SQL clauses or generate multiple expressions based on inputs.
This makes it easier for macros to handle complex SQL generation tasks.
Parameter handling and error management
Macros can validate inputs and raise errors at compile time to prevent runtime issues.
Parameters are defined within the macro declaration using () parentheses, similar to function definitions in programming languages.
Here’s an example of how we can include 2 parameters:
{% macro my_macro(param1, param2) %}
SELECT
{{ param1 }},
{{ param2 }}
FROM my_table
{% endmacro %}
As you can see, in the first line, we have included both param1 and param2.
To invoke the macro in a dbt model, you should use double curly braces {{ }}. The parameters are passed as arguments within the parentheses.
SELECT
id,
{{ my_macro('column_a', 'column_b') }}
FROM my_source_table
In this case, we specified both column_a and column_b to be our parameters.
Setup Guide for Working with dbt Macros
Before we start building and using macros, let’s set up a working environment so you can follow along with the examples in this tutorial.
This will ensure you have dbt installed, a sample dataset loaded, and the right folder structure to create macros.
1. Install dbt Core
You’ll need dbt Core installed locally. If you’re using Postgres in this tutorial, install with this command:
pip install dbt
If your project uses a data warehouse, run the respective commands instead:
pip install dbt-core dbt-postgres
pip install dbt-core dbt-snowflake
pip install dbt-core dbt-bigquery
2. Create a dbt Project
Before we run any dbt commands, we will need to set up a project environment.
Navigate to your desired projects directory.
This is where you will store your dbt project files. In this case, we’ll be storing all the related files in this dbt project folder.
To initialize a dbt project, you’ll need to run:
dbt init
cd ~/dbt-macros-tutorial
Create a folder structure in this format:
macros_tutorial/
├─ models/
├─ macros/
├─ tests/
├─ seeds/
├─ dbt_project.yml
3. Prepare the Sample Dataset
We’ll use Python to generate a small orders dataset to load into Postgres (or your chosen warehouse).
import pandas as pd
import random
from datetime import datetime, timedelta
from sqlalchemy import create_engine
# Create sample orders dataframe
random.seed(42)
num_rows = 20
start_date = datetime(2025, 1, 1)
data = []
for i in range(1, num_rows + 1):
order_date = start_date + timedelta(days=random.randint(0, 30))
ship_date = order_date + timedelta(days=random.randint(1, 10))
data.append({
'order_id': i,
'customer_id': random.randint(100, 105),
'order_date': order_date.date(),
'ship_date': ship_date.date(),
'status': random.choice(['pending', 'shipped', 'delivered']),
'revenue': round(random.uniform(50, 500), 2),
'cost': round(random.uniform(20, 400), 2)
})
df = pd.DataFrame(data)
# Load into Postgres
engine = create_engine('postgresql://user:password@localhost:5432/dbt_demo')
df.to_sql('orders', engine, if_exists='replace', index=False)
Note: Update the connection string for your database.
4. Create a Source in dbt
Add the following to models/src_orders.yml:
version: 2
sources:
- name: public
schema: public
tables:
- name: orders
5. Test the Connection and Source
dbt debug
dbt run-operation list_relations --args '{"schema_name": "public"}'
You’re now ready to start writing macros that operate on this dataset.
How to Write and Use dbt Macros
Now, let’s try out writing some macros of our own.
Defining macros
Place macro definitions in .sql files under the macros/ directory.
When naming macros, make sure to give each macro a descriptive name that reflects its purpose, and use parameters to make it flexible. Adding a brief docstring comment at the top helps teammates understand its intent.
Here’s an example:
-- macros/my_macro.sql
{% macro cents_to_dollars(column_name, precision=2) %}
({{ column_name }} / 100.0)::numeric(18, {{ precision }})
{% endmacro %}
Invoking macros in models
You call macros inside models with {{ macro_name(arg1, arg2) }}. They can be used in SELECT clauses, WHERE filters, or even entire CTE blocks, allowing you to inject reusable SQL patterns anywhere.
Here's an example:
-- models/my_model.sql
SELECT
order_id,
{{ cents_to_dollars('amount_in_cents') }} AS amount_in_dollars,
{{ cents_to_dollars('tax_in_cents', precision=4) }} AS tax_in_dollars
FROM {{ ref('raw_orders') }}
Invoking macros from the CLI
For operational tasks outside model builds, use this format/syntax:
dbt run-operation macro_name --args '{"param": "value"}'
This lets you execute utility SQL like schema creation or data cleanup without building a model.
Here's a command example:
dbt run-operation greet --args '{"name": "Austin"}'
Using macros from packages
Macros can also come from packages in dbt, such as dbt-utils.
When you install a package via packages.yml, its macros become available immediately. To do this, you’ll need to run dbt deps.
You can then call on the newly added macros with this format: package_name.macro_name.
Macro file organization
For better organization, sort and separate macros by functionality to improve maintainability.
Try to group related macros into separate files (e.g., date_utils.sql, string_utils.sql) for maintainability. This makes it easier to locate and update logic over time.
Examples of dbt Macros in Practice
Here are some examples of macros in more practical use cases:
Date comparison macro
Let’s say you want to create a function that you will use often to compare dates.
In this case, you can use a macro to compute day differences between two date fields while handling SQL dialect differences.
Start by creating a new file here: macros/date_utils.sql.
Next, use the following code:
{% macro days_between(start_col, end_col) %}
{#-- Handle different warehouses --#}
{% if target.type in ['postgres', 'redshift'] %}
(DATE_PART('day', {{ end_col }}::timestamp - {{ start_col }}::timestamp))
{% elif target.type in ['snowflake'] %}
DATEDIFF('day', {{ start_col }}, {{ end_col }})
{% elif target.type in ['bigquery'] %}
DATE_DIFF(CAST({{ end_col }} AS DATE), CAST({{ start_col }} AS DATE), DAY)
{% else %}
-- Fallback to ANSI-ish expression; may need adjustment per engine
CAST({{ end_col }} AS DATE) - CAST({{ start_col }} AS DATE)
{% endif %}
{% endmacro %}
Now, we’ll use this new days_between macro in a model.
with src as (
select * from {{ source('public', 'orders') }}
)
select
order_id,
customer_id,
order_date,
ship_date,
{{ days_between('order_date', 'ship_date') }} as days_to_ship
from src
Customized generate_schema_name macro
By default, dbt uses a default macro called generate_schema_name. This macro is used for specifying the name of the schema that a model should be built in.
You can also override dbt’s default to control the schema used per environment.
You can include this file at the location: macros/generate_schema_name.sql.
`sql
{% macro generate_schema_name(custom_schema_name, node) -%}
{# Policy:
- prod: ANALYTICS
- nonprod: dev_analytics, ci_analytics, etc.
#}
{%- if target.name == 'prod' -%}
{{ custom_schema_name | trim }}
{%- else -%}
{{ (target.name ~ '_' ~ custom_schema_name) | lower | replace(' ', '_') }}
{%- endif -%}
{%- endmacro %}
How it helps: On dev, models land in the dev_analytics (isolation) environment. On prod, they land in analytics. This creates predictable, collision-free deployments.
Executing SQL outside a model
You can also run SQL commands outside models for operations like table cleanup or metadata updates without creating models.
You can write a macro that executes raw SQL and then run it via dbt run-operation.
This is useful for DDL (dropping/creating tables) or metadata updates that don’t fit into a model.
Here’s the macro:
{% macro cleanup_temp_tables() %}
{% set sql %}
drop table if exists {{ target.schema }}.temp_table;
delete from {{ target.schema }}.audit_log where event_type = 'test';
{% endset %}
{{ run_query(sql) }}
{% endmacro %}
Next, run this operation in the terminal:
dbt run-operation cleanup_temp_tables
This will allow you to run macros without including them within models.
Parameterized macro example
We can also create a macro that’s highly parameterized, like in a function.
In this example, we’ll filter records based on a given status, which is when values are provided. This uses if and do statements to create logic.
{% macro where_orders(status=None, min_date=None, max_date=None) %}
{%- set clauses = [] -%}
{%- if status is not none -%}
{%- do clauses.append("status = '" ~ status ~ "'") -%}
{%- endif -%}
{%- if min_date is not none -%}
{%- do clauses.append("order_date >= '" ~ min_date ~ "'") -%}
{%- endif -%}
{%- if max_date is not none -%}
{%- do clauses.append("order_date < '" ~ max_date ~ "'") -%}
{%- endif -%}
{%- if clauses | length > 0 -%}
where {{ clauses | join(' and ') }}
{%- endif -%}
{% endmacro %}
Now we apply this macro in a model:
with src as (
select * from {{ ref('orders') }}
)
select *
from src
{{ where_orders(status='shipped', min_date='2025-01-05') }}
This macro builds in logic for when an argument is omitted; that criterion is ignored. This keeps optional logic tidy.
Best Practices for dbt Macros
Designing great macros is about more than getting the SQL to compile. The best macros are discoverable, well-documented, predictable, testable, and easy to deprecate or extend.
Here are some best practices to follow:
1. Naming conventions and parameter design
Similar to SQL and programming languages, naming conventions need to be followed to ensure readability.
Here are some naming tips:
- Be explicit and action‑oriented. Prefer verbs or clear intents (e.g.,
generate_select_list,where_orders) over vague names likehelper. - Keep names stable and lowercase. Use snake_case and stable prefixes by domain, e.g.,
date_*,string_*,test_*,admin_*.
2. Documentation strategies
You should develop habits to help document what you create. Documentation should be done in places where devs look.
For example, combine inline docstrings + a macros.yml (or schema.yml) with descriptions and examples.
Documentation content can include:
- Purpose and expected output (1–2 sentences)
- Parameter table with defaults and allowed values
- Short example of invocation in a model and with
run-operation - Cross‑database behavior notes
Example snippet of a config file:
version: 2
macros:
- name: days_between
description: |
Return day difference between two date expressions, normalized across warehouses.
arguments:
- name: start_col
type: string
description: Column/expression for the start date
- name: end_col
type: string
description: Column/expression for the end date
docs:
show: true
3. Code organization
Your code will also need to be well-organized, especially when you have to manage a large number of macros in your directory.
Try to structure by domain for ease of use. Here are some examples of that:
macros/date_utils.sqlmacros/string_utils.sqlmacros/testing/(custom tests)macros/admin/(DDL, housekeeping)macros/compat/(warehouse shims)
4. Keep macros small and single-purpose
A macro should do one thing well. If you find a macro growing too large or handling unrelated logic, break it into smaller macros. This improves reusability and makes testing more straightforward.
5. Test macros with representative data
Use dedicated test models or a staging environment to validate macro behavior.
For example:
- Create a small seed dataset that covers edge cases.
- Write a test model that calls the macro with different parameters.
- Use
dbt runand inspect the compiled SQL for correctness.
Pro tip: Pair macros with dbt test custom tests to verify the outputs meet expected constraints.
Common Use Cases and Applications
dbt macros are versatile, and their applications extend into many day‑to‑day analytics engineering workflows.
Here are some expanded, detailed examples:
- Standardizing data transformations: Macros can ensure the same naming conventions, trimming, null‑handling, and type-casting logic is applied to multiple models. For instance, you can create a
standardize_customer_namemacro to clean and format names identically across all marts. - Supporting cross‑database queries: Warehouses have subtle SQL dialect differences. Macros can abstract those differences—like using
DATEDIFFon Snowflake andDATE_DIFFon BigQuery. This makes your models portable. - Automating quality checks: Write reusable macros that generate data tests, such as checking for duplicate IDs, null critical columns, or out‑of‑range values. These can be invoked in multiple test models.
- Simplifying repetitive SQL patterns: Common CTEs, calculated fields, or filtering clauses can be parameterized in macros. For example, a macro to build dynamic
WHEREclauses based on optional filters prevents copy‑pasting complex logic.
Advanced Techniques and Performance Optimization
dbt macros can also be used in more advanced applications. Let’s look at two common ways below.
Recursive and meta-programming patterns
Macros can call other macros, loop through metadata, or dynamically generate SQL fragments based on schema inspection as well.
This allows you to:
- Automatically create staging models for every source table.
- Generate pivot/unpivot queries without manually typing each column.
- Build environment-aware queries by looping through config variables.
Performance optimization strategies
You can also perform further optimization to help you boost performance:
- Minimize query complexity by selectively including only the necessary columns or joins.
- Adjust materializations or SQL strategies based on warehouse type.
Integration with the dbt Ecosystem
Macros are created to be a part of the dbt ecosystem, making them even more powerful when paired with packages, cloud features, and external tools.
Package management
Many community dbt packages such as dbt-utils, dbt-date, and dbt-expectations, are essentially collections of highly reusable macros.
These packages bring prebuilt transformations, tests, and helpers without reinventing the wheel.
Compatibility with dbt Cloud and developer tools
Macros work seamlessly in dbt Cloud environments, meaning your dynamic SQL logic behaves the same whether run locally or in a scheduled cloud job.
They also integrate well with version control, CI/CD pipelines, and code review processes, ensuring macro updates are tested and deployed consistently.
Real-World Implementation Strategies
Moving from traditional SQL scripts to a macro-driven dbt workflow requires both technical changes and cultural adoption.
Migration and change management
When migrating, start by identifying repeated SQL patterns across models. Next, refactor these into macros incrementally, validating that compiled SQL matches the old logic.
Use feature flags (var()) to toggle new macro logic on/off during rollout, minimizing disruption. Most importantly, you’ll want your stakeholders to be informed of changes to avoid breaking downstream queries.
Governance and deployment strategies
For governance, start by establishing clear naming conventions, parameter standards, and documentation requirements for all macros.
Next, review macro changes in pull requests just like model updates, ensuring they meet performance and compatibility standards.
In production, monitor job logs for macro-generated queries to verify they run as expected. For macro-heavy projects, set up observability dashboards to track compile times and query performance by macro type.
Conclusion
dbt macros are a key part of productivity and maintainability in modern analytics engineering teams. Using dbt models along with macros can lead to better outcomes for data cleaning operations.
Want to learn more about dbt? Our Introduction to dbt course and dbt Tutorial might be helpful for you.
dbt Macros FAQs
How can I optimize the performance of my dbt macros?
The key is to make sure the SQL your macro produces is efficient. Avoid using loops to build SQL line by line. Try to let the database handle things with set-based queries instead. Also, keep macros focused and avoid adding too many layers of logic that generate overly complex SQL. If you find yourself repeating the same calculation, consider caching it inside the macro or moving it to a model.
What are some common pitfalls when using dbt macros?
A common mistake is overcomplicating macros so much that they’re hard to understand or debug. Another is not considering differences between databases. Sometimes you might forget to escape variables, which can cause errors or generate SQL that is unreadable for teammates.
How do I debug issues with dbt macros?
Start by looking at the compiled SQL in the target/compiled folder. This file shows you exactly what the macro produced. You can also add {{ log('message', info=True) }} statements inside your macro to see what’s happening at runtime. If something breaks, simplify the inputs and test with smaller examples using dbt run-operation until you find the problem.
Can dbt macros be used with other tools or frameworks?
Yes, but only through dbt itself. For example, tools like Airflow or Dagster can run dbt commands that call your macros. Within dbt, macros can also be shared through packages like dbt-utils. However, you can’t directly run a dbt macro in another tool unless you wrap it in a dbt command.
What are the best practices for documenting dbt macros?
Some best practices are to think of macros like small functions that other people will use. Also consider writing clear descriptions of what they do, listing out all the arguments (with defaults if possible), and show short examples of how to use them. If the macro behaves differently on different databases, note that too. Good documentation makes it easier for teammates to trust and reuse your work.

I'm Austin, a blogger and tech writer with years of experience both as a data scientist and a data analyst in healthcare. Starting my tech journey with a background in biology, I now help others make the same transition through my tech blog. My passion for technology has led me to my writing contributions to dozens of SaaS companies, inspiring others and sharing my experiences.
