Your dbt DAG Has a Date Problem (And It Looks Clean)

Our analyst was building a shipping performance dashboard. She wanted to answer one question: how many orders placed in Q1 actually shipped in Q2?

She connected fct_orders to our date dimension in the BI tool. Both the order date FK and the ship date FK pointed to the same dim_date table (the only date dimension we had). She set a Q1 filter. Hit refresh.

Every metric dropped to near zero.

What happened: the BI tool saw one dim_date connected to two foreign keys. When she filtered Q1, it applied to both simultaneously. Only rows where the order month and the ship month were both Q1 survived. Cross-date analysis was impossible. You couldn’t ask “ordered in Q1, shipped in Q2” because the tool had no way to distinguish the two roles of the same dimension.

That conversation introduced me to role-playing dimensions.

The setup that caused it is common — and the fix is simpler than it looks.

An Order Has a Life — And Four Dates

Take e-commerce as an example. An order isn’t a single moment in time. It’s a sequence of events:

A customer clicks “buy”: order date
The warehouse picks, packs, and dispatches: ship date
The courier delivers it: delivery date
Occasionally it comes back: return date

Four dates. One order. Each one matters to a different team:

Finance tracks order dates for revenue recognition
Logistics measures fulfilment performance by ship date
Customer experience lives by delivery dates
The returns team tracks all of them

The data model seems obvious: one fact table, fct_orders, four date foreign keys. The challenge is that all four need to join to dim_date, and that is where instinct leads you somewhere that works on the surface but creates problems.

What Everyone Tries First

When you first hit this, the fast answer is to write multiple CTEs in your fact table, all pointing at dim_date, and alias the columns you need in the SELECT:

-- fct_orders.sql
WITH order_dates    AS (SELECT * FROM {{ ref('dim_date') }}),
     ship_dates     AS (SELECT * FROM {{ ref('dim_date') }}),
     delivery_dates AS (SELECT * FROM {{ ref('dim_date') }}),
     return_dates   AS (SELECT * FROM {{ ref('dim_date') }})

SELECT
    o.order_id,
    od.date_key_sk    AS order_date_key_sk,
    sd.date_key_sk    AS ship_date_key_sk,
    dd.date_key_sk    AS delivery_date_key_sk,
    rd.date_key_sk    AS return_date_key_sk,
    od.month_number   AS order_month,
    sd.month_number   AS ship_month
FROM orders o
LEFT JOIN order_dates    od ON o.order_date    = od.date_day
LEFT JOIN ship_dates     sd ON o.ship_date     = sd.date_day
LEFT JOIN delivery_dates dd ON o.delivery_date = dd.date_day
LEFT JOIN return_dates   rd ON o.return_date   = rd.date_day

It works. The tests pass. Your dbt DAG shows one dim_date node. You ship it and move on.

graph LR
    slv_dim_date(slv_dim_date) --> dim_date(dim_date)
    dim_date -->|4 date FKs, same node| fct_orders(fct_orders)

The cracks appear later:

A second fact table (say, fct_returns) needs order_date and return_date too. You copy the CTEs.
The analyst wants to slice by order_week_number. She drags dim_date into the BI tool. One node, four FKs. It has no idea which date role she means:

Column from dim_date Which date does this belong to?
month_number Order date? Ship date? Delivery?
quarter_number Order date? Ship date? Delivery?
fiscal_year Order date? Ship date? Delivery?
Someone checks the DAG to understand what date context lives in the fact table. There’s one dim_date node. It tells them nothing about the four roles it plays.
You add a new fiscal calendar attribute to dim_date. Now you grep across every fact table to find which CTEs are SELECT * and which are selective, because the aliasing is buried in each file individually.

Column from `dim_date`	Which date does this belong to?
`month_number`	Order date? Ship date? Delivery?
`quarter_number`	Order date? Ship date? Delivery?
`fiscal_year`	Order date? Ship date? Delivery?

None of this is fatal. But it compounds quietly, and by the time it’s a real problem you’ve already done the damage across five models. The DAG looks clean. The problem only surfaces when a BI analyst connects fct_orders to dim_date and tries to filter by date role.

Two Wrong Answers

When the CTE approach breaks down, two “solutions” tend to come up in conversation. Both are worse than the problem.

Duplicate the physical table per role. Create dim_order_date, dim_ship_date, dim_delivery_date, and dim_return_date as four separate physical tables, each a full copy of dim_date. The BI problem is solved. But now you have four tables to refresh, four tables that can drift out of sync if a pipeline run fails, and four places to update when you add a fiscal year attribute. You’ve traded one clean dimension for four fragile ones.

Create a combination table. The idea: instead of joining dim_date four times, build one special table that has all the date combinations pre-baked in. One join, done.

It would look like this:

order_date	ship_date	order_month	ship_month
2024-01-15	2024-01-17	January	January
2024-01-15	2024-01-19	January	January
2024-03-02	2024-04-05	March	April
…	…	…	…

Two problems with this:

It grows with your data, not with the calendar. A standard dim_date has one row per calendar day — roughly 365 rows per year regardless of how many orders you process. A combination table has one row per unique date pair in your transaction history. A million orders could mean millions of rows.
It only works for one fact table. This table was built around the specific date pairs that exist in fct_orders. Your fct_returns table has completely different date combinations. You’d need a separate combination table for every fact table that needs multiple dates — and none of them can be shared.

The whole point of a date dimension is that it’s a clean, calendar-based reference that every fact table in your warehouse can join to. A combination table tied to transaction data throws that away. Kimball identifies this as a documented anti-pattern in The Data Warehouse Toolkit (3rd ed.).

Both paths lead somewhere worse. The right fix is different.

There’s a Name For This

Kimball called it role-playing dimensions.

The concept is straightforward: a single physical dimension table can serve multiple logical roles in the same fact table. An order date and a ship date are both dates with the same grain and same attributes, but they represent entirely different business events. The solution is to build and administer one physical table, then create the illusion of independent dimensions through views, with each view’s columns uniquely prefixed per role.

Kimball puts it plainly in The Data Warehouse Toolkit (3rd ed.):

“Be careful to uniquely label the columns in each of the views or aliases. For example, the order month attribute should be uniquely labeled to distinguish it from the requested ship month. If you don’t establish unique column names, you wouldn’t be able to tell the columns apart when both are dragged into a report.”

He also warns against relying solely on a BI tool’s semantic layer for this aliasing. If your organisation uses more than one BI tool (or anyone runs direct SQL queries), the semantic layer doesn’t help them. Views are more durable.

In practice the pattern is three layers deep: one physical table in gold (dim_date), one lightweight view per role (dim_order_date, dim_ship_date, and so on), and a fact table whose CTEs reference those views by name. The views are three-to-five lines each. The fact table gets clean, unambiguous FKs. The DAG shows every role as a distinct, named node.

The dbt Implementation

Here is the full pattern. One physical table in gold:

-- models/gold/shared/dim_date.sql
{{ config(materialized='table') }}

SELECT * FROM {{ ref('slv_dim_date') }}

Each role gets its own prefixed view:

-- models/gold/ecommerce/dim_order_date.sql
{{ config(materialized='view') }}

SELECT
    date_key_sk    AS order_date_key_sk,
    date_day       AS order_date_day,
    month_number   AS order_month_number,
    year_number    AS order_year_number,
    quarter_number AS order_quarter_number
    -- all attributes prefixed with 'order_'
FROM {{ ref('dim_date') }}

-- models/gold/ecommerce/dim_ship_date.sql
{{ config(materialized='view') }}

SELECT
    date_key_sk    AS ship_date_key_sk,
    date_day       AS ship_date_day,
    month_number   AS ship_month_number,
    year_number    AS ship_year_number,
    quarter_number AS ship_quarter_number
    -- all attributes prefixed with 'ship_'
FROM {{ ref('dim_date') }}

The fact table becomes clean:

-- models/gold/ecommerce/fct_orders.sql
WITH order_dates    AS (SELECT * FROM {{ ref('dim_order_date') }}),
     ship_dates     AS (SELECT * FROM {{ ref('dim_ship_date') }}),
     delivery_dates AS (SELECT * FROM {{ ref('dim_delivery_date') }}),
     return_dates   AS (SELECT * FROM {{ ref('dim_return_date') }})

SELECT
    o.order_id,
    od.order_date_key_sk,
    sd.ship_date_key_sk,
    dd.delivery_date_key_sk,
    rd.return_date_key_sk,
    od.order_month_number,
    sd.ship_month_number
FROM orders o
LEFT JOIN order_dates    od ON o.order_date    = od.order_date_day
LEFT JOIN ship_dates     sd ON o.ship_date     = sd.ship_date_day
LEFT JOIN delivery_dates dd ON o.delivery_date = dd.delivery_date_day
LEFT JOIN return_dates   rd ON o.return_date   = rd.return_date_day

What changed:

The column naming contract lives in the view, not buried across multiple fact files
The dbt DAG now shows distinct nodes for each role. A BI analyst browsing the catalog finds dim_order_date, dim_ship_date, dim_delivery_date, and dim_return_date as named, documented dimensions
Each date FK in the fact table’s YAML can now carry a relationships data test pointing to its named role view. With the alias CTE approach, that contract is invisible to dbt’s test framework
A second fact table that needs order_date just does ref('dim_order_date') with no copy-pasting CTEs
A new attribute added to dim_date requires a one-line alias addition to each role view — one place per role, not a change scattered across every fact file

Here is what the DAG looks like now:

graph LR
    slv_dim_date(slv_dim_date) --> dim_date(dim_date)
    dim_date --> dim_order_date(dim_order_date)
    dim_date --> dim_ship_date(dim_ship_date)
    dim_date --> dim_delivery_date(dim_delivery_date)
    dim_date --> dim_return_date(dim_return_date)
    dim_order_date --> fct_orders(fct_orders)
    dim_ship_date --> fct_orders
    dim_delivery_date --> fct_orders
    dim_return_date --> fct_orders

Views are resolved at query execution time on every major warehouse (Databricks, Snowflake, BigQuery, Redshift). Both the alias CTE approach and the role-playing view approach read the same underlying physical table. No extra scan, zero performance difference. The role-playing pattern is purely an organisation and naming decision. Your warehouse doesn’t care, but your team will.

”But Now I Have Four Date Dimensions”

Yes. That’s fine.

When you first see a domain folder with four dim_*_date.sql files, it feels like things have gotten complicated. It hasn’t. Each of those is a three-to-five line view. Views have zero storage cost at rest on any modern warehouse. You are not duplicating any data.

The other thing that helps with the “explosion” feeling: where you put these files communicates scope without extra documentation.

For a single-domain e-commerce project, all four date dims belong together in the domain folder. They’re all ecommerce-specific:

gold/
└── ecommerce/
    ├── dim_order_date.sql
    ├── dim_ship_date.sql
    ├── dim_delivery_date.sql
    └── dim_return_date.sql

shared/ enters the picture when a dim crosses domain boundaries. If your finance team’s revenue fact table also needs dim_order_date for revenue recognition (order dates matter to finance, not just logistics), that’s the signal to move it:

gold/
├── shared/
│   └── dim_order_date.sql    ← now referenced by ecommerce AND finance
└── ecommerce/
    ├── dim_ship_date.sql
    ├── dim_delivery_date.sql
    └── dim_return_date.sql

The folder is a live indicator of how broadly a dim is used across your warehouse. You never have to write that down separately. The location says it.

When To Use Each Approach

The alias CTE approach is not wrong. It is the right tool in a specific context. Here is how I now think about which to reach for:

Stick with alias CTEs when:

You are in a sandbox, an ad-hoc analysis, or a one-off exploration query
The model is not consumed by a BI tool
You are prototyping and the model is not production-bound

Create a role-playing dimension when:

The date FK appears in a production gold fact table
A BI tool needs to browse it as a named, discoverable dimension
More than one fact table will reference the same date role (now or eventually)
You want the column prefixing contract in one place, not scattered across fact files

Where to put it:

gold/shared/ if two or more fact tables across different domains reference it
gold/<domain>/ if it is scoped to one domain’s fact tables

One exception: BI-layer aliasing. Some semantic layers (Looker LookML, Sigma, Lightdash) can alias dimension columns per role without separate warehouse views. If all data access flows through a single BI tool with a mature semantic layer, that layer can manage role disambiguation. The risk: any direct SQL access — a data scientist’s notebook, an ad-hoc Databricks session, a dbt show query — bypasses the semantic layer entirely. Views are the more durable default; they work regardless of what the query comes from.

Not All Date FKs Are Equal

One nuance worth calling out: not every date column in a fact table is the same kind of date.

Business event dates describe something that happened in the domain: an order was placed, an item shipped, a delivery arrived, a subscription was cancelled. These are the dates analysts reach for when they build reports. They always deserve a role-playing dim with prefixed columns.

System timestamps, typically created_at, record when the row arrived in the source system. This is an audit date, not a domain event. Analysts rarely ask “show me orders by the date the record was inserted.” In most transactional systems, created_at is used primarily as a join predicate for resolving slowly changing dimensions, not as a primary analytical lens.

For system timestamps, the rule depends on one thing: is this FK selected in the fact table’s output?

Selected as a column → create the role-playing dim for consistency; a BI user can join to it and will benefit from prefixed column names
Used only as a join predicate, never selected → an alias on dim_date is acceptable; no BI user will reach it anyway

The practical test: look at your fact table’s SELECT list. Every date FK you are exposing to a BI tool deserves a named, prefixed view. The ones that exist only inside a JOIN condition and never surface as columns are fair game for the alias approach.

The upgrade from alias CTEs to role-playing views is not a rewrite. It is extracting the column aliasing you already wrote into a dedicated model and pointing your fact table at that instead.

Takeaways

Every date FK exposed to a BI tool deserves a named, prefixed view. The ones living only inside a JOIN condition and never surfacing as columns are fair game for the alias approach.
Put role-playing dims in gold/<domain>/ until two domains need them, then move to gold/shared/. The folder communicates scope without extra documentation.
Add attributes to dim_date, not to role-playing views. A new column in dim_date requires a one-line alias addition to each role view — but that is one place per role, not a grep across every fact file. The naming contract lives once, in the physical dimension.
The alias CTE approach is a reasonable starting point. Role-playing views are the version that holds up at scale. The gap between them is smaller than it looks — you are extracting an alias you already wrote into a dedicated model.

Have you hit this in your own project? I’d be curious whether you landed on role-playing views from the start or refactored into them, and what finally forced the conversation.