Designing Multi-Tenant dbt Models Without Forking Everything
A thin-overlay pattern for governed transformation systems.
Details are generalized and sanitized to preserve confidentiality while keeping the engineering lesson accurate.
Multi-tenant analytical systems often start with a shared transformation layer.
At first, this works well. A single set of staging models, business transformations, and downstream marts can serve multiple teams or tenants. The logic is centralized, reviewable, and easier to maintain.
The problem appears when tenants start to diverge.
One tenant needs a custom field. Another needs a slightly different join condition. A partner team wants to extend a downstream model without changing the shared logic for everyone else. These differences may be small individually, but over time they create architectural pressure.
Three Tempting Options
The common paths all solve part of the problem while creating a longer-term cost.
- Fork the full dbt project. This gives maximum flexibility, but duplicated SQL leads to inconsistent fixes and model drift.
- Put every condition into the shared model. This keeps one codebase, but shared logic becomes hard to read and test.
- Copy selected models into tenant projects. This is easy in the short term, but ownership, lineage, and dependency boundaries become unclear.
A thin-overlay pattern offers a middle ground.
The goal is not to eliminate tenant-specific customization. The goal is to make customization explicit, reviewable, and maintainable.
The Core Idea
A thin-overlay dbt architecture separates shared transformation logic from tenant-specific extensions.
The shared base layer owns canonical logic: common staging, reusable business transformations, and governed shared models. Tenant overlay projects depend on that base layer and override or extend only the models that truly differ.
A simplified version looks like this:
Shared Base Layer
- shared staging models
- shared business logic
- governed common marts
Tenant Overlay Layer
- selected overrides
- tenant-specific extensions
- downstream custom models
Platform Guardrails
- naming conventions
- schema routing
- tests and validation
- dependency documentation
- review standards
This design keeps common logic centralized while still giving downstream teams a safe place to customize what is genuinely tenant-specific.
Why Full Forks Do Not Scale
A full fork feels simple at first. A tenant needs custom logic, so the team copies the project and changes what it needs.
But full forks create hidden maintenance cost. When shared logic changes, every fork needs to receive the fix. When a bug is discovered in a common model, platform teams must reason about which copies have the same bug, which copies have drifted, and which teams now own the variation.
The more successful the platform becomes, the worse this pattern gets.
Forking solves today's customization problem by creating tomorrow's governance problem.
Why Putting Everything Into the Base Layer Also Fails
The opposite extreme is to keep every variation inside the shared base model.
This avoids duplication, but it creates another failure mode: the shared model becomes a maze of tenant-specific conditions. Over time, the base layer stops being a clean representation of common business logic. It becomes a place where every exception lives.
A shared base layer should represent what is truly shared.
Tenant-specific logic should be visible as tenant-specific logic.
The Thin-Overlay Pattern
A thin overlay preserves the base layer while allowing targeted extension.
A tenant overlay project should be small. It should not copy the entire transformation graph. Instead, it should do a few specific things:
- Reference shared base models.
- Override selected models only when necessary.
- Add tenant-specific downstream models.
- Use clear schema or naming conventions.
- Document why an override exists.
- Keep tests close to customized logic.
The overlay is not a second platform. It is a controlled extension point.
That distinction matters. If the overlay becomes a full replacement, the architecture has quietly turned back into a fork.
Metadata-Aware Modeling
Multi-tenant dbt design is not only about SQL reuse. It is also about metadata.
A useful transformation system should make these questions easy to answer:
- Which models are shared?
- Which models are tenant-specific?
- Which tenant models depend on shared base logic?
- Which overrides are intentional?
- Which downstream outputs are safe for analytics users?
- Who owns the customized layer?
Without metadata clarity, even a clean dbt DAG can become operationally confusing.
This is where naming conventions, schema routing, documentation, and dependency visibility become part of the platform design. They are not just style preferences. They are how teams understand and operate the system over time.
Governance Without Blocking Customization
A governed transformation system should not mean every change goes through the platform team.
That does not scale.
The better goal is controlled self-service: downstream teams should be able to extend models safely, while the platform team preserves the shared base layer and review standards.
In practice, the platform should provide:
- A clear base/overlay structure.
- Examples of acceptable overrides.
- A guide for adding tenant-specific models.
- Testing expectations.
- Review rules for changes that affect shared logic.
- Documentation that explains ownership boundaries.
This is where architecture and enablement meet.
A pattern is only useful if other teams can understand and adopt it.
A Practical Design Checklist
When designing a multi-tenant dbt system, these questions are useful:
Shared Logic
- What logic is truly common across tenants?
- Which models should never be copied into tenant projects?
- How will shared fixes propagate?
Tenant Customization
- What kinds of differences are allowed in overlays?
- Should tenants override existing models or only add downstream extensions?
- How are tenant-specific models named and routed?
Governance and Operations
- How are overrides documented?
- Which changes require platform review?
- Can downstream users tell whether a dataset is shared or tenant-specific?
- Can the platform team patch shared logic without chasing uncontrolled forks?
These questions help keep the architecture honest.
The Lesson
The broader lesson is that data platform architecture should make reuse and customization coexist.
A well-designed transformation system is not only technically correct. It should also be teachable, reviewable, and safe for other teams to extend.
For multi-tenant dbt projects, the goal is not to avoid customization. The goal is to avoid uncontrolled customization.
A thin-overlay pattern gives platform teams a practical way to preserve shared business logic, support tenant-specific needs, and reduce long-term model drift.
In that sense, multi-tenant dbt design is as much an enablement problem as it is a modeling problem.