Skip to main content

Running dbt Natively in Snowflake (Without dbt Cloud): What They Don't Tell You

A practitioner account of running a production dbt project in Snowflake's native runner — including the four constraints the vendor docs skip.

AC
Arturo Cárdenas
Founder & Chief Data Analytics & AI Officer
March 20, 2026 · Updated March 20, 2026 · 4 min read
Running dbt Natively in Snowflake (Without dbt Cloud): What They Don't Tell You

Key Takeaway

You can run dbt inside Snowflake without dbt Cloud. What the docs skip: correlated subqueries fail silently at runtime, macros inside PARTITION BY produce wrong results without errors, seed type inference corrupts IDs, and there is no local development. This post documents each constraint, the fix, and the one macro every Snowflake-native dbt project needs before its first model touches production.

Search "dbt without dbt Cloud Snowflake" and you'll find vendor docs, pricing comparisons, and a few Reddit threads that trail off mid-answer. What you won't find is a practitioner who ran a production dbt project this way and documented what actually broke.

This is that post.


The short answer

You can run dbt natively inside Snowflake without dbt Cloud and without a local dbt Core install. Snowflake's built-in dbt runner handles compilation and execution through its native git integration. For teams already on Snowflake with a preference to keep the stack simple, it's a reasonable choice.

Here's what the vendor docs skip: correlated subqueries fail silently at runtime, macros inside window functions produce wrong results without errors, seed type inference will corrupt your IDs, and you cannot run or debug anything locally. The workflow is slower. The safety net is thinner. And you need a production-safety macro from day one — not week two.


Why we chose it

We were brought in to modernize a cloud security company's analytics platform — migrating from legacy BI to a modern dbt + Snowflake + Sigma stack over five months. The data team had an existing Snowflake environment, Okta SSO, and no appetite for managing another SaaS vendor relationship. dbt Cloud was on the table. So was standalone dbt Core wired to a CI runner. We evaluated both and chose Snowflake's native dbt integration.

The reasoning: the authentication story was already complex (more on that in the Okta + Snowflake auth post). Adding dbt Cloud's OAuth layer on top of Okta SSO on top of Snowflake network policies was friction the team didn't need. The native runner kept secrets in one place, used the existing git integration, and let Snowflake's own scheduler handle deployments. On paper: elegant. In practice: a constraint catalog we had to discover and document ourselves.


The gotcha catalog

1. Correlated subqueries fail at runtime — with no compile error

This was the first production failure. Claude Code, operating on standard dbt patterns, generated temporal lookups as correlated subqueries. They compiled cleanly. They failed when Snowflake executed them.

-- Fails at runtime in Snowflake native dbt (no compile error)
SELECT *,
  (SELECT rate
   FROM pricing_rates
   WHERE effective_date <= t.billing_date
   ORDER BY effective_date DESC
   LIMIT 1) as rate
FROM billing t

The replacement is the JOIN + QUALIFY pattern, which is both Snowflake-idiomatic and faster:

-- Works in Snowflake native dbt — and performs better
SELECT t.*, r.rate
FROM billing t
LEFT JOIN pricing_rates r
  ON r.effective_date <= t.billing_date
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY t.billing_id
  ORDER BY r.effective_date DESC
) = 1

Once we documented this in PATTERNS.md and CLAUDE.md, the AI stopped generating the broken pattern entirely. Before we documented it: every temporal lookup was a runtime failure waiting for the next push.

2. Macros in PARTITION BY produce wrong results — sometimes silently

This one is worse than the correlated subquery problem, because the failure mode depends on the macro. Some macros inside PARTITION BY fail at compilation with a dbt error. Others compile cleanly, execute, and return wrong results with no error raised. Both are bad — but the silent case is harder to find.

We hit the silent case. When you call a dbt macro inside a window function's PARTITION BY or ORDER BY clause, the native runner can compile it to broken SQL that Snowflake executes without raising an error. The query runs. The results are wrong. We found this when quarterly ARR figures stopped matching the known reference — the root cause was a date-conversion macro being called inside PARTITION BY and producing incorrect partitioning.

-- DO NOT do this in Snowflake native dbt (wrong results, no error)
SELECT
  account_id,
  SUM(arr) OVER (
    PARTITION BY {{ fiscal_quarter(billing_date) }}
  ) as quarterly_arr
FROM arr_base

-- Inline the logic instead
SELECT
  account_id,
  SUM(arr) OVER (
    PARTITION BY
      YEAR(billing_date),
      CEIL(MONTH(billing_date) / 3.0)
  ) as quarterly_arr
FROM arr_base

The rule we added: no macro calls inside PARTITION BY or ORDER BY clauses. Inline the logic every time.

3. Seed type inference will corrupt your IDs

Snowflake's native dbt infers column types from seed CSV values, and it gets them wrong in predictable ways. Account ID 00123 becomes integer 123. Decimal precision is silently truncated. Date strings parse with incorrect formats.

The fix is explicit column type declarations in seeds.yml — but you won't know you need this until you're debugging why account lookups fail for anything with a leading zero:

# seeds.yml — declare every non-obvious type explicitly
version: 2

seeds:
  - name: pricing_rates
    config:
      column_types:
        account_id: varchar(20)
        region: varchar(50)
        rate_adjustment: number(5,4)
        effective_date: date
        product_tier: varchar(100)

We added this after discovering that 12 regional accounts with leading-zero IDs were silently failing lookups. The data was there. The joins were producing nulls. Explicit types in seeds.yml fixed it entirely.

4. No local development — and what that actually means

With Snowflake native dbt, there is no dbt run on your laptop. The cycle is: write code → commit → push → wait for Snowflake to pick it up → test in the UI → report results → start the next session. You cannot run dbt debug, dbt compile --select, or dbt test locally against your actual warehouse.

This matters more than it sounds. Debugging a model requires a full push cycle. A quick "does this compile?" check requires committing. The total iteration time per change is measured in minutes, not seconds.

The upside — and it's real — is that you can't commit throwaway code. Every test is a commit. The repo history is clean. "Works on my machine" is structurally impossible. On a multi-developer team where three people share a dev environment, that discipline has value.

What it means in practice: test logic in chunks, not line by line. Write complete model files before pushing. Batch your debugging cycles. And document +persist_docs in dbt_project.yml — since you don't have a dbt Cloud docs site, column and model descriptions written to Snowflake's native metadata are the next best thing:

# dbt_project.yml
models:
  your_project:
    +persist_docs:
      relation: true    # Model descriptions appear in Snowflake UI
      columns: true     # Column descriptions appear in Snowflake UI

Sigma and other BI tools that connect to Snowflake inherit these descriptions automatically. It's not a full docs site. It's enough.

Snowflake-native dbt constraints map: two columns showing what works (JOIN+QUALIFY, inlined window logic, explicit seed types, validate_dev_target) vs what breaks (correlated subqueries, macro-in-PARTITION-BY, seed type inference, no local dev)

Development cycle comparison: standard dbt local loop (seconds) vs Snowflake-native workflow (minutes per iteration, commit-push-test cycle)


The production safety macro

The single most important thing you can add to a Snowflake-native dbt project is a macro that prevents a developer from accidentally running against production.

Without local development, the only environment is the Snowflake environment. And Snowflake's native runner will execute against whichever target is configured. If someone tests a model change and their target is prod, it runs in production. No sandbox. No dry run. The model runs.

We solved this with a validate_dev_target macro called at the top of every session:

{% macro validate_dev_target() %}
  {#- List of valid development targets -#}
  {% set valid_dev_targets = ['dev_alice', 'dev_bob', 'dev_shared'] %}

  {#- Catch unknown dev targets -#}
  {% if target.name.startswith('dev_') and target.name not in valid_dev_targets %}
    {{ exceptions.raise_compiler_error(
      "Unknown target: '" ~ target.name ~ "'\n" ~
      "Valid targets: " ~ valid_dev_targets | join(', ') ~ "\n" ~
      "Did you forget to add it to profiles.yml?"
    ) }}
  {% endif %}

  {#- Block prod runs from outside CI/CD -#}
  {% if target.name == 'prod' %}
    {% set is_ci = env_var('CI', 'false') %}
    {% if is_ci != 'true' %}
      {{ exceptions.raise_compiler_error(
        "PROD target can only run in CI/CD!\n" ~
        "   Current target: " ~ target.name ~ "\n" ~
        "   Use a dev_* target for local development."
      ) }}
    {% endif %}
  {% endif %}

  {#- Log confirmation -#}
  {% do log("Target validation passed!", info=True) %}
  {% do log("Target: " ~ target.name, info=True) %}
  {% do log("Database: " ~ target.database, info=True) %}
  {% do log("Schema: " ~ target.schema, info=True) %}

{% endmacro %}

Two things this macro does: first, it catches unrecognized dev targets — the error message tells you exactly what's wrong and how to fix it. Second, it blocks any prod run unless the CI environment variable is set to 'true', which only happens in Snowflake's automated deployment pipeline.

This runs at compilation time. If someone is misconfigured, the session fails before it touches data. Across the five-month engagement, we had zero accidental production writes. That's not luck — it's a macro.


What Snowflake-specific patterns actually help

Once you accept the constraints, the Snowflake-specific SQL features become useful. Three that earned their place:

QUALIFY for window function filtering — cleaner than wrapping everything in a subquery, and it's what makes the JOIN + QUALIFY pattern work:

-- One row per account, most recent billing period
SELECT account_id, billing_date, arr
FROM usage_data
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY account_id
  ORDER BY billing_date DESC
) = 1

TRY_CAST for defensive type conversions — returns NULL on failure instead of raising an error, which matters when you're migrating data from a legacy system with inconsistent types:

-- Safe numeric conversion — NULL on failure, not an error
TRY_CAST(legacy_account_id AS NUMBER) AS account_id_num

EQUAL_NULL for set comparisons — handles NULL = NULL correctly, which standard SQL does not. Essential when comparing rows across systems where nullable columns are part of the grain:

-- Handles NULL = NULL correctly in change detection
WHERE NOT EQUAL_NULL(source.discount_rate, target.discount_rate)

None of these are in the standard dbt tutorial. All three removed real bugs from the codebase.


What we'd do differently

Start the gotcha catalog on day one. We discovered each constraint during a push cycle, which meant we learned them the expensive way — write code, commit, push, fail in Snowflake UI, understand what broke, fix, repeat. By week two we had four hard rules in CLAUDE.md and PATTERNS.md. Those rules should have been there at project start.

The second thing: get seeds.yml column types right before you write any model that joins to a seed. The type inference corruption problem compounds. If your account IDs are integers when they should be strings, every model that joins on account ID is producing silently incorrect results. You won't see an error. You'll see wrong numbers in your BI tool, and you'll spend hours looking in the wrong place.

For the full picture on authentication constraints that shaped this workflow — Okta SSO, dynamic IP whitelisting, the permission dance with the infra team — see the companion post on enterprise security and developer velocity.

For how we used this environment to run a 9-day AI-assisted sprint with 265,638 validated rows, see AI-Assisted Analytics Engineering: How Claude Code Changed Our dbt Workflow.


Snowflake-native dbt constraint map: four quadrants showing Compile-time errors (top left, gray), Runtime failures (top right, red), Silent data corruption (bottom right, amber), and Workflow constraints (bottom left, slate), with each gotcha plotted by severity and detectability


Frequently asked questions

Is Snowflake-native dbt the same as dbt Core?

No. Snowflake's built-in dbt runner uses dbt Core under the hood, but it executes inside Snowflake's managed environment. You don't install dbt locally, there's no profiles.yml on your machine, and you can't run dbt run from a terminal. The compilation happens in Snowflake's pipeline, triggered by git commits or scheduled jobs. The constraints documented here — correlated subquery failures, seed type inference, no local execution — are specific to this managed runtime, not to dbt Core in general.

Does dbt Cloud have the same constraints?

No. Most of these constraints are specific to Snowflake's native runner, not dbt Cloud. dbt Cloud supports local development, has its own docs site, and runs dbt outside of Snowflake's execution environment. If you have the option, dbt Cloud is the more flexible path. The reason to choose the native runner is tighter Snowflake integration, one fewer vendor to manage, and authentication that stays inside your existing Snowflake trust boundary.

How do you handle the RBAC complexity that comes with multi-developer Snowflake environments?

Each developer gets an isolated database — DBT_DEV__<NAME> — so development work never touches shared schemas. The validate_dev_target macro prevents accidental production runs at the compilation layer. Production deployments go through a service role with a separate grant structure. The full RBAC pattern — role hierarchies, warehouse grants, schema access for service roles — is its own topic covered in The Hidden Tax of Snowflake RBAC.

What happens to dbt docs if there's no dbt Cloud docs site?

+persist_docs writes model and column descriptions directly into Snowflake's native metadata. The Snowflake UI shows them in the schema browser. BI tools like Sigma inherit column descriptions automatically. It's not as good as dbt Cloud's docs site — you lose the lineage graph UI and the searchable docs portal — but for teams already deep in Snowflake, it covers most of the daily-use cases. The bigger loss is the inability to run dbt docs generate and dbt docs serve locally for interactive lineage exploration.

Should every Snowflake dbt project use the validate_dev_target macro?

Yes, if you have more than one developer and a production target. The risk is low-stakes on a solo project with a single environment. On any project with shared dev targets, a production deployment pipeline, and real data in production — which describes almost every commercial dbt project — a misconfigured target will eventually run against production. The macro takes 37 lines to write and costs nothing to call. The scenario it prevents costs much more than that to recover from.


The documentation for Snowflake-native dbt tells you how to set it up. It doesn't tell you what breaks, why it breaks silently, or what you need before the first model touches production. That's what this catalog is for.

The workflow is slower than standard dbt development. The constraints are real. So are the upsides: clean commit history, authentication that stays in one trust boundary, and a deployment pipeline that lives entirely inside your existing Snowflake investment. Whether that tradeoff makes sense depends on your team. If you make it, go in with the constraint catalog already written.

Once dbt is running natively in Snowflake, Cortex Analyst becomes the next question — we tested it on production financial data.


Considering running dbt natively in Snowflake? We can help you evaluate the tradeoffs before you hit the constraints. Book a quick consultation.

Topics

snowflake native dbtdbt without dbt cloudsnowflake dbt coredbt correlated subquery snowflakesnowflake dbt seed type inferencevalidate_dev_target macrodbt PARTITION BY macro bugsnowflake dbt no local development
Share this article:
AC

Arturo Cárdenas

Founder & Chief Data Analytics & AI Officer

Arturo is a senior analytics and AI consultant helping mid-market companies cut through data chaos to unlock clarity, speed, and measurable ROI.

Ready to turn data into decisions?

Let's discuss how Clarivant can help you achieve measurable ROI in months.