SaaS & Tech Finance, Fintech & Investment

The Most Important File in Our dbt Repo Was a Markdown File

Q: How do you keep CLAUDE.md from going stale as the project evolves?

Treat it like `dbt_project.yml` — it changes when the project changes. We added one rule: any decision made in a session that should carry forward gets added to `PATTERNS.md` or `CLAUDE.md` before the session closes. Two minutes. The gap between what `CLAUDE.md` says and what the project actually does is how inconsistencies accumulate.

Q: Isn't this just prompt engineering?

The `CLAUDE.md` content is, yes. The hook system is not. Hooks run at the OS level and fire before Claude can commit code or modify a protected file. Instructions get degraded by context limits. Hooks don't. You need both: instructions shape behavior you want, hooks prevent behavior you cannot afford to allow by accident. Later in a long session — past 60% of available context — Claude may lose coherence on rules it was following earlier. A hook that was true in session minute one is still true in session minute ninety.

What goes in CLAUDE.md when you're using Claude Code on a financial dbt codebase — and why hooks are a different kind of guardrail.

Arturo Cárdenas

Founder & Chief Data Analytics & AI Officer

January 16, 2026 · 13 min read

The Most Important File in Our dbt Repo Was a Markdown File

Key Takeaway

There is a real problem with using AI coding assistants on financial systems: they don't know what they're not supposed to touch. We solved it with CLAUDE.md — three categories of content that made Claude Code trustworthy on a billing system with real revenue data.

There is a real problem with using AI coding assistants on financial systems: they don't know what they're not supposed to touch.

We spent nine days using Claude Code to rebuild FY27 pricing models on a Snowflake-native dbt codebase. Sixty-nine models. Thirty-one macros. Real revenue data. The AI could write every line of code we asked for. That's exactly why we had to be careful about what we asked.

The fix lived in a file called CLAUDE.md.

The problem starts before the first session

The codebase Claude Code walked into had a five-year history, and most of that history was invisible.

Pricing logic had been hardcoded in Jinja macros since 2021 — CASE/WHEN statements with literal rate values baked directly into the code. Every pricing change was a deployment. When rates changed without a deployment — which happened — there was no record of it in git. During validation, we found a 118% undocumented price increase that existed only at the database level with no code change.

This is what Claude Code walks into without a CLAUDE.md: a codebase where the most important business logic is either undocumented or gone. The AI will generate syntactically valid dbt that reflects how pricing systems are supposed to work, not how this one actually works.

A cloud security company needed updated pricing models before fiscal year start. Nine days. The data team came in skeptical — not of dbt, but of AI-assisted development where a wrong number means a wrong invoice. That skepticism made sense.

We brought Claude Code in from day one as a genuine pair programmer with full repo context. One thing we hadn't accounted for: the codebase ran on Snowflake's native dbt executor, not dbt Cloud. Correlated subqueries that work in standard dbt fail at runtime in Snowflake native. Without that written somewhere Claude could read at session start, it generated patterns that looked correct and broke when tested.

First lesson: the AI doesn't know your environment. You have to write it down.

By day three, CLAUDE.md had become more important than any model file.

Claude Code reads it at the start of every session. It's not documentation — documentation is for humans. CLAUDE.md is instructions for the AI: what rules apply here, what it must never do, what the domain context means. When you don't write these down, the AI falls back on general programming knowledge. On a standard web app, that's fine. On a billing system with 11 regional multipliers, a fiscal calendar that doesn't match ISO quarters, and a pricing tier structure that maps to contractual commitments rather than usage bands, general programming knowledge is not enough.

There's also a context degradation problem. Above roughly 60% of available context, Claude starts losing coherence — forgetting decisions it made earlier in the session. Fresh sessions with structured file handoff solved this. That system is described in our post on the 9-day sprint.

Here: what actually goes into CLAUDE.md.

Three categories of content belong there.

CLAUDE.md three-layer architecture: Architecture Rules (layer 1, slate), Hard Boundaries (layer 2, amber), Domain Context (layer 3, teal), with the Hooks band at the bottom as OS-level enforcement

Architecture rules. Where code lives and how the AI should approach the project.

All staging models live in models/staging/ — prefix: stg_
Intermediate joins belong in models/intermediate/ — prefix: int_
Final marts go in models/marts/ — prefix: fct_ or dim_
Never create a mart that sources directly from raw.
Never run `dbt run` without --select. Full-refresh on prod is catastrophic.

The --select rule was a hard rule, not a soft preference — a full refresh in production Snowflake can take hours and cost accordingly. Claude follows explicit rules reliably but doesn't automatically know your warehouse billing scales with compute time.

Hard limits. What the AI must never do. We called this layer "Destructive Operation Blockers + Financial Logic Ownership."

Never run `dbt run` without --select.
Never use FLOAT or DOUBLE for financial columns. Always NUMBER(38,9).
Never edit fiscal_quarter macros without detailed comments + human approval.
Never use correlated subqueries — they fail at runtime in Snowflake native.
For any model touching revenue, billing, or pricing:
  pause and request human confirmation before writing the final mart.
The legacy BI repo is read-only. May grep/glob, never edit.

The NUMBER not FLOAT rule deserves a full sentence. Floating-point errors in revenue models are not a hypothetical risk — they're a bad quarterly close and an audit you didn't plan for. Claude Code, operating on general programming knowledge, will reach for FLOAT. It's fine for most things. On a billing system, it isn't, and the error is silent until Finance notices a rounding problem in their reporting.

The pause-and-confirm rule for financial models was the most important thing in the file. Claude would generate the model, propose the joins, work out the grain — then stop before writing the final output layer and ask for human confirmation. Small friction. Significant audit trail. Exactly the kind of thing that's obvious in hindsight and absent by default.

We also documented the migration workflow: discover (read-only), propose model plan, implement under human review, validate with evidence. That last item was earned — early on, Claude would report reconciliations as passing before we'd verified them against the full date range. Once we required a reconciliation query with results attached, the behavior changed immediately.

Domain context. What the data actually represents.

pricing_tier maps to contractual commitments, not usage bands.
fiscal_quarter uses a 4-4-5 retail calendar. Do not assume ISO quarters.
client_id in billing tables is not the same as account_id in CRM.
Regional multipliers apply to one product line only.
The other uses a global flat rate. Do not apply regional multipliers to it.

Each of these caught real problems. The fiscal quarter lag is intentional — without it documented, Claude builds a model that looks correct but produces the "wrong" number from Finance's perspective. The client_id/account_id distinction prevents silent duplicate rows on billing-CRM JOINs. The regional multiplier rule stopped Claude from applying regional pricing to the flat-rate product line — an error consistent enough to look intentional.

A new data engineer spends two weeks learning this. The AI reconstructs it from model names and column comments, and sometimes reconstructs it wrong.

The Snowflake constraint catalog

Standard CLAUDE.md templates cover architecture, limits, and domain context. When you're running dbt natively in Snowflake — not dbt Cloud, not standalone dbt Core — you need a fourth category: environment-specific constraints your AI will hit without knowing they exist.

These are the four we documented after the first sprint week.

Correlated subqueries fail at runtime. Claude generates them naturally when building lookup logic. They compile, they look correct, and they fail when Snowflake executes them. The fix is JOIN + QUALIFY:

-- What Claude generates by default (fails at runtime in Snowflake native)
SELECT *, (SELECT rate FROM rates WHERE rates.date <= t.date ORDER BY date DESC LIMIT 1) as rate
FROM transactions t

-- What actually works
SELECT t.*, r.rate
FROM transactions t
LEFT JOIN rates r ON r.date <= t.date
QUALIFY ROW_NUMBER() OVER (PARTITION BY t.id ORDER BY r.date DESC) = 1

Once this was in CLAUDE.md, Claude stopped generating the broken pattern entirely.

Macros in PARTITION BY fail silently. When Claude calls a macro inside a window function's PARTITION BY clause, dbt compiles it without error. The query runs. The results are wrong. We discovered this when Q1–Q3 FY26 data disappeared from quarterly ARR — the fiscal quarter calculation was a macro call inside PARTITION BY, and it was producing incorrect partitioning. The fix is to inline the logic rather than calling the macro. This is now documented as a hard rule: no macro calls inside window function PARTITION BY or ORDER BY clauses.

Seed type inference is wrong. Account ID "00123" becomes integer 123. Every seed file needs explicit column types in seeds.yml — without them, leading zeros vanish and decimal precision is lost silently.

No local development. The workflow is write → commit → push → test in Snowflake UI. Claude will suggest running dbt run locally. On Snowflake native, that doesn't work. Documenting this prevented suggestions that would eat session time.

CLAUDE.md sets the rules. Hooks enforce them.

Instructions degrade vs hooks don't: two horizontal tracks showing CLAUDE.md instructions fading past 60% context while the hooks line stays perfectly solid all the way across

Pre-commit hooks run before any commit lands in the repo. Pre-tool-use hooks (Claude Code-specific, configured in .claude/settings.json) fire before Claude executes a tool call. We built two types.

Pre-commit hooks that catch destructive SQL before it can land in the repo:

# Block DROP and TRUNCATE in model files
if grep -rE "^\s*(DROP|TRUNCATE)\s" models/; then
  echo "ERROR: Destructive SQL found in models/"
  exit 1
fi

Pre-tool-use hooks that intercept Claude before it edits protected paths:

# Prevent edits to macros/fiscal/ without explicit confirmation
if tool == "edit" and "macros/fiscal/" in tool_input.get("path", ""):
    raise Exception("Protected path: macros/fiscal/ requires manual review")

The fiscal macro protection was not negotiable — those macros encoded multi-year business logic that predated the current team. Claude could read and reference them but not edit them without a human unlocking the path.

The key distinction: CLAUDE.md instructions can degrade past 60% of context. Hooks run at the OS level and don't forget. We never had to use the escape hatch.

Where does dbt MCP fit?

dbt released an MCP server (April 2025) that gives AI assistants tool access to a live dbt project — query the manifest, run dbt commands, execute SQL against your warehouse. It's useful. But it operates at a different layer of the problem.

Here's how the layers map:

dbt MCP layer model: four horizontal bands showing dbt MCP (tool access), Agent Skills (workflow), CLAUDE.md (domain context), and Hooks (OS-level enforcement) — each layer solves a different problem

dbt MCP: tool access — Claude can query lineage, run dbt test, execute SQL against your warehouse at runtime
Agent Skills (dbt's markdown skill files): workflow encoding — how to use dbt commands in sequence
CLAUDE.md: domain context + rules — fiscal calendars, non-standard metric definitions, engagement-specific constraints, anything that isn't in the manifest — persistent across sessions
Hooks: enforcement — rules that must hold even when context is full

dbt Labs has explicitly stated that providing "user and domain specific knowledge" is a future roadmap item for MCP, not a current capability. The MCP can tell Claude what models exist and how they connect. It cannot tell Claude that client_id and account_id are different keys that produce silent duplicates on JOIN, or that fiscal quarters lag by one month by design, or that one product line uses a flat global rate while another uses regional multipliers.

The MCP and CLAUDE.md aren't alternatives. The MCP handles tool access; CLAUDE.md handles meaning. You need both when the project has business rules that live outside the manifest.

From engagement-specific to team-reusable

At the end of the engagement, we distilled the setup into three versions: the engagement-specific CLAUDE.md with all domain knowledge, a generalized Clarivant playbook reusable across engagements, and a simplified starter kit the data team could adopt independently — clone, follow five steps, start a session.

The value of building this carefully: it survives the engagement. The full handoff approach is documented in AI Guardrails That Outlast the Consultant.

Frequently asked questions

Does Claude Code actually understand dbt, or does it just write SQL?

It understands dbt patterns well — refs, sources, materializations, macros, the DAG model. What it doesn't know is your project's specific conventions, your data model's domain semantics, or environment-specific limitations like Snowflake native's correlated subquery behavior. CLAUDE.md closes that gap. Without it, you get syntactically valid dbt that's architecturally wrong — which is worse than a syntax error because it doesn't fail, it just drifts until someone notices.

How do you keep CLAUDE.md from going stale as the project evolves?

Treat it like dbt_project.yml — it changes when the project changes. We added one rule: any decision made in a session that should carry forward gets added to PATTERNS.md or CLAUDE.md before the session closes. Two minutes. The gap between what CLAUDE.md says and what the project actually does is how inconsistencies accumulate.

Isn't this just prompt engineering?

The CLAUDE.md content is, yes. The hook system is not. Hooks run at the OS level and fire before Claude can commit code or modify a protected file. Instructions get degraded by context limits. Hooks don't. You need both: instructions shape behavior you want, hooks prevent behavior you cannot afford to allow by accident. Later in a long session — past 60% of available context — Claude may lose coherence on rules it was following earlier. A hook that was true in session minute one is still true in session minute ninety.

What about dbt's MCP server — doesn't that make CLAUDE.md unnecessary?

They operate at different layers. The MCP gives Claude tool access to your live dbt project — lineage, manifest, the ability to run tests and execute SQL. CLAUDE.md carries domain context: the business rules, fiscal conventions, and project constraints that aren't in the manifest. dbt has acknowledged that domain context is a future roadmap item for MCP. Until then, CLAUDE.md is the bridge. You want both: the MCP for tool access, CLAUDE.md for meaning.

Should every dbt project have this setup?

No. If you're exploring a dataset, prototyping, or working alone on a non-production project, the overhead isn't justified. This setup pays off when three things are true at once: the models touch money, the codebase has domain logic that isn't obvious from the schema, and multiple sessions will build on each other over days. All three were true here. That's the setup this was built for.

The Sr. Director of Analytics went from skeptical to champion over the course of the engagement. Not because we told him AI was great. Because the constraints we put in place meant the AI was actually trustworthy on financial data.

We didn't build CLAUDE.md to limit what Claude could do. We built it so Claude could be trusted to do more.

Setting up AI-assisted workflows that teams can maintain is part of our AI Strategy practice — the guardrails are as important as the tooling.

If your team is using AI on financial data without guardrails, or avoiding it because there aren't any, we can help you set up the right boundaries. Start with a guardrails review.

Topics

claude codeCLAUDE.mddbtsnowflakehooksguardrailsai safetyfinancial dataanalytics engineeringcontext degradation

Share this article:

Arturo Cárdenas

Founder & Chief Data Analytics & AI Officer

Arturo is a senior analytics and AI consultant helping mid-market companies cut through data chaos to unlock clarity, speed, and measurable ROI.

The Most Important File in Our dbt Repo Was a Markdown File

The problem starts before the first session

The Snowflake constraint catalog

Where does dbt MCP fit?

From engagement-specific to team-reusable

Frequently asked questions

Does Claude Code actually understand dbt, or does it just write SQL?

How do you keep CLAUDE.md from going stale as the project evolves?

Isn't this just prompt engineering?

What about dbt's MCP server — doesn't that make CLAUDE.md unnecessary?

Should every dbt project have this setup?

Topics

Arturo Cárdenas

Related Services

AI Strategy

Financial Analytics & Investor Clarity

Related Case Studies

Revenue Analytics Rebuilt: $84M Validated, 5 Years of Pricing Debt Resolved

More Insights

The 60,000-Row Excel File That Changed Everything

1 Ticket to Change a Price. Not Anymore.

We Finished 5 Days Early. Here's the System.

Ready to turn data into decisions?