Our Chief Architect issued three commands. Twenty-five minutes later, an agent had built a complete dbt lineage from a 67-page technical spec.

Bronze, silver, and gold models, all tested, all documented, with a fully detailed pull request pushed to GitHub.

No hand-written code. No cleanup pass. On a live webinar with dbt Labs, in front of hundreds of data practitioners. [Watch the full webinar on demand →]

That webinar was not a marketing exercise. It was a real internal project: building a capacity planning and forecasting lineage so our Head of Practice could answer a straightforward but critical question before Monday's strategic planning session. Can we handle the incoming pipeline? Who do we need to hire? When?

The agent answered it. The code passed review. The PR got merged.

Here is what we have learned building this way for over a year, and why we are convinced that the gap between firms doing this and firms talking about it is about to become permanent.

Why "Paste and Clean Up" AI Workflows Fail for Data Engineering

Most data teams treat AI as a productivity hack. Paste some code into a chatbot, get a rough draft, clean it up manually. That is not agentic development. That is autocomplete with extra steps.

The failure mode is well documented. You prompt an agent without sufficient context and you get generic code that looks right but breaks against real data. No test coverage. No documentation. No alignment to the business question the code was supposed to answer. You spend more time debugging the output than you would have spent writing it yourself.

Why Structured Context Beats Prompting for AI-Generated dbt Code

The breakthrough was not better prompts. It was massive, structured context.

We decomposed every task our analytics engineers perform daily into discrete, automatable operations. Then we built agents and skills for each one. The result is a layered system our agents draw from every time they build:

  • Code context. Example code, existing repositories, reference architectures the agents can pattern-match against.
  • Data context. Raw source data, existing models, seed data. Our agents query the warehouse directly via the dbt MCP server to understand what they are building against before writing a single line of SQL.
  • Document context. Business use cases, business requirements documents (BRDs), 67-page technical specs, best practices guides, PR templates, and design guides. Every use case starts with three documents: what the business needs, how they define their metrics, and a precise technical implementation plan.
  • Tool context. The dbt MCP server for running dbt show, dbt build, and data queries. GitHub CLI for branch management and pull requests. Document fetchers for pulling specs into the agent's context window.
  • Instruction context. 65+ custom Claude Code skills encoding our standards for medallion architecture, test coverage, linting, documentation, naming conventions, and code review. Over 60 rules the agent loads before writing its first model, every single time.

This is the difference between an agent that produces throwaway code and an agent that produces production-grade infrastructure. The agent does not guess how to write a bronze model. It loads our dbt best practices, reads the technical spec, queries the raw data to understand column types and relationships, creates a to-do list, and builds exactly what the spec calls for. When tests fail, it queries the data to understand why, updates the test expectations, and reruns. When it finishes, it pushes a PR with detailed descriptions of every model, every test, and the business questions the lineage answers.

Our Chief Architect has not written code by hand in over a year. He architects, designs technical specs, reviews agent output, and orchestrates. The agent handles the typing. He handles the thinking.

The Live Demo: Building a Capacity Planning Lineage With Claude Code

dbt Labs invited us to present alongside their product team at their recent "Operationalize Analytics Agents" webinar. They showed their new AI surface area: MCP servers, structured context layers, and the analyst and developer agents coming to the dbt platform. We showed what happens when you take that infrastructure seriously and build a production workflow on top of it.

The demo was a real project. Our Head of Practice, Tom Clinton, needed answers to a capacity planning question: the team was at 100% utilization, one deal was signed, five more were about to close, and strategic planning was Monday. He needed to know who to hire, what roles, and how much lead time, by end of day.

Our Chief Architect, Dylan Cruise, opened Claude Code and issued three commands across three clean context windows (one per medallion layer to prevent cross-contamination):

  1. build bronze models pointing to the technical spec
  2. build silver models pointing to the technical spec
  3. build gold models pointing to the technical spec

Each time, the agent loaded its skills and the spec, queried the raw data via the dbt MCP server, created a to-do list, built the models with full documentation and tests, ran dbt build to validate, self-corrected any failures, and checked off each task. At the end, one more command created a branch, resolved merge conflicts, and pushed a PR with a detailed summary.

!
Total Runtime

Approximately 25 minutes for the full lineage. Bronze models, silver models, gold analysis models, all tested, all documented, all following our class-leading standards for medallion architecture.

Then we connected the dbt MCP server to Claude Desktop and let Tom query the newly built models directly. The result: a hiring plan showing which roles needed to be filled immediately and which could wait. Same-day answer to a question that would typically require a multi-day analysis cycle.

The Three Specs Required Before an AI Agent Writes dbt Code

The part of our process that surprises people most has nothing to do with AI. It is the upfront documentation.

For every use case we build, whether internal or for a client, we produce three documents before the agent writes anything:

  • Business Use Case. Aligns us with stakeholders on exactly what we are building, the business context, the goals, the decisions the analysis will inform, and what is in and out of scope. By the end of delivery, they get exactly what they asked for because we agreed on it in writing first.
  • Business Requirements Document (BRD). Defines the specific metrics, their definitions, and how the business thinks about them. We build dbt models for exactly how the business measures its own performance, not how we think they should.
  • Technical Specification. Takes the BRD and translates it into a precise implementation plan. Bronze models, silver models, gold models, every column, every join, every test. In our webinar demo, this document was 67 pages long. That level of specificity is what separates an agent that builds production infrastructure from one that builds a rough draft.

We created this document workflow years before we had agents. We used to write these specs for our human engineers. Now we write them for agents. The only difference is that the specs have gotten more detailed, because agents can consume and follow 67 pages of technical context without losing focus, something no human can match.

How Agentic Analytics Engineering Changes Data Team Economics

The dynamics that make data projects slow are not going away. Stakeholder requirements shift constantly. Budgets shrink while expectations grow. Deadlines force teams to choose between speed and quality, and they usually sacrifice quality.

Agents do not solve these problems by typing faster. They solve them by compressing the gap between a precisely defined business question and trusted, production-ready infrastructure. The documentation forces alignment before building starts. The agents execute the build at a speed and consistency level that eliminates the traditional tradeoff between speed and quality. The human stays in the loop as the architect, the reviewer, and the decision-maker.

This is what we mean when we say our talent handles the thinking and our agents handle the typing. It is not a slogan. It is the operating model that drives 90%+ client retention.

Watch the Full dbt Labs and Mammoth Growth Agent Webinar On-Demand

We showed the full workflow live: the documentation pipeline, the agent configuration, the Claude Code commands, the dbt MCP integration, the PR output, and the business analysis layer. If you want to see agentic analytics engineering done at production quality, not a toy demo, this is the session.

[Watch "Operationalize Analytics Agents: dbt AI Updates and Mammoth's AE Agent in Action" on demand →]

If you are building with agents today, or thinking about starting, we are happy to share what we have learned. Reach out to Tom Clinton or Dylan Cruise on LinkedIn, or book a conversation with our team.

[Book a Conversation →]