Vibe Coding, But Production-Ready: A Specs-Driven Feedback Loop for AI-Assisted Development

ai
29 Mar 2026
25 mins read

Vibe coding is fun, fast, and honestly one of the best ways to unlock momentum. I use it too.

But when we move from exploration to production, momentum alone is not enough. If we skip the feedback loop, we pay later with rework, version mismatches, security gaps, and architecture drift.

This post is an instructional playbook for a mixed audience: engineers, tech leads, and product people who collaborate in AI-assisted software delivery.

The examples throughout use Angular and Java/Spring because that is my primary tech stack. The cycle itself applies to any framework, language, or platform — the version drift and default-selection problems happen everywhere.

The Core Idea

The goal is not to stop vibe coding.

The goal is to add engineering control around vibe coding so we can keep speed without sacrificing quality.

Think of it this way:

Decision Framework: When to Vibe, When to Spec

Use this simple rule:

This keeps speed where speed belongs, and rigor where rigor matters.

Two Workflows, Two Outcomes

Workflow A: Prompt to Implementation (No Feedback Loop)

Typical flow:

  1. Ask AI to build the project.
  2. Accept defaults.
  3. Start coding features.
  4. Discover issues later during integration or release.

Common failure modes:

Workflow B: Specs-Driven AI Cycle (With Feedback Loop)

Typical flow:

  1. Capture intent and constraints.
  2. Write a high-level technical design.
  3. Expand into low-level decisions.
  4. Ask AI to implement against that spec.
  5. Validate versions, dependencies, architecture, and tests.
  6. Feed findings back into the next iteration.

This workflow still moves quickly, but it reduces expensive surprises.

Common Advice vs Practical Reality

Common advice map

Reality map

Enterprise Reality: Software Development Is More Than Code Generation

One trap in AI-assisted delivery is treating code generation as if it were the entire software development process.

It is not.

Software development includes coding, but also verification, security, compliance, operability, and release readiness.

In enterprise environments, generated source code usually must pass explicit guardrails before it can be merged or deployed:

Coding is one part of the equation, not the whole equation.

The Engineer’s Role Is Changing, Not Shrinking

If AI can generate code, what is the engineer’s job now?

The role shifts from writing every line to orchestrating and validating the output. You define intent, set constraints, review architecture decisions, verify correctness, and own what gets committed. The developer who merges AI-generated code is accountable for it — the same way you are accountable for code written by a junior teammate after you approve the PR.

This is not a lesser role. It is a harder one.

Orchestration requires the same skills it always did: system design, debugging, security awareness, performance intuition, and domain knowledge. AI does not replace those skills. It amplifies them.

A concrete example: you ask AI to scaffold a Spring Boot service with a REST endpoint that accepts user input and queries a database. The generated code compiles, tests pass, and the endpoint works. But when you review it, you notice the query is built with string concatenation instead of parameterized queries. If you understand SQL injection, you catch it in seconds. If you do not, it ships to production with a security hole that no test covered because no test was written for that attack vector.

That is the difference between generating code and engineering software.

The shift is in where you spend your time. Less time on boilerplate and syntax. More time on design, review, integration, and verification. The feedback loop in this post exists precisely because orchestration without structure is just accepting defaults with extra steps.

Skills still matter. They matter more, not less, because the cost of missing a bad decision goes up when code arrives faster than your ability to evaluate it.

Why Product People Should Care Too

This is not just an engineering process concern.

When teams skip specification and design:

When teams adopt a lightweight spec-first loop:

Estimates, Not Guesstimates

One underrated benefit of this process: it makes estimation genuinely easier.

When you have a written intent spec, a high-level design, a low-level design with acceptance criteria, and a test scenario catalog, you are no longer estimating in a vacuum. You know what needs to be built, how it connects, what the edge cases are, and what “done” looks like.

That changes estimation from a guessing game into a structured conversation.

The specs-driven cycle does not eliminate uncertainty. But it replaces vague gut feel with grounded decomposition, and that is where reliable estimates come from.

Before You Prompt: Set Up Instructions and Skills First

The single highest-leverage thing a team can do before running any prompt is to encode their guardrails into the codebase as instruction and skill files.

These are plain text files, checked into the repository, that many AI tools can read automatically at the start of a session. They do not require a new prompt when your tool supports repository instructions. In practice, this turns tribal knowledge into shared defaults.

What belongs in instruction files:

What skills add on top:

Skills are named, reusable workflows. Instead of rewriting a system role and a set of rules on every prompt, a skill pre-packages that context. A tdd-implementation skill already knows it must write tests first, use your project’s test framework, and follow your naming conventions before the engineer types a single word.

Why this matters before prompting:

Without instruction files, every prompt starts from a blank slate. The AI does not know your version policy, your test strategy, or your architecture boundaries. It makes plausible but uninformed decisions. With instructions in place, prompts can inherit those constraints automatically in tools that support this behavior. The guardrails are not something you have to remember to add manually every time.

Set up your instruction and skill files before you run your first prompt, not after you have already started building.

The Specs-Driven Cycle You Can Use This Week

Use the following five-step cycle.

Step 1: Define Product Intent (Short, Crisp, Testable)

Write a one-page product intent spec:

Prerequisite: Read and Understand the Existing Codebase

If your project has an existing codebase, read and understand it before proceeding to Step 2.

For greenfield projects: skip this prerequisite and proceed directly to Step 2.

Ask the AI to read and understand your existing code before it proposes any technical design or implementation. This is critical because:

Practical approach:

  1. Provide the AI with key codebase context: repository structure, main modules, entry points, key design decisions.
  2. Ask the AI to read and summarize the existing architecture, patterns, and conventions.
  3. Include this summary as context in all subsequent design and implementation prompts.

This shifts the AI from “build something from scratch” to “extend and improve what exists.”

Greenfield projects: If there is no existing codebase to read, proceed directly to the next step. Do not spend time trying to generate or imagine a codebase structure.

Step 2: High-Level Technical Design

Document:

Step 3: Low-Level Design and Acceptance Checks

Document:

At this step, each scenario should be explicit and testable:

Break the Low-Level Design Into a Detailed Implementation Plan

Once the low-level design is documented, go one step further: ask the AI to produce a detailed, ordered implementation plan before writing a single line of production code.

This plan should list every task — components, endpoints, data models, migrations, tests, configuration — as discrete, sequenced steps small enough to evaluate individually.

Why this matters:

Think of this plan as a checklist the team agrees on before any implementation prompt is sent. Each item is a unit of work that can be handed to AI, reviewed independently, and tested in isolation.

Gate: Design Review Before Implementation

If your company has a design committee, architecture committee, or even an informal tech lead review process, this is the right moment to use it.

Present the high-level and low-level designs before any code is generated. Get feedback from peers and stakeholders while changes are still cheap. A single comment at the design stage can prevent days of rework after implementation.

This is not bureaucracy. It is the same discipline engineers apply to code review, extended one step earlier where the cost of change is lowest. AI makes implementation fast — but fast implementation in the wrong direction is still the wrong direction.

Do not skip the review gate because the AI can generate code quickly. Speed of generation is not a reason to compress the thinking time.

Step 4: AI-Assisted Implementation Against the Spec (TDD First)

In this step, ask AI to follow a strict TDD cycle. Implementation comes after tests, not before.

Use this sequence for each feature slice:

  1. Ask AI to write tests first from acceptance criteria.
  2. Run tests and confirm they fail for the expected reason.
  3. Ask AI to implement the smallest change needed to make tests pass.
  4. Re-run tests.
  5. Ask AI to refactor while keeping tests green.
  6. Move to the next feature slice.

The TDD cycle applies at every test layer, not just unit tests:

Why this matters:

Practical rule:

Step 5: Verification and Feedback

Run checks and compare output with the spec:

Then package delivery for review:

Then iterate.

Concrete Examples: Angular and Spring Boot

The two examples below come from my day-to-day stack. If you work with React and Node, Python and FastAPI, or Go and Postgres, the same patterns apply — only the CLI commands and support URLs change.

Angular Example: Version Drift Without Feedback

A frequent real-world issue:

Current Angular release policy and support windows are explicit in official docs. Version alignment should be validated before implementation deepens. Version numbers in this section are examples current at publication time; always verify against the official support matrix.

Practical guardrails:

# always check CLI and framework versions first
ng version

# generate project with latest version (example)
npx @angular/cli new my-app --routing --style=scss

# validate dependency tree and update recommendations
ng update

Key lesson: the issue is not “AI is bad”. The issue is that we asked for implementation before defining constraints.

Spring Boot Example: Default Does Not Mean Right for Your Context

Another common pattern:

Important nuance:

As with Angular, treat version numbers here as point-in-time examples and confirm support status before implementation.

So the right question is not “3 or 4?” in isolation. The right question is:

Key lesson: use official generation channels and support docs as your source of truth, then use AI to accelerate implementation details.

Sample Prompts You Can Reuse

Below are practical prompts for each stage.

Prompt A: Product Intent Spec

**Context:**
You are a product engineering assistant. You are helping a team prepare specification documents for feature development before any design or implementation work begins.

Feature idea:


**Objective:**
Produce a one-page product intent specification that aligns engineering and product teams on scope, success criteria, and constraints for this feature.

**Audience:**
Product managers, engineers, tech leads, and stakeholders making planning and prioritization decisions.

**Style:**
Structured. Numbered sections. Explicit, actionable language. Avoid ambiguity.

**Tone:**
Collaborative and clarifying. If information is missing or ambiguous, ask focused questions instead of making assumptions. Assume stakeholders want precision.

**Response:**
Deliver exactly these six sections in this order:
1) Problem statement (one paragraph)
2) Target users (bullet list)
3) Success metrics (specific, measurable)
4) Out of scope (explicit non-goals)
5) Risks and assumptions (potential blockers or dependencies)
6) Acceptance criteria in Given/When/Then format
   - Include happy path, validation/failure cases, and at least one edge case per criterion

Prompt B: High-Level Technical Design

**Context:**
You are a senior software architect designing solutions. You have reviewed the product intent spec and are now translating business requirements into system design.

Intent spec:


**Objective:**
Produce a high-level technical design that translates the product intent into architecture and system boundaries, without implementation details or code.

**Audience:**
Engineers, tech leads, and architects who will review this design and decide if it aligns with technical strategy and team capabilities.

**Style:**
Text-based diagrams and structured sections. Visual representations in ASCII or text form are preferred (not code). Annotate relationships and data flows clearly.

**Tone:**
Clear and architectural. Explain trade-offs between alternatives. Flag constraints or concerns early.

**Response:**
Deliver exactly these sections in this order:
- Architecture diagram in text form (ASCII or text-based visualization)
- Component responsibilities (what each major component owns)
- Data flow (how data moves between components)
- Security and observability requirements (non-functional needs)
- Key trade-offs and alternatives considered (why this design, not another)
- High-level test scenario map (happy path, failure paths, and edge-case families)

Do not generate implementation code or tests. Do not write code in any language.

Prompt C: Low-Level Design and Version Policy

**Context:**
You are a staff engineer preparing an implementation plan. You have the high-level design and must now specify concrete interfaces, data models, and version constraints so implementation work can be precise and testable.

High-level design:


**Objective:**
Produce a detailed low-level design and implementation plan that specifies what to build, version constraints, and test strategy — enabling unambiguous work assignments.

**Audience:**
Implementation engineers, QA, and architects who need to know exactly what to build and verify, including which versions are acceptable.

**Style:**
Detailed and concrete. Specify interfaces, data models, and error handling explicitly. Include specific version and dependency requirements.

**Tone:**
Precise. No ambiguity about version policy or technical decisions. Flag any gaps or assumptions.

**Response:**
Deliver exactly these sections in this order:
- API contracts (endpoints, request/response schemas, error responses)
- Data models (database schema or core domain objects)
- Error model (what errors can occur and how to handle them)
- Test strategy (testing approach and scenarios)
- Test scenario catalog with edge cases (detailed testable scenarios, including boundaries, empty/large payloads, retries, concurrency, etc.)
- Dependency/version policy (which versions of which dependencies are acceptable)

Version policy requirements must include:
- Angular: must be aligned with actively supported major versions
- Spring Boot: must use a currently supported release line and compatible Java version

Prompt D: Tests First (TDD)

**Context:**
You are a senior engineer working in strict test-driven development (TDD) mode. You have a low-level design and acceptance criteria. Tests must be written first, before any production code.

Low-level design:


Acceptance criteria:


**Objective:**
Write test files that directly correspond to the acceptance criteria and test scenarios. These tests will drive implementation. Do not write production code yet.

**Audience:**
Engineers who will run these tests immediately and implement code to make them pass.

**Style:**
Test code in the project's native test framework. One test per clearly named scenario. Include brief comments explaining what each test validates.

**Tone:**
Explicit. Each test must map to one acceptance criterion. Leave no ambiguity about what passes or fails.

**Response:**
Deliver:
- Test files (write actual test code using the project's test framework)
- One test per acceptance criterion, plus at least one edge case test per criterion
- Brief comments for each test explaining what it validates
- Commands to run the test suite

Imperative: Write tests only. Do not write any production code. Do not implement any features. Your output is test files only.

Prompt E: Minimal Implementation to Pass Tests

**Context:**
You are a senior engineer continuing strict TDD. Tests have been written and are currently failing. Your job is to write the minimal production code needed to make all tests pass — nothing more.

Low-level design:


Existing failing tests:


**Objective:**
Implement only the production code required to make all existing tests pass. Do not add features not covered by tests. Do not refactor unless tests fail.

**Audience:**
Engineers and code reviewers verifying that implementation matches the low-level design and test intentions.

**Style:**
Production code written in the project's native language. Follow existing code style and architecture conventions. Keep implementation focused and minimal.

**Tone:**
Strict. Only code that makes tests pass. No speculative features. If tests pass, you are done with this slice.

**Response:**
Deliver:
- Production code files (write implementation code only, no tests)
- Commands to run the existing tests (to verify they pass)
- Commands to verify framework/runtime versions (to confirm the environment)
- Assumptions checklist (what assumptions did you make? are they in the low-level design?)
- Expected test output summary (show which tests now pass)

Imperative: Do not modify the tests. Do not add features. Do not refactor. If the low-level design seems wrong, propose a design amendment instead of changing architecture.

Prompt F: Verification and Feedback Report

**Context:**
You are reviewing completed implementation against the low-level design spec. It is time to audit whether the work matches intent and identify gaps, risks, or compliance issues before release.

Spec:


Implementation summary:


**Objective:**
Produce a gap report that compares the implementation against the spec and identifies what matches, what is missing, what risks remain, and whether the work is production-ready.

**Audience:**
Engineers, tech leads, QA, and release managers deciding whether this work is ready to merge and ship.

**Style:**
Structured report with matrices, lists, and clear status indicators (met/partial/missing). Prioritize risks by severity.

**Tone:**
Critical and honest. Flag every gap and risk. Provide actionable remediation steps. Give a clear yes/no on production readiness.

**Response:**
Deliver exactly these sections in this order:
1) Compliance matrix (spec requirement → implementation status: met/partial/missing)
2) Version and dependency validation (are versions correct and supported?)
3) Risk list by severity (high/medium/low)
4) Suggested remediation steps (how to fix gaps before release)
5) Decision: ready for production? yes/no and why. If no, list the top 3 blockers.

From Prompts to Reusable Team Standards

Running these prompts manually as one-off messages is a good start. The next step is to bake them into your codebase so every engineer on the team uses the same starting point, every time.

Most AI-assisted development tools support three kinds of reusable artifacts:

The payoff is consistency and speed:

Treat your prompt files and agent definitions the same way you treat code: review them, version them, and refine them as you learn what works.

Taking It Further: MCP Servers Close the Loop with Real Data

Instructions, prompt files, and agents make the cycle consistent. MCP servers make it live.

MCP (Model Context Protocol) is an open standard that lets AI tools connect directly to external systems: file systems, APIs, registries, CI pipelines, test runners, and more. Instead of pasting context into a chat window, you give the AI a direct, authorized connection to the source of truth.

Every step in the specs-driven cycle benefits from this:

Step 1 — Product Intent: An MCP server connected to your issue tracker (GitHub Issues, Jira, Linear) can read the actual ticket, linked dependencies, and prior ADRs, so the AI writes the intent spec against real project context instead of a description you pasted.

Prerequisite — Codebase Reading: Replace manual copy-paste with a filesystem or GitHub MCP server. When configured and authorized, the AI can read your repo structure, key modules, and open PRs directly. The codebase summary in Prompt B is generated from live data, not memory.

Step 2 — High-Level Design: An MCP server can fetch the current official support schedules from Angular, Spring Boot, or Node.js release endpoints at prompt time. The version policy check in Prompt C is grounded in live registry data, not training-data guesses that may be months out of date.

Step 3 — Low-Level Design: Database and API MCP servers let the AI inspect real schemas, existing endpoint contracts, and live OpenAPI specs. When those integrations are enabled, interface designs match what actually exists, not what the AI imagines exists.

Step 4 — TDD Loop: A test-runner MCP server can execute the test suite and return actual pass/fail results inside the conversation. The TDD loop in Step 4 becomes tight and automatic: write tests → run via MCP → see real failure output → implement minimal fix → run again → verify green.

For end-to-end tests, the Playwright MCP server is a direct fit. Once configured, it gives the AI a live browser it can control: navigate to a URL, click elements, fill forms, assert on visible content, and return the results without leaving the conversation. This closes the e2e loop in the same way a test-runner MCP closes the unit test loop — the AI writes the Playwright test, runs it through the MCP server against the locally running app, reads the failure output, fixes the implementation, and reruns until the acceptance scenario passes.

You can seed that loop with Playwright Codegen: run npx playwright codegen <your-app-url>, interact with the UI manually, and Codegen outputs a ready-to-edit test file. Hand that file to the AI via the MCP server for assertion review and edge-case coverage — you get a grounded starting point instead of asking the AI to infer selectors from thin air.

Step 5 — Verification: CI and security MCP servers can trigger a real pipeline run and return the results inline. The verification report in Prompt F is built from actual build, lint, and scan outputs rather than the AI inferring what might be wrong.

The combined effect: you move from a human-in-the-loop feedback cycle to a grounded feedback loop where each step is validated against real, current data. The AI is still doing the reasoning; MCP servers are supplying the facts.

Practical starting points:

You do not need all of these on day one. Start with one — the one that removes your team’s most expensive manual step — and layer from there.

A Lightweight Team Working Agreement

If you want this to stick across product and engineering, agree on a short process:

  1. No implementation prompts before intent + design are approved.
  2. Every AI-generated project must include a version validation step.
  3. Any architecture change during coding requires a design delta note.
  4. PRs include a spec compliance checklist.
  5. Release readiness requires explicit support-lifecycle verification.

This is a small governance layer with a huge payoff.

Final Takeaway

You do not need to choose between creativity and discipline.

Vibe coding is a powerful accelerant. Specs-driven feedback loops are the steering wheel and brakes.

Great teams use both.

If your organization wants to adopt AI coding responsibly, start with one change: never go from prompt to production without a spec checkpoint and a verification loop.

References

This site uses cookies. Please choose whether to accept analytics cookies. Privacy Policy