You Can't Vibe Code Without Breaking a Few Eggs

There’s a saying that if you want to make an omelet, you have to break a few eggs. Another version: if you’re working in the kitchen during service, expect shells on the counter, grease on the stove, and mise en place bowls stacked in the sink.

The creative process leaves a mess. Always has. Always will.

This is the unspoken truth of vibe coding.

The Artifacts Nobody Talks About

When you work with AI to build software—whether you call it vibe coding, vibe engineering, or just “prompting Claude until something compiles”—you’re going to end up with artifacts you didn’t ask for.

Markdown files in the root directory. Notes adjacent to the file you’re working on. Scratch pads near the AI’s configuration. Planning documents that made sense in the moment but now sit there, orphaned, taking up space in your mental model of the codebase.

These files aren’t bugs. They’re not mistakes. They’re side effects of a non-deterministic creative process.

Here’s what happens: you ask the AI to build something. The AI, being a good collaborator, starts taking notes. It creates a plan. It documents its reasoning. It leaves breadcrumbs. Sometimes you asked for this. Sometimes you enabled a skill that makes it more “chatty” about its thinking. Sometimes it just… happens.

Claude Code even ships “teaching modes”—Explanatory and Learning configurations—that deliberately produce more narration and breadcrumbs. They’re useful for understanding what the AI is doing and why. But they also amplify the side effects: notes, TODOs, scratch files, half-finished scaffolds. If you treat all artifacts as trash, you throw away signal with the noise.

The location of these artifacts is essentially random. Root directory. Adjacent to where it’s working. Inside .claude/ or wherever it keeps its own context. There’s no rhyme or reason because the AI doesn’t think about file organization the way you do. It’s operating in a fundamentally different mode—quantum, chaotic, non-deterministic. Call it whatever you want. The point is: you cannot control every aspect of how AI works.

And here’s the controversial take: you shouldn’t try to.

The Illusion of Control

I see developers spending hours writing rules. Claude rules. Cursor rules. System prompts. Guidelines. Constraints. All in an effort to make the AI work “more neatly” or “more precisely.”

Let me be clear: context engineering is valuable. Steering the AI in the right direction matters. Spec-driven development helps. These are real techniques with real benefits.

But they’re also extras. They’re not the task you set out to do.

Your job isn’t to sit there writing rules for the AI. Your job is to build the product. The rules should emerge from the work, not precede it. When you sit down to write rules in isolation—disconnected from actual implementation—they’re not grounded in reality. They become theoretical constraints that may or may not help when the AI hits real code, real problems, real edge cases.

The most effective context engineering happens during the work, not before it. You discover what guidance the AI needs by watching it fail. You add constraints when you see patterns of mistakes. You steer incrementally, not prescriptively.

This is the nature of working with a system that doesn’t think like you do.

The Language Barrier

Here’s an analogy that might help: working with AI is like working with a very capable animal.

Stay with me.

Your dog doesn’t speak English. Neither does your cat. They try their best to understand you—through tone, movement, repetition, energy. They pick up patterns. They learn what makes you happy. But there’s a fundamental gap. No matter how smart your dog is, it will never understand a complex sentence the way another human would.

AI is similar. Yes, it “speaks” English. But the English you write is just a blurb of text that gets tokenized, converted to numbers, and then inferred into some output. If we’re talking about an agent, that inference happens multiple times—into actions, tool executions, file reads, API calls. A million intermediate steps between your intent and the final result.

The more capable the model, the better this process works. Good training data helps. Good system prompts help. Anthropic knows how to steer Claude well. But there’s no guarantee. It’s still a probabilistic, stochastic process. You’re not giving instructions to a subordinate. You’re steering an animal toward a destination.

This is why we have plan mode, ask mode, spec-driven development—they’re all steering mechanisms. They help you confirm that the AI understood what you meant. They create checkpoints for alignment. But they don’t eliminate the fundamental gap.

We lack a precise language for communicating with AI. We use English because it’s the best interface we have, but it’s a lossy medium. Meaning gets compressed, inferred, sometimes lost entirely.

Creation Leaves a Mess

So here’s the situation: you’re working with a powerful but imprecise tool, steering it toward a goal through a lossy communication channel, and along the way it’s generating artifacts—notes, plans, scratch files—that you didn’t explicitly request.

Some developers treat this as a problem to solve. They get frustrated. They manually clean up. They write stricter rules. They try to prevent the mess from happening in the first place.

I think this is the wrong approach.

Look at your apartment. Look at your desk. Look at your life. Some people maintain perfect order, sure. But most of us? We’re focused on a million things. We’re lazy sometimes. We don’t have time to tidy. We leave dishes in the sink because we’re in the middle of something more important.

And that’s fine.

The mess isn’t the enemy. The mess is evidence that creation happened. The dishes in the sink mean you cooked a meal. The scattered notes mean the AI was thinking. The orphaned markdown files mean progress was made.

Fighting the mess is like fighting gravity. You can spend energy on it, but you’ll never win permanently.

The Post-It Model

Think about how humans work on complex problems.

You write post-it notes. You scribble on whiteboards. You leave browser tabs open. You create scratch files. You accumulate artifacts of your thinking process—things that help in the moment but aren’t part of the final deliverable.

Eventually, you clean up. You throw away the post-its. You close the tabs. You delete the scratch files. But you don’t do it continuously. You do it when the work is done, or at natural breakpoints, or when the mess starts interfering with the next task.

AI artifacts work the same way.

The markdown files the AI leaves behind are its post-it notes. They capture the state of the solution at a point in time. They preserve reasoning that might be useful later. They document decisions that were made.

Treating them as garbage immediately is a mistake. You might find something valuable when you actually look. A gap you didn’t realize existed. A decision you forgot was made. Context that helps when you return to the code six months later.

The question isn’t how to prevent the artifacts. The question is: when and how do you clean them up?

The Cleaning Crew Model

Here’s my proposal: stop treating cleanup as a manual burden and start treating it as a workflow.

If you hire a cleaning crew for your house, you stop worrying about the mess. You cook, you create, you live your life. The cleaning crew comes on their schedule and restores order. You don’t think about it. It’s handled.

Why can’t the same model work for codebases?

Let the builder build. Let the AI be messy. Let the artifacts accumulate. Then, separately, send an agent to clean.

This could happen:

On a schedule (weekly cleanup pass)
Before release (part of your delivery pipeline)
Before merge (PR-level hygiene)
After the fact on main (post-merge tidying)

The cleanup agent’s job is to:

Harvest the important ideas from scattered artifacts
Categorize them into appropriate locations
Discard what’s no longer relevant
Surface gaps or incomplete work that needs attention

This is a feedback loop, just like compilation errors or test failures. The AI creates, the code compiles (or doesn’t), tests pass (or don’t), and then a cleanup pass happens (or doesn’t). Close the loop.

The Case Against Early Cleanup

Some will argue: just clean up as you go. Don’t let the mess accumulate.

I think this is wrong for several reasons.

First, it’s distracting. Context switching from creation to organization and back kills momentum. When you’re in flow, the last thing you want is to stop and tidy files. The same applies to AI—if you’re constantly correcting its organizational habits, you’re not getting work done.

Second, you lose information. Early cleanup means discarding artifacts before you fully understand their value. That markdown file the AI created might contain reasoning you’ll want later. That planning document might capture requirements you’ll forget. Premature cleanup is premature optimization—and we know how that goes.

Third, it slows your cycles. If cleanup happens before every merge, you’re adding friction to every PR. If cleanup happens after the fact—on a schedule, in batches—you maintain velocity during active development and consolidate the tidying into dedicated passes.

The goal is to separate the creation phase from the organization phase. Let them happen in different modes, at different times, with different tools.

What’s Actually in Those Artifacts?

When you stop treating AI artifacts as garbage and start treating them as artifacts worth examining, interesting things emerge.

Mental models. The AI’s notes often reveal how it understood the problem. This can expose misalignments early—or confirm alignment when you weren’t sure.

Incomplete work. Sometimes the AI leaves a task half-done. The artifact captures the state. This is valuable when you return later and need to understand where things left off.

Alternative approaches. The AI might have explored options it didn’t ultimately pursue. Those explorations could be useful if requirements change.

Documentation seeds. The explanatory content the AI generates—even if poorly placed—can become the basis for actual documentation.

Technical debt markers. If the AI keeps leaving notes about something, that might indicate a problem worth addressing.

Cleanup isn’t just deletion. It’s curation. It’s finding the signal in the noise and putting it somewhere useful.

The Ralph Wiggum Technique

There’s been talk recently about the “Ralph Wiggum technique” for AI—essentially letting the AI be chaotic and iterating from there. It’s not the only approach, but it captures something true.

AI is iterative by nature. You prompt, you get output, you evaluate, you prompt again. The loop continues until you’re satisfied—or until you give up.

Fighting the chaos at each iteration is exhausting. A better approach is to let the chaos happen in bounded contexts and then clean up at well-defined points.

This is similar to how we treat code quality. We don’t demand perfect code in every commit. We allow messy work-in-progress commits on feature branches. The cleanup happens before merge—squashing commits, writing proper messages, ensuring tests pass.

Why should AI artifacts be different?

Practical Implementation

If you’re sold on the cleaning crew model, here’s how to implement it:

1. Define Acceptable Locations

Decide where artifacts are allowed to accumulate. Maybe a .scratch/ directory. Maybe adjacent to whatever the AI is working on. Maybe in the AI’s own config space. Have a convention, but don’t enforce it rigidly during active development.

2. Create a Cleanup Skill

Build an AI skill (or agent) specifically for cleanup. Its job:

Scan for artifacts outside acceptable permanent locations
Analyze content for value
Categorize: discard, relocate, or promote to documentation
Present findings for human review
Execute cleanup with approval

3. Schedule the Cleanup

Add cleanup to your workflow at natural breakpoints:

End of sprint
Before release
Weekly on a schedule
As part of PR review

4. Preserve Context When Discarding

When removing artifacts, don’t just delete. Extract key insights first. Update actual documentation. Note decisions that were made. The cleanup process should capture value, not just destroy clutter.

5. Review the Harvest

Periodically review what the cleanup process found. Are there patterns? Is the AI consistently creating certain types of artifacts? Are those artifacts useful or noise? Adjust your steering based on what you learn.

Let AI Be AI

The deeper lesson here is about acceptance.

AI is not a human collaborator. It doesn’t think like you. It doesn’t organize like you. It doesn’t have the same priorities. It’s a different kind of entity—powerful, useful, but alien in important ways.

Trying to make AI act human is a losing battle. You’ll spend more energy on correction than creation.

Instead: understand how AI naturally works and build processes that accommodate it.

AI leaves artifacts? Create a cleanup workflow. AI communicates imprecisely? Build in verification checkpoints. AI makes mistakes? Create feedback loops that catch and correct.

Don’t try to solve everything at once. Don’t try to make AI something it isn’t.

It will be messy. That’s okay. That’s the nature of the beast. The mess is the receipt. It proves work happened.

Your job is to harness the output—the good parts—and deal with the mess at the right time, in the right way, with the right tools.

The kitchen is never clean mid-service.

And neither is your repo.

The Path Forward

Vibe coding is still new. We’re all figuring it out. The tools will get better. The techniques will mature. The mess might even get smaller over time.

But I suspect it will never go away entirely. Non-deterministic systems produce non-deterministic outcomes. Chaotic creation produces chaotic artifacts. This is inherent, not incidental.

The developers who thrive in this new world won’t be the ones who fight the mess. They’ll be the ones who build systems to handle it.

Let the builder build. Let the mess accumulate. Let the cleaning crew clean.

You can’t vibe code without breaking a few eggs. So stop pretending you can—and start building the kitchen that handles the shells.