Vibe coding went from Andrej Karpathy’s tweet to Collins Dictionary’s Phrase of the 12 months in below twelve months. In Y Combinator’s Winter 2025 batch, 25% of startups had codebases that have been 95% or extra AI-generated. GitHub has reported that Copilot was chargeable for a median of 46% of code being written throughout programming languages, and 61% in Java.
So sure, it has grow to be the brand new regular and everybody’s doing it however sadly, most individuals are doing it badly. The instruments like Claude Code and Cursor are wonderful however most vibe coders use them like autocomplete on steroids, like a genie: simply immediate randomly and watch for it to prepare dinner. However belief me the output seems to be loopy at first look till the codebase is a large number the agent itself cannot navigate, lol.So on this information, we cowl 5 issues which may make you pretty much as good as a developer who went to highschool for this. Possibly higher.
1. Use CLAUDE.md and Guidelines as Persistent Context
Each Claude Code or Cursor session begins with the agent having seen nothing about your undertaking earlier than. It reads no matter information you level it at, infers what it will probably, and guesses the remaining. For small remoted duties that’s nice however for something heavy it isn’t, as a result of these guesses preserve compounding.
Let’s say you might be three weeks into constructing a SaaS billing system. You open a brand new session and ask the agent so as to add a utilization based mostly pricing tier. It doesn’t know you have already got a BillingService class in /providers/billing.py. It doesn’t know you standardized on Stripe’s price_id format for all pricing objects. So it creates a brand new PricingService, picks its personal format, and builds one thing parallel to your present structure. 4 periods later you could have two billing programs and neither is full.
A CLAUDE.md file on the root of your undertaking will get learn in the beginning of each session. Here’s what an actual one seems to be like for a SaaS undertaking:
# Challenge: Acme SaaS
## Stack
- Node.js + Specific backend
- PostgreSQL with Prisma ORM
- React + TypeScript frontend
- Stripe for billing (worth IDs comply with format: price_[plan]_[interval])
## Key providers
- /providers/billing.py — all Stripe logic lives right here, don't create parallel billing code
- /providers/auth.py — JWT + refresh token sample, see present implementation earlier than touching auth
- /lib/db.ts — single Prisma consumer occasion, import from right here
## Conventions
- All API responses: { knowledge, error, meta } form
- Errors all the time use AppError class, by no means plain Error
- Each DB question wants express subject choice, no choose *
## Don't contact
- /legacy/funds/ — deprecated, being eliminated in Q3
- /auth/oauth.py — frozen till SSO ships
Cursor now paperwork Guidelines and AGENTS.md for persistent directions. GitHub Copilot helps repository-wide instruction information like .github/copilot-instructions.md, and a few Copilot agent surfaces additionally learn AGENTS.md, CLAUDE.md, and GEMINI.md.
Whenever you add a brand new service or set up a brand new conference, replace the file instantly. It turns into the agent’s reminiscence between periods.
Another factor: context rot is actual. A 2025 Chroma examine of 18 fashions discovered measurable accuracy drops as conversations grew longer, even on easy duties. A 40-message session overlaying three options is slower and fewer correct than three separate 15-message periods. Open a brand new dialog for every distinct activity. Pin solely the information related to that activity.
2. Make the Agent Plan Earlier than It Builds
The default conduct of each agentic software is to begin writing code the second you describe one thing. For a self-contained activity like “add a subject to this way” that’s nice however for something with actual scope it would create issues you don’t discover till you might be deep into the implementation.
Here’s a concrete instance. You might be constructing a workforce invitation system: a person enters an electronic mail, the system sends an invitation, the recipient clicks a hyperlink, creates an account, and will get added to the workforce. Sounds easy however that characteristic touches your customers desk, your groups desk, a brand new invites desk, your electronic mail service, your auth circulation, and your JWT technology. If the agent misunderstands how your auth circulation works and builds the invitation acceptance logic towards a special assumption, you’ll not discover out till the characteristic is usually finished.
Earlier than any characteristic with scope, ship this primary:
Earlier than writing any code: analyze the codebase, then give me a step-by-step plan
for constructing the workforce invitation system. Record each file you'll modify, each
file you'll create, each DB migration wanted, and any assumptions you might be
making concerning the present code. Don't write code but.
A great plan output seems to be like this:
Recordsdata to switch:
- /routes/groups.ts — add POST /groups/:id/invite and POST /groups/accept-invite
- /providers/electronic mail.ts — add sendTeamInvite() utilizing present Resend consumer
- /prisma/schema.prisma — add Invitation mannequin
Recordsdata to create:
- /providers/invites.ts — token technology, validation, expiry logic
DB migration:
- invites desk: id, team_id, electronic mail, token (distinctive), expires_at, accepted_at
Assumptions:
- Invite tokens expire after 48 hours
- Inviting an already-registered electronic mail nonetheless goes by the invite circulation
- No invite restrict per workforce presently
Learn that a few occasions and ensure: Is the 48-hour expiry proper? Did it miss the speed limiting you want? Is it utilizing the e-mail service accurately? Repair the plan earlier than a single line of code will get written.
The opposite facet of that is immediate specificity. The extra exactly you describe what you need, the much less the agent has to deduce.
| Imprecise | Particular |
|---|---|
| “Add funds” | Combine Stripe Checkout for the Professional plan ($29/month). On success, set person.plan = ‘professional’ and person.stripe_customer_id. On cancellation redirect to /pricing. Use present BillingService in /providers/billing.ts. |
| “Construct an API” | REST endpoint POST /api/experiences. Accepts { start_date, end_date, metric } in request physique. Validates dates with Zod. Queries the occasions desk grouped by day. Returns { knowledge: [{ date, count }], complete }. |
| “Repair the sluggish question” | The GET /api/customers endpoint takes 4 seconds. The customers desk has 800k rows. Add a database index on created_at and rewrite the question to make use of pagination (restrict 50, cursor-based). Don’t change the response form. |
3. Use a Separate Evaluate Agent for Safety and Logic
Coding brokers are optimized to finish duties, to not perceive why each guardrail exists. Columbia DAPLab has documented recurring failure patterns throughout main coding brokers, together with safety points, knowledge administration errors, and weak codebase consciousness. That makes blind belief harmful: the identical agent that fixes a bug also can take away the examine that was stopping a worse one.
The clearest actual instance of this: within the Replit agent incident of 2025, the autonomous agent deleted a undertaking’s major manufacturing database as a result of it determined the database wanted cleanup. It was following its optimization goal. It was additionally violating an express instruction to not modify manufacturing knowledge. And sadly, no human reviewed what it was about to do.
The agent that wrote your code just isn’t in a great place to catch its personal errors. Claude Code helps subagents: separate brokers that run in fully remoted contexts with no reminiscence of what the primary agent constructed. You outline them in .claude/brokers/:
---
title: security-reviewer
description: Opinions code for safety points after implementation is full
instruments: Learn, Grep, Glob
mannequin: opus
---
You're a senior safety engineer doing a pre-ship assessment.
For each route added or modified, examine:
- Is authentication enforced? Can an unauthenticated request attain this?
- Is the person licensed? Can person A entry person B's knowledge?
- Is enter validated earlier than it hits the database?
- Are there any hardcoded secrets and techniques, API keys, or credentials?
Report: file title, line quantity, particular challenge, urged repair.
Don't summarize. Report each challenge you discover.
After your primary agent finishes constructing the invitation system:
Use the security-reviewer subagent on all of the information we simply created or modified.
Here’s what an actual reviewer output seems to be like:
/routes/groups.ts line 47
Subject: POST /groups/accept-invite doesn't confirm the token belongs to the
electronic mail tackle of the logged-in person. Any authenticated person who is aware of a legitimate
token can settle for any invite.
Repair: Add examine that invitation.electronic mail === req.person.electronic mail earlier than accepting.
/providers/invites.ts line 23
Subject: Token generated with Math.random() — not cryptographically safe.
Repair: Change with crypto.randomBytes(32).toString('hex').
Neither of these would have been caught by the constructing agent. Each would have made it to prod.
Escape.tech’s scan of 5,600 vibe-coded apps discovered over 400 uncovered secrets and techniques and 175 situations of PII uncovered by endpoints. Most of it’s precisely this class of challenge, authorization logic that works functionally however has holes.
4. Immediate in Layers, Not in One Large Spec
Position task modifications what the agent prioritizes. “Construct this characteristic” and “Act as a senior engineer who has been burned by poorly examined cost code earlier than. Construct this characteristic.” produce completely different outputs. The second will add edge case dealing with, write extra defensive validation, and flag assumptions it isn’t certain about. The mannequin responds to framing.
Construct options in layers, not unexpectedly. The usual mistake when constructing one thing like a Stripe integration is to ask for the entire thing in a single immediate. You get code that compiles however has the billing logic, webhook dealing with, and database updates tangled collectively. As an alternative:
Immediate 1:
Arrange the Stripe Checkout session creation solely.
Endpoint: POST /api/subscribe
Accepts: { price_id, user_id }
Returns: { checkout_url }
Don't deal with webhooks but. Don't replace the database but. Simply the session creation.
Evaluate that. Make sure that the Stripe consumer is initialized accurately, the correct price_id is being handed, the success and cancel URLs level to the correct locations.
Immediate 2:
Now add the Stripe webhook handler.
Endpoint: POST /api/webhooks/stripe
Deal with these occasions solely: checkout.session.accomplished, buyer.subscription.deleted
On checkout.session.accomplished: set person.plan = 'professional', person.stripe_customer_id = buyer id from occasion
On buyer.subscription.deleted: set person.plan = 'free'
Confirm the webhook signature utilizing STRIPE_WEBHOOK_SECRET from env.
Evaluate that individually, examine the signature verification, additionally that the person lookup is right.
Every layer is reviewable and has a transparent scope. If one thing is fallacious precisely the place.
Use pseudo-code when the logic however not the implementation:
Construct a fee limiter for the /api/send-invite endpoint.
Logic:
- Key: user_id + present hour (e.g. "user_123_2026041514")
- Restrict: 10 invitations per hour per person
- On restrict exceeded: return 429 with { error: "Charge restrict exceeded", retry_after: seconds till subsequent hour }
- Use Redis if out there within the undertaking, in any other case in-memory Map is okay
That is extra correct than “add fee limiting to the invite endpoint” as a result of you could have specified the important thing construction, the restrict, the error response form, and the storage desire. There may be nearly nothing left to guess.
The vast majority of builders transport AI generated code spend average to important time correcting it. Solely round 10% ship it near as is. These are largely skilled Claude Code customers with tight CLAUDE.md information and structured construct periods.
Learn each diff earlier than committing. git diff earlier than each commit. When the agent has modified a file you didn’t ask it to the touch, both the immediate left room for interpretation or the agent overreached. Each are value understanding earlier than the code goes wherever.
Prohibit what the agent can entry. The permissions.deny block in ~/.claude/settings.json prevents the agent from studying or writing particular paths. A .cursorignore file does the identical in Cursor.
{
"permissions": {
"deny": [
"/auth/oauth.py",
"/.env",
"/.env.production",
"/legacy/**",
"/migrations/**"
]
}
}
Oh, migrations deserve particular point out. An agent that may write its personal migration information can silently alter your database schema. Maintain migrations out of attain and write them your self after reviewing what the agent constructed.
Take a look at instantly after each characteristic. Not as a separate activity later, proper after. “Now write unit checks for the invitation service we simply constructed. Cowl: token expiry, duplicate invite to identical electronic mail, settle for with fallacious person, settle for with expired token.” The agent that simply constructed the characteristic is aware of the sting circumstances. Ask for checks whereas that context is reside.
That is it. Share with whoever wants it. Comfortable prompting!
