Wiki for Vibe Coded Apps

Vibe Coding Wiki

200+ hrs, 70k lines, non-developer PM. Distilled lessons + community insights.
Tags: #vibe-coding #ai #workflow #security

1. Planning

Write a PRD first. Define goal → roadmap → first milestone. AI cannot one-shot non-trivial work — it needs constrained scope.
Build micro-feature by micro-feature. One independently testable unit per session.
Use OpenSpec + grillme to stress-test your documents before building.

2. Session & Token Management

One session per feature. Long sessions degrade context and burn tokens fast.
/compact focus on <X> to trim context mid-session.
Plan with the best model, code with a cheaper one.
- Architecture / reviews → Claude Sonnet, GPT-4o
- Implementation / boilerplate / tests → Codex, Haiku

3. Build Order

UI-First (recommended for solo vibe coding)

Build frontend with mock data → validate AI understood the spec → refine data model → implement backend.

Catches misunderstandings early when they're cheap
Gives fast visual feedback before real logic exists

Backend-First (traditional / team projects)

Build the API → frontend just displays what exists. Cleaner business logic, fewer rewrites.

Contract-First (best of both)

Define OpenAPI spec → build both sides against it independently → use Postman to test endpoints before the frontend exists.

Verdict: UI-first works well solo. Backend/contract-first for teams or complex logic.

4. Testing

Rules for CLAUDE.md

Rule	Why
Test behaviour, not implementation	Survives refactors
Name test after the bug/behaviour it catches	Self-documenting
Cover failure paths, not just happy path	Regressions hide in edge cases
No loose assertions that pass on broken code	Tests must actually catch bugs

TDD Flow for Tricky Features

List everything that could go wrong
Ask AI to write a failing test for each
Run it — confirm it fails for the right reason
Ask AI to write the implementation
Confirm test passes

This stops the AI writing code first, then writing tests that rubber-stamp whatever it did.

Key Practices

Cross-check tests with a second model — the model that wrote the code has the same blind spots
Run Stryker (mutation testing) periodically — confirms tests actually catch bugs, not just confirm existing behaviour
Save one well-crafted test as a reference example for that module — instruct AI to follow it
Automate in CI — runs on every PR

5. Version Control (Git)

Start Git on day one. Starting at 16k lines is too late — you regret it.
Commit at every logical checkpoint, not every trivial change (noisy rollbacks)
Feature branches for every new feature or refactor; merge only when tested
Rolling back to a known-good commit is often the fastest fix

6. CLAUDE.md / AGENTS.md

Persistent instruction file the AI reads every session. Without it, the AI repeats the same mistakes across sessions.

Essential rules to include:

- Split components aggressively — one responsibility per component
- No business logic in components — belongs in hooks/services
- Max file size: 300 lines
- Test behaviour, not implementation
- Always cover failure paths
- API keys never reach the browser
- AI-generated content sanitized before rendering

7. Refactoring

Schedule 1–2 refactor days every week or two. Hard to stop shipping features, but unchecked growth tanks quality.
Refactor sessions: remove dead code, split large files, improve naming, reduce duplication.

⚠️ When refactoring with AI:

Explicitly tell it not to touch existing logic — only port. Models will silently delete things. Always diff carefully after a refactor. Never trust "I didn't change any logic."

Watch for:

Files over your line limit (e.g. 500 lines)
Components with multiple responsibilities
Business logic in the wrong layer

8. Code Review & PR Discipline

Run /review + /security-review on every PR — build custom subagents tuned to your codebase's patterns
Review with a different model than wrote the code — the author has the same blind spots as its own output

Review checklist:

Logic correctness
Auth checked on every protected endpoint
No data exposed to client that shouldn't be
Test coverage — what's missing, not just what's there
No hardcoded secrets
Component boundaries respected

9. Security

Honest baseline: AI security reviews pass flawed code regularly. For multi-tenant or sensitive-data apps, get a human security review.

Must-Know Concepts

How tokens/sessions are handled and where they expire
Your DB's auth model: Firestore Security Rules, Postgres RLS
What's client-side vs. server-side
Ask AI to explain the traps before you implement, not after

Practical Checklist

Access Control

Single source of truth for allowlists — backend, DB rules, and app all read from it
Per-collection field validation: every write must match an explicit field+type allowlist

API Keys

Keys never reach the browser — browser → your backend → third-party API
Keys in a secret manager (Google Secret Manager, Vercel env, AWS Secrets Manager)
Public endpoints: require secret token + per-IP rate limiting

Output & Headers

CSP configured — only your domains + known third parties allowlisted
All AI-generated content sanitized before rendering (XSS prevention)
Security headers: HSTS, X-Frame-Options, CORS strict allowlist

Secrets in Code

Gitleaks in CI — scans for accidentally committed API keys
Pre-commit hook or CI check catches secrets before they hit the repo

Linting / Static Analysis

Linter (ESLint, Ruff, clang-tidy) integrated into CI — runs on every build
Compiler catches errors; linter catches code that will become errors

OWASP Top 10 — High-Risk in Vibe Coding

Risk	AI Vibe Coding Danger
A01 Broken Access Control	High — AI often skips auth checks on individual endpoints
A05 Security Misconfiguration	High — CSP, CORS, headers easy to miss
A06 Vulnerable Dependencies	High — AI never audits deps
A09 Security Logging Failures	High — AI rarely adds audit logging unprompted
A02 Cryptographic Failures	Medium — correct libs, but may misconfigure

Why AI Can't Fully Cover Security

Attack surface is too diverse (broken ACL → cipher oracle attacks)
Authorization must be specified with zero ambiguity — prompts can't do this reliably
Side-effect bugs (change in one place exposes something elsewhere) are nearly impossible for AI to catch

10. Multi-Model Strategy

Model	Best For
Claude	UI/UX, component design, nuanced writing
Codex	Complex logic, heavy refactors, token-efficient on large tasks
Gemini	Fast large-context codebase evaluation; fresh perspective — analysis only, don't let it build
GPT-4o	Cross-checking, planning, reasoning trade-offs

Write with Model A → Review with Model B — catches what the author misses
Gemini: Run its feedback past Claude/Codex before acting — hit or miss

11. Deployment

Never deploy directly to production.
Preview channel is a mandatory gate: Commit → Preview → Verify → Promote
SHA-locked promotion: deploy script refuses to promote if current commit SHA ≠ verified SHA. Prevents shipping a version you didn't actually test.

12. Multi-Agent Workflows

Run parallel agents only on completely unrelated features with zero shared files
Each agent gets its own git worktree (separate directory, separate branch)
Run a status script to flag any file both agents have touched
Never let two agents work on the same thing — conflicts every time

git worktree add ../feature-b-dir feature-b-branch

13. TypeScript

Adopt earlier than feels necessary — benefits compound, late migration is painful
Start at the boundaries: API layer → data models → shared utils → inward
AI writes better code when types are defined — can't make assumptions about data shapes
Refactors are dramatically safer with type checking

14. Tooling Reference

Category	Tool	Notes
Version control	Git + GitHub/GitLab	Day one
Secret scanning	Gitleaks	CI integration
Linting	ESLint / Ruff / clang-tidy	Every build
Mutation testing	Stryker (JS)	Periodic
Error tracking	Sentry	Set up early
Analytics	PostHog	Behavioural data
API testing	Postman	Backend-independent testing
Secret management	GCP Secret Manager / AWS Secrets Manager	Never hardcode

React Native specific:

Test both Android and iOS from day one
Move from Expo Go to DevBuilds ASAP — Expo Go behaves differently from real builds

15. Beginner Ramp

Tier 1 — Day One

Use Git, commit often
Write CLAUDE.md before you start
One session per feature
Manual test before moving on

Tier 2 — First Month

Automated unit tests on core logic
Preview/staging environment
Secrets in a secret manager, never in code
Understand your DB auth model

Tier 3 — Once Shipping

TypeScript at the boundaries
Mutation testing (Stryker)
Multi-model PR reviews
Git worktrees for parallel agents

Where to Learn

Ask the AI to explain any term in plain language — fastest teacher available
Fundamentals of Software Architecture (O'Reilly) — theory behind all of this
OWASP Top 10 — free, canonical security reference

Glossary

Term	Definition
PRD	Written plan for what you're building, before you build it
CLAUDE.md	File that tells your AI the rules to follow every session
Context window	How much the AI can "remember" in one conversation — finite
Token	Unit of AI computation — words, code, instructions all cost tokens
Git worktree	Multiple branches checked out simultaneously in separate folders
CSP	Content Security Policy — browser rule limiting what your site can load/send
RLS	Row-Level Security — DB feature enforcing who can read/write which rows
XSS	Cross-Site Scripting — injecting malicious code via user input
Mutation testing	Deliberately breaking code to verify tests catch the breakage
Linter	Finds code patterns likely to cause bugs before they do
Gitleaks	Scans git history for accidentally committed secrets
API contract	Formal definition of an endpoint's inputs and outputs
SHA	Unique fingerprint for a specific version of your code