Wiki for Vibe Coded Apps
Vibe Coding Wiki
200+ hrs, 70k lines, non-developer PM. Distilled lessons + community insights.
Tags: #vibe-coding #ai #workflow #security
1. Planning
- Write a PRD first. Define goal → roadmap → first milestone. AI cannot one-shot non-trivial work — it needs constrained scope.
- Build micro-feature by micro-feature. One independently testable unit per session.
- Use OpenSpec + grillme to stress-test your documents before building.
2. Session & Token Management
- One session per feature. Long sessions degrade context and burn tokens fast.
/compact focus on <X>to trim context mid-session.- Plan with the best model, code with a cheaper one.
- Architecture / reviews → Claude Sonnet, GPT-4o
- Implementation / boilerplate / tests → Codex, Haiku
3. Build Order
UI-First (recommended for solo vibe coding)
Build frontend with mock data → validate AI understood the spec → refine data model → implement backend.
- Catches misunderstandings early when they're cheap
- Gives fast visual feedback before real logic exists
Backend-First (traditional / team projects)
Build the API → frontend just displays what exists. Cleaner business logic, fewer rewrites.
Contract-First (best of both)
Define OpenAPI spec → build both sides against it independently → use Postman to test endpoints before the frontend exists.
Verdict: UI-first works well solo. Backend/contract-first for teams or complex logic.
4. Testing
Rules for CLAUDE.md
| Rule | Why |
|---|---|
| Test behaviour, not implementation | Survives refactors |
| Name test after the bug/behaviour it catches | Self-documenting |
| Cover failure paths, not just happy path | Regressions hide in edge cases |
| No loose assertions that pass on broken code | Tests must actually catch bugs |
TDD Flow for Tricky Features
- List everything that could go wrong
- Ask AI to write a failing test for each
- Run it — confirm it fails for the right reason
- Ask AI to write the implementation
- Confirm test passes
This stops the AI writing code first, then writing tests that rubber-stamp whatever it did.
Key Practices
- Cross-check tests with a second model — the model that wrote the code has the same blind spots
- Run Stryker (mutation testing) periodically — confirms tests actually catch bugs, not just confirm existing behaviour
- Save one well-crafted test as a reference example for that module — instruct AI to follow it
- Automate in CI — runs on every PR
5. Version Control (Git)
- Start Git on day one. Starting at 16k lines is too late — you regret it.
- Commit at every logical checkpoint, not every trivial change (noisy rollbacks)
- Feature branches for every new feature or refactor; merge only when tested
- Rolling back to a known-good commit is often the fastest fix
6. CLAUDE.md / AGENTS.md
Persistent instruction file the AI reads every session. Without it, the AI repeats the same mistakes across sessions.
Essential rules to include:
- Split components aggressively — one responsibility per component
- No business logic in components — belongs in hooks/services
- Max file size: 300 lines
- Test behaviour, not implementation
- Always cover failure paths
- API keys never reach the browser
- AI-generated content sanitized before rendering
7. Refactoring
- Schedule 1–2 refactor days every week or two. Hard to stop shipping features, but unchecked growth tanks quality.
- Refactor sessions: remove dead code, split large files, improve naming, reduce duplication.
⚠️ When refactoring with AI:
Explicitly tell it not to touch existing logic — only port. Models will silently delete things. Always diff carefully after a refactor. Never trust "I didn't change any logic."
Watch for:
- Files over your line limit (e.g. 500 lines)
- Components with multiple responsibilities
- Business logic in the wrong layer
8. Code Review & PR Discipline
- Run
/review+/security-reviewon every PR — build custom subagents tuned to your codebase's patterns - Review with a different model than wrote the code — the author has the same blind spots as its own output
Review checklist:
9. Security
Honest baseline: AI security reviews pass flawed code regularly. For multi-tenant or sensitive-data apps, get a human security review.
Must-Know Concepts
- How tokens/sessions are handled and where they expire
- Your DB's auth model: Firestore Security Rules, Postgres RLS
- What's client-side vs. server-side
- Ask AI to explain the traps before you implement, not after
Practical Checklist
Access Control
API Keys
Output & Headers
Secrets in Code
Linting / Static Analysis
OWASP Top 10 — High-Risk in Vibe Coding
| Risk | AI Vibe Coding Danger |
|---|---|
| A01 Broken Access Control | High — AI often skips auth checks on individual endpoints |
| A05 Security Misconfiguration | High — CSP, CORS, headers easy to miss |
| A06 Vulnerable Dependencies | High — AI never audits deps |
| A09 Security Logging Failures | High — AI rarely adds audit logging unprompted |
| A02 Cryptographic Failures | Medium — correct libs, but may misconfigure |
Why AI Can't Fully Cover Security
- Attack surface is too diverse (broken ACL → cipher oracle attacks)
- Authorization must be specified with zero ambiguity — prompts can't do this reliably
- Side-effect bugs (change in one place exposes something elsewhere) are nearly impossible for AI to catch
10. Multi-Model Strategy
| Model | Best For |
|---|---|
| Claude | UI/UX, component design, nuanced writing |
| Codex | Complex logic, heavy refactors, token-efficient on large tasks |
| Gemini | Fast large-context codebase evaluation; fresh perspective — analysis only, don't let it build |
| GPT-4o | Cross-checking, planning, reasoning trade-offs |
- Write with Model A → Review with Model B — catches what the author misses
- Gemini: Run its feedback past Claude/Codex before acting — hit or miss
11. Deployment
- Never deploy directly to production.
- Preview channel is a mandatory gate:
Commit → Preview → Verify → Promote - SHA-locked promotion: deploy script refuses to promote if current commit SHA ≠ verified SHA. Prevents shipping a version you didn't actually test.
12. Multi-Agent Workflows
- Run parallel agents only on completely unrelated features with zero shared files
- Each agent gets its own git worktree (separate directory, separate branch)
- Run a status script to flag any file both agents have touched
- Never let two agents work on the same thing — conflicts every time
git worktree add ../feature-b-dir feature-b-branch
13. TypeScript
- Adopt earlier than feels necessary — benefits compound, late migration is painful
- Start at the boundaries: API layer → data models → shared utils → inward
- AI writes better code when types are defined — can't make assumptions about data shapes
- Refactors are dramatically safer with type checking
14. Tooling Reference
| Category | Tool | Notes |
|---|---|---|
| Version control | Git + GitHub/GitLab | Day one |
| Secret scanning | Gitleaks | CI integration |
| Linting | ESLint / Ruff / clang-tidy | Every build |
| Mutation testing | Stryker (JS) | Periodic |
| Error tracking | Sentry | Set up early |
| Analytics | PostHog | Behavioural data |
| API testing | Postman | Backend-independent testing |
| Secret management | GCP Secret Manager / AWS Secrets Manager | Never hardcode |
React Native specific:
- Test both Android and iOS from day one
- Move from Expo Go to DevBuilds ASAP — Expo Go behaves differently from real builds
15. Beginner Ramp
Tier 1 — Day One
- Use Git, commit often
- Write CLAUDE.md before you start
- One session per feature
- Manual test before moving on
Tier 2 — First Month
- Automated unit tests on core logic
- Preview/staging environment
- Secrets in a secret manager, never in code
- Understand your DB auth model
Tier 3 — Once Shipping
- TypeScript at the boundaries
- Mutation testing (Stryker)
- Multi-model PR reviews
- Git worktrees for parallel agents
Where to Learn
- Ask the AI to explain any term in plain language — fastest teacher available
- Fundamentals of Software Architecture (O'Reilly) — theory behind all of this
- OWASP Top 10 — free, canonical security reference
Glossary
| Term | Definition |
|---|---|
| PRD | Written plan for what you're building, before you build it |
| CLAUDE.md | File that tells your AI the rules to follow every session |
| Context window | How much the AI can "remember" in one conversation — finite |
| Token | Unit of AI computation — words, code, instructions all cost tokens |
| Git worktree | Multiple branches checked out simultaneously in separate folders |
| CSP | Content Security Policy — browser rule limiting what your site can load/send |
| RLS | Row-Level Security — DB feature enforcing who can read/write which rows |
| XSS | Cross-Site Scripting — injecting malicious code via user input |
| Mutation testing | Deliberately breaking code to verify tests catch the breakage |
| Linter | Finds code patterns likely to cause bugs before they do |
| Gitleaks | Scans git history for accidentally committed secrets |
| API contract | Formal definition of an endpoint's inputs and outputs |
| SHA | Unique fingerprint for a specific version of your code |