Wiki for Vibe Coded Apps
Vibe Coding Wiki
200+ hrs, 70k lines, non-developer PM. Distilled lessons + community insights.
Tags: #vibe-coding #ai #workflow #security
1. Planning
- Write a PRD first. Define goal → roadmap → first milestone. AI cannot one-shot non-trivial work — it needs constrained scope.
- Build micro-feature by micro-feature. One independently testable unit per session.
- Use OpenSpec + grillme to stress-test your documents before building.
2. Session & Token Management
- One session per feature. Long sessions degrade context and burn tokens fast.
/compact focus on <X>to trim context mid-session.- Plan with the best model, code with a cheaper one.
- Architecture / reviews → Claude Sonnet, GPT-4o
- Implementation / boilerplate / tests → Codex, Haiku
3. Build Order
UI-First (recommended for solo vibe coding)
Build frontend with mock data → validate AI understood the spec → refine data model → implement backend.
- Catches misunderstandings early when they're cheap
- Gives fast visual feedback before real logic exists
Backend-First (traditional / team projects)
Build the API → frontend just displays what exists. Cleaner business logic, fewer rewrites.
Contract-First (best of both)
Define OpenAPI spec → build both sides against it independently → use Postman to test endpoints before the frontend exists.
Verdict: UI-first works well solo. Backend/contract-first for teams or complex logic.
4. Testing
Rules for CLAUDE.md
| Rule | Why |
|---|---|
| Test behaviour, not implementation | Survives refactors |
| Name test after the bug/behaviour it catches | Self-documenting |
| Cover failure paths, not just happy path | Regressions hide in edge cases |
| No loose assertions that pass on broken code | Tests must actually catch bugs |
TDD Flow for Tricky Features
- List everything that could go wrong
- Ask AI to write a failing test for each
- Run it — confirm it fails for the right reason
- Ask AI to write the implementation
- Confirm test passes
This stops the AI writing code first, then writing tests that rubber-stamp whatever it did.
Key Practices
- Cross-check tests with a second model — the model that wrote the code has the same blind spots
- Run Stryker (mutation testing) periodically — confirms tests actually catch bugs, not just confirm existing behaviour
- Save one well-crafted test as a reference example for that module — instruct AI to follow it
- Automate in CI — runs on every PR
5. Version Control (Git)
- Start Git on day one. Starting at 16k lines is too late — you regret it.
- Commit at every logical checkpoint, not every trivial change (noisy rollbacks)
- Feature branches for every new feature or refactor; merge only when tested
- Rolling back to a known-good commit is often the fastest fix
6. CLAUDE.md / AGENTS.md
Persistent instruction file the AI reads every session. Without it, the AI repeats the same mistakes across sessions.
Essential rules to include:
- Split components aggressively — one responsibility per component
- No business logic in components — belongs in hooks/services
- Max file size: 300 lines
- Test behaviour, not implementation
- Always cover failure paths
- API keys never reach the browser
- AI-generated content sanitized before rendering
7. Refactoring
- Schedule 1–2 refactor days every week or two. Hard to stop shipping features, but unchecked growth tanks quality.
- Refactor sessions: remove dead code, split large files, improve naming, reduce duplication.
⚠️ When refactoring with AI:
Explicitly tell it not to touch existing logic — only port. Models will silently delete things. Always diff carefully after a refactor. Never trust "I didn't change any logic."
Watch for:
- Files over your line limit (e.g. 500 lines)
- Components with multiple responsibilities
- Business logic in the wrong layer
8. Code Review & PR Discipline
- Run
/review+/security-reviewon every PR — build custom subagents tuned to your codebase's patterns - Review with a different model than wrote the code — the author has the same blind spots as its own output
Review checklist:
9. Security & Anti-Hacking — The Bible
Honest baseline: AI security reviews pass flawed code regularly. AI also flags correct, framework-current code as broken (false positives). For any multi-tenant or sensitive-data app, get a human security review before launch. This section is the canonical reference — work through it top to bottom.
9.0 First Principles (memorize these)
- Never trust the client. Anything from a browser — form fields, query params, headers, cookies, hidden inputs, IDs — is attacker-controlled. The server is the only trust boundary.
- The UI is not a security control. A disabled button, a hidden menu, a missing link — all bypassable with one
curl. Every rule must be re-checked on the server. - Fail closed. On any doubt — missing session, unknown role, error mid-check — deny. Default-deny, then allow explicitly.
- Least privilege. Every user, token, and service gets the minimum access needed. No "admin can do anything" shortcuts.
- Defense in depth. Assume each layer fails. App-level check and rate limit and firewall and monitoring. No single point of trust.
- Verify, don't assume — especially with AI. AI works from stale training data and confidently "fixes" things that aren't broken. Check the current framework docs before changing security-relevant wiring. (See 9.12.)
- Specify authorization with zero ambiguity. "Only admins" is not a spec. "Only an admin of the same tenant as the target record, and never to elevate a user above the caller's own role" is.
9.1 Authentication
9.2 Authorization & Multi-Tenancy — the #1 vibe-coding killer
9.3 Input Handling & Injection
9.4 Secrets & Cryptography
9.5 Rate Limiting, Lockout & fail2ban — anti-brute-force
9.6 Honeypots & Deception — catch scanners before they find anything
9.7 Browser & Transport Hardening (Headers + Cookies)
9.8 Dependencies & Supply Chain
9.9 Data Protection & Privacy
9.10 Logging, Monitoring & Incident Basics
9.11 Testing Security (behaviour, not vibes)
9.12 Working with AI on Security (meta-rules)
- AI passes flawed security code and flags correct code as broken. Treat its security output as a draft, not a verdict.
- VERIFY THE FRAMEWORK against current docs. AI's training data lags releases; it will "fix" wiring that is actually correct for your version (e.g. renamed file conventions, new config APIs) and break it. Read the installed docs before touching framework-level security wiring.
- Review with a different model than wrote the code — the author shares its own blind spots.
- Authorization needs zero-ambiguity prompts (see 9.0 #7). Side-effect bugs — a change here that exposes something there — are nearly impossible for AI to catch; that's what human review and tenant-isolation tests are for.
OWASP Top 10 — Highest-Risk in Vibe Coding
| Risk | AI Vibe Coding Danger |
|---|---|
| A01 Broken Access Control | Critical — AI skips per-record/tenant checks; IDOR everywhere |
| A05 Security Misconfiguration | High — CSP, CORS, headers, cookie flags easy to miss |
| A06 Vulnerable Dependencies | High — AI never audits deps unprompted |
| A07 Auth Failures | High — weak reset flows, no rate limiting, account-takeover paths |
| A09 Logging/Monitoring Gaps | High — AI rarely adds security logging or alerting unprompted |
| A02 Cryptographic Failures | Medium — right libs, wrong config; plaintext secrets at rest |
Pre-Launch Security Checklist (copy-paste)
AUTH
- [ ] Passwords hashed (argon2/scrypt/bcrypt), constant-time compare
- [ ] No password set on existing accounts from unauthenticated forms
- [ ] Reset tokens: random, hashed, single-use, expiring; links from canonical URL
AUTHZ
- [ ] Central authz helpers called before every mutation
- [ ] Every record access scoped to the caller's tenant/ownership (no IDOR)
- [ ] No client input selects whose data/keys are used
- [ ] Server enforces every rule the UI implies; role changes can't escalate
INPUT
- [ ] Parameterized queries only; user HTML sanitized; no eval/unsafe deser
SECRETS
- [ ] No hardcoded secrets; secret manager in prod; keys never in browser
- [ ] Sensitive data encrypted at rest (AES-GCM); key outside the DB
ABUSE
- [ ] Rate limiting per-IP + per-identity on auth/reset/signup
- [ ] Progressive ban/lockout; fail2ban fed by structured logs
- [ ] Honeypot field + decoy routes
HEADERS
- [ ] CSP, HSTS, X-Frame-Options, nosniff, Referrer/Permissions-Policy, COOP
- [ ] Auth cookies HttpOnly+Secure+SameSite; CSRF + CORS handled
SUPPLY CHAIN
- [ ] Dependency audit + gitleaks in CI; single lockfile
DATA
- [ ] No DB/.env committed; PII minimized & encrypted; secrets rotated if leaked
OBSERVABILITY
- [ ] Security events logged (no secrets/PII); error tracking + alerts
- [ ] Tenant-isolation & abuse-path tests; human security review done
10. Multi-Model Strategy
| Model | Best For |
|---|---|
| Claude | UI/UX, component design, nuanced writing |
| Codex | Complex logic, heavy refactors, token-efficient on large tasks |
| Gemini | Fast large-context codebase evaluation; fresh perspective — analysis only, don't let it build |
| GPT-4o | Cross-checking, planning, reasoning trade-offs |
- Write with Model A → Review with Model B — catches what the author misses
- Gemini: Run its feedback past Claude/Codex before acting — hit or miss
11. Deployment
- Never deploy directly to production.
- Preview channel is a mandatory gate:
Commit → Preview → Verify → Promote - SHA-locked promotion: deploy script refuses to promote if current commit SHA ≠ verified SHA. Prevents shipping a version you didn't actually test.
12. Multi-Agent Workflows
- Run parallel agents only on completely unrelated features with zero shared files
- Each agent gets its own git worktree (separate directory, separate branch)
- Run a status script to flag any file both agents have touched
- Never let two agents work on the same thing — conflicts every time
git worktree add ../feature-b-dir feature-b-branch
13. TypeScript
- Adopt earlier than feels necessary — benefits compound, late migration is painful
- Start at the boundaries: API layer → data models → shared utils → inward
- AI writes better code when types are defined — can't make assumptions about data shapes
- Refactors are dramatically safer with type checking
14. Tooling Reference
| Category | Tool | Notes |
|---|---|---|
| Version control | Git + GitHub/GitLab | Day one |
| Secret scanning | Gitleaks | CI integration + pre-commit hook |
| Dependency audit | npm/pnpm audit, Dependabot, pip-audit | CI, every PR — fail on highs |
| Linting | ESLint / Ruff / clang-tidy | Every build |
| Mutation testing | Stryker (JS) | Periodic |
| Error tracking | Sentry | Set up early |
| Analytics | PostHog | Behavioural data |
| API testing | Postman | Backend-independent testing |
| Secret management | GCP Secret Manager / AWS Secrets Manager | Never hardcode |
| At-rest encryption | AES-256-GCM (libsodium / WebCrypto) | Encrypt API keys/tokens/PII in the DB |
| HTML sanitization | DOMPurify | Before any dangerouslySetInnerHTML |
| Network bans | fail2ban (+ structured app logs) | IP banning at the firewall |
| Headers / CSP | framework header config / helmet | CSP, HSTS, X-Frame-Options, etc. |
React Native specific:
- Test both Android and iOS from day one
- Move from Expo Go to DevBuilds ASAP — Expo Go behaves differently from real builds
15. Beginner Ramp
Tier 1 — Day One
- Use Git, commit often
- Write CLAUDE.md before you start
- One session per feature
- Manual test before moving on
- Never commit
.env/databases; no hardcoded secrets - Auth checked on the server for every protected action (not just the UI)
Tier 2 — First Month
- Automated unit tests on core logic — including tenant-isolation & abuse paths
- Preview/staging environment
- Secrets in a secret manager, never in code
- Understand your DB auth model
- Per-record/tenant authorization (kill IDOR); rate-limit auth flows
- Security headers (CSP/HSTS/…), gitleaks + dependency audit in CI
Tier 3 — Once Shipping
- TypeScript at the boundaries
- Mutation testing (Stryker)
- Multi-model PR reviews
- Git worktrees for parallel agents
- Encrypt sensitive data at rest; progressive bans + fail2ban; honeypots
- Security event logging + alerting; human security review for sensitive apps
Where to Learn
- Ask the AI to explain any term in plain language — fastest teacher available
- Fundamentals of Software Architecture (O'Reilly) — theory behind all of this
- OWASP Top 10 — free, canonical security reference
Glossary
| Term | Definition |
|---|---|
| PRD | Written plan for what you're building, before you build it |
| CLAUDE.md | File that tells your AI the rules to follow every session |
| Context window | How much the AI can "remember" in one conversation — finite |
| Token | Unit of AI computation — words, code, instructions all cost tokens |
| Git worktree | Multiple branches checked out simultaneously in separate folders |
| CSP | Content Security Policy — browser rule limiting what your site can load/send |
| RLS | Row-Level Security — DB feature enforcing who can read/write which rows |
| XSS | Cross-Site Scripting — injecting malicious code via user input |
| Mutation testing | Deliberately breaking code to verify tests catch the breakage |
| Linter | Finds code patterns likely to cause bugs before they do |
| Gitleaks | Scans git history for accidentally committed secrets |
| API contract | Formal definition of an endpoint's inputs and outputs |
| SHA | Unique fingerprint for a specific version of your code |
| RBAC | Role-Based Access Control — permissions derived from a user's role |
| IDOR | Insecure Direct Object Reference — accessing another user's record by guessing/passing its ID |
| Multi-tenancy | One app serving many isolated customers; data must never cross tenant lines |
| Privilege escalation | Gaining rights beyond what you were granted (e.g. user → admin) |
| Defense in depth | Layering controls so no single failure is fatal |
| Least privilege | Granting the minimum access required, nothing more |
| Fail closed | On error/doubt, deny access rather than allow |
| KDF | Key Derivation Function — slow, salted password hashing (argon2/scrypt/bcrypt) |
| AES-GCM | Authenticated symmetric encryption for data at rest |
| At-rest encryption | Encrypting stored data so a DB leak doesn't expose secrets |
| Rate limiting | Capping how often an action can run, per IP/user, to stop brute force |
| fail2ban | Daemon that bans abusive IPs at the firewall by reading log patterns |
| Honeypot | A decoy field/route that only bots hit, used to detect and ban attackers |
| HSTS | HTTP Strict Transport Security — forces HTTPS |
| CSRF | Cross-Site Request Forgery — tricking a logged-in user's browser into acting |
| SSRF | Server-Side Request Forgery — making the server fetch an attacker's URL |
| Host-header poisoning | Abusing a trusted Host header to forge links (e.g. password resets) |