I had a family events website — about 1,000 pages generated programmatically from a database — and a suspicion that the SEO was held together with duct tape. Screaming Frog confirmed the suspicion. It gave me 200+ findings in a CSV. What it didn't give me was a way to trace those findings to root causes in the codebase, design fixes, ship them, and prove they worked on the live site. So I wrote a PRD, handed it to Claude Code, and ran 13 phases of audit and remediation in about 90 minutes. Here's every phase and every tradeoff.
The Gap Between "Here's What's Wrong" and "Here's Proof It's Fixed"
Ahrefs and Semrush are excellent crawlers. They have severity scoring and I use both. But they stop at the report. They'll tell you 14 pages are orphaned. They won't design the cross-linking architecture to fix it. They'll flag truncated titles. They won't trace the root cause to a double brand suffix in your layout template.
For a programmatic site with 1,000+ pages, 24 category hubs, and dynamic sitemaps, I needed the full loop: finding to root cause to fix to verified deployment. Not a report that sits in a backlog for two sprints.
This approach layers on top of your existing stack. It doesn't replace Ahrefs — it closes the loop from finding to verified fix.
How I Structured the Audit
Why a PRD Instead of a Prompt
I tried prompting first. The output was garbage because the agent had no memory of what it had already found. So I wrote a PRD instead — Phase 1 crawlability, Phase 2 Core Web Vitals, all the way through Phase 13 consolidated scoring. Each phase had success criteria and a verification step.
The structure pays off around Phase 8. When the agent maps the internal link graph, it can reference the orphan pages it found in Phase 1 and use that context to design the cross-linking fix in Phase 9. Without the PRD, every phase is a standalone check. With it, the agent builds a compounding picture of the site's health.
I also built in escape hatches: after 3 failed attempts on the same action, try an alternative approach. After 5 iterations without progress, output BLOCKED_NEED_HUMAN. The agent hit both during the audit. They're not theoretical.
Site-Type Detection Changes Everything Downstream
The first phase classifies the site — SaaS marketing site, e-commerce, blog, local aggregation — based on URL patterns and content signals. A local aggregation site needs structured data coverage checks, duplicate content policy for recurring events, cross-category navigation audits. A B2B SaaS site needs none of those and instead needs conversion path analysis and programmatic landing page quality checks.
Most audit frameworks are one-size-fits-all. The site-type classification determines which checklists load for every subsequent phase. That's the one architectural choice that made the rest of the audit actually useful.
13 Phases — What Actually Happened
Phases 1-4: Technical Foundation
Phase 1 — Crawlability. Playwright hit robots.txt, sitemap.xml, and sample pages across every template type. A database query confirmed 2,637 slugged events versus 1,038 in the sitemap. The gap was intentional — the sitemap filters to active events only. Worth checking, but not a bug.
The homepage had no canonical tag. Every other page's canonical pointed to the non-www domain, but the site served at the www subdomain. Google treats those as separate domains. I traced it to the metadataBase property in the layout template pointing to the wrong domain variant. One-line fix, sitewide impact.
Phase 2 — Core Web Vitals. DOM sizes: 225-286 nodes, well under the 1,500 threshold. TTFB: 14ms on the homepage thanks to Vercel edge cache.
The fix that mattered: hero images on event pages used loading="lazy" on above-fold content. That tanks LCP. The images also used raw <img> tags instead of Next.js Image, missing WebP auto-conversion and srcset entirely.
Removing loading="lazy" from the hero took 30 seconds. The full next/image migration was a separate effort because the images came from external URLs across dozens of source domains, and switching required configuring remotePatterns for each one.
The priority system forced the decision: lazy-loading fix goes P0 (30 seconds, high impact), full image pipeline migration goes P2 (high effort, moderate impact). Without that framework, both fixes sit in the same backlog with the same ambiguous priority.
Phase 3 — Security. Five of six headers present. Only HSTS missing. Vercel enforces HTTPS at the edge anyway. P2.
Phase 4 — URL Structure. Zero uppercase slugs. Max slug length 91 characters — slightly over the 80 guideline, but the date suffix adds value for event pages. Trailing slashes redirected correctly. Confirmed the canonical mismatch from Phase 1.
Phases 5-7: On-Page Edits
Phase 5 — Heading Structure. Every page had exactly one H1 and no hierarchy violations. But the homepage H1 was "Your Weekend, Sorted." — zero keywords. Hub pages were strong: "Museums & Attractions for Families in Baltimore." The homepage heading was valid HTML but invisible to search. Rewrote it to include the primary keyword. Simple, but manual scanning would have approved the old version as "good enough."
Phase 6 — Title Tags and Meta Descriptions. This was the biggest on-page win.
Every page title had | BrandName | Brand Name appended — the title template concatenated the brand twice. Event detail page titles hit 82 characters, well past Google's truncation point. One template edit fixed 1,000+ titles simultaneously.
Second finding: raw database slugs leaking into meta descriptions. Event descriptions contained anne_arundel_county instead of "Anne Arundel County." A formatting function existed in the codebase but wasn't called in the metadata generation path. The agent traced the call chain from page component to metadata function and found the gap. You'd only see this on events from certain counties where the slug contains underscores. Sampling 10 pages manually, you'd probably miss it.
Phase 7 — Image Optimization. Confirmed Phase 2: event hero images used raw <img>, external URLs, loading="lazy", no srcset. The gap was specifically in the programmatically generated pages where the template had been built fast and never revisited.
Phases 8-12: Internal Linking
Phase 8 — Link Graph Mapping. 14 of 24 category hub pages were orphaned. Zero incoming links from anywhere except the sitemap.
The homepage linked to zero hub pages. The footer linked to zero hub pages. Event detail pages linked to zero hub pages. Link equity pooled at the top and never flowed down to the pages actually targeting keywords.
This is where the PRD structure earns its keep. The agent connected the orphan pages from Phase 1 to the missing internal links in Phase 8 and used both to design the cross-linking architecture in Phase 9. Those aren't three separate problems — they're one structural gap.
Phases 9-11 — Cross-Linking Architecture. Five interventions:
- Event detail pages: "More [event_type] events in [location_area]" linking to the relevant hub
- 4-level breadcrumbs: Home > Category Type > Specific Category > Event Name
- Hub pages cross-linking to related hubs in other taxonomy types
- Footer category section on every page
- Homepage category grid with visual navigation to all hub types
The audit phase produced the entire remediation spec. Implementation was execution, not design.
Phase 12 — Checkpoint. Footer category links went P0 (every page benefits). Breadcrumbs went P0 (structural SEO signal). Homepage grid went P2 (single page, lower urgency). Deployed and verified on the live site before moving on.
Phases 13-18: Content and E-E-A-T
Phase 13 — Duplicate Content. 170+ events shared identical descriptions. Chain events with templated source copy — Storytime at one chain had 56 duplicates, a kids workshop at another had 35.
I decided to keep them indexed. Each page adds unique value through classification metadata, Schema.org Event markup, cross-category links, and different dates. Google's helpful content guidelines basically say aggregation is fine if you're adding real value beyond the raw listing. We are. 38% of events had descriptions under 100 characters — thin by word count, but each is wrapped in structured data, related events, breadcrumbs, and cross-category navigation. We'll publish Search Console data once we have 90 days of indexation history.
Phase 16 — E-E-A-T Assessment.
- Experience: "Built by a local dad" on the About page. Present but thin.
- Expertise: No author attribution anywhere. "AI-curated" framing without explaining the editorial process.
- Authoritativeness: 310+ event sources aggregated but no contact info visible, no social proof.
- Trustworthiness: Privacy policy, HTTPS, transparent about AI use. Strongest dimension.
Fixes: contact email in footer, editorial disclosure page, social links and author attribution, About page rewrite, FAQ page with FAQ schema markup.
Three PRDs, Not One
PRD 1: Structural Fixes
P0 fixes (canonical URL, title suffix, hero loading, meta descriptions) deployed in one commit — 13 files changed. P1 internal linking (footer categories, breadcrumbs, cross-links, homepage grid) deployed in one commit — 9 files. P2 technical improvements plus E-E-A-T fixes deployed in one commit — 7 files plus a new FAQ page.
Every commit verified on the live site via Playwright screenshots. Not "trust me, I checked" — actual before/after visual proof that the canonical tag renders, the breadcrumbs appear, the footer links exist.
PRD 2: Keyword Optimization
After structural fixes, a second PRD rewrote copy across every page type: title tags, meta descriptions, H1s, intro paragraphs.
- Homepage H1: "Your Weekend, Sorted." became "[City] Family Events — Your Weekend, Sorted."
- Event type hubs: "Museums & Attractions" became "Best Museums & Attractions for Kids in [City]"
- Area hubs got search-intent-matched descriptions instead of bare "Events in [Area]"
Each PRD fed the next one. Audit surfaces findings, remediation fixes structure, optimization sharpens copy.
PRD 3 Would Have Been Competitive SERP Analysis
I deferred it. That was a mistake. Knowing what competitors rank for should shape anchor text and keyword choices from Phase 1. Next iteration front-loads it.
Where AI Earned Its Keep — and Where It Didn't
Pattern detection at scale. Finding the double brand suffix across 1,000+ titles. Finding 170 duplicate descriptions across 2,600+ events. Catching database slug formatting buried in the metadata generation path. These are needle-in-a-haystack problems where exhaustive checking beats sampling.
Cross-phase reasoning. An agent holding 13 phases in context sees connections that a human auditing one checklist at a time would miss. The orphan pages weren't just a crawlability issue — they were a symptom of missing internal linking architecture. That insight emerged from the structure of the PRD, not from any single phase.
Verification. Playwright screenshots confirmed every fix on the live production site. No ambiguity about whether the deploy worked.
Root cause tracing. The agent didn't just flag "titles too long." It traced the double suffix to the layout template, the slug formatting to a missing function call in the metadata path, the orphan pages to a footer that linked to zero hub pages.
But the agent flagged 47 things. I maybe cared about 12.
HSTS missing? Technically correct. Vercel enforces HTTPS at the edge. P2. Thirty-eight percent thin content? Technically correct. The aggregation value justified keeping it indexed. E-E-A-T assessment? The agent checked for the presence of contact info and author attribution. It couldn't judge whether a thin About page narrative would actually build trust with a real visitor. Infrastructure dependencies — Search Console verification, analytics environment variables — sit outside the codebase entirely.
The agent finds everything. You decide what matters for your specific site.
The Reusable Pattern
When This Works
- Programmatic SEO sites with 500+ pages generated from a database
- Post-deployment audits after a major structural change
- Post-acquisition site consolidation where you inherit multiple domains
- SaaS sites with 200+ programmatic landing pages (city pages, integration pages, template galleries)
For context on how programmatic SEO scales and where it breaks, Ahrefs has a solid primer.
When This Is Overkill
- A 50-page marketing site needs 30 minutes with a standard SEO audit checklist, not a 13-phase PRD
- If you already have a technical SEO specialist running quarterly audits, this adds methodology without new capability
- If your bottleneck is execution capacity, not audit quality, fix your sprint planning first
The 5 Things That Actually Matter
- Write a PRD, not a prompt. Phases with outcomes, success criteria, verification steps. Include escape hatches for when the agent gets stuck. This is the one structural choice that makes everything else work.
- Classify the site type first. The checklist for a local aggregation site is different from a SaaS marketing site. Load the right one.
- Close the loop. Audit PRD produces findings. Remediation PRD consumes them. Screenshots prove the fix shipped. If you stop at the report, you've done the easy half.
- Deploy in checkpoints. Every 4-6 phases, ship and verify on the live site. Don't batch 13 phases of fixes into one deploy.
- Document decisions, not just findings. "HSTS missing" is a finding. "HSTS missing, rated P2 because Vercel enforces HTTPS at the edge" is a decision. Decisions are what make audits useful six months later.
What I'd Do Differently
Front-load competitive SERP analysis. I deferred it and regretted it. Knowing what competitors rank for should shape anchor text and keyword targeting from Phase 1.
Run a CRO audit in parallel. The SEO audit catches what Googlebot sees. A conversion audit catches what a human decides. Running them together gives the complete picture.
Build monitoring into the PRD. The final phase documented what to track but didn't automate it. Next version includes a SERP monitoring script that runs weekly and alerts on ranking changes.
Budget for three PRDs from the start. The keyword optimization PRD emerged because remediation revealed copy gaps I hadn't anticipated. Plan for audit, structure, and copy from day one.
Thirteen phases sounds like a lot. It was — but it was structured work, not sprawl. The PRD gave the agent a map. The checklists gave each phase pass/fail criteria. The remediation loop closed the gap between "here are your problems" and "here's proof they're fixed." And the tradeoff documentation means the next person who picks this up knows exactly why each call got made. The checklists in this article are the same ones I actually run. That felt worth sharing.
Victor Sowers is the founder of STEEPWORKS, where he builds AI-native GTM systems for operators who'd rather engineer results than guess at them.
