Sitemap and Robots Reconciliation
Generate the sitemap from the live route source and align robots rules so crawlers see exactly the canonical, indexable pages — no stale URLs, no orphans, no public pages accidentally blocked.
Verified HMX-owned case
Outcome signals
These are the real outcome statements attached to this HMX case study.
- Generated
- sitemap built from live routes, never stale
- Aligned
- robots and canonicals agree
- No orphans
- every public page is discoverable
- Protected
- admin and utility paths stay out of crawl
Case architecture
Sitemap and Robots Reconciliation Architecture
- 01Generate the sitemap from
Generate the sitemap from the live route source and align robots rules so crawlers see exactly the canonical, indexable pages — no stale URLs, no o...
- 02Include lastmod and only
Include lastmod and only canonical, indexable URLs
- 03Next
Next.js sitemap.ts (metadata route) supports the route, form, or data boundary for Sitemap and Robots Reconciliation so public UX and backend state stay connected.
- 04Next
Reconcile robots rules to allow public pages and disallow admin/utility paths
- 05Fallback Path
When automation confidence is low, route the record to a manual owner with the source, stage, and last action attached.
- 06Live Site
Generated sitemap built from live routes, never stale; Aligned robots and canonicals agree; No orphans every public page is discoverable; Protected...
Problem
The operating gap
A hand-maintained sitemap drifts from reality: it lists deleted URLs, omits new pages, and contradicts robots rules that block pages meant to be indexed or expose ones that should be private. Crawlers waste budget and miss real content.
Build
What gets built
Generate sitemap.ts programmatically from the same typed route and content sources the site renders from, so it can never list a dead or missing URL. Reconcile robots.ts to allow exactly the public, canonical pages and disallow admin and utility routes, keeping crawl directives and canonical tags in agreement.
Build steps
Sitemap and Robots Reconciliation uses a web app route, data, and conversion layer for Full-Stack Websites. Generate the sitemap from the live route source and align robots rules so crawlers see exactly the canonical, indexable pages — no stale URLs, no o... The architecture connects generate the sitemap from, next, next, and live site with an explicit control path.
- 01Generate the sitemap from the live route and content sources, not a static list
- 02Include lastmod and only canonical, indexable URLs
- 03Reconcile robots rules to allow public pages and disallow admin/utility paths
- 04Confirm robots and canonical tags agree on what is indexable
- 05Exclude retired and gone routes so crawlers stop chasing them
- 06Run route QA to confirm every sitemap URL resolves
Stack
Tools and layers
- Next.js sitemap.ts (metadata route)
- Next.js robots.ts
- Typed route/content sources
- Canonical URL alignment
- Route-QA script
- Vercel
- Experience layer: Generate the sitemap from the live route and content sources, not a static list
- Server layer: Include lastmod and only canonical, indexable URLs
- Database layer: Next.js sitemap.ts (metadata route) supports the route, form, or data boundary for Sitemap and Robots Reconciliation so public UX and backend state stay connected.
- Automation layer: Next.js robots.ts handles routine steps while generate sitemap.ts programmatically from the same typed route and content sources the site renders from, so it can never list a dead or missing URL.
- Measurement layer: Generated sitemap built from live routes, never stale; Aligned robots and canonicals agree; No orphans every public page is discoverable; Protected...
Data flow
- 01Generate the sitemap from the live route and content sources, not a static list
- 02Include lastmod and only canonical, indexable URLs
- 03Reconcile robots rules to allow public pages and disallow admin/utility paths
- 04Confirm robots and canonical tags agree on what is indexable
- 05Exclude retired and gone routes so crawlers stop chasing them
- 06Run route QA to confirm every sitemap URL resolves
Controls
- A hand-maintained sitemap drifts from reality: it lists deleted URLs, omits new pages, and contradicts robots rules that block pages meant to be in...
- Generate sitemap.ts programmatically from the same typed route and content sources the site renders from, so it can never list a dead or missing URL.
- When automation confidence is low, route the record to a manual owner with the source, stage, and last action attached.
Research basis
A route assembles through form, data, metadata, and deploy checks.
The same website operating path
Full-stack websites for service businesses and operators: route architecture, service pages, lead capture, metadata, proof boundaries, blog/database paths, analytics, and deployment checks.
Route map
Service architecture
Clear service routes
Lead capture
Form and context flow
Lead capture that saves context
Public metadata
SEO and schema layer
SEO and schema on public pages
Launch QA
Analytics and deployment checks
Analytics events tied to CTAs