Crawl Control

Sitemap and Robots Reconciliation

Generate the sitemap from the live route source and align robots rules so crawlers see exactly the canonical, indexable pages — no stale URLs, no orphans, no public pages accidentally blocked.

Start a Project All Case Studies

hmx__websites__

HMX Zone

Next.js robots.ts

2 to 4 days

build time

outcomes

stack tools

build steps

Next.js sitemap.ts (metadata route)

Verified HMX-owned case

Outcome signals

These are the real outcome statements attached to this HMX case study.

Generated: sitemap built from live routes, never stale
Aligned: robots and canonicals agree
No orphans: every public page is discoverable
Protected: admin and utility paths stay out of crawl

Case architecture

Sitemap and Robots Reconciliation Architecture

6 nodes

Generate the sitemap from

Include lastmod and only

Fallback Path

Live Site

01Generate the sitemap from
Generate the sitemap from the live route source and align robots rules so crawlers see exactly the canonical, indexable pages — no stale URLs, no o...
02Include lastmod and only
Include lastmod and only canonical, indexable URLs
03Next
Next.js sitemap.ts (metadata route) supports the route, form, or data boundary for Sitemap and Robots Reconciliation so public UX and backend state stay connected.
04Next
Reconcile robots rules to allow public pages and disallow admin/utility paths
05Fallback Path
When automation confidence is low, route the record to a manual owner with the source, stage, and last action attached.
06Live Site
Generated sitemap built from live routes, never stale; Aligned robots and canonicals agree; No orphans every public page is discoverable; Protected...

Problem

The operating gap

A hand-maintained sitemap drifts from reality: it lists deleted URLs, omits new pages, and contradicts robots rules that block pages meant to be indexed or expose ones that should be private. Crawlers waste budget and miss real content.

Build

What gets built

Generate sitemap.ts programmatically from the same typed route and content sources the site renders from, so it can never list a dead or missing URL. Reconcile robots.ts to allow exactly the public, canonical pages and disallow admin and utility routes, keeping crawl directives and canonical tags in agreement.

Build steps

Sitemap and Robots Reconciliation uses a web app route, data, and conversion layer for Full-Stack Websites. Generate the sitemap from the live route source and align robots rules so crawlers see exactly the canonical, indexable pages — no stale URLs, no o... The architecture connects generate the sitemap from, next, next, and live site with an explicit control path.

01Generate the sitemap from the live route and content sources, not a static list
02Include lastmod and only canonical, indexable URLs
03Reconcile robots rules to allow public pages and disallow admin/utility paths
04Confirm robots and canonical tags agree on what is indexable
05Exclude retired and gone routes so crawlers stop chasing them
06Run route QA to confirm every sitemap URL resolves

Stack

Tools and layers

Next.js sitemap.ts (metadata route)
Next.js robots.ts
Typed route/content sources
Canonical URL alignment
Route-QA script
Vercel

Experience layer: Generate the sitemap from the live route and content sources, not a static list
Server layer: Include lastmod and only canonical, indexable URLs
Database layer: Next.js sitemap.ts (metadata route) supports the route, form, or data boundary for Sitemap and Robots Reconciliation so public UX and backend state stay connected.
Automation layer: Next.js robots.ts handles routine steps while generate sitemap.ts programmatically from the same typed route and content sources the site renders from, so it can never list a dead or missing URL.
Measurement layer: Generated sitemap built from live routes, never stale; Aligned robots and canonicals agree; No orphans every public page is discoverable; Protected...

Data flow

01Generate the sitemap from the live route and content sources, not a static list
02Include lastmod and only canonical, indexable URLs
03Reconcile robots rules to allow public pages and disallow admin/utility paths
04Confirm robots and canonical tags agree on what is indexable
05Exclude retired and gone routes so crawlers stop chasing them
06Run route QA to confirm every sitemap URL resolves

Controls

A hand-maintained sitemap drifts from reality: it lists deleted URLs, omits new pages, and contradicts robots rules that block pages meant to be in...
Generate sitemap.ts programmatically from the same typed route and content sources the site renders from, so it can never list a dead or missing URL.
When automation confidence is low, route the record to a manual owner with the source, stage, and last action attached.

Research basis

A route assembles through form, data, metadata, and deploy checks.

The same website operating path

Full-stack websites for service businesses and operators: route architecture, service pages, lead capture, metadata, proof boundaries, blog/database paths, analytics, and deployment checks.

Route map

Service architecture

Clear service routes

01active

Progress72%

Lead capture

Form and context flow

Lead capture that saves context

02active

Progress86%

Public metadata

SEO and schema layer

SEO and schema on public pages

03active

Progress64%

Launch QA

Analytics and deployment checks

Analytics events tied to CTAs

04active

Progress91%