{
  "slug": "wordpress-migration",
  "title": "WordPress migration (F03)",
  "description": "Probe a live WordPress site, review its theme/builder/content, and import posts, pages, media and taxonomies into a fresh @webhouse/cms site.",
  "category": "guides",
  "order": 22,
  "locale": "en",
  "translationGroup": "4b2c7eb3-13c9-4ba0-8049-ad7c64724db8",
  "helpCardId": null,
  "content": "## What it does\n\nPoint at a public WordPress site, click through a 4-step wizard, and end up with a brand-new @webhouse/cms site containing the WP content as JSON documents plus downloaded media. Phase 1 handles the content + media side cleanly; theme/design extraction is Phase 2.\n\nThe migration lives at **/admin/sites/new → WordPress tab**.\n\n## What gets imported\n\n- Posts, pages, and custom post types\n- Media files (images, PDFs, downloads) — downloaded and stored under `/uploads/`\n- Featured images, wired to the imported post\n- Categories and tags\n- Excerpts, publish status, publish date\n- Authors (as text — not as a relation collection in Phase 1)\n\n**Not imported (Phase 2+):**\n\n- Comments\n- Menus / navigation\n- ACF / custom fields\n- Gutenberg-only blocks that don't transform to clean HTML (kept as HTML but not structured)\n- Page-builder shortcodes (Divi, WPBakery) — they leak as raw `[et_pb_...]` text until Phase 2's HTML-scraping fallback lands\n\n## The wizard (4 steps)\n\n### 1. Probe\n\nEnter the WordPress site URL. The wizard calls the WP REST API (`/wp-json/wp/v2/...`) to detect:\n\n- Theme name and version\n- Page builder in use (Elementor, Divi, WPBakery, Gutenberg, Classic)\n- Content inventory: post counts per post type, media counts, taxonomy counts\n\nTakes ~3–5 seconds. Works on any self-hosted WordPress with the REST API enabled (default since WP 4.7 — about 90% of sites). wordpress.com hosted sites are not supported directly; you'd need the REST API accessible.\n\n### 2. Review\n\nReview the detected metadata. No content preview yet — that's a Phase 2 addition. Decide whether to continue based on the inventory numbers.\n\n### 3. Name\n\nGive the new site an ID and display name, pick which organization to add it under. The wizard will auto-generate `cms.config.ts` based on the discovered post types — a WP custom post type called `exhibitions` becomes a CMS collection with the same name.\n\n### 4. Migrate\n\nSpinner screen. The wizard:\n\n- Paginates the WP REST API (100 items per page, no delay between pages)\n- Downloads each media file, slugifies the filename (e.g. `photo-a1b2.jpg`), writes to `public/uploads/`\n- Rewrites `<img>` URLs in post content from `wp-content/uploads/...` to the new `/uploads/...` paths\n- Creates one JSON document per post/page in `content/<collection>/<slug>.json`\n- Writes the generated `cms.config.ts` with `urlPrefix` matching the original WP paths (so any redirects you set up can keep working 1:1)\n- Registers the site in the CMS registry under the chosen org\n\nDuration: ~30 seconds for a small blog, up to 5 minutes for a site with hundreds of media files. No progress bar in Phase 1 — just the final \"Open site in CMS\" button.\n\n## Authentication\n\nPublic WP sites need nothing. For private sites (e.g. `wp-admin`-protected), the wizard supports WordPress application passwords: `username:app-password` passed via HTTP Basic Auth.\n\n## What to check after migration\n\n- **Broken shortcodes** — if the source used Divi/WPBakery/Elementor, you'll see raw shortcode text in imported content. Either manually clean up or wait for Phase 2 HTML scraping.\n- **Author links** — authors come in as text. If you want relational authors, add a `team` collection and rewrite the `author` field as a relation.\n- **Custom fields** — ACF fields are dropped. Check the WP admin source for fields you need and re-add them as @webhouse/cms fields in `cms.config.ts` (re-exporting `webhouse-schema.json` if the site has non-TS consumers).\n- **URL prefix** — verify `urlPrefix` matches the original structure so your old URLs still resolve.\n- **Images with text in them** — the downloaded images are byte-identical copies, no alt text inferred. Run the media AI analysis to generate alt text in bulk.\n\n## Phase 2+ roadmap\n\n- Design token extraction via Dembrandt (colors, fonts, spacing scale)\n- Tailwind config auto-generation from extracted tokens\n- HTML scraping fallback for page-builder sites (Divi, WPBakery)\n- WXR XML import (WP export file) as an offline alternative\n- Custom field mapping UI\n- Content preview before commit\n\nPhase 1 is the safe baseline — it won't do anything unexpected, and everything it does import is lossless against the WP REST API response.",
  "excerpt": "What it does\n\nPoint at a public WordPress site, click through a 4-step wizard, and end up with a brand-new @webhouse/cms site containing the WP content as JSON documents plus downloaded media. Phase 1 handles the content + media side cleanly; theme/design extraction is Phase 2.\n\nThe migration lives ",
  "seo": {
    "metaTitle": "WordPress migration (F03) — webhouse.app Docs",
    "metaDescription": "Migrate posts, pages, media and taxonomies from a live WordPress site into a fresh @webhouse/cms site via a 4-step wizard.",
    "keywords": [
      "webhouse",
      "cms",
      "wordpress",
      "migration",
      "import",
      "rest-api",
      "f03"
    ]
  },
  "createdAt": "2026-04-15T21:00:00.000Z",
  "updatedAt": "2026-04-15T21:00:00.000Z"
}