webhouse.appwebhouse.appdocs

Probe a live WordPress site, review its theme/builder/content, and import posts, pages, media and taxonomies into a fresh @webhouse/cms site.

What it does

Point at a public WordPress site, click through a 4-step wizard, and end up with a brand-new @webhouse/cms site containing the WP content as JSON documents plus downloaded media. Phase 1 handles the content + media side cleanly; theme/design extraction is Phase 2.

The migration lives at /admin/sites/new → WordPress tab.

What gets imported

  • Posts, pages, and custom post types
  • Media files (images, PDFs, downloads) — downloaded and stored under /uploads/
  • Featured images, wired to the imported post
  • Categories and tags
  • Excerpts, publish status, publish date
  • Authors (as text — not as a relation collection in Phase 1)

Not imported (Phase 2+):

  • Comments
  • Menus / navigation
  • ACF / custom fields
  • Gutenberg-only blocks that don't transform to clean HTML (kept as HTML but not structured)
  • Page-builder shortcodes (Divi, WPBakery) — they leak as raw [et_pb_...] text until Phase 2's HTML-scraping fallback lands

The wizard (4 steps)

1. Probe

Enter the WordPress site URL. The wizard calls the WP REST API (/wp-json/wp/v2/...) to detect:

  • Theme name and version
  • Page builder in use (Elementor, Divi, WPBakery, Gutenberg, Classic)
  • Content inventory: post counts per post type, media counts, taxonomy counts

Takes ~3–5 seconds. Works on any self-hosted WordPress with the REST API enabled (default since WP 4.7 — about 90% of sites). wordpress.com hosted sites are not supported directly; you'd need the REST API accessible.

2. Review

Review the detected metadata. No content preview yet — that's a Phase 2 addition. Decide whether to continue based on the inventory numbers.

3. Name

Give the new site an ID and display name, pick which organization to add it under. The wizard will auto-generate cms.config.ts based on the discovered post types — a WP custom post type called exhibitions becomes a CMS collection with the same name.

4. Migrate

Spinner screen. The wizard:

  • Paginates the WP REST API (100 items per page, no delay between pages)
  • Downloads each media file, slugifies the filename (e.g. photo-a1b2.jpg), writes to public/uploads/
  • Rewrites <img> URLs in post content from wp-content/uploads/... to the new /uploads/... paths
  • Creates one JSON document per post/page in content/<collection>/<slug>.json
  • Writes the generated cms.config.ts with urlPrefix matching the original WP paths (so any redirects you set up can keep working 1:1)
  • Registers the site in the CMS registry under the chosen org

Duration: ~30 seconds for a small blog, up to 5 minutes for a site with hundreds of media files. No progress bar in Phase 1 — just the final "Open site in CMS" button.

Authentication

Public WP sites need nothing. For private sites (e.g. wp-admin-protected), the wizard supports WordPress application passwords: username:app-password passed via HTTP Basic Auth.

What to check after migration

  • Broken shortcodes — if the source used Divi/WPBakery/Elementor, you'll see raw shortcode text in imported content. Either manually clean up or wait for Phase 2 HTML scraping.
  • Author links — authors come in as text. If you want relational authors, add a team collection and rewrite the author field as a relation.
  • Custom fields — ACF fields are dropped. Check the WP admin source for fields you need and re-add them as @webhouse/cms fields in cms.config.ts (re-exporting webhouse-schema.json if the site has non-TS consumers).
  • URL prefix — verify urlPrefix matches the original structure so your old URLs still resolve.
  • Images with text in them — the downloaded images are byte-identical copies, no alt text inferred. Run the media AI analysis to generate alt text in bulk.

Phase 2+ roadmap

  • Design token extraction via Dembrandt (colors, fonts, spacing scale)
  • Tailwind config auto-generation from extracted tokens
  • HTML scraping fallback for page-builder sites (Divi, WPBakery)
  • WXR XML import (WP export file) as an offline alternative
  • Custom field mapping UI
  • Content preview before commit

Phase 1 is the safe baseline — it won't do anything unexpected, and everything it does import is lossless against the WP REST API response.

Tags:MigrationMediaSchema
Previous
Curation Queue
Next
Calendar
JSON API →Edit on GitHub →