{
  "slug": "agent-image-generation",
  "title": "Image Generation",
  "description": "Agents can generate images via Google Gemini 3 Pro Image. Output goes through the same media pipeline as user uploads — variants, EXIF, AI alt-text — and every image is tagged as AI-generated.",
  "category": "concepts",
  "order": 4,
  "locale": "en",
  "translationGroup": "5cd1bc3e-2f54-4f7d-a48f-11f8ec3bea22",
  "helpCardId": null,
  "content": "## What is the `generate_image` tool?\n\nWhen you enable **Image generation** on an agent, the agent gains a new tool called `generate_image`. The tool calls Google's Gemini 3 Pro Image model (commonly known as **Nano Banana 2**) and produces a real image from a text prompt. The image is saved to the site's media library, optimised, analyzed, and stamped with provenance metadata — all in one tool call.\n\nThe agent decides on its own when to call the tool. A typical run looks like:\n\n1. The agent reads its prompt (\"Write a short post about mountain sunrises\").\n2. The agent calls `generate_image` with a descriptive prompt of its own (\"A serene sunrise over snow-capped mountains, soft golden hour light, photorealistic\").\n3. Gemini returns the image bytes.\n4. The image is saved, processed, analyzed for alt-text, and the tool returns a Markdown image tag pointing at the saved file.\n5. The agent embeds that tag at the top of the article body.\n6. The final post lands in the curation queue with the image already in place.\n\n## Pipeline parity with uploads\n\nThe key design decision: a generated image goes through the **exact same processing pipeline** as a user-uploaded image. Specifically:\n\n| Step | Same as upload? |\n|------|-----------------|\n| Save bytes to `public/uploads/` via the media adapter | ✓ |\n| Generate WebP variants (Sharp, default 400 / 800 / 1200 / 1600 widths) | ✓ |\n| Extract EXIF metadata | ✓ (will be empty on synthetic images) |\n| Run F44 AI vision analysis to produce caption + alt-text + tags | ✓ |\n| Append to `media-meta.json` | ✓ |\n\nThe only thing that differs is **provenance**. Generated images get four extra fields on their `MediaMeta` entry:\n\n```json\n{\n  \"generatedByAi\": true,\n  \"generatedByModel\": \"gemini-3-pro-image-preview\",\n  \"generatedAt\": \"2026-04-08T...\",\n  \"generatedPrompt\": \"A serene sunrise over snow-capped mountains...\"\n}\n```\n\n## Marking and filtering in the media library\n\nEvery AI-generated image gets a distinct **purple AI badge** on the media card (separate from the gold sparkles badge that marks AI-analyzed-but-uploaded images). The badge tooltip shows the original prompt for quick context.\n\nThe media list sidebar gets a new **AI generated** filter under the AI Analysis section. Click it to see only the images your agents have produced.\n\nBoth grid view and list view render the badge.\n\n## Cost\n\nNano Banana 2 (Gemini 3 Pro Image Preview) costs **$0.039 per image**. The cost is charged to the Cockpit budget via `addCost()` immediately after a successful generation. If you have per-agent cost guards enabled, the image cost counts toward those caps too.\n\n## Failure mode: no hallucinated placeholders\n\nThe tool description has a strict rule baked in: **if generation fails, the agent must omit the image entirely.** No placeholder URLs, no \"image coming soon\" text, no stock images.\n\nWhen `generate_image` fails, the handler returns a string that begins with `Image generation failed:`. The agent's tool description tells it that any such string means \"do NOT include any image in your final output. Continue writing the article without one.\" The error message itself reminds the agent of this rule inline.\n\nThe curation Preview modal also defends against this — if a Markdown image URL doesn't start with `http(s)://`, `/`, or `data:`, it renders a small \"⚠ Invalid image URL\" warning chip instead of a broken `<img>`.\n\n## Configuration\n\n1. Make sure the Gemini API key is set on the org or site (org-level `aiGeminiApiKey` is inherited via F87).\n2. Open the agent you want to enable, scroll to **Tools**, tick **Image generation (Gemini Nano Banana)**, save.\n3. Run the agent with a prompt that benefits from an image.\n\nKeys resolved in order: `ai-config.json` → `GOOGLE_GENERATIVE_AI_API_KEY` env → `GEMINI_API_KEY` env. If none are set, the tool returns `null` from `buildToolRegistry` and is silently skipped — the agent can still run, just without the image option.\n\n## Webhook and curation embed\n\nWhen the agent finishes, the `agent.completed` webhook embed renders the generated image inline (Discord `embed.image`) provided the image URL is publicly reachable. Locally-generated images can't be reached by Discord, so the embed falls back to a clickable link in the description until the document is approved and deployed.\n\n## See also\n\n- [Per-agent cost guards](/docs/agent-cost-guards) — cap image-spend at the agent level.\n- [Curation queue Preview](/docs/curation-queue) — see your generated images rendered before approval.\n- [Media library](/docs/media) — filter and badge for AI-generated images.",
  "excerpt": "What is the generateimage tool?\n\nWhen you enable Image generation on an agent, the agent gains a new tool called generateimage. The tool calls Google's Gemini 3 Pro Image model (commonly known as Nano Banana 2) and produces a real image from a text prompt. The image is saved to the site's media libr",
  "seo": {
    "metaTitle": "Image Generation — webhouse.app Docs",
    "metaDescription": "Agents generate images via Gemini 3 Pro Image. Same media pipeline as uploads, with full AI-generated provenance and a media library filter.",
    "keywords": [
      "webhouse",
      "cms",
      "ai",
      "image generation",
      "gemini",
      "nano banana"
    ]
  },
  "createdAt": "2026-04-08T00:00:00.000Z",
  "updatedAt": "2026-04-08T00:00:00.000Z"
}