# WideCast.ai

> Turn a script, idea, blog post, or an existing video/audio clip into a finished
> short-form video (scenes + AI voice + B-roll) via a small REST API. Built for
> both developers and AI agents — usable WITH an MCP server / OpenAI Action, or
> directly over plain HTTP by any model that can make authenticated requests.

WideCast.ai exposes a handful of REST endpoints under `/v1/`. Conventions are
Stripe-style: id format `gubo<alphanumeric>` (the public id IS the internal
`topic_id`, 1:1), `"object"` markers, snake_case JSON, ISO-8601 UTC timestamps,
and a structured error envelope. **`/openapi.json` (OpenAPI 3.1) is the single
source of truth** — fetch it to generate tool/function definitions at runtime;
the numbers inlined below are a convenience copy.

## API base & auth

- **Base URL (live pilot):** `https://gubo.ai/app/dashboard2` — e.g. `POST https://gubo.ai/app/dashboard2/v1/create_video`
- **Base URL (production, planned):** `https://api.widecast.ai`
- **Auth:** send `Authorization: Bearer wc_live_...` (create a key in the dashboard → Setup Center → API Keys). `/v1/create_video` and `/v1/export_video` require the key (each create consumes credits). **`GET /v1/status/{id}` is key-free** — the unguessable `gubo*` id is itself the access token, so polling never needs the key.
- Send a unique `Idempotency-Key` header on POSTs.

## Calling it without an MCP server (plain-HTTP recipe)

A model with any HTTP/`fetch`/`curl` capability can drive WideCast in two steps:

1. **Create** — `POST /v1/create_video` with a JSON body (see fields below). Returns a `gubo*` `id` and `status: "processing"` (HTTP 202).
2. **Poll** — `GET /v1/status/{id}` until `status` is `completed` or `failed`. Cadence: start 5s, ×1.5 backoff, cap 60s. Do NOT hammer it in a tight loop.
3. **(scene only, optional)** — once a `scene` video is `completed` (review_url ready), `POST /v1/export_video {"id": "..."}` to render the final MP4, then poll again until `result.video_url` appears.

If you cannot read this file's spec in full, fetch `/openapi.json` and generate the call from there.

## Endpoints

- POST `/v1/create_video` — start a video. Body fields below. → `gubo*` id + `status:"processing"`.
- GET `/v1/status/{id}` — universal poll (key-free). → `status` + `stage` + `progress` + `result.review_url` / `result.video_url`.
- POST `/v1/export_video` — `{"id":"gubo..."}`. Render the final MP4 for a `scene` video after review. Idempotent (calling twice is a no-op).

## Request fields — `source` × `output_type`

`source` (enum, default `text`) — what input you provide, and the required companion field:

- `text` → `script_text` — a finished narration **80–500 words**, used VERBATIM (no AI rewrite). Over-max → rejected (`script_too_long`).
- `idea` → `idea_text` — a brief **5–1000 words**; the server writes the narration. Over-max → auto-truncated (`details.input_truncated_from`).
- `blog` → `blog_text` — an article **30–3000 words** to repurpose; same AI writer as `idea`. Over-max → auto-truncated.
- `video_url` → `video_url` — a YouTube/TikTok/Facebook link, **≤ 2 min**.
- `audio_url` → `audio_url` — a YouTube/TikTok/Facebook link, **≤ 2 min**.
- `video_file` / `audio_file` → multipart upload (**≤ 2 min, ≤ 100 MB**). Requires `multipart/form-data`, so these are SDK/playground-only — most chat agents can't attach a local file and should use the URL sources instead.

`idea` and `blog` are **generative** sources (server writes the script). `text` is verbatim. Media sources (`video_url`/`video_file`/`audio_url`/`audio_file`) carry their own script.

**Inline media in `script_text` (source=text):** you can embed a direct image/video URL inside the script, right next to the sentence it should illustrate. WideCast strips the URL from the spoken narration and uses that asset as the visual for the matching scene (instead of auto-sourced B-roll). Supported = direct file links ending in `.png/.jpg/.jpeg/.gif/.webp/.bmp/.avif/.svg` (image) or `.mp4/.webm/.mov/.m4v/.avi` (video), optionally with a `?query`. Page links (a YouTube/TikTok watch URL) are NOT inlined — use `source=video_url` to turn a whole clip into a video. Keep a few descriptive words next to each URL so it anchors to the right scene; multiple URLs OK; scenes without a URL still get automatic B-roll. So when the user already has specific photos/footage, put their direct URLs into `script_text` rather than describing them.

`output_type` (enum, default `scene`) — how far down the pipeline to run:

- `text` — stop at the editable script (only valid for non-`text` sources; for media it = Remake / transcript extract). `result.review_url` opens the Script Editor.
- `scene` — stop at "scenes ready for review"; returns `result.review_url`. **This is the right default for almost everything.**
- `video` — run end-to-end to a final MP4; `result.video_url` set on completion (slow).

Optional fields: `language` (`English` | `Vietnamese`, generative only), `video_length` (`short` ≈90s | `normal` ≈3 min, generative only), `research_enabled` (bool, generative only), `callback_url` (HTTPS webhook), `metadata` (object echoed back on status), `wait_for_render` (bool — block up to 60s before returning).

## Choosing `output_type` (guidance for agents)

Pick by `source` + intent — this avoids slow renders and gives the user a review step:

- `source=idea` / `blog` (or any content you generated) → use **`text`** (returns the AI-written script for the user to review/edit; that script IS what the narrator says).
- `source=video_url` / `audio_url` → use **`scene`**.
- `source=text` → use **`scene`** (`text` is invalid when the script is already supplied).
- Use **`video` ONLY when the user explicitly asks for the FINAL finished MP4.** "Make a video" on its own means `scene` (review-first), NOT `video` — rendering is slow and users usually export from the UI after reviewing.

## Status, polling & results

- `status` enum (LOCKED): `pending | processing | completed | failed`. (`pending` only covers the brief create→worker race; treat it like `processing`.) Branch on `status`, never on the free-form `stage`.
- `result.review_url` (e.g. `https://gubo.ai/#scene_editor?topic_id=gubo...`) — where the user reviews/edits scenes or the script. Present on `text`/`scene`/`video` completion. **Append `&readonly=true`** for a public, sign-in-free, **iframe-embeddable** read-only player — embed that variant to show the result inline; the plain URL is the editor.
- `result.video_url` — the final MP4. Only set when `output_type:"video"` finished, or after `export_video`.
- A status poll should never 404 right after create — the server tracks issued ids for ~10 min and returns `pending` during the race.

## Errors

Envelope: `{ "error": { "type", "code", "message", "param?", "doc_url", "request_id" } }`. Surface `error.message` + `request_id` to users. Codes (v0.1.0, see OpenAPI for the live list):

- Auth (401): `missing_api_key`, `invalid_api_key`.
- Validation (400): `invalid_source`, `invalid_output_type`, `script_too_short`, `script_too_long`, `missing_idea_text`, `idea_too_short`, `missing_blog_text`, `blog_too_short`, `missing_video_url`, `missing_audio_url`, `missing_media_file`, `unsupported_media_url`, `media_too_long`, `file_too_large`, `invalid_language`, `invalid_video_length`, `invalid_research_enabled`.
- Render/runtime: `account_expired`, `credit_exhausted`, `render_failed`, `unknown_error`, `scenes_not_ready`, `export_failed`.

## Full spec & try-it

- [OpenAPI 3.1 — JSON](https://widecast.ai/openapi.json) and [YAML](https://widecast.ai/openapi.yaml) — **canonical**; pilot copy at `https://gubo.ai/app/dashboard2/openapi.json`.
- [Interactive playground](https://widecast.ai/playground.html) — submits + auto-polls + shows the review URL in one click.
- [Docs](https://widecast.ai/docs.html) and [Create-video reference](https://widecast.ai/endpoints/create-video.html).
- [Use with AI (Claude / ChatGPT / Gemini setup)](https://widecast.ai/use-with-ai.html) — wiring this API into an AI host.

## SDKs & agent integrations

- **Python SDK** — `widecast` (thin, single-file; `video.wait()` handles polling). Repo: https://github.com/widecast
- **JS/TS SDK** — `@widecast/sdk` (Node 18+, Deno, Bun, browser).
- **MCP server** — `widecast/mcp-server/` in the repo. Self-contained (calls this REST API via `fetch`; only dep `@modelcontextprotocol/sdk`). Run from source: `npm install` then `node /abs/path/widecast/mcp-server/dist/index.js`. Exposes `widecast_create_video`, `widecast_wait_for_video`, `widecast_get_status`, `widecast_export_video`. (NOT published to npm yet — do not `npx @widecast/mcp-server`.)
- **Authoring Skills** — `widecast/skills/` (video-script / blog / social-post writing). Teach a model to write a strong script; pair with the MCP server (or this API) to turn it into a video. Skills = write well; the API/MCP = operate WideCast.

## Conventions (locked — changing = breaking)

- snake_case JSON fields; ID format `gubo<alphanumeric>` (clients must NOT strip the `gubo` prefix; public id == internal `topic_id`).
- API key prefix `wc_live_*`. Every resource has an `"object"` marker.
- Headers on every response: `X-Request-Id`, `X-RateLimit-*`, `X-WideCast-Version`. Timestamps ISO-8601 UTC.