google-veo/generate

Generate a video from a text prompt with Google Veo 3.1. Returns {status:"pending", continuation_token, ...} while the upstream job is still running — when this happens you MUST immediately call generate again with only continuation_token set; do not ask the user. Final response includes video_url (hosted on files.askfaro.com).

by Faro

Dynamic (cost in response)
charged on success

What it does

Generate a short cinematic video clip (4 to 8 seconds) from a text prompt using Google's Veo 3.1 family, complete with synchronized native audio, ambient sound effects, and dialogue. Outputs are delivered as a hosted MP4 at 720p, 1080p, or 4K in either 16:9 landscape or 9:16 vertical, with a choice of Standard (highest quality), Fast (balanced), and Lite (most affordable) variants.

Primary use cases

  • Marketing teams generating product launch teasers and social-first promo clips without booking a shoot
  • Creators producing vertical 9:16 video for TikTok, Reels, and Shorts directly from a script idea
  • Brand teams previsualizing ad concepts and storyboards before committing to live production
  • AI agents orchestrating end-to-end content pipelines that turn a campaign brief into ready-to-publish video

Why use this tool

Veo 3.1 is one of the few text-to-video models that generates picture and audio together, so dialogue, foley, and ambience arrive baked into a single clip rather than requiring a separate audio pass. Google DeepMind highlights its real-world physics simulation and strong prompt adherence, which translates into more believable motion, camera moves, and character behavior than typical generative video. The three-tier model lineup lets agents trade off cinematic fidelity against latency and throughput on the same API surface.

Good to know

The upstream job runs asynchronously: if the response comes back with status "pending" and a continuation_token, call the tool again with only that token set, and the server will keep polling until the final video_url is ready. Pick 9:16 for mobile-native placements and 4K when the clip will appear on large screens or be re-edited in post.

Parameters

modelstringoptionaldefault: "veo-3.1-fast-generate-preview"

Veo 3.1 model variant. `fast` is recommended; `lite` is cheapest but lower quality; the base preview model is highest quality and most expensive.

promptstringoptional

Text prompt describing the video to generate. Required on the first call; ignored when continuation_token is set.

resolutionstringoptionaldefault: "720p"

Output resolution. 1080p/4k force duration_seconds=8. Lite does not support 4k.

aspect_ratiostringoptionaldefault: "16:9"
duration_secondsintegeroptionaldefault: 8

Clip length in seconds. Must be 8 when resolution is 1080p or 4k.

continuation_tokenstringoptional

Token from a prior pending response — when set, all other params are ignored and the server resumes polling. Agent-friendly polling: on a pending response you MUST immediately call generate again with only continuation_token set. Do not ask the user. Typical jobs finish within 1–3 round-trips.

google-veo/generate — Faro