1 points lu794377 | 1 comments | | HN request time: 0.217s | source

Hi HN, We’ve been working on Veo 3.1, an AI video model that turns text or image prompts into cinematic video clips with built-in audio and scene control. It’s the next step after Veo 3, focused on longer storytelling and directing flexibility.

Veo 3.1 builds on transformer-based video diffusion with improved temporal consistency and sound integration.

Key features:

Extended 30-Second Clips: Generate longer cinematic scenes with natural pacing and narrative flow.

1080p & Vertical Format: Outputs in full HD and native 9:16 for social or professional use.

Stronger Scene Consistency: Improved lighting, framing, and character stability across shots.

Multi-Shot Orchestration: Create multi-shot sequences with transitions and camera motion control.

Built-In Audio & Lip-Sync: Automatically syncs dialogue and ambient sound for cinematic realism.

Why we built this: We wanted to move beyond short clips toward full cinematic storytelling — where a single prompt can define a sequence of shots, pacing, and sound without external tools.

We’d love your feedback on:

What kinds of projects longer 30-second clips could unlock

How you’d like to control multi-shot sequencing or camera motion

What integrations (APIs, plug-ins, etc.) would make Veo 3.1 easier to build with