Skip to main content
Text-to-Audio models transform written scripts into spoken dialogue or narration. In Automat Studio, these models power the dialogue generation workflow, allowing you to create voice performances directly from your scripts without recording sessions.
These models are used to generate dialogue audio from text scripts for your shots.

When to Use

  • Dialogue Generation — Create character dialogue from script text
  • Narration — Generate voiceover or narration tracks
  • Rapid Prototyping — Test dialogue pacing and delivery before final recording
  • Multiple Takes — Generate variations of the same line for director review
These models consistently deliver the best results for voice generation:

ElevenLabs v3

  • Credits: 2
  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Provider: fal.ai
  • Avg. Duration: ~28 seconds
  • Best for: Natural-sounding dialogue with excellent voice quality and emotion

Supported Models

No additional supported models are currently available for this workflow.

Tips for Best Results

  1. Break Long Scripts — Process scenes separately to manage file sizes
  2. Specify Emotion and Tone — Include delivery cues in your prompts (pacing, emotion, intensity)
  3. Generate Multiple Takes — Create several versions to offer director options
  4. Match Character Voices — Use consistent voice profiles for each character
  5. Preview Before Using — Listen to generated audio before integrating into shots
  6. Clean Audio Preferred — Isolated dialogue works better than text with sound effects