Configure TTS Engines¶
Text-to-speech engines convert typed text into audio for system announcements, dynamic IVR responses, and notifications. UnifiedBX ships with TTS support; you wire in an engine (Google, Polly, Flite, etc.) and modules can call it.
Before You Start¶
- You've chosen an engine: Flite (free, low quality, on-host), Google Cloud TTS (paid, high quality, online), AWS Polly (paid, high quality, online), Sangoma TTS (paid, hosted), or others.
- For paid engines: API credentials.
- For Flite: package installed on the host (
yum install fliteor equivalent).
Steps¶
- Go to System Attributes → Text To Speech Engines (sometimes labeled TTS or TTS Engines).
- Click + Add Engine.
- Pick the engine type:
- Flite — free, no creds. Voice options are limited (kal_diphone, slt, etc.).
- Google Cloud Text-to-Speech — provide a service account JSON.
- AWS Polly — provide AWS access key, secret, region.
- Sangoma TTS — uses your Sangoma account.
- Configure engine-specific options:
- For Google: paste the service-account JSON, pick voice (e.g.
en-US-Wavenet-D). - For Polly: pick voice (Joanna, Matthew, etc.) and region.
- For Google: paste the service-account JSON, pick voice (e.g.
- Click Submit.
- Click Apply Config.
Use the TTS engine¶
Modules that support TTS will offer the engine as a source. Examples:
- Announcements — instead of a recorded file, type text.
- Voicemail email — TTS rendering of voicemail content (separate feature).
- Outroutemsg — outbound route messages.
For ad-hoc TTS in a System Recording, some versions of UnifiedBX let you type text on the recording-add page; otherwise generate the audio externally and upload as WAV.
Verify¶
Add a test Announcement using the TTS engine, point an inbound route at it, and call. The synthesized voice should play.
Common Issues¶
- No audio / fallback to default voice. Engine credentials wrong or rate-limited. Check engine logs.
- Flite voice sounds robotic. That's Flite — switch to Google or Polly for natural voices.
- Long text cuts off. Some engines have per-call character limits. Split into smaller chunks.
- Slow first-play. TTS engines synthesize on-demand and cache. First play hits the API; subsequent plays are cached.
- API quota exceeded. Online engines bill per character. Watch usage; cache aggressively.