Documentacion renderizada
Esta pagina renderiza Markdown y Mermaid del modulo directamente desde la fuente publica de documentacion.
Overview#
The Multi-Provider Realtime Voice AI module delivers live conversational AI capabilities through simultaneous support for OpenAI Realtime API and Google Gemini Live, enabling natural voice interactions for dispatch operations, field reporting, and public-facing communication channels. The system manages provider selection, voice resolution, greeting injection timing, and session lifecycle across providers while maintaining a unified interface for consuming applications.
By abstracting multiple voice AI providers behind a single orchestration layer, organizations gain provider redundancy, best-of-breed voice quality, and the ability to route conversations to the optimal provider based on language, latency requirements, and cost constraints.
Key Features#
- OpenAI Realtime API Integration -- Full support for OpenAI's general availability Realtime API with streaming audio input and output, function calling during voice sessions, and configurable voice personas
- Google Gemini Live Integration -- Native integration with Google Gemini Live for multimodal voice interactions, with optimized greeting injection timing and voice resolution to ensure natural conversation flow
- Provider-Agnostic Interface -- A unified API abstracts provider-specific protocols, enabling applications to initiate voice sessions without coupling to a specific provider's SDK or session management model
- Automatic Provider Failover -- Real-time health monitoring of voice providers with automatic session migration when a provider experiences degraded quality or availability
- Voice Persona Management -- Configure and manage voice personas with provider-specific voice selection, speaking rate, pitch adjustment, and personality prompting for consistent brand representation
- Secure API Key Management -- Provider API keys are stored and rotated through the platform's secrets management system, never exposed to client applications, with per-tenant key isolation
- Session Analytics -- Track voice session duration, provider utilization, latency metrics, turn-taking patterns, and user satisfaction signals for continuous optimization
- Tenant Isolation -- Complete separation of voice sessions, API credentials, usage quotas, and conversation history between tenants with no cross-tenant data leakage
Use Cases#
- AI-Assisted Dispatch -- Dispatchers interact with AI through natural voice to query case details, update incident status, and receive situation briefings without leaving their communication workflow
- Field Reporting -- Officers and field agents dictate reports through voice AI that structures the narrative into standardized report formats with entity extraction and automatic case linking
- Public Communication -- Automated voice interfaces handle routine public inquiries, triage incoming calls, and escalate complex requests to human operators with full conversation context
- Multilingual Operations -- Voice AI provides real-time interpretation and translation during cross-language interactions, with provider selection optimized for the language pair involved
Integration#
This module connects to the AI/LLM orchestration layer for prompt management and safety guardrails, the authentication service for session authorization, and the PSAP dispatch system for operational voice workflows. Voice session transcripts feed into the case management and audit logging systems for record keeping.
Availability#
- Enterprise Plan: Full multi-provider voice AI included
- Professional Plan: Single provider voice AI included; multi-provider and advanced persona management available as add-on
Last Reviewed: 2026-03-02