Skip to content

useAiVoice Voice Interaction

Browser Compatibility

This feature uses the browser's native Web Speech API for speech-to-text. In Google Chrome, it requires running in a 🪄 secure context (HTTPS or localhost) to work properly.

useAiVoice is a highly integrated voice processing hook. It not only encapsulates the browser's native speech-to-text (STT) capabilities but also features a built-in Web Audio analysis engine that outputs real-time high-fidelity waveform data, perfectly matching the AiVoiceTrigger component for dynamic visual feedback.

Key Features

  • Visual Sync: Built-in frequency analyzer returns real-time amplitudes array to drive waveform animations directly.
  • Physical Recording: Built-in MediaRecorder generates a .webm audio blob after recording.
  • Smart VAD: Supports Voice Activity Detection to automatically stop recording when the user finishes speaking.
  • Real-time STT: Supports interimResults to provide partial transcripts while the user is still speaking.
  • Plug-and-Play: Automatically manages microphone permissions, AudioContext lifecycle, and proper cleanup.

Basic Usage

⚠️ Current environment doesn't support Web Speech API (Chrome recommended).
Waiting for speech...
Basic Interaction Demo

Integration with AiSender

Integrate the voice trigger into AiSender for an input experience similar to mainstream AI assistants.

Integrated Input Field

Advanced Case: Spheromorphism AI Chat

Hello! I am your voice assistant. What would you like to talk about?

Full Voice Chat Loop

API

UseAiVoiceOptions

PropertyDescriptionTypeDefault
languageRecognition languagestring'zh-CN'
interimResultsRequest partial resultsbooleantrue
continuousContinuous recognitionbooleanfalse
vadEnable Smart Silence Detectionbooleantrue
vadThresholdTime threshold for silence (ms)number2000
waveCountNumber of amplitude barsnumber20

Return Value

ExportDescriptionType
isRecordingReactive recording stateRef<boolean>
transcriptConfirmed final textRef<string>
interimTranscriptReal-time partial textRef<string>
audioBlobGenerated audio fileRef<Blob>
amplitudesReal-time waveform data for AiVoiceTriggerRef<number[]>
startStart recognition() => void
stopStop and get results() => void
sttSupportedBrowser support checkboolean

Released under the MIT License.