useAiVoice Voice Interaction
Browser Compatibility
This feature uses the browser's native Web Speech API for speech-to-text. In Google Chrome, it requires running in a 🪄 secure context (HTTPS or localhost) to work properly.
useAiVoice is a highly integrated voice processing hook. It not only encapsulates the browser's native speech-to-text (STT) capabilities but also features a built-in Web Audio analysis engine that outputs real-time high-fidelity waveform data, perfectly matching the AiVoiceTrigger component for dynamic visual feedback.
Key Features
- Visual Sync: Built-in frequency analyzer returns real-time
amplitudesarray to drive waveform animations directly. - Physical Recording: Built-in
MediaRecordergenerates a.webmaudio blob after recording. - Smart VAD: Supports Voice Activity Detection to automatically stop recording when the user finishes speaking.
- Real-time STT: Supports
interimResultsto provide partial transcripts while the user is still speaking. - Plug-and-Play: Automatically manages microphone permissions, AudioContext lifecycle, and proper cleanup.
Basic Usage
⚠️ Current environment doesn't support Web Speech API (Chrome recommended).
Waiting for speech...
Basic Interaction Demo
Integration with AiSender
Integrate the voice trigger into AiSender for an input experience similar to mainstream AI assistants.
Integrated Input Field
Advanced Case: Spheromorphism AI Chat
Full Voice Chat Loop
API
UseAiVoiceOptions
| Property | Description | Type | Default |
|---|---|---|---|
| language | Recognition language | string | 'zh-CN' |
| interimResults | Request partial results | boolean | true |
| continuous | Continuous recognition | boolean | false |
| vad | Enable Smart Silence Detection | boolean | true |
| vadThreshold | Time threshold for silence (ms) | number | 2000 |
| waveCount | Number of amplitude bars | number | 20 |
Return Value
| Export | Description | Type |
|---|---|---|
| isRecording | Reactive recording state | Ref<boolean> |
| transcript | Confirmed final text | Ref<string> |
| interimTranscript | Real-time partial text | Ref<string> |
| audioBlob | Generated audio file | Ref<Blob> |
| amplitudes | Real-time waveform data for AiVoiceTrigger | Ref<number[]> |
| start | Start recognition | () => void |
| stop | Stop and get results | () => void |
| sttSupported | Browser support check | boolean |