useAiVoice Voice Interaction
Browser Compatibility
This feature uses the browser's native Web Speech API for speech-to-text. In Google Chrome, it requires running in a 🪄 secure context (HTTPS or localhost) to work properly.
useAiVoice is a highly integrated voice processing hook. It not only encapsulates the browser's native speech-to-text (STT) capabilities but also features a built-in Web Audio analysis engine that outputs real-time high-fidelity waveform data, perfectly matching the AiVoiceTrigger component for dynamic visual feedback.
Key Features
- Visual Sync: Built-in frequency analyzer returns real-time
amplitudesarray to drive waveform animations directly. - Physical Recording: Built-in
MediaRecordergenerates a.webmaudio blob after recording. - Smart VAD: Supports Voice Activity Detection to automatically stop recording when the user finishes speaking.
- Real-time STT: Supports
interimResultsto provide partial transcripts while the user is still speaking. - Plug-and-Play: Automatically manages microphone permissions, AudioContext lifecycle, and proper cleanup.
Basic Usage
⚠️ Current environment doesn't support Web Speech API (Chrome recommended).
Waiting for speech...
Basic Interaction Demo
Integration with AiSender
Integrate the voice trigger into AiSender for an input experience similar to mainstream AI assistants.
Integrated Input Field
Advanced Case: Spheromorphism AI Chat
Full Voice Chat Loop
API
UseAiVoiceOptions
| Property | Description | Type | Default |
|---|---|---|---|
| language | Recognition language | string | 'zh-CN' |
| interimResults | Request partial results | boolean | true |
| continuous | Continuous recognition | boolean | false |
| vad | Enable Smart Silence Detection | boolean | true |
| vadThreshold | Time threshold for silence (ms) | number | 2000 |
| volumeThreshold | Volume sensitivity threshold (0-1) | number | 0.05 |
| waveCount | Number of amplitude bars | number | 20 |
| useSTT | Enable browser speech recognition | boolean | true |
| onStart | Callback when recording starts | () => void | - |
| onStop | Callback with transcript and audio blob | (transcript: string, blob: Blob | null) => void | - |
| onResult | Callback for finalized transcript | (transcript: string) => void | - |
| onPartialResult | Callback for interim transcript | (transcript: string) => void | - |
| onError | Error callback | (error: unknown) => void | - |
Return Value
| Export | Description | Type |
|---|---|---|
| isRecording | Reactive recording state | Ref<boolean> |
| transcript | Confirmed final text | Ref<string> |
| interimTranscript | Real-time partial text | Ref<string> |
| amplitudes | Real-time waveform data for AiVoiceTrigger | Ref<number[]> |
| volume | Real-time volume (0-100) | Ref<number> |
| audioBlob | Generated audio file | Ref<Blob | null> |
| start | Start recording and recognition | () => Promise<void> |
| stop | Stop and get results | () => void |
| cancel | Cancel recording and discard current result | () => void |
| sttSupported | Browser support check | boolean |