useAiVoice Voice Interaction

Browser Compatibility

This feature uses the browser's native Web Speech API for speech-to-text. In Google Chrome, it requires running in a 🪄 secure context (HTTPS or localhost) to work properly.

useAiVoice is a highly integrated voice processing hook. It not only encapsulates the browser's native speech-to-text (STT) capabilities but also features a built-in Web Audio analysis engine that outputs real-time high-fidelity waveform data, perfectly matching the AiVoiceTrigger component for dynamic visual feedback.

Key Features

Visual Sync: Built-in frequency analyzer returns real-time amplitudes array to drive waveform animations directly.
Physical Recording: Built-in MediaRecorder generates a .webm audio blob after recording.
Smart VAD: Supports Voice Activity Detection to automatically stop recording when the user finishes speaking.
Real-time STT: Supports interimResults to provide partial transcripts while the user is still speaking.
Plug-and-Play: Automatically manages microphone permissions, AudioContext lifecycle, and proper cleanup.

Basic Usage

Basic Interaction Demo

Integration with AiSender

Integrate the voice trigger into AiSender for an input experience similar to mainstream AI assistants.

Integrated Input Field

Advanced Case: Spheromorphism AI Chat

Full Voice Chat Loop

API

UseAiVoiceOptions

Property	Description	Type	Default
language	Recognition language	`string`	`'zh-CN'`
interimResults	Request partial results	`boolean`	`true`
continuous	Continuous recognition	`boolean`	`false`
vad	Enable Smart Silence Detection	`boolean`	`true`
vadThreshold	Time threshold for silence (ms)	`number`	`2000`
volumeThreshold	Volume sensitivity threshold (0-1)	`number`	`0.05`
waveCount	Number of amplitude bars	`number`	`20`
useSTT	Enable browser speech recognition	`boolean`	`true`
onStart	Callback when recording starts	`() => void`	-
onStop	Callback with transcript and audio blob	`(transcript: string, blob: Blob \| null) => void`	-
onResult	Callback for finalized transcript	`(transcript: string) => void`	-
onPartialResult	Callback for interim transcript	`(transcript: string) => void`	-
onError	Error callback	`(error: unknown) => void`	-

Return Value

Export	Description	Type
isRecording	Reactive recording state	`Ref<boolean>`
transcript	Confirmed final text	`Ref<string>`
interimTranscript	Real-time partial text	`Ref<string>`
amplitudes	Real-time waveform data for AiVoiceTrigger	`Ref<number[]>`
volume	Real-time volume (0-100)	`Ref<number>`
audioBlob	Generated audio file	`Ref<Blob \| null>`
start	Start recording and recognition	`() => Promise<void>`
stop	Stop and get results	`() => void`
cancel	Cancel recording and discard current result	`() => void`
sttSupported	Browser support check	`boolean`

useAiVoice Voice Interaction ​

Key Features ​

Basic Usage ​

Integration with AiSender ​

Advanced Case: Spheromorphism AI Chat ​

API ​

UseAiVoiceOptions ​

Return Value ​