Speech Synthesis
The Speech Synthesis component provides a comprehensive interface to the Web Speech API's text-to-speech functionality, allowing your Progressive Web App to convert text into spoken words. This enables accessible, hands-free experiences and enhanced user interactions.
This component is particularly useful for:
Accessibility features for users with visual impairments or reading difficulties
Educational applications (language learning, pronunciation guides)
Navigation and turn-by-turn directions
Content reading (news articles, books, messages)
Voice-enabled interfaces and assistive technologies
Multi-language support with native voice selection
Browser Support
The Speech Synthesis API is widely supported across modern browsers:
Chrome/Edge: Full support with extensive voice libraries
Safari: Full support with high-quality voices
Firefox: Full support with system voices
Mobile browsers: Excellent support on both iOS and Android
Voice availability varies by platform and language. Desktop browsers typically offer more voices than mobile browsers. The controller automatically handles voice detection and selection.
Usage
Basic Text-to-Speech
Speaking HTML Elements
Voice Selection with Custom Parameters
Multi-Language Content
Playback Controls
Custom Internationalization
Per-Item Custom Parameters
Immediate vs Queue Mode
Common Use Cases
1. Accessible News Reader
2. Language Learning App
3. Form Validation Feedback
4. Navigation Assistant
5. Interactive Tutorial System
API Reference
Values
The controller accepts the following configuration values:
localeValue (String, default: "en-US")
Default language/locale for speech synthesis
Examples:
"en-US","fr-FR","es-ES","de-DE"
rateValue (Number, default: 1)
Speech speed/rate
Range: 0.1 to 10 (typical: 0.5 to 2)
1= normal speed,< 1= slower,> 1= faster
pitchValue (Number, default: 1)
Speech pitch
Range: 0 to 2
1= normal pitch,< 1= lower,> 1= higher
volumeValue (Number, default: 1)
Speech volume
Range: 0 to 1
0= silent,1= maximum volume
voiceValue (String, default: undefined)
Specific voice name to use
If not set, automatically selects best voice for locale
Example:
"Google US English","Microsoft David Desktop"
enqueueValue (Boolean, default: true)
Queue mode behavior
true: Utterances are queued and played sequentiallyfalse: New utterance cancels previous (immediate mode)
i18nValue (Object, default: {})
Internationalization strings for status messages
Keys:
loading,ready,unsupported,playing,paused,canceled,finished
Example:
Targets
item
Marks HTML elements as speakable items
Can be clicked individually or queued together
Supports custom data attributes for per-item configuration
voiceSelect
A
<select>element that will be automatically populated with available voicesAutomatically formatted as:
"Voice Name (locale) • local"or"Voice Name (locale)"
status
Element where status messages are displayed
Shows: "Loading voices…", "Ready", "Playing", "Paused", etc.
Content controlled by
i18nValueor defaults
Example:
Actions
speak({ params })
Speaks the provided text
Parameters:
text(String, required): Text to speaklocale(String, optional): Override default localevoice(String, optional): Override default voicerate(Number, optional): Override default ratepitch(Number, optional): Override default pitchvolume(Number, optional): Override default volume
speakItem(event)
Speaks the clicked item target element
Uses
data-speech-*attributes or element'stextContentSupports per-item parameters via data attributes
enqueueItems({ params })
Queues all item targets for sequential playback
Parameters can override defaults for all items
Automatically starts playback
pause()
Pauses current speech
Can be resumed with
resume()
resume()
Resumes paused speech
No effect if not paused
cancel()
Stops current speech and clears queue
Cannot be resumed
changeVoiceFromSelect()
Updates voice from the voiceSelect target value
Automatically bound to voiceSelect change event
setRate({ params })
Updates the speech rate
Parameters:
rate(Number)
setPitch({ params })
Updates the speech pitch
Parameters:
pitch(Number)
setVolume({ params })
Updates the speech volume
Parameters:
volume(Number)
Example:
Item Data Attributes
When using item targets, you can customize speech parameters per element:
data-speech-text
Text to speak (overrides element's
textContent)
data-speech-locale
Language/locale for this item
data-speech-voice
Voice name for this item
data-speech-rate
Speech rate for this item (Number)
data-speech-pitch
Speech pitch for this item (Number)
data-speech-volume
Speech volume for this item (Number)
Example:
Events
All events are prefixed with pwa:speech-synthesis: and dispatched on the controller element.
pwa:speech-synthesis:ready
Dispatched when the controller is initialized and ready
No event detail
pwa:speech-synthesis:unsupported
Dispatched if Speech Synthesis API is not supported
No event detail
pwa:speech-synthesis:voicesloaded
Dispatched when voices are loaded and available
Event detail:
voices(Array): List of available voices withname,lang,localproperties
pwa:speech-synthesis:start
Dispatched when speech starts
Event detail:
text(String): The text being spokenlang(String): The language usedsourceEl(Element|undefined): The source element if speaking an item
pwa:speech-synthesis:end
Dispatched when speech finishes
Event detail:
text(String): The text that was spokenlang(String): The language usedsourceEl(Element|undefined): The source element if speaking an item
pwa:speech-synthesis:pause
Dispatched when speech is paused
Event detail:
text(String): The text being pausedsourceEl(Element|undefined): The source element
pwa:speech-synthesis:resume
Dispatched when speech is resumed
Event detail:
text(String): The text being resumedsourceEl(Element|undefined): The source element
pwa:speech-synthesis:error
Dispatched when a speech error occurs
Event detail:
error(String): Error message or typesourceEl(Element|undefined): The source element
pwa:speech-synthesis:boundary
Dispatched at word or sentence boundaries during speech
Event detail:
name(String): Boundary type ("word" or "sentence")charIndex(Number): Character index in the textcharLength(Number): Length of the current word/sentenceelapsedTime(Number): Elapsed time in millisecondssourceEl(Element|undefined): The source element
pwa:speech-synthesis:queued
Dispatched when an utterance is added to the queue
Event detail:
size(Number): Current queue size
pwa:speech-synthesis:dequeue
Dispatched when an utterance is removed from the queue for playback
Event detail:
remaining(Number): Number of items remaining in queue
pwa:speech-synthesis:cancel
Dispatched when speech is canceled
No event detail
pwa:speech-synthesis:voicechange
Dispatched when the voice is changed
Event detail:
name(String): New voice name
pwa:speech-synthesis:ratechange
Dispatched when the rate is changed
Event detail:
rate(Number): New rate value
pwa:speech-synthesis:pitchchange
Dispatched when the pitch is changed
Event detail:
pitch(Number): New pitch value
pwa:speech-synthesis:volumechange
Dispatched when the volume is changed
Event detail:
volume(Number): New volume value
Example:
Properties
The controller exposes the following read-only properties:
voices
Returns array of available
SpeechSynthesisVoiceobjectsUpdated when voices are loaded
locales
Returns array of unique locale codes from available voices
Example:
["en-US", "fr-FR", "es-ES"]
isSpeaking
Returns
trueif currently speaking,falseotherwiseBoolean property
Example:
Best Practices
1. Always Check Browser Support
2. Provide Visual Feedback
Always show the current state of speech synthesis:
3. Handle Long Content Appropriately
For long articles, break content into manageable chunks:
4. Use Appropriate Speech Rates
Match speech rate to content type:
0.5-0.8: Learning content, complex information, pronunciation guides
0.9-1.0: Standard reading, news articles, general content
1.1-1.5: Quick scanning, familiar content, user preference
1.6-2.0: Rapid reading for advanced users
5. Respect User Preferences
6. Optimize for Multi-Language Apps
7. Use Boundary Events for Highlighting
8. Provide Keyboard Controls
Make speech controls accessible via keyboard:
Troubleshooting
No Voices Available
Problem: Voice list is empty or voices don't load
Solutions:
Wait for
voicesloadedevent before using voicesSome browsers load voices asynchronously - the controller handles this automatically
On some systems, text-to-speech voices may need to be installed separately
Speech Not Working on iOS
Problem: Speech synthesis doesn't work or requires user interaction on iOS Safari
Solutions:
iOS requires user interaction before speech can play - ensure speech is triggered by user action (click, tap)
Don't auto-play speech on page load
Test on actual iOS devices, not just simulators
Voice Selection Not Working
Problem: Selected voice is not being used
Solutions:
Ensure voice name matches exactly (case-sensitive)
Voice must be available in the voices list
Some voices are locale-specific - check voice's
langproperty matches utterance locale
Speech Cuts Off or Stops
Problem: Speech stops unexpectedly or gets interrupted
Solutions:
Check for multiple speech synthesis controllers on the same page
Ensure
enqueuemode is set correctlyVerify no JavaScript errors in console
Some browsers limit queue size - break long content into smaller chunks
Rate/Pitch/Volume Not Applied
Problem: Speech parameters don't seem to work
Solutions:
Some voices ignore certain parameters (especially pitch)
Valid ranges: rate (0.1-10), pitch (0-2), volume (0-1)
Not all voices support all parameters - try different voices
Language/Accent Mismatch
Problem: Wrong accent or language is used
Solutions:
Set correct
localevalue (e.g., "en-US" vs "en-GB")Explicitly select a voice with desired accent using
voiceValueUse
data-speech-localeper item for multi-language content
Memory Issues with Long Content
Problem: Browser becomes slow or unresponsive with very long content
Solutions:
Break content into smaller chunks (use
itemtargets)Limit queue size - don't enqueue hundreds of items at once
Cancel and clear queue when user navigates away
Accessibility Considerations
1. ARIA Attributes
2. Screen Reader Compatibility
The component works alongside screen readers. Provide options to disable speech synthesis if user prefers their own assistive technology:
3. Visual Indicators
Always provide visual feedback that complements audio:
Last updated
Was this helpful?