edge-tts: Microsoft Edge Online TTS Integration for Python
edge-tts is an open-source Python library designed to harness the power of Microsoft Edge's online text-to-speech (TTS) service. This service, powered by advanced neural voices from Microsoft, allows developers and users to convert text into high-quality, natural-sounding speech without the need for installing Microsoft Edge browser, a Windows operating system, or obtaining any API keys. It's particularly useful for applications requiring multilingual voice synthesis, such as accessibility tools, virtual assistants, audiobooks, or automated content narration.
Key Features
- Seamless Integration: Use it directly in Python scripts or via simple command-line interfaces (
edge-ttsfor generation andedge-playbackfor playback with subtitles). - No Dependencies on Edge or Windows: Runs on any platform (Linux, macOS, Windows) as long as you have Python and internet access.
- Multilingual Support: Access a vast library of voices across numerous languages and accents, including neural voices with personalities like friendly or positive tones.
- Customization Options: Adjust speech rate, volume, pitch, and select specific voices to tailor the output.
- Output Flexibility: Generate audio files (e.g., MP3) and synchronized subtitles (SRT) for easy integration into videos or apps.
- Command-Line Convenience: Quick TTS generation and playback without writing code.
Installation and Setup
To get started, install via pip:
pip install edge-tts
For command-line only use, prefer pipx:
pipx install edge-tts
Note that edge-playback requires the mpv player (except on Windows) for immediate audio playback with subtitles.
Basic Usage Examples
Command-Line TTS Generation
Generate a simple audio file:
edge-tts --text "Hello, world!" --write-media hello.mp3 --write-subtitles hello.srt
Play back with subtitles:
edge-playback --text "Hello, world!"
Voice Selection
List available voices:
edge-tts --list-voices
This outputs a table of voices like af-ZA-AdriNeural (Female, General, Friendly). Use a specific voice:
edge-tts --voice ar-EG-SalmaNeural --text "مرحبا كيف حالك؟" --write-media hello_ar.mp3 --write-subtitles hello_ar.srt
Audio Customization
Modify speech attributes:
- Slow down rate:
--rate=-50% - Lower volume:
--volume=-50% - Adjust pitch:
--pitch=-50HzExample:
edge-tts --rate=-50% --text "Hello, world!" --write-media slow_hello.mp3 --write-subtitles slow_hello.srt
Python Module Usage
For programmatic access, import and use the library. The project provides examples in /examples/ and utility functions in /src/edge_tts/util.py. Key steps include:
- Listing voices:
edge_tts.list_voices() - Creating a communicator:
communicate(text, voice) - Saving audio: Stream to a file.
Example code snippet:
import asyncio
import edge_tts
async def tts_example():
text = "Hello from edge-tts!"
voice = "en-US-AriaNeural"
communicate = edge_tts.Communicate(text, voice)
await communicate.save("output.mp3")
asyncio.run(tts_example())Limitations and Notes
- SSML Support: Custom SSML is limited to what Microsoft Edge generates (single
<voice>and<prosody>tags). Advanced SSML is not supported due to service restrictions. - Internet Required: Relies on Microsoft's online service, so an active connection is needed.
- Voice Availability: Voices are subject to Microsoft's updates; use
--list-voicesfor the latest.
Community and Integrations
With over 9,500 GitHub stars, edge-tts is widely adopted. It's integrated into projects like:
- hass-edge-tts for Home Assistant.
- Podcastfy for podcast generation.
- tts-samples for voice testing.
This library democratizes access to premium TTS technology, making it ideal for AI-driven applications, educational tools, and content creators seeking cost-effective speech synthesis.
