Text to Speech MCP Server

Text to Speech MCP servers enable AI models to convert text into natural-sounding speech, providing capabilities for real-time audio generation, voice synthesis, and multilingual support.

April 25, 2025
MCP ServerSpecialized ToolsText to Speech MCP Server
GitHub starsPyPI versionPyPI downloads

Overview

The RealtimeTTS MCP Server enables AI models to convert text into speech in real-time. This server is built on the powerful RealtimeTTS Python library, which is designed for low-latency text-to-speech applications. It supports a wide range of TTS engines, making it a versatile solution for adding voice capabilities to AI agents.

Created by:

Developed by KoljaB

Key Features

Low-Latency Conversion

Almost instantaneous text-to-speech conversion, ideal for real-time interactions

🔊

High-Quality Audio

Generates clear and natural-sounding speech

🔄

Multiple TTS Engines

Supports OpenAI TTS, ElevenLabs, Azure, Coqui TTS, and more

🌍

Multilingual Support

Provides speech synthesis in multiple languages

Available Tools

Quick Reference

ToolPurposeCategory
synthesizeConvert text to speechCore
streamStream synthesized audioCore
set_engineSelect the TTS engineConfiguration
get_enginesList available enginesDiscovery

Detailed Usage

synthesize

Convert a string of text into speech and play it.

use_mcp_tool({
  server_name: "text_to_speech",
  tool_name: "synthesize",
  arguments: {
    text: "Hello, world! This is a test."
  }
});
stream

Stream synthesized audio in real-time as it's generated.

use_mcp_tool({
  server_name: "text_to_speech",
  tool_name: "stream",
  arguments: {
    text: "This is a streaming test to demonstrate real-time audio synthesis."
  }
});
set_engine

Select the TTS engine to use for speech synthesis.

use_mcp_tool({
  server_name: "text_to_speech",
  tool_name: "set_engine",
  arguments: {
    engine: "elevenlabs"
  }
});
get_engines

Get a list of available TTS engines.

use_mcp_tool({
  server_name: "text_to_speech",
  tool_name: "get_engines",
  arguments: {}
});

Installation

{
  "mcpServers": {
    "text_to_speech": {
      "command": "pip",
      "args": [
        "install",
        "realtimetts[all]"
      ]
    }
  }
}

Common Use Cases

1. Voice-Enabled AI Assistants

Provide voice output for AI assistants and chatbots.

// Let the assistant speak its response
use_mcp_tool({
  server_name: "text_to_speech",
  tool_name: "synthesize",
  arguments: {
    text: "I'm sorry, I didn't understand that. Could you please rephrase?"
  }
});

2. Accessibility

Make applications more accessible by providing audio versions of text content.

// Read the content of an article aloud
use_mcp_tool({
  server_name: "text_to_speech",
  tool_name: "synthesize",
  arguments: {
    text: articleContent
  }
});

3. Real-Time Notifications

Create audible notifications for events in your applications.

// Announce a new message
use_mcp_tool({
  server_name: "text_to_speech",
  tool_name: "synthesize",
  arguments: {
    text: "You have a new message from Jane."
  }
});

Sources