Openai-elevenlabs-tts-mcp-server NPM

OpenAI + ElevenLabs TTS MCP Server

A powerful Model Context Protocol (MCP) server that I created to convert text to speech using OpenAI for text transformation and ElevenLabs for text-to-speech conversion.

🌟 Features

Text Transformation: Uses OpenAI's GPT-4o to transform plain text into engaging, light-hearted scripts
Text-to-Speech Conversion: Leverages ElevenLabs' high-quality voice synthesis API
File Input Support: Process text directly or from .txt files
Persistent Storage: Automatically saves generated audio files to the outputs directory
MCP Integration: Seamlessly integrates with Claude and other MCP-compatible AI assistants

📋 Prerequisites

Node.js (v16 or higher)
OpenAI API key
ElevenLabs API key

🚀 Installation

Clone the repository:

git clone https://github.com/Gitmaxd/openai-elevenlabs-tts-mcp-server.git
cd openai-elevenlabs-tts-mcp-server

Install dependencies:
```
npm install
```
Build the server:
```
npm run build
```

Set up your API keys as environment variables:

export OPENAI_API_KEY="your-openai-api-key"
export ELEVENLABS_API_KEY="your-elevenlabs-api-key"

🔧 Configuration

Output Directory

By default, audio files are saved to the outputs directory in the current working directory. You can customize the output location by setting the TTS_OUTPUT_DIR environment variable:

export TTS_OUTPUT_DIR="/path/to/custom/output/directory"

Claude Desktop Integration

To use with Claude Desktop, add the server configuration to your Claude Desktop config file:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "openai-elevenlabs-tts-mcp-server": {
      "command": "/absolute/path/to/openai-elevenlabs-tts-mcp-server/build/index.js"
    }
  }
}

🛠️ Available Tools

generate_audio

Converts text to speech using OpenAI for transformation and ElevenLabs for TTS.

Parameters:

text (string, required): Text content or path to a .txt file to convert to speech

Example Usage in Claude:

I'd like to convert this text to speech: "Hello world, this is a test of the text-to-speech system."

🧪 Development

For development with auto-rebuild:

npm run watch

Debugging

Since MCP servers communicate over stdio, debugging can be challenging. I recommend using the MCP Inspector:

npm run inspector

The Inspector will provide a URL to access debugging tools in your browser.

📁 Project Structure

.
├── src/
│   └── index.ts         # Main server implementation
├── build/               # Compiled JavaScript files
├── outputs/             # Generated audio files
├── package.json         # Project dependencies and scripts
└── README.md            # This documentation

🔄 How It Works

The server receives a text input (direct text or file path)
OpenAI transforms the text into an engaging script
ElevenLabs converts the transformed text to speech
The audio file is saved to the outputs directory
The server returns the path to the generated audio file

📚 About MCP (Model Context Protocol)

The Model Context Protocol (MCP) is an open standard that enables AI assistants like Claude to interact with external tools and services. MCP servers provide a standardized way for AI models to access capabilities beyond their training data, such as accessing real-time information or controlling external systems.

Learn more about MCP at modelcontextprotocol.github.io.

👨‍💻 Created by

This MCP Server was created by me, GitMaxd.

📄 License

This project is open source and available under the MIT License.

🔗 Links

Homepage: https://feathered.io
GitHub Repository: https://github.com/Gitmaxd/openai-elevenlabs-tts-mcp-server

elevenlabs openai tts text-to-speech mcp model-context-protocol claude voice

@modelcontextprotocol/sdk @types/fs-extra axios fs-extra openai

0.1.2

5 months ago

0.1.1

5 months ago

0.1.0

5 months ago