0.1.2 โ€ข Published 2 months ago

openai-elevenlabs-tts-mcp-server v0.1.2

Weekly downloads
-
License
MIT
Repository
github
Last release
2 months ago

OpenAI + ElevenLabs TTS MCP Server

MCP Server License Website

A powerful Model Context Protocol (MCP) server that I created to convert text to speech using OpenAI for text transformation and ElevenLabs for text-to-speech conversion.

๐ŸŒŸ Features

  • Text Transformation: Uses OpenAI's GPT-4o to transform plain text into engaging, light-hearted scripts
  • Text-to-Speech Conversion: Leverages ElevenLabs' high-quality voice synthesis API
  • File Input Support: Process text directly or from .txt files
  • Persistent Storage: Automatically saves generated audio files to the outputs directory
  • MCP Integration: Seamlessly integrates with Claude and other MCP-compatible AI assistants

๐Ÿ“‹ Prerequisites

  • Node.js (v16 or higher)
  • OpenAI API key
  • ElevenLabs API key

๐Ÿš€ Installation

  1. Clone the repository:

    git clone https://github.com/Gitmaxd/openai-elevenlabs-tts-mcp-server.git
    cd openai-elevenlabs-tts-mcp-server
  2. Install dependencies:

    npm install
  3. Build the server:

    npm run build
  4. Set up your API keys as environment variables:

    export OPENAI_API_KEY="your-openai-api-key"
    export ELEVENLABS_API_KEY="your-elevenlabs-api-key"

๐Ÿ”ง Configuration

Output Directory

By default, audio files are saved to the outputs directory in the current working directory. You can customize the output location by setting the TTS_OUTPUT_DIR environment variable:

export TTS_OUTPUT_DIR="/path/to/custom/output/directory"

Claude Desktop Integration

To use with Claude Desktop, add the server configuration to your Claude Desktop config file:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "openai-elevenlabs-tts-mcp-server": {
      "command": "/absolute/path/to/openai-elevenlabs-tts-mcp-server/build/index.js"
    }
  }
}

๐Ÿ› ๏ธ Available Tools

generate_audio

Converts text to speech using OpenAI for transformation and ElevenLabs for TTS.

Parameters:

  • text (string, required): Text content or path to a .txt file to convert to speech

Example Usage in Claude:

I'd like to convert this text to speech: "Hello world, this is a test of the text-to-speech system."

๐Ÿงช Development

For development with auto-rebuild:

npm run watch

Debugging

Since MCP servers communicate over stdio, debugging can be challenging. I recommend using the MCP Inspector:

npm run inspector

The Inspector will provide a URL to access debugging tools in your browser.

๐Ÿ“ Project Structure

.
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ index.ts         # Main server implementation
โ”œโ”€โ”€ build/               # Compiled JavaScript files
โ”œโ”€โ”€ outputs/             # Generated audio files
โ”œโ”€โ”€ package.json         # Project dependencies and scripts
โ””โ”€โ”€ README.md            # This documentation

๐Ÿ”„ How It Works

  1. The server receives a text input (direct text or file path)
  2. OpenAI transforms the text into an engaging script
  3. ElevenLabs converts the transformed text to speech
  4. The audio file is saved to the outputs directory
  5. The server returns the path to the generated audio file

๐Ÿ“š About MCP (Model Context Protocol)

The Model Context Protocol (MCP) is an open standard that enables AI assistants like Claude to interact with external tools and services. MCP servers provide a standardized way for AI models to access capabilities beyond their training data, such as accessing real-time information or controlling external systems.

Learn more about MCP at modelcontextprotocol.github.io.

๐Ÿ‘จโ€๐Ÿ’ป Created by

This MCP Server was created by me, GitMaxd.

๐Ÿ“„ License

This project is open source and available under the MIT License.

๐Ÿ”— Links