1.1.4 • Published 5 months ago

gemini-multimodal-live-voice-only v1.1.4

Weekly downloads
-
License
MIT
Repository
github
Last release
5 months ago

gemini-multimodal-live-voice-only

A React-based multimodal live streaming library that provides:

✅ A Live API context

Audio processing (real-time input/output, volume meters)

Pre-built UI components ( Control Tray for audio controls )

Tool call handling for function-based interactions

This package bundles JavaScript/TypeScript logic for easy integration into any React project.


✨ Features

  • 🔹 Live API Context & Hook

    Provides LiveAPIProvider (context) and useLiveAPIContext (hook) to manage connections, audio streaming, and configuration .

  • 🔹 Built-in Audio Processing

    Handles real-time audio input/output with volume meters .

  • 🔹 Pre-Built UI Components

    Includes ControlTray , a ready-to-use component for connect/disconnect, mute/unmute, and volume monitoring .

  • 🔹 Tool Call Handling

    Built-in tool call handler for functions like "create_todo" and more.

  • 🔹 Easy Styling

    Auto-inject bundled styles via:

    import 'gemini-multimodal-live-voice-only/dist/gemini-multimodal-live-voice-only.css';

📌 Installation

Install the package using npm:

npm install gemini-multimodal-live-voice-only

🚀 Usage

Basic Setup

Wrap your app with LiveAPIProvider and use the built-in UI components.

import React from 'react';
import { LiveAPIProvider, ControlTray } from 'gemini-multimodal-live-voice-only';
import 'gemini-multimodal-live-voice-only/dist/gemini-multimodal-live-voice-only.css';

const App = () => (
  <LiveAPIProvider
    apiKey={"your-api-key"}
    dynamicConfig={{
      voiceName: "Kore",
      systemInstruction: {
        parts: [{ text: "You are AI of omiii. Follow the provided tools and instructions." }]
      },
      tools: [
        { googleSearch: {} },
        { functionDeclarations: [] }
      ]
    }}
  >
    <ControlTray />
  </LiveAPIProvider>
);

export default App;

Accessing the Live API Context

Use the useLiveAPIContext hook for managing connections and audio controls .

import React from 'react';
import { useLiveAPIContext } from 'gemini-multimodal-live-voice-only';

const StatusDisplay = () => {
  const { connected, connect, disconnect, volume, muted, mute, unmute } = useLiveAPIContext();

  return (
    <div>
      <h2>Status: {connected ? 'Connected' : 'Disconnected'}</h2>
      <button onClick={connected ? disconnect : connect}>
        {connected ? 'Disconnect' : 'Connect'}
      </button>
      <button onClick={muted ? unmute : mute}>
        {muted ? 'Unmute' : 'Mute'}
      </button>
      <p>Volume: {volume}</p>
    </div>
  );
};

export default StatusDisplay;

📖 API Reference

LiveAPIProvider

Initializes and provides the live API context.

Props:

Prop NameTypeRequiredDescription
apiKeystring✅ YesAPI key for authentication.
dynamicConfigobject✅ YesContains configuration settings.
urlstring❌ NoAPI URL (defaults to Gemini Live API).

🔹 dynamicConfig options:

  • voiceName (string): Sets the voice. Available voices: "Puck", "Charon", "Kore", "Fenrir", "Aoede".
  • systemInstruction (object): Defines system behavior (array of { text: string } objects).
  • tools (array): Function declarations (Google Gemini function calling format).

useLiveAPIContext

Hook for managing the live API connection.

Returns:

PropertyTypeDescription
clientobjectAPI client instance.
configobjectCurrent API configuration.
setConfigfunctionUpdates the API configuration.
connectedbooleantrueif connected,falseotherwise.
connectfunctionEstablishes a connection.
disconnectfunctionCloses the connection.
volumenumberCurrent audio volume level.
mutedbooleantrueif muted,falseotherwise.
mutefunctionMutes the microphone.
unmutefunctionUnmutes the microphone.

ControlTray

Pre-built UI component for managing audio controls.

Features:

✔ Connect/Disconnect button

✔ Mute/Unmute button

✔ Volume level visualization

<ControlTray />

🛠 Tool Call Handler Example

Process tool calls dynamically based on function names:

useEffect(() => {
  const onToolCall = async (toolCall) => {
    const responses = await Promise.all(
      toolCall.functionCalls.map(async (fc) => {
        switch (fc.name) {
          case "create_item":
            try {
              const response = await fetch("http://localhost:5000/items", {
                method: "POST",
                headers: { "Content-Type": "application/json" },
                body: JSON.stringify(fc.args),
              });
              const data = await response.json();
              return { id: fc.id, response: { output: data } };
            } catch (error) {
              return { id: fc.id, response: { output: { error: error.message } } };
            }
          default:
            return { id: fc.id, response: { output: { error: "Unknown function" } } };
        }
      })
    );
    setTimeout(() => client.sendToolResponse({ functionResponses: responses }), 200);
  };

  client.on("toolcall", onToolCall);
  return () => client.off("toolcall", onToolCall);
}, [client]);

🎨 Style Guide

Importing Styles

To apply default styles, import the CSS file:

import 'gemini-multimodal-live-voice-only/dist/gemini-multimodal-live-voice-only.css';

Custom Styling

Override styles using CSS classes :

.control-tray {
  background-color: #282c34;
  color: white;
  padding: 10px;
  border-radius: 8px;
}

button {
  background-color: #61dafb;
  border: none;
  padding: 8px 16px;
  margin: 5px;
  border-radius: 4px;
  cursor: pointer;
}

button:hover {
  background-color: #4fa3d1;
}

🛠 Build & Development

Development Setup

  1. Clone the repository:
    git clone https://github.com/omanandswami2005/gemini-multimodal-live-voice-only-NPM.git
    cd gemini-multimodal-live-voice-only-NPM
    npm install
  2. Start development mode:
    npm run dev

Building the Package

Run the following command to build the package:

npm run build

🤝 Contributing

1️⃣ Fork the repository

2️⃣ Make changes

3️⃣ Submit a pull request

For major changes, open an issue before starting development.


📩 Support

For issues and questions:

🔹 GitHub : Open an issue

🔹 Email : omanandswami2005@gmail.com


Happy coding! 🚀

1.1.1

5 months ago

1.1.0

5 months ago

1.1.4

5 months ago

1.1.3

5 months ago

1.1.2

5 months ago

1.0.12

5 months ago

1.0.11

5 months ago

1.0.10

5 months ago

1.0.6

5 months ago

1.0.5

5 months ago

1.0.4

5 months ago

1.0.3

5 months ago

1.0.2

5 months ago

1.0.1

5 months ago

1.0.0

5 months ago