1.1.4 • Published 11 months ago

gemini-multimodal-live-voice-only v1.1.4

Weekly downloads
-
License
MIT
Repository
github
Last release
11 months ago

gemini-multimodal-live-voice-only

A React-based multimodal live streaming library that provides:

✅ A Live API context

Audio processing (real-time input/output, volume meters)

Pre-built UI components ( Control Tray for audio controls )

Tool call handling for function-based interactions

This package bundles JavaScript/TypeScript logic for easy integration into any React project.


✨ Features

  • 🔹 Live API Context & Hook

    Provides LiveAPIProvider (context) and useLiveAPIContext (hook) to manage connections, audio streaming, and configuration .

  • 🔹 Built-in Audio Processing

    Handles real-time audio input/output with volume meters .

  • 🔹 Pre-Built UI Components

    Includes ControlTray , a ready-to-use component for connect/disconnect, mute/unmute, and volume monitoring .

  • 🔹 Tool Call Handling

    Built-in tool call handler for functions like "create_todo" and more.

  • 🔹 Easy Styling

    Auto-inject bundled styles via:

    import 'gemini-multimodal-live-voice-only/dist/gemini-multimodal-live-voice-only.css';

📌 Installation

Install the package using npm:

npm install gemini-multimodal-live-voice-only

🚀 Usage

Basic Setup

Wrap your app with LiveAPIProvider and use the built-in UI components.

import React from 'react';
import { LiveAPIProvider, ControlTray } from 'gemini-multimodal-live-voice-only';
import 'gemini-multimodal-live-voice-only/dist/gemini-multimodal-live-voice-only.css';

const App = () => (
  <LiveAPIProvider
    apiKey={"your-api-key"}
    dynamicConfig={{
      voiceName: "Kore",
      systemInstruction: {
        parts: [{ text: "You are AI of omiii. Follow the provided tools and instructions." }]
      },
      tools: [
        { googleSearch: {} },
        { functionDeclarations: [] }
      ]
    }}
  >
    <ControlTray />
  </LiveAPIProvider>
);

export default App;

Accessing the Live API Context

Use the useLiveAPIContext hook for managing connections and audio controls .

import React from 'react';
import { useLiveAPIContext } from 'gemini-multimodal-live-voice-only';

const StatusDisplay = () => {
  const { connected, connect, disconnect, volume, muted, mute, unmute } = useLiveAPIContext();

  return (
    <div>
      <h2>Status: {connected ? 'Connected' : 'Disconnected'}</h2>
      <button onClick={connected ? disconnect : connect}>
        {connected ? 'Disconnect' : 'Connect'}
      </button>
      <button onClick={muted ? unmute : mute}>
        {muted ? 'Unmute' : 'Mute'}
      </button>
      <p>Volume: {volume}</p>
    </div>
  );
};

export default StatusDisplay;

📖 API Reference

LiveAPIProvider

Initializes and provides the live API context.

Props:

Prop NameTypeRequiredDescription
apiKeystring✅ YesAPI key for authentication.
dynamicConfigobject✅ YesContains configuration settings.
urlstring❌ NoAPI URL (defaults to Gemini Live API).

🔹 dynamicConfig options:

  • voiceName (string): Sets the voice. Available voices: "Puck", "Charon", "Kore", "Fenrir", "Aoede".
  • systemInstruction (object): Defines system behavior (array of { text: string } objects).
  • tools (array): Function declarations (Google Gemini function calling format).

useLiveAPIContext

Hook for managing the live API connection.

Returns:

PropertyTypeDescription
clientobjectAPI client instance.
configobjectCurrent API configuration.
setConfigfunctionUpdates the API configuration.
connectedbooleantrueif connected,falseotherwise.
connectfunctionEstablishes a connection.
disconnectfunctionCloses the connection.
volumenumberCurrent audio volume level.
mutedbooleantrueif muted,falseotherwise.
mutefunctionMutes the microphone.
unmutefunctionUnmutes the microphone.

ControlTray

Pre-built UI component for managing audio controls.

Features:

✔ Connect/Disconnect button

✔ Mute/Unmute button

✔ Volume level visualization

<ControlTray />

🛠 Tool Call Handler Example

Process tool calls dynamically based on function names:

useEffect(() => {
  const onToolCall = async (toolCall) => {
    const responses = await Promise.all(
      toolCall.functionCalls.map(async (fc) => {
        switch (fc.name) {
          case "create_item":
            try {
              const response = await fetch("http://localhost:5000/items", {
                method: "POST",
                headers: { "Content-Type": "application/json" },
                body: JSON.stringify(fc.args),
              });
              const data = await response.json();
              return { id: fc.id, response: { output: data } };
            } catch (error) {
              return { id: fc.id, response: { output: { error: error.message } } };
            }
          default:
            return { id: fc.id, response: { output: { error: "Unknown function" } } };
        }
      })
    );
    setTimeout(() => client.sendToolResponse({ functionResponses: responses }), 200);
  };

  client.on("toolcall", onToolCall);
  return () => client.off("toolcall", onToolCall);
}, [client]);

🎨 Style Guide

Importing Styles

To apply default styles, import the CSS file:

import 'gemini-multimodal-live-voice-only/dist/gemini-multimodal-live-voice-only.css';

Custom Styling

Override styles using CSS classes :

.control-tray {
  background-color: #282c34;
  color: white;
  padding: 10px;
  border-radius: 8px;
}

button {
  background-color: #61dafb;
  border: none;
  padding: 8px 16px;
  margin: 5px;
  border-radius: 4px;
  cursor: pointer;
}

button:hover {
  background-color: #4fa3d1;
}

🛠 Build & Development

Development Setup

  1. Clone the repository:
    git clone https://github.com/omanandswami2005/gemini-multimodal-live-voice-only-NPM.git
    cd gemini-multimodal-live-voice-only-NPM
    npm install
  2. Start development mode:
    npm run dev

Building the Package

Run the following command to build the package:

npm run build

🤝 Contributing

1️⃣ Fork the repository

2️⃣ Make changes

3️⃣ Submit a pull request

For major changes, open an issue before starting development.


📩 Support

For issues and questions:

🔹 GitHub : Open an issue

🔹 Email : omanandswami2005@gmail.com


Happy coding! 🚀

1.1.1

11 months ago

1.1.0

11 months ago

1.1.4

11 months ago

1.1.3

11 months ago

1.1.2

11 months ago

1.0.12

11 months ago

1.0.11

11 months ago

1.0.10

11 months ago

1.0.6

11 months ago

1.0.5

11 months ago

1.0.4

11 months ago

1.0.3

11 months ago

1.0.2

11 months ago

1.0.1

11 months ago

1.0.0

11 months ago