1.0.1 • Published 6 months ago

gemini-livejs v1.0.1

Weekly downloads
-
License
MIT
Repository
github
Last release
6 months ago

Gemini Live WebSocket Client

A TypeScript client for real-time communication with Google's Gemini API via WebSocket, supporting text, audio, function calls, and executable code responses.

Features

FeatureSupportDescription
Text to TextStandard text conversation
Text to AudioConvert text input to audio response
Audio to TextConvert audio input to text response
Audio to AudioProcess audio input and respond with audio
Function CallsHandle function calling capabilities
Executable CodeReceive executable code responses
Real-time StreamingStream responses in real-time
Custom ConfigurationConfigurable model parameters
Safety SettingsCustomizable content safety filters
Connection EventsHandlers for connection lifecycle

Limitations

FeatureStatusDescription
Code Executiontools.codeExecution not implemented yet
Multi-part ResponseOnly supports text and inlineData parts
Multiple Function CallsLimited to single function call output
Multiple Code BlocksLimited to single executable code output

Installation

npm install gemini-livejs

Basic Usage

import { GeminiLive } from "gemini-livejs";

// Initialize client
const client = new GeminiLive("YOUR_API_KEY", {
  generationConfig: {
    temperature: 0.7,
    maxOutputTokens: 2000,
  },
});

// Basic text conversation
async function chat() {
  const response = await client.send({ prompt: "Hello, how are you?" });
  console.log(response.text);
}

// Real-time audio streaming
client.realtime((response) => {
  if (response.type === "audio") {
    // Handle audio response
    console.log(response.audio.data);
  } else if (response.type === "text") {
    // Handle text response
    console.log(response.text);
  }
});

Note: send and realtime wont work if the client is not connected or handshaked.

Advanced Features

Function Calling

const client = new GeminiLive("YOUR_API_KEY", {
  tools: {
    functionDeclarations: [
      {
        name: "get_weather",
        description: "Get current weather",
        parameters: {
          type: ParameterType.OBJECT,
          properties: {
            location: {
              type: ParameterType.STRING,
              description: "City name",
            },
          },
        },
      },
    ],
  },
});

// Handle function calls
client.send({ prompt: "What's the weather in Tokyo?" }).then((response) => {
  if (response.type === "function" && response.functionCall) {
    console.log(response.functionCall);
  }
});

Event Handling

client
  .on_open(() => console.log("Connected!"))
  .on_handshake(() => console.log("Handshake complete"))
  .on_close((reason) => console.log("Closed:", reason));

Configuration Options

type GeminiConfig = {
  generationConfig?: {
    maxOutputTokens?: number;
    temperature?: number;
    topP?: number;
    topK?: number;
    responseType?: "TEXT" | "AUDIO";
    voiceName?: "Aoede" | "Charon" | "Fenrir" | "Kore" | "Puck";
  };
  systemInstruction?: string;
  tools?: ToolsConfig;
  safetySettings?: SafetySettings[];
};

Examples

For more examples, see the examples directory.

License

MIT

Contributing

Contributions welcome! Please feel free to submit a Pull Request.

1.0.1

6 months ago

1.0.0

6 months ago