Llama.cpp-ts NPM

llama.cpp-ts 🦙

LlamaCPP-ts is a Node.js binding for the LlamaCPP library, which wraps the llama.cpp framework. It provides an easy-to-use interface for running language models in Node.js applications, supporting asynchronous streaming responses.

Supported Systems:

MacOS
Windows (not tested yet)
Linux (not tested yet)

Models

You can find some models here

Example is using this one Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf.

Installation

Ensure that you have CMake installed on your system:

On MacOS: brew install cmake
On Windows: choco install cmake
On Linux: sudo apt-get install cmake

Then, install the package:

npm install llama.cpp-ts
# or
yarn add llama.cpp-ts

Usage

Basic Usage

const { Llama } = require('llama.cpp-ts');

async function main() {
    const llama = new Llama();
    const modelPath = "./path/to/your/model.gguf";
    const modelParams = { nGpuLayers: 32 };
    const contextParams = { nContext: 2048 };

    if (!llama.initialize(modelPath, modelParams, contextParams)) {
        console.error("Failed to initialize the model");
        return;
    }

    llama.setSystemPrompt("You are a helpful assistant. Always provide clear, concise, and accurate answers.");

    const question = "What is the capital of France?";
    const tokenStream = llama.prompt(question);

    console.log("Question:", question);
    console.log("Answer: ");

    while (true) {
        const token = await tokenStream.read();
        if (token === null) break;
        process.stdout.write(token);
    }
}

main().catch(console.error);

API Reference

Llama Class

The Llama class provides methods to interact with language models loaded through llama.cpp.

Public Methods

constructor(): Creates a new Llama instance.
initialize(modelPath: string, modelParams?: object, contextParams?: object): boolean: Initializes the model with the specified path and parameters.
setSystemPrompt(systemPrompt: string): void: Sets the system prompt for the conversation.
prompt(userMessage: string): TokenStream: Streams the response to the given prompt, returning a TokenStream object.
resetConversation(): void: Resets the conversation history.

TokenStream Class

The TokenStream class represents a stream of tokens generated by the language model.

Public Methods

read(): Promise<string | null>: Reads the next token from the stream. Returns null when the stream is finished.

Example

Here's a more comprehensive example demonstrating the usage of the library:

const { Llama } = require('llama.cpp-ts');

async function main() {
    const llama = new Llama();
    const modelPath = __dirname + "/models/Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf";
    const modelParams = { nGpuLayers: 32 };
    const contextParams = { nContext: 2048 };

    if (!llama.initialize(modelPath, modelParams, contextParams)) {
        console.error("Failed to initialize the model");
        return;
    }

    llama.setSystemPrompt("You are a helpful assistant. Always provide clear, concise, and accurate answers.");

    const questions = [
        "What is the capital of France?",
        "What's the population of that city?",
        "What country is the city in?"
    ];

    for (const question of questions) {
        const tokenStream = llama.prompt(question);

        console.log("Question:", question);
        console.log("Answer: ");

        while (true) {
            const token = await tokenStream.read();
            if (token === null) break;
            process.stdout.write(token);
        }

        console.log("\n");
    }
}

main().catch(console.error);