Quantized | npm.io

serve websocket GGML 4/5bit Quantized LLM's based on Meta's LLaMa model with llama.ccp

llm wrapper huggingface javascript llama.cpp socket.io dalai quantized cpu text

0.1.0 • Published 2 years ago

use `npm i --save llama.native.js` to run lama.cpp models on your local machine. features a socket.io server and client that can do inference with the host of the model.

llm wrapper huggingface javascript llama.cpp socket.io dalai quantized cpu text

1.1.0 • Published 2 years ago