Modular AI tools.
Pick what you need.

Every feature runs locally on your hardware. Choose your model engine per tool. Swap providers without changing a line of code.

// available now

Ship-ready features.

These tools are production-ready and included in every install.

💬

Chat with Files

Upload PDFs, Word documents, spreadsheets, code files, and more. local-ai indexes your files using local embeddings, stores them in a vector database, and lets you ask natural-language questions against your data.

Supported Formats .pdf, .docx, .xlsx, .csv, .txt, .md, .py, .js, .ts, .json
Model Engines Ollama, LM Studio, vLLM, llama.cpp, any OpenAI-compatible API
Vector Store ChromaDB (default), Qdrant, Milvus, Weaviate
Embedding Models nomic-embed-text, all-minilm, bge-large, or any GGUF model
RAG Pipeline Multi-file Context Conversation History
Chat Session
# 3 files indexed (report.pdf, data.xlsx, notes.md)

you: What were the Q3 revenue numbers?

ai: Based on report.pdf (page 12), Q3 revenue was $4.2M, up 18% from Q2...
🔊

Text to Audio

Convert any text into natural-sounding speech using locally-running TTS models. Multiple voices, multiple languages, zero cloud dependency. Ideal for accessibility, content creation, and audio previews.

TTS Engines Coqui TTS, Bark, Piper, XTTS-v2, or any pluggable engine
Output Formats .wav, .mp3, .ogg
Languages English, Spanish, French, German, Chinese, Japanese, and 20+ more
Voice Cloning Supported via XTTS-v2 with 6-second reference audio
Real-time Streaming Batch Processing REST API
Generated Audio 00:12
Model: piper-en-amy · 22kHz
// roadmap

Coming soon.

These features are in active development. Star the repo to get notified when they ship.

In Progress
🖼️

Image Generation

Text-to-image using local Stable Diffusion, SDXL, or ComfyUI workflows. Generate, inpaint, and upscale — all on your GPU.

Stable Diffusion ComfyUI LoRA Support
In Progress
📝

Document Summarizer

Drop lengthy reports, research papers, or legal docs. Get structured summaries with key takeaways, organized by section.

Multi-doc Export PDF Custom Templates
Planned
🔍

Semantic Search

Index all your files and search by meaning. Surface connections across documents, find related content, and navigate your knowledge base naturally.

Vector Index Hybrid Search Filters
Planned
🔌

Plugin System

Community plugin marketplace. Build custom tools with the SDK and plug them into any local model. Share plugins via the registry.

SDK Registry Hooks API
// bring your own model

Works with every major engine.

local-ai doesn't lock you into one provider. Swap engines per feature without changing your workflow.

Ollama

Easiest setup. Pull models with one command. Great for getting started.

Full Support

LM Studio

GUI-based model manager with OpenAI-compatible server built in.

Full Support

vLLM

High-throughput serving with PagedAttention. Best for multi-user setups.

Full Support

llama.cpp

Lightweight C++ inference. Runs on CPU, Apple Silicon, and CUDA.

Full Support

OpenAI-Compatible

Any server exposing the /v1/chat/completions endpoint works out of the box.

Full Support

Custom Engine

Write a thin adapter using our Engine SDK. Connect any inference backend.

Coming Soon

Ready to run AI locally?

Get up and running in under 2 minutes.

Install local-ai ⭐ Star on GitHub