Ollama

Local & Open Source

Ollama is the fastest and simplest way to run large language models locally. With a single command (`ollama run llama3.3`), you download and start chatting with a model. It provides a clean REST API that is compatible with countless tools — OpenCode, Cherry Studio, Continue.dev, LangChain, and more can all connect to Ollama as their backend. For developers and researchers who want local AI as infrastructure rather than a standalone app, Ollama's lightweight server model and extensive integration ecosystem make it the default choice.

Key Features

One-command model download and execution: `ollama run <model>`

REST API for programmatic access from any language or tool

Model customization via Modelfile — adjust parameters, system prompts, and more

Broad ecosystem integration: OpenCode, Cherry Studio, Continue, LangChain, LlamaIndex

Cross-platform: macOS, Linux, Windows with GPU acceleration

Usage Guide

Setting Up Ollama

1
Download and install from ollama.com, or use `brew install ollama` on macOS.
2
Pull a model: `ollama pull llama3.3` or `ollama pull deepseek-r1:8b`.
3
Start chatting: `ollama run llama3.3` and type your prompts.
4
Access via API: send requests to `http://localhost:11434/api/generate`.
5
Connect tools: point OpenCode, Cherry Studio, or Continue to `http://localhost:11434` as the model provider.