《Ollama : AI Readme》
Ollama
Get up and running with large language models.
macOS
Windows
Linux
curl -fsSL https://ollama.com/install.sh | sh
Docker
The official Ollama Docker image ollama/ollama
is available on Docker Hub.
Libraries
Community
Quickstart
To run and chat with Llama 3.2:
ollama run llama3.2
Model library
Ollama supports a list of models available on ollama.com/library
Here are some example models that can be downloaded:
Model | Parameters | Size | Download |
---|---|---|---|
DeepSeek-R1 | 7B | 4.7GB | ollama run deepseek-r1 |
DeepSeek-R1 | 671B | 404GB | ollama run deepseek-r1:671b |
Llama 3.3 | 70B | 43GB | ollama run llama3.3 |
Llama 3.2 | 3B | 2.0GB | ollama run llama3.2 |
Llama 3.2 | 1B | 1.3GB | ollama run llama3.2:1b |
Llama 3.2 Vision | 11B | 7.9GB | ollama run llama3.2-vision |
Llama 3.2 Vision | 90B | 55GB | ollama run llama3.2-vision:90b |
Llama 3.1 | 8B | 4.7GB | ollama run llama3.1 |
Llama 3.1 | 405B | 231GB | ollama run llama3.1:405b |
Phi 4 | 14B | 9.1GB | ollama run phi4 |
Phi 3 Mini | 3.8B | 2.3GB | ollama run phi3 |
Gemma 2 | 2B | 1.6GB | ollama run gemma2:2b |
Gemma 2 | 9B | 5.5GB | ollama run gemma2 |
Gemma 2 | 27B | 16GB | ollama run gemma2:27b |
Mistral | 7B | 4.1GB | ollama run mistral |
Moondream 2 | 1.4B | 829MB | ollama run moondream |
Neural Chat | 7B | 4.1GB | ollama run neural-chat |
Starling | 7B | 4.1GB | ollama run starling-lm |
Code Llama | 7B | 3.8GB | ollama run codellama |
Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
LLaVA | 7B | 4.5GB | ollama run llava |
Solar | 10.7B | 6.1GB | ollama run solar |
[!NOTE]
You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
Customize a model
Import from GGUF
Ollama supports importing GGUF models in the Modelfile:
-
Create a file named
Modelfile
, with aFROM
instruction with the local filepath to the model you want to import.FROM ./vicuna-33b.Q4_0.gguf
-
Create the model in Ollama
ollama create example -f Modelfile
-
Run the model
ollama run example
Import from Safetensors
See the guide on importing models for more information.
Customize a prompt
Models from the Ollama library can be customized with a prompt. For example, to customize the llama3.2
model:
ollama pull llama3.2
Create a Modelfile
:
FROM llama3.2
# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
Next, create and run the model:
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
For more information on working with a Modelfile, see the Modelfile documentation.
CLI Reference
Create a model
ollama create
is used to create a model from a Modelfile.
ollama create mymodel -f ./Modelfile
Pull a model
ollama pull llama3.2
This command can also be used to update a local model. Only the diff will be pulled.
Remove a model
ollama rm llama3.2
Copy a model
ollama cp llama3.2 my-model
Multiline input
For multiline input, you can wrap text with """
:
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
Multimodal models
ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
Output: The image features a yellow smiley face, which is likely the central focus of the picture.
Pass the prompt as an argument
ollama run llama3.2 "Summarize this file: $(cat README.md)"
Output: Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
Show model information
ollama show llama3.2
List models on your computer
ollama list
List which models are currently loaded
ollama ps
Stop a model which is currently running
ollama stop llama3.2
Start Ollama
ollama serve
is used when you want to start ollama without running the desktop application.
Building
See the developer guide