====== Ollama: Local Large Language Model Execution ======

Ollama is a tool that allows users to run large language models (LLMs) locally on their machines without relying on cloud services. This ensures greater privacy, data control, and offline usage capabilities.

===== Key Features =====
  * **Local Model Execution:** Install and run AI models directly on your device, such as Llama 3.3, DeepSeek-R1, Phi-4, Mistral, and Gemma 2.
  * **Cross-Platform Compatibility:** Available for macOS, Linux, and Windows, making it accessible on multiple environments.
  * **Command Line Interface (CLI):** Operates through the terminal or command prompt, offering efficient interaction with installed models.
  * **Privacy and Data Control:** Since the tool runs locally, your data is not sent to external servers, ensuring enhanced security and privacy.

===== Installation and Basic Usage =====
1. **Download and Install:** 
   - Visit [Ollama Official Website](https://ollama.com/) and download the appropriate version for your operating system.
   - Follow the installer instructions to complete the setup.
   
2. **Using the Terminal:**
   - After installation, open your system's terminal or command prompt.
   - Run models using simple commands. For example, to run the Mistral model, use:
<code>    
~$ ollama run mistral
</code>     
===== Supported Models =====
Ollama supports several popular large language models, including but not limited to:
  * **Llama** (all versions)
  * **DeepSeek-R1**
  * **Phi-4**
  * **Mistral**
  * **Gemma 2**
  
===== Advantages of Ollama =====
  - **Offline Functionality:** No internet connection is needed once models are installed.
  - **Data Security:** Data remains on the local device, eliminating the risk of data breaches from cloud services.
  - **High Performance:** Running models locally can offer faster responses depending on system specifications.

====== Model library ======

Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library)

Here are some example models that can be downloaded:

^ Model                  ^ Parameters ^ Size   ^ Download Command                       ^
| DeepSeek-R1            | 7B         | 4.7GB  | `ollama run deepseek-r1`               |
| DeepSeek-R1            | 671B       | 404GB  | `ollama run deepseek-r1:671b`          |
| Llama 3.3              | 70B        | 43GB   | `ollama run llama3.3`                  |
| Llama 3.2              | 3B         | 2.0GB  | `ollama run llama3.2`                  |
| Llama 3.2              | 1B         | 1.3GB  | `ollama run llama3.2:1b`               |
| Llama 3.2 Vision       | 11B        | 7.9GB  | `ollama run llama3.2-vision`           |
| Llama 3.2 Vision       | 90B        | 55GB   | `ollama run llama3.2-vision:90b`       |
| Llama 3.1              | 8B         | 4.7GB  | `ollama run llama3.1`                  |
| Llama 3.1              | 405B       | 231GB  | `ollama run llama3.1:405b`             |
| Phi 4                  | 14B        | 9.1GB  | `ollama run phi4`                      |
| Phi 3 Mini             | 3.8B       | 2.3GB  | `ollama run phi3`                      |
| Gemma 2                | 2B         | 1.6GB  | `ollama run gemma2:2b`                 |
| Gemma 2                | 9B         | 5.5GB  | `ollama run gemma2`                    |
| Gemma 2                | 27B        | 16GB   | `ollama run gemma2:27b`                |
| Mistral                | 7B         | 4.1GB  | `ollama run mistral`                   |
| Moondream 2            | 1.4B       | 829MB  | `ollama run moondream`                 |
| Neural Chat            | 7B         | 4.1GB  | `ollama run neural-chat`               |
| Starling               | 7B         | 4.1GB  | `ollama run starling-lm`               |
| Code Llama             | 7B         | 3.8GB  | `ollama run codellama`                 |
| Llama 2 Uncensored     | 7B         | 3.8GB  | `ollama run llama2-uncensored`         |
| LLaVA                  | 7B         | 4.5GB  | `ollama run llava`                     |
| Solar                  | 10.7B      | 6.1GB  | `ollama run solar`                     |

==== Note ====

You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

===== CLI Reference =====

Create a model

ollama create is used to create a model from a Modelfile.
<code> 
ollama create mymodel -f ./Modelfile
</code> 
Pull a model
<code> 
ollama pull llama3.2
</code> 
This command can also be used to update a local model. Only the diff will be pulled.

Remove a model
<code> 
ollama rm llama3.2
</code> 
Copy a model
<code> 
ollama cp llama3.2 my-model
</code> 
Multiline input

For multiline input, you can wrap text with """:
<code> 
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
</code> 
Multimodal models
<code> 
ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
</code>
Output: The image features a yellow smiley face, which is likely the central focus of the picture.

Pass the prompt as an argument
<code> 
ollama run llama3.2 "Summarize this file: $(cat README.md)"
</code> 

Show model information
<code> 
ollama show llama3.2
</code> 
List models on your computer
<code> 
ollama list
</code> 
List which models are currently loaded
<code> 
ollama ps
</code> 
Stop a model which is currently running
<code> 
ollama stop llama3.2
</code> 
<code> 
Start Ollama
ollama serve is used when you want to start ollama without running the desktop application.
</code> 

===== How to Run Ollama and Connect to the Service API Through Internal Network or Internet =====

Setting Environment Variables on Linux

If Ollama is run as a systemd service, environment variables should be set using systemctl:

Edit the Ollama Service File: Open the Ollama service configuration file with the following command:

    sudo systemctl edit ollama.service

Add the Environment Variable: In the editor, add the following lines under the [Service] section:

    [Service]
    Environment="OLLAMA_HOST=0.0.0.0"

Note #1: Sometimes, 0.0.0.0 does not work due to your environment setup. Instead, you can try setting it to your local ip address like 10.0.0.x or xxx.local, etc.

Note #2: You should put this above this line ### Lines below this comment will be discarded. It should look something like this:

    ### Editing /etc/systemd/system/ollama.service.d/override.conf
    ### Anything between here and the comment below will become the new contents of the file
    [Service]
    Environment="OLLAMA_HOST=0.0.0.0"
    ### Lines below this comment will be discarded
    ### /etc/systemd/system/ollama.service
    # [Unit]
    # Description=Ollama Service
    # After=network-online.target
    #
    # [Service]
    # ExecStart=/usr/local/bin/ollama serve
    # User=ollama
    # Group=ollama
    # Restart=always
    # RestartSec=3
    # Environment="PATH=/home/kimi/.nvm/versions/node/v20.5.0/bin:/home/kimi/.local/share/pnpm:/usr/local/sbin:/usr/local/bin:/usr/s>
    #
    # [Install]
    # WantedBy=default.target

Restart the Service: After editing the file, reload the systemd daemon and restart the Ollama service:

    sudo systemctl daemon-reload
    sudo systemctl restart ollama


===== Learn More =====
For more detailed information and tutorials, visit [Ollama's official website](https://ollama.com/) or check out this [video overview](https://www.youtube.com/watch?v=wxyDEqR4KxM).