Ollama Server

10 days ago

Prerequisites:

Distrobox
Podman (or Docker) in rootless configuration
10gb + of storage space in /home (depending on how many models you want to set up)

Create the distrobox:

distrobox create \
  --name ollama-box \
  --image ubuntu:24.04

I use Ubuntu, just because it tends to not have compatibility issues with my Radeon GPU. You could use the default distrobox for your distro, potentially. If you use a different distro in the distrobox, you should use the package manager commands for that distro.

Enter the distrobox:

distrobox enter ollama-box

Install the utility packages:

sudo apt update && sudo apt upgrade
sudo apt install -y pciutils lshw curl

Install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

Setup Environment Variables:

echo 'export OLLAMA_VULKAN=1' >> ~/.bashrc
echo 'export OLLAMA_CONTEXT_LENGTH=16384' >> ~/.bashrc
echo 'export OLLAMA_FLASH_ATTENTION=1' >> ~/.bashrc
echo 'export OLLAMA_KV_CACHE_TYPE=q8_0' >> ~/.bashrc
echo 'export OLLAMA_KEEP_ALIVE=10m' >> ~/.bashrc

The above are just the environment variables I've experimented with. You may need a lower or higher OLLAMA_CONTEXT_LENGTH, depending on your hardware. I needed the OLLAMA_VULKAN variable for Ollama to recognize my Radeon GPU. OLLAMA_KEEP_ALIVE tells Ollama how long to keep a model in memory. If you are swapping models a lot or only do short chat sessions, you can lower this value or keep the default of 5 minutes.

Start serving Ollama:

ollama serve

Here you can install models. Check the list of models: https://ollama.com/search to decide what models you want.

ollama pull [model]

This step is on your host OS and not in the distrobox. If you want to set up a systemd service on your host OS, create the file ~/.config/systemd/user/ollama.service with this content:

[Unit]
Description=Ollama AI Server (Distrobox)
After=default.target

[Service]
Environment="OLLAMA_VULKAN=1"
Environment="OLLAMA_CONTEXT_LENGTH=16384"
Environment="OLLAMA_FLASH_ATTENTION=1"
Environment="OLLAMA_KV_CACHE_TYPE=q8_0"
Environment="OLLAMA_KEEP_ALIVE=10m"
Environment="OLLAMA_HOST=0.0.0.0"
ExecStart=distrobox enter ollama-box -- ollama serve
Restart=on-failure
RestartSec=5

[Install]
Description=Ollama AI Server (Distrobox)
After=default.target network.target

Then run this command to reload systemd:

systemctl --user daemon-reload

To start and stop the Ollama server:

systemctl --user start ollama
systemctl --user stop ollama

To auto-start the Ollama server:

systemctl --user enable ollama

Keep in mind you'll need to update Environment variables here if you are using the service. If you aren't using systemd, you just update the ~/.bashrc within the distrobox.

From here you can setup tools to use Ollama, like Open WebUI or integrations in your IDE for coding.

David D.