NEWS Earn Money with Onidel Cloud! Affiliate Program Details - Check it out

How to Deploy Ollama + Open WebUI on an Ubuntu 24.04 VPS with Docker Compose

Introduction

Large Language Models (LLMs) have revolutionized how we interact with artificial intelligence, but accessing them often requires expensive cloud API calls or complex setups. Ollama solves this problem by providing a simple way to run LLMs locally, while Open WebUI offers an intuitive web interface similar to ChatGPT.

In this comprehensive tutorial, you’ll learn how to deploy both Ollama and Open WebUI on an Ubuntu 24.04 VPS using Docker Compose. This setup allows you to run powerful language models privately on your own infrastructure, giving you complete control over your data and conversations.

What you’ll accomplish:

  • Set up a complete local LLM environment
  • Deploy Ollama for model management and inference
  • Configure Open WebUI for an intuitive chat interface
  • Implement security best practices
  • Optimize performance for your VPS resources

Prerequisites

Before beginning this tutorial, ensure you have:

System Requirements:

  • Ubuntu 24.04 LTS VPS with minimum 8GB RAM (16GB+ recommended for larger models)
  • At least 4 CPU cores for optimal performance
  • 50GB+ storage space (models can be 4-70GB each)
  • Root or sudo access to the server

Software Requirements:

  • Docker Engine 24.0+ and Docker Compose v2
  • Basic familiarity with command line operations
  • SSH access to your VPS

Note: High-performance VPS instances with modern processors like AMD EPYC Milan provide significantly better inference speeds for language models.

Step-by-Step Tutorial

Step 1: Update Your Ubuntu System

First, connect to your VPS via SSH and update the system packages:

sudo apt update && sudo apt upgrade -y
sudo apt install curl wget git -y

Step 2: Install Docker and Docker Compose

Install Docker using the official installation script:

# Download and install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

# Add your user to the docker group
sudo usermod -aG docker $USER

# Start and enable Docker service
sudo systemctl start docker
sudo systemctl enable docker

Log out and back in for group changes to take effect, then verify the installation:

docker --version
docker compose version

Step 3: Create Project Directory and Configuration

Create a dedicated directory for your Ollama deployment:

mkdir ~/ollama-webui
cd ~/ollama-webui

Step 4: Create Docker Compose Configuration

Create a docker-compose.yml file with the following configuration:

version: '3.8'

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    environment:
      - OLLAMA_HOST=0.0.0.0:11434
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          memory: 2G

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=your-secret-key-here
    volumes:
      - open_webui_data:/app/backend/data
    depends_on:
      - ollama
    restart: unless-stopped

volumes:
  ollama_data:
  open_webui_data:

Security Note: Replace your-secret-key-here with a strong, randomly generated secret key:

openssl rand -base64 32

Step 5: Deploy the Services

Launch both services using Docker Compose:

docker compose up -d

Verify both containers are running:

docker compose ps

Monitor the logs to ensure everything starts correctly:

docker compose logs -f

Step 6: Download and Configure Language Models

Access the Ollama container to download your first model:

# Download a lightweight model for testing
docker exec -it ollama ollama pull llama3.2:3b

# For more powerful models (requires more RAM):
# docker exec -it ollama ollama pull llama3.2:8b
# docker exec -it ollama ollama pull codellama:7b

List available models:

docker exec -it ollama ollama list

Step 7: Configure Firewall (Security Best Practice)

Set up UFW firewall to secure your deployment:

# Enable UFW
sudo ufw enable

# Allow SSH (adjust port if needed)
sudo ufw allow 22/tcp

# Allow Open WebUI access
sudo ufw allow 3000/tcp

# Block direct Ollama access from external sources
# (Open WebUI will communicate internally)
sudo ufw deny 11434/tcp

# Check firewall status
sudo ufw status

Step 8: Access Open WebUI

Open your web browser and navigate to http://your-vps-ip:3000. You’ll be prompted to create an admin account on first access.

Once logged in, you can:

  • Start conversations with downloaded models
  • Download additional models through the web interface
  • Customize model parameters and system prompts
  • Create and manage different chat sessions

Best Practices

Performance Optimization

Resource Management:

  • Monitor RAM usage with docker stats – smaller models like llama3.2:3b use ~4GB, while larger models can require 16GB+
  • Use SSD storage for faster model loading and inference
  • Consider CPU-optimized VPS instances for better performance

Model Selection Strategy:

  • Start with smaller models (3B-7B parameters) to test your setup
  • Gradually scale to larger models based on your VPS resources
  • Use specialized models like CodeLlama for programming tasks

Security Considerations

Essential Security Measures:

  • Never expose Ollama port (11434) to the internet – it lacks authentication
  • Use strong passwords for Open WebUI admin accounts
  • Regularly update container images: docker compose pull && docker compose up -d
  • Consider setting up SSL/TLS with a reverse proxy like Nginx for production use
  • Implement regular backups of your conversation data

Maintenance and Monitoring

Set up log rotation to prevent disk space issues:

# View current log sizes
docker system df

# Clean up unused containers and images
docker system prune -f

Create a backup script for your data:

#!/bin/bash
docker compose down
sudo tar -czf ~/ollama-backup-$(date +%Y%m%d).tar.gz ~/ollama-webui
docker compose up -d

Conclusion

You’ve successfully deployed a complete local LLM environment with Ollama and Open WebUI on Ubuntu 24.04. This setup provides you with:

  • Complete privacy and control over your AI conversations
  • No per-token costs or API rate limits
  • The ability to customize and fine-tune models for your specific needs
  • A scalable foundation for building AI-powered applications

Your deployment can handle multiple concurrent users and supports various language models depending on your VPS specifications. As your needs grow, you can easily scale by upgrading your server resources or deploying additional instances.

For optimal performance and reliability, consider high-performance VPS solutions with modern processors and NVMe storage. Modern cloud infrastructure ensures your LLM deployment runs smoothly and responds quickly to user interactions.

Ready to explore more advanced deployments? Consider investigating GPU-accelerated instances for even faster inference speeds, or setting up load-balanced multiple instances for enterprise workloads.

Share your love