OpenWebUI-Cuda/README.md
bizzle 6e13e69b41 docs: use real Gitea clone URL in README
Co-authored-by: Claude <claude-code@anthropic.com>
2026-06-21 20:57:18 -04:00

5.1 KiB

Open WebUI (CUDA) + Ollama

A Dockerized Open WebUI setup using the CUDA/GPU image, wired up to talk to an Ollama backend for local LLMs.

This runs Open WebUI in a container with NVIDIA GPU acceleration and serves the web interface on port 3012.


Prerequisites (Windows)

You'll need the following installed on your Windows machine:

  1. Githttps://git-scm.com/download/win
  2. Docker Desktophttps://www.docker.com/products/docker-desktop/
    • During/after install, make sure the WSL 2 backend is enabled (Docker Desktop → Settings → General → Use the WSL 2 based engine).
  3. NVIDIA GPU + drivers — this uses the CUDA image, so you need an NVIDIA GPU.
    • Install the latest NVIDIA Game Ready / Studio driver: https://www.nvidia.com/download/index.aspx
    • GPU passthrough to Docker works automatically through WSL 2 with a recent driver — you do not need to install the CUDA toolkit separately.
    • In Docker Desktop → Settings → Resources → make sure your WSL distro is enabled.
  4. Ollamahttps://ollama.com/download/windows
    • This is what actually runs the models. Open WebUI is just the front end.

💡 No NVIDIA GPU? See Running without a GPU at the bottom.


1. Clone the repository

Open PowerShell (or Git Bash / Windows Terminal) and run:

git clone https://git.bizzle.lol/bizzle/OpenWebUI-Cuda.git
cd OpenWebUI-Cuda

2. Set up Ollama and pull your models

Open WebUI does not download models itself — Ollama does. So first make sure Ollama is running and has at least one model.

  1. After installing Ollama, it runs automatically in the background (check the system tray). You can confirm with:

    ollama --version
    
  2. Pull a model (this downloads it). For example:

    ollama pull llama3.2
    ollama pull qwen2.5
    
  3. List what you have available:

    ollama list
    

    Anything listed here will show up in Open WebUI's model dropdown once connected.


3. (Optional) Point Open WebUI at Ollama on another PC

By default this works out of the box if Ollama is running on the same Windows PC. The compose file is already set to:

- OLLAMA_API_BASE_URL=http://host.docker.internal:11434/api

host.docker.internal lets the container reach Ollama running on your host machine — no changes needed for the common case.

Only if you want to use Ollama running on a different machine on your network, edit docker-compose.yml and swap in that machine's IP, e.g.:

- OLLAMA_API_BASE_URL=http://192.168.1.50:11434/api

By default Ollama only listens on localhost. If Ollama is on a different machine, you must also set the OLLAMA_HOST=0.0.0.0 environment variable on that machine so it accepts connections from the network. (Not needed if it's on the same PC.)


4. Start it up

From inside the OpenWebUI-Cuda folder:

docker compose up -d

The first run will download the Open WebUI CUDA image (a few GB), so give it a few minutes.

Then open your browser to:

👉 http://localhost:3012

On first visit, create an admin account (the first account registered becomes the admin). Your models from ollama list should appear in the model selector at the top.


Everyday commands

Action Command
Start (in background) docker compose up -d
Stop docker compose down
Restart docker compose restart
View logs docker compose logs -f
Update to latest Open WebUI docker compose pull then docker compose up -d

Your chats, settings, and accounts are stored in a Docker volume (open-webui) and survive restarts and updates.


Troubleshooting

No models show up in the dropdown

  • Run ollama list — do you actually have models pulled?
  • Double-check the OLLAMA_API_BASE_URL in docker-compose.yml (step 3).
  • Make sure Ollama is running (system tray icon, or ollama list works).
  • After editing the compose file, re-run docker compose up -d to apply changes.

docker compose up fails with a GPU / nvidia error

  • Update your NVIDIA driver (see prerequisites).
  • Make sure Docker Desktop is using the WSL 2 backend.
  • Test GPU access with: docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
  • If you don't have an NVIDIA GPU, see below.

Port 3012 already in use

  • Change the left side of the ports line in docker-compose.yml, e.g. "3013:8080", then use that port in your browser.

Running without a GPU

The CUDA image requires an NVIDIA GPU. If your friend doesn't have one, switch to the standard image: in docker-compose.yml, change

image: ghcr.io/open-webui/open-webui:cuda

to

image: ghcr.io/open-webui/open-webui:main

and delete the entire deploy: block (the resources / nvidia GPU section). Ollama will still use the GPU itself if one is present — this only affects Open WebUI's own acceleration.