docs: add Windows setup README and default Ollama URL to localhost

Add README with Windows clone/run instructions and Ollama model setup. Default OLLAMA_API_BASE_URL to host.docker.internal so it works out of the box on the same PC, with comments on pointing it at a remote PC. Co-authored-by: Claude <claude-code@anthropic.com>
2026-06-21 20:53:16 -04:00 · 2026-06-21 20:53:16 -04:00 · 9518fc1052
commit 9518fc1052
3 changed files with 190 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,8 @@
 # Local environment / secrets
 .env
 .env.*
 !.env.example
 # OS / editor cruft
 .DS_Store
 Thumbs.db
--- a/README.md
+++ b/README.md
@ -0,0 +1,152 @@
 # Open WebUI (CUDA) + Ollama
 A Dockerized [Open WebUI](https://github.com/open-webui/open-webui) setup using the **CUDA/GPU** image, wired up to talk to an [Ollama](https://ollama.com) backend for local LLMs.
 This runs Open WebUI in a container with NVIDIA GPU acceleration and serves the web interface on **port `3012`**.
 ---
 ## Prerequisites (Windows)
 You'll need the following installed on your Windows machine:
 1. **Git** — https://git-scm.com/download/win
 2. **Docker Desktop** — https://www.docker.com/products/docker-desktop/
   - During/after install, make sure the **WSL 2 backend** is enabled (Docker Desktop → Settings → General → *Use the WSL 2 based engine*).
 3. **NVIDIA GPU + drivers** — this uses the CUDA image, so you need an NVIDIA GPU.
   - Install the latest **NVIDIA Game Ready / Studio driver**: https://www.nvidia.com/download/index.aspx
   - GPU passthrough to Docker works automatically through WSL 2 with a recent driver — you do **not** need to install the CUDA toolkit separately.
   - In Docker Desktop → Settings → Resources → make sure your WSL distro is enabled.
 4. **Ollama** — https://ollama.com/download/windows
   - This is what actually runs the models. Open WebUI is just the front end.
 > 💡 No NVIDIA GPU? See [Running without a GPU](#running-without-a-gpu) at the bottom.
 ---
 ## 1. Clone the repository
 Open **PowerShell** (or Git Bash / Windows Terminal) and run:
 ```powershell
 git clone https://your-gitea-server/your-username/OpenWebUI-Cuda.git
 cd OpenWebUI-Cuda
 ```
 > Replace the URL above with the actual clone URL shown on the Gitea repo page (the green **Clone** button). If the repo is private, Gitea will prompt for your username and password / access token.
 ---
 ## 2. Set up Ollama and pull your models
 Open WebUI does **not** download models itself — Ollama does. So first make sure Ollama is running and has at least one model.
 1. After installing Ollama, it runs automatically in the background (check the system tray). You can confirm with:
   ```powershell
   ollama --version
   ```
 2. Pull a model (this downloads it). For example:
   ```powershell
   ollama pull llama3.2
   ollama pull qwen2.5
   ```
 3. List what you have available:
   ```powershell
   ollama list
   ```
   Anything listed here will show up in Open WebUI's model dropdown once connected.
 ---
 ## 3. (Optional) Point Open WebUI at Ollama on another PC
 **By default this works out of the box** if Ollama is running on the *same* Windows PC. The compose file is already set to:
 ```yaml
 - OLLAMA_API_BASE_URL=http://host.docker.internal:11434/api
 ```
 `host.docker.internal` lets the container reach Ollama running on your host machine — no changes needed for the common case.
 **Only** if you want to use Ollama running on a **different machine on your network**, edit `docker-compose.yml` and swap in that machine's IP, e.g.:
 ```yaml
 - OLLAMA_API_BASE_URL=http://192.168.1.50:11434/api
 ```
 > By default Ollama only listens on `localhost`. If Ollama is on a **different** machine, you must also set the `OLLAMA_HOST=0.0.0.0` environment variable on that machine so it accepts connections from the network. (Not needed if it's on the same PC.)
 ---
 ## 4. Start it up
 From inside the `OpenWebUI-Cuda` folder:
 ```powershell
 docker compose up -d
 ```
 The first run will download the Open WebUI CUDA image (a few GB), so give it a few minutes.
 Then open your browser to:
 ### 👉 http://localhost:3012
 On first visit, create an admin account (the first account registered becomes the admin). Your models from `ollama list` should appear in the model selector at the top.
 ---
 ## Everyday commands
 | Action | Command |
 |---|---|
 | Start (in background) | `docker compose up -d` |
 | Stop | `docker compose down` |
 | Restart | `docker compose restart` |
 | View logs | `docker compose logs -f` |
 | Update to latest Open WebUI | `docker compose pull` then `docker compose up -d` |
 Your chats, settings, and accounts are stored in a Docker volume (`open-webui`) and survive restarts and updates.
 ---
 ## Troubleshooting
 **No models show up in the dropdown**
 - Run `ollama list` — do you actually have models pulled?
 - Double-check the `OLLAMA_API_BASE_URL` in `docker-compose.yml` ([step 3](#3-optional-point-open-webui-at-ollama-on-another-pc)).
 - Make sure Ollama is running (system tray icon, or `ollama list` works).
 - After editing the compose file, re-run `docker compose up -d` to apply changes.
 **`docker compose up` fails with a GPU / nvidia error**
 - Update your NVIDIA driver (see prerequisites).
 - Make sure Docker Desktop is using the WSL 2 backend.
 - Test GPU access with: `docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi`
 - If you don't have an NVIDIA GPU, see below.
 **Port 3012 already in use**
 - Change the left side of the ports line in `docker-compose.yml`, e.g. `"3013:8080"`, then use that port in your browser.
 ---
 ## Running without a GPU
 The CUDA image requires an NVIDIA GPU. If your friend doesn't have one, switch to the standard image: in `docker-compose.yml`, change
 ```yaml
 image: ghcr.io/open-webui/open-webui:cuda
 ```
 to
 ```yaml
 image: ghcr.io/open-webui/open-webui:main
 ```
 and delete the entire `deploy:` block (the `resources` / `nvidia` GPU section). Ollama will still use the GPU itself if one is present — this only affects Open WebUI's own acceleration.
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -0,0 +1,30 @@
 services:
    open-webui2:
        image: ghcr.io/open-webui/open-webui:cuda
        container_name: open-webui2
        ports:
            - "3012:8080"
        volumes:
            - open-webui:/app/backend/data
        environment:
            # Defaults to Ollama running on this same PC (the Docker host).
            # To use Ollama on ANOTHER machine on your network, replace
            # host.docker.internal with that machine's IP, e.g.
            #   http://192.168.1.50:11434/api
            # (On that remote machine, set OLLAMA_HOST=0.0.0.0 so it accepts LAN connections.)
            - OLLAMA_API_BASE_URL=http://host.docker.internal:11434/api
            - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY:-default-random-key}
            - OPENAI_API_KEY=${OPENAI_API_KEY:-}
        deploy:
            resources:
                reservations:
                    devices:
                        - driver: nvidia
                          count: all
                          capabilities: [gpu]
        extra_hosts:
            - "host.docker.internal:host-gateway"
        restart: always
 volumes:
    open-webui: