commit 9518fc105250e6a7521939de5c23f401adb4e086 Author: bizzle Date: Sun Jun 21 20:53:16 2026 -0400 docs: add Windows setup README and default Ollama URL to localhost Add README with Windows clone/run instructions and Ollama model setup. Default OLLAMA_API_BASE_URL to host.docker.internal so it works out of the box on the same PC, with comments on pointing it at a remote PC. Co-authored-by: Claude diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..aef18e4 --- /dev/null +++ b/.gitignore @@ -0,0 +1,8 @@ +# Local environment / secrets +.env +.env.* +!.env.example + +# OS / editor cruft +.DS_Store +Thumbs.db diff --git a/README.md b/README.md new file mode 100644 index 0000000..2181341 --- /dev/null +++ b/README.md @@ -0,0 +1,152 @@ +# Open WebUI (CUDA) + Ollama + +A Dockerized [Open WebUI](https://github.com/open-webui/open-webui) setup using the **CUDA/GPU** image, wired up to talk to an [Ollama](https://ollama.com) backend for local LLMs. + +This runs Open WebUI in a container with NVIDIA GPU acceleration and serves the web interface on **port `3012`**. + +--- + +## Prerequisites (Windows) + +You'll need the following installed on your Windows machine: + +1. **Git** — https://git-scm.com/download/win +2. **Docker Desktop** — https://www.docker.com/products/docker-desktop/ + - During/after install, make sure the **WSL 2 backend** is enabled (Docker Desktop → Settings → General → *Use the WSL 2 based engine*). +3. **NVIDIA GPU + drivers** — this uses the CUDA image, so you need an NVIDIA GPU. + - Install the latest **NVIDIA Game Ready / Studio driver**: https://www.nvidia.com/download/index.aspx + - GPU passthrough to Docker works automatically through WSL 2 with a recent driver — you do **not** need to install the CUDA toolkit separately. + - In Docker Desktop → Settings → Resources → make sure your WSL distro is enabled. +4. **Ollama** — https://ollama.com/download/windows + - This is what actually runs the models. Open WebUI is just the front end. + +> 💡 No NVIDIA GPU? See [Running without a GPU](#running-without-a-gpu) at the bottom. + +--- + +## 1. Clone the repository + +Open **PowerShell** (or Git Bash / Windows Terminal) and run: + +```powershell +git clone https://your-gitea-server/your-username/OpenWebUI-Cuda.git +cd OpenWebUI-Cuda +``` + +> Replace the URL above with the actual clone URL shown on the Gitea repo page (the green **Clone** button). If the repo is private, Gitea will prompt for your username and password / access token. + +--- + +## 2. Set up Ollama and pull your models + +Open WebUI does **not** download models itself — Ollama does. So first make sure Ollama is running and has at least one model. + +1. After installing Ollama, it runs automatically in the background (check the system tray). You can confirm with: + + ```powershell + ollama --version + ``` + +2. Pull a model (this downloads it). For example: + + ```powershell + ollama pull llama3.2 + ollama pull qwen2.5 + ``` + +3. List what you have available: + + ```powershell + ollama list + ``` + + Anything listed here will show up in Open WebUI's model dropdown once connected. + +--- + +## 3. (Optional) Point Open WebUI at Ollama on another PC + +**By default this works out of the box** if Ollama is running on the *same* Windows PC. The compose file is already set to: + +```yaml +- OLLAMA_API_BASE_URL=http://host.docker.internal:11434/api +``` + +`host.docker.internal` lets the container reach Ollama running on your host machine — no changes needed for the common case. + +**Only** if you want to use Ollama running on a **different machine on your network**, edit `docker-compose.yml` and swap in that machine's IP, e.g.: + +```yaml +- OLLAMA_API_BASE_URL=http://192.168.1.50:11434/api +``` + +> By default Ollama only listens on `localhost`. If Ollama is on a **different** machine, you must also set the `OLLAMA_HOST=0.0.0.0` environment variable on that machine so it accepts connections from the network. (Not needed if it's on the same PC.) + +--- + +## 4. Start it up + +From inside the `OpenWebUI-Cuda` folder: + +```powershell +docker compose up -d +``` + +The first run will download the Open WebUI CUDA image (a few GB), so give it a few minutes. + +Then open your browser to: + +### 👉 http://localhost:3012 + +On first visit, create an admin account (the first account registered becomes the admin). Your models from `ollama list` should appear in the model selector at the top. + +--- + +## Everyday commands + +| Action | Command | +|---|---| +| Start (in background) | `docker compose up -d` | +| Stop | `docker compose down` | +| Restart | `docker compose restart` | +| View logs | `docker compose logs -f` | +| Update to latest Open WebUI | `docker compose pull` then `docker compose up -d` | + +Your chats, settings, and accounts are stored in a Docker volume (`open-webui`) and survive restarts and updates. + +--- + +## Troubleshooting + +**No models show up in the dropdown** +- Run `ollama list` — do you actually have models pulled? +- Double-check the `OLLAMA_API_BASE_URL` in `docker-compose.yml` ([step 3](#3-optional-point-open-webui-at-ollama-on-another-pc)). +- Make sure Ollama is running (system tray icon, or `ollama list` works). +- After editing the compose file, re-run `docker compose up -d` to apply changes. + +**`docker compose up` fails with a GPU / nvidia error** +- Update your NVIDIA driver (see prerequisites). +- Make sure Docker Desktop is using the WSL 2 backend. +- Test GPU access with: `docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi` +- If you don't have an NVIDIA GPU, see below. + +**Port 3012 already in use** +- Change the left side of the ports line in `docker-compose.yml`, e.g. `"3013:8080"`, then use that port in your browser. + +--- + +## Running without a GPU + +The CUDA image requires an NVIDIA GPU. If your friend doesn't have one, switch to the standard image: in `docker-compose.yml`, change + +```yaml +image: ghcr.io/open-webui/open-webui:cuda +``` + +to + +```yaml +image: ghcr.io/open-webui/open-webui:main +``` + +and delete the entire `deploy:` block (the `resources` / `nvidia` GPU section). Ollama will still use the GPU itself if one is present — this only affects Open WebUI's own acceleration. diff --git a/docker-compose.yml b/docker-compose.yml new file mode 100755 index 0000000..c456d86 --- /dev/null +++ b/docker-compose.yml @@ -0,0 +1,30 @@ +services: + open-webui2: + image: ghcr.io/open-webui/open-webui:cuda + container_name: open-webui2 + ports: + - "3012:8080" + volumes: + - open-webui:/app/backend/data + environment: + # Defaults to Ollama running on this same PC (the Docker host). + # To use Ollama on ANOTHER machine on your network, replace + # host.docker.internal with that machine's IP, e.g. + # http://192.168.1.50:11434/api + # (On that remote machine, set OLLAMA_HOST=0.0.0.0 so it accepts LAN connections.) + - OLLAMA_API_BASE_URL=http://host.docker.internal:11434/api + - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY:-default-random-key} + - OPENAI_API_KEY=${OPENAI_API_KEY:-} + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + extra_hosts: + - "host.docker.internal:host-gateway" + restart: always + +volumes: + open-webui: \ No newline at end of file