Add README with Windows clone/run instructions and Ollama model setup. Default OLLAMA_API_BASE_URL to host.docker.internal so it works out of the box on the same PC, with comments on pointing it at a remote PC. Co-authored-by: Claude <claude-code@anthropic.com>
153 lines
5.3 KiB
Markdown
153 lines
5.3 KiB
Markdown
# Open WebUI (CUDA) + Ollama
|
|
|
|
A Dockerized [Open WebUI](https://github.com/open-webui/open-webui) setup using the **CUDA/GPU** image, wired up to talk to an [Ollama](https://ollama.com) backend for local LLMs.
|
|
|
|
This runs Open WebUI in a container with NVIDIA GPU acceleration and serves the web interface on **port `3012`**.
|
|
|
|
---
|
|
|
|
## Prerequisites (Windows)
|
|
|
|
You'll need the following installed on your Windows machine:
|
|
|
|
1. **Git** — https://git-scm.com/download/win
|
|
2. **Docker Desktop** — https://www.docker.com/products/docker-desktop/
|
|
- During/after install, make sure the **WSL 2 backend** is enabled (Docker Desktop → Settings → General → *Use the WSL 2 based engine*).
|
|
3. **NVIDIA GPU + drivers** — this uses the CUDA image, so you need an NVIDIA GPU.
|
|
- Install the latest **NVIDIA Game Ready / Studio driver**: https://www.nvidia.com/download/index.aspx
|
|
- GPU passthrough to Docker works automatically through WSL 2 with a recent driver — you do **not** need to install the CUDA toolkit separately.
|
|
- In Docker Desktop → Settings → Resources → make sure your WSL distro is enabled.
|
|
4. **Ollama** — https://ollama.com/download/windows
|
|
- This is what actually runs the models. Open WebUI is just the front end.
|
|
|
|
> 💡 No NVIDIA GPU? See [Running without a GPU](#running-without-a-gpu) at the bottom.
|
|
|
|
---
|
|
|
|
## 1. Clone the repository
|
|
|
|
Open **PowerShell** (or Git Bash / Windows Terminal) and run:
|
|
|
|
```powershell
|
|
git clone https://your-gitea-server/your-username/OpenWebUI-Cuda.git
|
|
cd OpenWebUI-Cuda
|
|
```
|
|
|
|
> Replace the URL above with the actual clone URL shown on the Gitea repo page (the green **Clone** button). If the repo is private, Gitea will prompt for your username and password / access token.
|
|
|
|
---
|
|
|
|
## 2. Set up Ollama and pull your models
|
|
|
|
Open WebUI does **not** download models itself — Ollama does. So first make sure Ollama is running and has at least one model.
|
|
|
|
1. After installing Ollama, it runs automatically in the background (check the system tray). You can confirm with:
|
|
|
|
```powershell
|
|
ollama --version
|
|
```
|
|
|
|
2. Pull a model (this downloads it). For example:
|
|
|
|
```powershell
|
|
ollama pull llama3.2
|
|
ollama pull qwen2.5
|
|
```
|
|
|
|
3. List what you have available:
|
|
|
|
```powershell
|
|
ollama list
|
|
```
|
|
|
|
Anything listed here will show up in Open WebUI's model dropdown once connected.
|
|
|
|
---
|
|
|
|
## 3. (Optional) Point Open WebUI at Ollama on another PC
|
|
|
|
**By default this works out of the box** if Ollama is running on the *same* Windows PC. The compose file is already set to:
|
|
|
|
```yaml
|
|
- OLLAMA_API_BASE_URL=http://host.docker.internal:11434/api
|
|
```
|
|
|
|
`host.docker.internal` lets the container reach Ollama running on your host machine — no changes needed for the common case.
|
|
|
|
**Only** if you want to use Ollama running on a **different machine on your network**, edit `docker-compose.yml` and swap in that machine's IP, e.g.:
|
|
|
|
```yaml
|
|
- OLLAMA_API_BASE_URL=http://192.168.1.50:11434/api
|
|
```
|
|
|
|
> By default Ollama only listens on `localhost`. If Ollama is on a **different** machine, you must also set the `OLLAMA_HOST=0.0.0.0` environment variable on that machine so it accepts connections from the network. (Not needed if it's on the same PC.)
|
|
|
|
---
|
|
|
|
## 4. Start it up
|
|
|
|
From inside the `OpenWebUI-Cuda` folder:
|
|
|
|
```powershell
|
|
docker compose up -d
|
|
```
|
|
|
|
The first run will download the Open WebUI CUDA image (a few GB), so give it a few minutes.
|
|
|
|
Then open your browser to:
|
|
|
|
### 👉 http://localhost:3012
|
|
|
|
On first visit, create an admin account (the first account registered becomes the admin). Your models from `ollama list` should appear in the model selector at the top.
|
|
|
|
---
|
|
|
|
## Everyday commands
|
|
|
|
| Action | Command |
|
|
|---|---|
|
|
| Start (in background) | `docker compose up -d` |
|
|
| Stop | `docker compose down` |
|
|
| Restart | `docker compose restart` |
|
|
| View logs | `docker compose logs -f` |
|
|
| Update to latest Open WebUI | `docker compose pull` then `docker compose up -d` |
|
|
|
|
Your chats, settings, and accounts are stored in a Docker volume (`open-webui`) and survive restarts and updates.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
**No models show up in the dropdown**
|
|
- Run `ollama list` — do you actually have models pulled?
|
|
- Double-check the `OLLAMA_API_BASE_URL` in `docker-compose.yml` ([step 3](#3-optional-point-open-webui-at-ollama-on-another-pc)).
|
|
- Make sure Ollama is running (system tray icon, or `ollama list` works).
|
|
- After editing the compose file, re-run `docker compose up -d` to apply changes.
|
|
|
|
**`docker compose up` fails with a GPU / nvidia error**
|
|
- Update your NVIDIA driver (see prerequisites).
|
|
- Make sure Docker Desktop is using the WSL 2 backend.
|
|
- Test GPU access with: `docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi`
|
|
- If you don't have an NVIDIA GPU, see below.
|
|
|
|
**Port 3012 already in use**
|
|
- Change the left side of the ports line in `docker-compose.yml`, e.g. `"3013:8080"`, then use that port in your browser.
|
|
|
|
---
|
|
|
|
## Running without a GPU
|
|
|
|
The CUDA image requires an NVIDIA GPU. If your friend doesn't have one, switch to the standard image: in `docker-compose.yml`, change
|
|
|
|
```yaml
|
|
image: ghcr.io/open-webui/open-webui:cuda
|
|
```
|
|
|
|
to
|
|
|
|
```yaml
|
|
image: ghcr.io/open-webui/open-webui:main
|
|
```
|
|
|
|
and delete the entire `deploy:` block (the `resources` / `nvidia` GPU section). Ollama will still use the GPU itself if one is present — this only affects Open WebUI's own acceleration.
|