OpenWebUI-Cuda/README.md

# Open WebUI (CUDA) + Ollama

A Dockerized [Open WebUI](https://github.com/open-webui/open-webui) setup using the **CUDA/GPU** image, wired up to talk to an [Ollama](https://ollama.com) backend for local LLMs.

This runs Open WebUI in a container with NVIDIA GPU acceleration and serves the web interface on **port `3012`**.

---

## Prerequisites (Windows)

You'll need the following installed on your Windows machine:

1. **Git** — https://git-scm.com/download/win
2. **Docker Desktop** — https://www.docker.com/products/docker-desktop/
   - During/after install, make sure the **WSL 2 backend** is enabled (Docker Desktop → Settings → General → *Use the WSL 2 based engine*).
3. **NVIDIA GPU + drivers** — this uses the CUDA image, so you need an NVIDIA GPU.
   - Install the latest **NVIDIA Game Ready / Studio driver**: https://www.nvidia.com/download/index.aspx
   - GPU passthrough to Docker works automatically through WSL 2 with a recent driver — you do **not** need to install the CUDA toolkit separately.
   - In Docker Desktop → Settings → Resources → make sure your WSL distro is enabled.
4. **Ollama** — https://ollama.com/download/windows
   - This is what actually runs the models. Open WebUI is just the front end.

> 💡 No NVIDIA GPU? See [Running without a GPU](#running-without-a-gpu) at the bottom.

---

## 1. Clone the repository

Open **PowerShell** (or Git Bash / Windows Terminal) and run:

```powershell
git clone https://your-gitea-server/your-username/OpenWebUI-Cuda.git
cd OpenWebUI-Cuda
```

> Replace the URL above with the actual clone URL shown on the Gitea repo page (the green **Clone** button). If the repo is private, Gitea will prompt for your username and password / access token.

---

## 2. Set up Ollama and pull your models

Open WebUI does **not** download models itself — Ollama does. So first make sure Ollama is running and has at least one model.

1. After installing Ollama, it runs automatically in the background (check the system tray). You can confirm with:

   ```powershell
   ollama --version
   ```

2. Pull a model (this downloads it). For example:

   ```powershell
   ollama pull llama3.2
   ollama pull qwen2.5
   ```

3. List what you have available:

   ```powershell
   ollama list
   ```

   Anything listed here will show up in Open WebUI's model dropdown once connected.

---

## 3. (Optional) Point Open WebUI at Ollama on another PC

**By default this works out of the box** if Ollama is running on the *same* Windows PC. The compose file is already set to:

```yaml
- OLLAMA_API_BASE_URL=http://host.docker.internal:11434/api
```

`host.docker.internal` lets the container reach Ollama running on your host machine — no changes needed for the common case.

**Only** if you want to use Ollama running on a **different machine on your network**, edit `docker-compose.yml` and swap in that machine's IP, e.g.:

```yaml
- OLLAMA_API_BASE_URL=http://192.168.1.50:11434/api
```

> By default Ollama only listens on `localhost`. If Ollama is on a **different** machine, you must also set the `OLLAMA_HOST=0.0.0.0` environment variable on that machine so it accepts connections from the network. (Not needed if it's on the same PC.)

---

## 4. Start it up

From inside the `OpenWebUI-Cuda` folder:

```powershell
docker compose up -d
```

The first run will download the Open WebUI CUDA image (a few GB), so give it a few minutes.

Then open your browser to:

### 👉 http://localhost:3012

On first visit, create an admin account (the first account registered becomes the admin). Your models from `ollama list` should appear in the model selector at the top.

---

## Everyday commands

| Action | Command |
|---|---|
| Start (in background) | `docker compose up -d` |
| Stop | `docker compose down` |
| Restart | `docker compose restart` |
| View logs | `docker compose logs -f` |
| Update to latest Open WebUI | `docker compose pull` then `docker compose up -d` |

Your chats, settings, and accounts are stored in a Docker volume (`open-webui`) and survive restarts and updates.

---

## Troubleshooting

**No models show up in the dropdown**
- Run `ollama list` — do you actually have models pulled?
- Double-check the `OLLAMA_API_BASE_URL` in `docker-compose.yml` ([step 3](#3-optional-point-open-webui-at-ollama-on-another-pc)).
- Make sure Ollama is running (system tray icon, or `ollama list` works).
- After editing the compose file, re-run `docker compose up -d` to apply changes.

**`docker compose up` fails with a GPU / nvidia error**
- Update your NVIDIA driver (see prerequisites).
- Make sure Docker Desktop is using the WSL 2 backend.
- Test GPU access with: `docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi`
- If you don't have an NVIDIA GPU, see below.

**Port 3012 already in use**
- Change the left side of the ports line in `docker-compose.yml`, e.g. `"3013:8080"`, then use that port in your browser.

---

## Running without a GPU

The CUDA image requires an NVIDIA GPU. If your friend doesn't have one, switch to the standard image: in `docker-compose.yml`, change

```yaml
image: ghcr.io/open-webui/open-webui:cuda
```

to

```yaml
image: ghcr.io/open-webui/open-webui:main
```

and delete the entire `deploy:` block (the `resources` / `nvidia` GPU section). Ollama will still use the GPU itself if one is present — this only affects Open WebUI's own acceleration.