# BTC ML Trading Strategy Optimizer

An automated optimization loop that trains ML models on BTC/USDT data, backtests trading strategies, and uses an LLM to iteratively improve the configuration.

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                        Optimization Loop                        │
│                                                                 │
│  ┌──────────┐    ┌───────────────┐    ┌──────────────────────┐  │
│  │   VPS    │───>│  Windows PC   │───>│     Mac Mini         │  │
│  │ (Orch.)  │<───│  (GPU/ML)     │    │     (LLM)           │  │
│  │          │<───────────────────────>│                      │  │
│  │ - Fetch  │    │ - XGBoost     │    │ - Ollama             │  │
│  │   data   │    │ - LightGBM    │    │ - qwen3.5:27b        │  │
│  │ - Coord  │    │ - CatBoost    │    │ - Analyze results    │  │
│  │ - Store  │    │ - RTX 4070 Ti │    │ - Suggest changes    │  │
│  └──────────┘    └───────────────┘    └──────────────────────┘  │
│       ▲                                        │                │
│       └────────────────────────────────────────┘                │
│                    Modified config                              │
└─────────────────────────────────────────────────────────────────┘
```

### Machines (Tailscale)

| Machine    | Role         | Address           | Key Resources        |
|------------|-------------|-------------------|---------------------|
| VPS        | Orchestrator | localhost          | Coordination, data   |
| Windows PC | ML Engine    | 100.76.218.38     | RTX 4070 Ti GPU      |
| Mac Mini   | LLM         | 100.100.242.21    | Ollama, qwen3.5:27b  |

## Directory Structure

```
btc-ml-optimizer/
├── orchestrator.py              # Main loop — coordinates everything
├── ml_engine/
│   └── train_and_backtest.py    # Self-contained ML script (runs on Windows)
├── llm_client/
│   └── analyzer.py              # LLM strategy analyzer (calls Mac Mini)
├── scripts/
│   ├── fetch_data.py            # BTC/USDT data fetcher (ccxt)
│   └── setup_windows.sh         # Install deps on Windows PC
├── config/
│   └── initial_config.json      # Starting configuration
├── data/                        # OHLCV CSV files
├── results/                     # Iteration results + logs
├── requirements_vps.txt         # VPS Python dependencies
└── requirements_windows.txt     # Windows PC Python dependencies
```

## Setup

### 1. VPS (this machine)

```bash
pip install -r requirements_vps.txt
```

### 2. Windows PC

```bash
# From VPS — installs all ML deps on Windows via SSH
bash scripts/setup_windows.sh
```

Or manually on Windows:
```bash
pip install -r requirements_windows.txt
```

### 3. Mac Mini

Ensure Ollama is running with the qwen3.5:27b model:
```bash
ollama pull qwen3.5:27b
ollama serve  # should already be running
```

## Usage

### Fetch Data

```bash
python3 scripts/fetch_data.py
```

Downloads 2 years of BTC/USDT 1h and 4h OHLCV data from Binance.

### Run the Optimizer

```bash
python3 orchestrator.py
```

The optimizer will:
1. Ensure data is fetched
2. Upload ML engine + data to Windows PC
3. Train model and backtest on GPU
4. Send results to LLM for analysis
5. Apply LLM-suggested config changes
6. Repeat until convergence (or 50 iterations)

### Run ML Engine Standalone (on Windows)

```bash
python train_and_backtest.py --config config.json --data btc_4h.csv --output results.json
```

## Configuration Reference

### `model_type`
- `xgboost` — XGBoost with GPU (default, generally best)
- `lightgbm` — LightGBM with GPU (faster training)
- `catboost` — CatBoost with GPU (handles interactions well)
- `ensemble` — Soft voting of all three

### `features`
- `technical_indicators` — List of indicators to compute
- `lookback_periods` — Windows for return/volatility features
- `use_volume_features` — Include volume-derived features
- `use_volatility_features` — Include volatility features
- `use_candle_patterns` — Include candlestick pattern features
- `use_lag_features` — Include lagged feature values
- `lag_periods` — Specific lag periods to use

### `target`
- `direction` — `"long"` or `"both"`
- `horizon_candles` — Forward-looking prediction window
- `threshold_pct` — Minimum % move to label as positive

### `hyperparameters`
Standard gradient boosting params: `learning_rate`, `max_depth`, `n_estimators`, `subsample`, `colsample_bytree`, `min_child_weight`, `gamma`, `reg_alpha`, `reg_lambda`

### `strategy`
- `entry_threshold` — Min probability to enter trade (0.5-0.8)
- `stop_loss_pct` — Stop loss percentage
- `take_profit_pct` — Take profit percentage
- `trailing_stop_pct` — Trailing stop distance
- `position_sizing` — `"confidence_scaled"` or `"fixed"`
- `min_confidence_to_trade` — Absolute minimum confidence

### `training`
- `walk_forward_windows` — Number of walk-forward splits (3-10)
- `train_pct` / `validation_pct` / `test_pct` — Data split ratios

## Convergence Criteria

The optimizer stops when:
- Sharpe ratio exceeds 3.0
- Sharpe improvement < 1% over 5 consecutive iterations
- Maximum 50 iterations reached

## Output

- `config/best_config.json` — Best configuration found
- `results/iterations.jsonl` — Full log of every iteration
- `results/results_iter_N.json` — Detailed results per iteration