Multi-machine optimization loop: - VPS orchestrator coordinates training and LLM analysis - Windows PC (RTX 4070 Ti) runs XGBoost/LightGBM/CatBoost with GPU - Mac Mini runs qwen3.5:27b via Ollama for strategy analysis Includes 60+ technical features, walk-forward validation, confidence-scaled position sizing, and automated convergence detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
161 lines
6.1 KiB
Markdown
161 lines
6.1 KiB
Markdown
# BTC ML Trading Strategy Optimizer
|
|
|
|
An automated optimization loop that trains ML models on BTC/USDT data, backtests trading strategies, and uses an LLM to iteratively improve the configuration.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Optimization Loop │
|
|
│ │
|
|
│ ┌──────────┐ ┌───────────────┐ ┌──────────────────────┐ │
|
|
│ │ VPS │───>│ Windows PC │───>│ Mac Mini │ │
|
|
│ │ (Orch.) │<───│ (GPU/ML) │ │ (LLM) │ │
|
|
│ │ │<───────────────────────>│ │ │
|
|
│ │ - Fetch │ │ - XGBoost │ │ - Ollama │ │
|
|
│ │ data │ │ - LightGBM │ │ - qwen3.5:27b │ │
|
|
│ │ - Coord │ │ - CatBoost │ │ - Analyze results │ │
|
|
│ │ - Store │ │ - RTX 4070 Ti │ │ - Suggest changes │ │
|
|
│ └──────────┘ └───────────────┘ └──────────────────────┘ │
|
|
│ ▲ │ │
|
|
│ └────────────────────────────────────────┘ │
|
|
│ Modified config │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Machines (Tailscale)
|
|
|
|
| Machine | Role | Address | Key Resources |
|
|
|------------|-------------|-------------------|---------------------|
|
|
| VPS | Orchestrator | localhost | Coordination, data |
|
|
| Windows PC | ML Engine | 100.76.218.38 | RTX 4070 Ti GPU |
|
|
| Mac Mini | LLM | 100.100.242.21 | Ollama, qwen3.5:27b |
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
btc-ml-optimizer/
|
|
├── orchestrator.py # Main loop — coordinates everything
|
|
├── ml_engine/
|
|
│ └── train_and_backtest.py # Self-contained ML script (runs on Windows)
|
|
├── llm_client/
|
|
│ └── analyzer.py # LLM strategy analyzer (calls Mac Mini)
|
|
├── scripts/
|
|
│ ├── fetch_data.py # BTC/USDT data fetcher (ccxt)
|
|
│ └── setup_windows.sh # Install deps on Windows PC
|
|
├── config/
|
|
│ └── initial_config.json # Starting configuration
|
|
├── data/ # OHLCV CSV files
|
|
├── results/ # Iteration results + logs
|
|
├── requirements_vps.txt # VPS Python dependencies
|
|
└── requirements_windows.txt # Windows PC Python dependencies
|
|
```
|
|
|
|
## Setup
|
|
|
|
### 1. VPS (this machine)
|
|
|
|
```bash
|
|
pip install -r requirements_vps.txt
|
|
```
|
|
|
|
### 2. Windows PC
|
|
|
|
```bash
|
|
# From VPS — installs all ML deps on Windows via SSH
|
|
bash scripts/setup_windows.sh
|
|
```
|
|
|
|
Or manually on Windows:
|
|
```bash
|
|
pip install -r requirements_windows.txt
|
|
```
|
|
|
|
### 3. Mac Mini
|
|
|
|
Ensure Ollama is running with the qwen3.5:27b model:
|
|
```bash
|
|
ollama pull qwen3.5:27b
|
|
ollama serve # should already be running
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Fetch Data
|
|
|
|
```bash
|
|
python3 scripts/fetch_data.py
|
|
```
|
|
|
|
Downloads 2 years of BTC/USDT 1h and 4h OHLCV data from Binance.
|
|
|
|
### Run the Optimizer
|
|
|
|
```bash
|
|
python3 orchestrator.py
|
|
```
|
|
|
|
The optimizer will:
|
|
1. Ensure data is fetched
|
|
2. Upload ML engine + data to Windows PC
|
|
3. Train model and backtest on GPU
|
|
4. Send results to LLM for analysis
|
|
5. Apply LLM-suggested config changes
|
|
6. Repeat until convergence (or 50 iterations)
|
|
|
|
### Run ML Engine Standalone (on Windows)
|
|
|
|
```bash
|
|
python train_and_backtest.py --config config.json --data btc_4h.csv --output results.json
|
|
```
|
|
|
|
## Configuration Reference
|
|
|
|
### `model_type`
|
|
- `xgboost` — XGBoost with GPU (default, generally best)
|
|
- `lightgbm` — LightGBM with GPU (faster training)
|
|
- `catboost` — CatBoost with GPU (handles interactions well)
|
|
- `ensemble` — Soft voting of all three
|
|
|
|
### `features`
|
|
- `technical_indicators` — List of indicators to compute
|
|
- `lookback_periods` — Windows for return/volatility features
|
|
- `use_volume_features` — Include volume-derived features
|
|
- `use_volatility_features` — Include volatility features
|
|
- `use_candle_patterns` — Include candlestick pattern features
|
|
- `use_lag_features` — Include lagged feature values
|
|
- `lag_periods` — Specific lag periods to use
|
|
|
|
### `target`
|
|
- `direction` — `"long"` or `"both"`
|
|
- `horizon_candles` — Forward-looking prediction window
|
|
- `threshold_pct` — Minimum % move to label as positive
|
|
|
|
### `hyperparameters`
|
|
Standard gradient boosting params: `learning_rate`, `max_depth`, `n_estimators`, `subsample`, `colsample_bytree`, `min_child_weight`, `gamma`, `reg_alpha`, `reg_lambda`
|
|
|
|
### `strategy`
|
|
- `entry_threshold` — Min probability to enter trade (0.5-0.8)
|
|
- `stop_loss_pct` — Stop loss percentage
|
|
- `take_profit_pct` — Take profit percentage
|
|
- `trailing_stop_pct` — Trailing stop distance
|
|
- `position_sizing` — `"confidence_scaled"` or `"fixed"`
|
|
- `min_confidence_to_trade` — Absolute minimum confidence
|
|
|
|
### `training`
|
|
- `walk_forward_windows` — Number of walk-forward splits (3-10)
|
|
- `train_pct` / `validation_pct` / `test_pct` — Data split ratios
|
|
|
|
## Convergence Criteria
|
|
|
|
The optimizer stops when:
|
|
- Sharpe ratio exceeds 3.0
|
|
- Sharpe improvement < 1% over 5 consecutive iterations
|
|
- Maximum 50 iterations reached
|
|
|
|
## Output
|
|
|
|
- `config/best_config.json` — Best configuration found
|
|
- `results/iterations.jsonl` — Full log of every iteration
|
|
- `results/results_iter_N.json` — Detailed results per iteration
|