Compare commits

..

10 Commits

Author SHA1 Message Date
BizzleBot
13bac5f654 v4: Bitcoin Accumulation Zone Monitor — on-chain metrics + backtest engine
COMPLETE PIVOT from ML trading optimizer to on-chain metrics monitor.

Architecture:
- Playwright scrapes LookIntoBitcoin Plotly Dash charts for real on-chain data
- 10 proven metrics: Puell Multiple, MVRV Z-Score, Fear & Greed, Reserve Risk,
  RHODL Ratio, NUPL, LTH Realized Price, 200W SMA, Hash Ribbons, Drawdown
- Each metric scores 0-10, composite 0-100
- No ML, no black box — every signal transparent and traceable
- Historical backtest validates scoring against actual BTC forward returns
- Recency-weighted analysis accounts for diminishing cycle returns

Full documentation in ARCHITECTURE.md
2026-03-20 23:07:53 +00:00
BizzleBot
5b3b3811ec feat: add historical backtest engine and dashboard page
- scrapers/history_collector.py: scrapes full time series from 8 LookIntoBitcoin
  charts + Fear & Greed API, stores to data/history.json (~5700 days back to 2010)
- backtesting/engine.py: scores each historical day using same thresholds as live
  scoring, computes 30d/90d/180d/1yr forward returns, bracket stats, signal events
- dashboard/server.py: adds /backtest page with dual-axis score vs price chart,
  bracket performance table, signal event list, current context box; adds backtest
  nav link and historical context box on main dashboard; 4 new API endpoints

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 22:50:57 +00:00
BizzleBot
e3c5aa9f32 chore: add .gitignore for pycache and data dirs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 22:31:35 +00:00
BizzleBot
62e32fc655 feat: replace ML optimizer with on-chain accumulation zone monitor
Complete rewrite — replaces the ML-based signal optimizer with a transparent
on-chain metric monitoring dashboard. Scrapes 10 metrics from LookIntoBitcoin
(Playwright) and free APIs, scores each 0-10, composite 0-100.

Metrics: Fear & Greed, Puell Multiple, MVRV Z-Score, Drawdown from ATH,
Price vs 200W SMA, Reserve Risk, RHODL Ratio, NUPL, LTH Realized Price,
Hash Ribbons. Auto-refreshes every 15 minutes. Settings page preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 22:31:29 +00:00
BizzleBot
aba30f7718 fix: LLM analysis + new run button + settings page support
- Fixed LLM failing silently (401 auth error on every iteration)
- Reset provider to Ollama (working) from broken OpenRouter config
- Added /api/clear endpoint + 'New Run' button to reset history
- LLM failures now logged visibly with error details
- LLM suggestions persisted to iteration data (survive restarts)
- Settings page support via llm_settings.json (multi-provider)
2026-03-20 21:51:05 +00:00
BizzleBot
c17b3b5167 v3: accumulation signal optimizer - lower initial thresholds, disable PCA, simpler model start 2026-03-19 23:55:51 +00:00
BizzleBot
560863fa0d pivot: rewrite as BTC accumulation signal optimizer
Replace day-trading bot with long-term accumulation signal model.
Predicts optimal BUY times using forward return analysis at 7d/30d/90d
horizons, scoring each candle 0-100. Primary metric is now
cost_basis_improvement_pct (model buy price vs DCA).

- train_and_backtest.py: regression models (XGBoost/LSTM hybrid),
  accumulation-focused features (price position, momentum, volatility,
  volume, cycle), forward return targets, signal quality backtesting
- orchestrator.py: cost improvement scoring, signal count validation
- analyzer.py: accumulation-focused LLM system prompt
- dashboard: cost improvement display, signal metrics table
- config: new accumulation-focused parameters

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 23:51:43 +00:00
BizzleBot
a21e635d9f feat: add LSTM, hybrid ensemble, PCA, scaler, ATR stops, rolling window
Major upgrade to the ML engine:
- LSTM model type: 2-layer PyTorch LSTM with early stopping, GPU support
- Hybrid mode: LSTM (60%) + XGBoost (40%) with agreement gating
- StandardScaler normalization (critical for LSTM)
- PCA dimensionality reduction (configurable variance retention)
- ATR-based dynamic stop-loss/take-profit adapting to volatility
- Rolling window retraining for more realistic time series validation
- Updated LLM system prompt with docs for all new parameters
- All backward compatible (xgboost/lightgbm/catboost still work)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 23:02:11 +00:00
BizzleBot
e24b6605d7 fix: disable qwen3.5 thinking mode for analyzer (was consuming all tokens), increase timeout 2026-03-19 22:32:40 +00:00
BizzleBot
d81d1dedac fix: replace unicode chars that break Windows cp1252 encoding 2026-03-19 22:25:40 +00:00
23 changed files with 4487 additions and 915 deletions

9
.gitignore vendored Normal file
View File

@ -0,0 +1,9 @@
__pycache__/
*.pyc
data/cache.json
data/history.json
config/llm_settings.json
results/
*.log
.env
node_modules/

262
ARCHITECTURE.md Normal file
View File

@ -0,0 +1,262 @@
# Bitcoin Accumulation Zone Monitor — Architecture & Logic
## Overview
This is **NOT** a trading bot or ML predictor. It monitors proven Bitcoin on-chain metrics that have historically signaled optimal accumulation (buying) zones for long-term holders. Each metric scores 0-10 points, producing a composite score of 0-100.
**Philosophy:** Every signal is transparent and traceable. No black box. The metrics used have correctly identified every major Bitcoin cycle bottom since 2010.
## How It Works
### Data Pipeline
```
LookIntoBitcoin.com ──┐
(Playwright scraper) │
├──> data/cache.json (current values, refreshed every 15min)
alternative.me API ────┤ data/history.json (full history back to 2010, refreshed weekly)
CoinGecko API ─────────┘
Scoring Engine (scoring/engine.py)
Composite Score 0-100
┌────┴────┐
▼ ▼
Dashboard Backtest Engine
(live) (historical validation)
```
### Data Sources
All data is scraped or fetched from free sources — **no API keys required**.
| Source | Method | Data |
|--------|--------|------|
| LookIntoBitcoin / BitcoinMagazinePro | Playwright browser scraping of Plotly Dash charts | Puell Multiple, MVRV Z-Score, Reserve Risk, RHODL Ratio, NUPL, 200W SMA, LTH Realized Price, LTH Supply, Hash Ribbons, Pi Cycle |
| alternative.me | Free REST API | Fear & Greed Index (daily, back to Feb 2018) |
| CoinGecko | Free REST API | BTC price, market cap, 24h change |
#### Scraping Method (LookIntoBitcoin)
The site uses Plotly Dash charts. We intercept the `_dash-update-component` XHR response which contains the full chart data as JSON:
```python
page.on("response", handler) # Intercept XHR
page.goto("https://www.lookintobitcoin.com/charts/puell-multiple/")
# Response contains: response['chart']['figure']['data'] → list of trace objects
# Each trace: {name: str, x: [dates], y: [values]}
```
This gives us the **complete historical time series** (5000+ data points per metric going back to 2010) without needing any API key.
## Scoring System
### Individual Metrics (0-10 each)
#### 1. Fear & Greed Index (source: alternative.me)
Measures market sentiment from social media, surveys, and momentum.
| F&G Value | Classification | Score |
|-----------|---------------|-------|
| 0-10 | Extreme Fear | 10 |
| 11-25 | Fear | 7 |
| 26-45 | Neutral-low | 4 |
| 46-55 | Neutral | 2 |
| 56-75 | Greed | 1 |
| 76-100 | Extreme Greed | 0 |
**Logic:** "Be fearful when others are greedy, be greedy when others are fearful." — Buffett. Extreme Fear has historically coincided with cycle bottoms.
#### 2. Puell Multiple (source: LookIntoBitcoin)
Measures miner revenue relative to 365-day average. When miners earn very little (low Puell), they're capitulating — historically a bottom signal.
| Puell Value | Meaning | Score |
|-------------|---------|-------|
| < 0.3 | Deep miner capitulation | 10 |
| 0.3-0.5 | Miner stress | 8 |
| 0.5-0.8 | Below average revenue | 5 |
| 0.8-1.2 | Normal | 3 |
| 1.2-2.0 | Above average | 1 |
| > 2.0 | Miner euphoria | 0 |
**Historical accuracy:** Puell < 0.5 identified the Dec 2018, Mar 2020, and Jun 2022 bottoms.
#### 3. MVRV Z-Score (source: LookIntoBitcoin)
Compares market value to realized value. Negative Z-Score means the market is valued below what everyone paid — extreme undervaluation.
| Z-Score | Meaning | Score |
|---------|---------|-------|
| < 0 | Below realized value | 10 |
| 0-0.5 | Undervalued | 8 |
| 0.5-1.5 | Fair value | 5 |
| 1.5-3.0 | Overvalued | 2 |
| 3.0-5.0 | Very overvalued | 1 |
| > 5.0 | Extreme overvaluation | 0 |
**Historical accuracy:** Every time MVRV Z-Score went below 0, buying led to >200% returns within 2 years (100% hit rate across all cycles).
#### 4. Drawdown from ATH (calculated from price)
How far BTC has fallen from its all-time high. Larger drawdowns = better buying opportunity historically.
| Drawdown | Score |
|----------|-------|
| > 70% | 10 |
| 50-70% | 8 |
| 30-50% | 6 |
| 20-30% | 4 |
| 10-20% | 2 |
| < 10% | 0 |
#### 5. Price vs 200-Week SMA (source: LookIntoBitcoin)
The 200-week moving average has historically acted as the absolute floor in bear markets.
| Position | Score |
|----------|-------|
| Below 200W SMA | 10 |
| 0-20% above | 6 |
| 20-50% above | 3 |
| 50-100% above | 1 |
| > 100% above | 0 |
#### 6. Reserve Risk (source: LookIntoBitcoin)
Measures the confidence of long-term holders relative to the price. Low Reserve Risk = high confidence among HODLers + low price = excellent time to buy.
| Reserve Risk | Score |
|--------------|-------|
| < 0.002 | 10 |
| 0.002-0.005 | 7 |
| 0.005-0.01 | 4 |
| 0.01-0.02 | 2 |
| > 0.02 | 0 |
#### 7. RHODL Ratio (source: LookIntoBitcoin)
Ratio of 1-week old coins to 1-2 year old coins. Low ratio = long-term holders dominating (accumulation). High ratio = short-term speculation (distribution).
| RHODL | Score |
|-------|-------|
| < 100 | 10 |
| 100-500 | 7 |
| 500-2000 | 4 |
| 2000-10000 | 1 |
| > 10000 | 0 |
#### 8. NUPL — Net Unrealized Profit/Loss (source: LookIntoBitcoin)
Shows what fraction of market cap is unrealized profit. Negative = market is at a loss (capitulation). Above 0.75 = euphoria.
| NUPL | Phase | Score |
|------|-------|-------|
| < 0 | Capitulation | 10 |
| 0-0.25 | Hope/Fear | 7 |
| 0.25-0.5 | Optimism | 4 |
| 0.5-0.75 | Belief/Greed | 1 |
| > 0.75 | Euphoria | 0 |
#### 9. LTH Realized Price vs Spot (source: LookIntoBitcoin)
Long-Term Holder Realized Price = average cost basis of coins held >155 days. When spot price drops below this, even diamond hands are underwater — extreme value.
| Position | Score |
|----------|-------|
| Price below LTH RP | 10 |
| 0-20% above | 6 |
| 20-50% above | 3 |
| > 50% above | 1 |
#### 10. Hash Ribbons / Miner Capitulation (source: LookIntoBitcoin)
When miners capitulate (hash rate declining), it signals maximum pain. The recovery signal (hash rate resuming growth) has been a reliable buy signal.
| Signal | Score |
|--------|-------|
| Active buy signal | 10 |
| Recent recovery | 6 |
| Normal | 3 |
| Miner euphoria | 0 |
### Composite Score
```
Total Score = Sum of all individual metric scores (0-100)
```
| Score Range | Assessment | Action |
|-------------|------------|--------|
| 85-100 | Extreme Accumulation Zone | Strong buy — historically rare, ~4x per decade |
| 70-84 | Strong Accumulation | Buy — excellent long-term entry |
| 55-69 | Moderate Opportunity | Consider buying — decent entry |
| 40-54 | Neutral | Hold — not compelling either way |
| 25-39 | Caution | Reduce or wait — market heating up |
| 0-24 | Extreme Caution | Do NOT buy — historically the worst times |
## Backtest Engine
### Purpose
Reconstruct the composite score historically and compare against actual BTC forward returns to validate the scoring system's accuracy.
### Methodology
1. **Historical Reconstruction:** Using scraped historical data (2010-present), calculate what each metric's score would have been on every day
2. **Forward Returns:** For each historical day, calculate what BTC actually returned over the next 30, 90, 180, and 365 days
3. **Score Bracket Analysis:** Group days by score bracket and calculate average forward returns, win rates, max drawdowns
4. **Recency Weighting:** More recent cycles weighted higher because BTC's cycle-over-cycle returns diminish as it matures:
- 2022-present: 4x weight
- 2020-2021: 3x weight
- 2018-2019: 2x weight
- Before 2018: 1x weight
5. **Cycle-Separated Results:** Returns shown per cycle (Cycle 3: 2016-2019, Cycle 4: 2020-2023, Cycle 5: 2024+)
### Diminishing Returns Adjustment
Bitcoin's gains decrease every cycle. A score of 90 in 2018 led to different outcomes than a score of 90 in 2022:
- The backtest separates results by cycle
- Current expectations are based on the 2 most recent comparable cycles
- Adaptive thresholds recalculate based on rolling 2-year windows
## Architecture
```
/opt/apps/btc-ml-optimizer/
├── dashboard/
│ └── server.py # FastAPI + inline HTML/JS dashboard
├── scrapers/
│ ├── __init__.py
│ ├── lookintobitcoin.py # Playwright scraper for on-chain charts
│ ├── history_collector.py # Full historical data collection
│ ├── fear_greed.py # alternative.me Fear & Greed API
│ └── price.py # CoinGecko BTC price API
├── scoring/
│ ├── __init__.py
│ └── engine.py # Scoring logic and thresholds
├── backtesting/
│ ├── __init__.py
│ └── engine.py # Historical backtest calculations
├── data/
│ ├── cache.json # Current metric values (refreshed every 15min)
│ └── history.json # Full historical data (refreshed weekly)
├── config/
│ ├── thresholds.json # Configurable scoring thresholds
│ └── llm_settings.json # Optional LLM provider config for AI commentary
├── llm_client/
│ └── analyzer.py # Optional LLM integration for signal analysis
└── README.md
```
## Infrastructure
- **Server:** Main VPS (Hostinger), Tailscale IP 100.94.106.120
- **Port:** 3088
- **Process Manager:** pm2 (`btc-ml-optimizer`)
- **Dashboard URL:** http://100.94.106.120:3088
- **Backtest URL:** http://100.94.106.120:3088/backtest
- **Git Repo:** https://git.bizzle.lol/bizzle/btc-accumulation-monitor
## Dependencies
- Python 3.13
- FastAPI + uvicorn
- Playwright (Chromium, headless)
- requests
- No ML libraries required
- No paid API keys required

0
backtesting/__init__.py Normal file
View File

412
backtesting/engine.py Normal file
View File

@ -0,0 +1,412 @@
"""Historical backtest engine for Bitcoin Accumulation Zone scoring."""
import json
import logging
import os
import sys
from collections import defaultdict
from datetime import datetime, timedelta
log = logging.getLogger(__name__)
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, BASE_DIR)
HISTORY_PATH = os.path.join(BASE_DIR, "data", "history.json")
CACHE_PATH = os.path.join(BASE_DIR, "data", "cache.json")
# Score brackets matching the dashboard assessment levels
BRACKETS = [
(0, 20, "Extreme Caution"),
(21, 40, "Caution"),
(41, 55, "Neutral"),
(56, 70, "Moderate Opportunity"),
(71, 85, "Strong Accumulation"),
(86, 100, "Extreme Accumulation"),
]
# Scoring thresholds — replicated from scoring/engine.py for standalone use
METRIC_SCORERS = {
"fear_greed": {
"ranges": [[None, 10, 10], [10, 25, 7], [25, 45, 4], [45, 55, 2], [55, 75, 1], [75, None, 0]],
},
"puell_multiple": {
"ranges": [[None, 0.3, 10], [0.3, 0.5, 8], [0.5, 0.8, 5], [0.8, 1.2, 3], [1.2, 2.0, 1], [2.0, None, 0]],
},
"mvrv_zscore": {
"ranges": [[None, 0, 10], [0, 0.5, 8], [0.5, 1.5, 5], [1.5, 3, 2], [3, 5, 1], [5, None, 0]],
},
"reserve_risk": {
"ranges": [[None, 0.002, 10], [0.002, 0.005, 7], [0.005, 0.01, 4], [0.01, 0.02, 2], [0.02, None, 0]],
},
"rhodl_ratio": {
"ranges": [[None, 100, 10], [100, 500, 7], [500, 2000, 4], [2000, 10000, 1], [10000, None, 0]],
},
"nupl": {
"ranges": [[None, 0, 10], [0, 0.25, 7], [0.25, 0.5, 4], [0.5, 0.75, 1], [0.75, None, 0]],
},
}
# Ratio-based metrics: score based on price vs reference value
RATIO_SCORERS = {
"price_vs_200w_sma": {
# pct_above ranges
"ranges": [[None, 0, 10], [0, 20, 6], [20, 50, 3], [50, 100, 1], [100, None, 0]],
"price_key": "btc_price",
"ref_key": "200w_sma",
},
"lth_realized_price": {
"ranges": [[None, 0, 10], [0, 20, 6], [20, 50, 3], [50, None, 1]],
"price_key": "btc_price",
"ref_key": "lth_realized_price",
},
}
# Drawdown scoring
DRAWDOWN_RANGES = [[70, None, 10], [50, 70, 8], [30, 50, 6], [20, 30, 4], [10, 20, 2], [None, 10, 0]]
def _score_range(value, ranges):
"""Score a value using range-based thresholds."""
if value is None:
return None
for low, high, score in ranges:
low_ok = low is None or value >= low
high_ok = high is None or value < high
if low_ok and high_ok:
return score
return 0
def _build_daily_index(history):
"""Build a dict mapping metric_key -> {date_str: value} for fast lookup."""
index = {}
for key, data in history.items():
if key.startswith("_") or not isinstance(data, dict) or "dates" not in data:
continue
lookup = {}
for d, v in zip(data["dates"], data["values"]):
lookup[d] = v
index[key] = lookup
return index
def _get_all_dates(index):
"""Get sorted union of all dates across all metrics."""
all_dates = set()
for lookup in index.values():
all_dates.update(lookup.keys())
return sorted(all_dates)
def _last_known_value(lookup, date, max_lookback=30):
"""Get value for date, or most recent prior value within lookback window."""
if date in lookup:
return lookup[date]
d = datetime.strptime(date, "%Y-%m-%d")
for i in range(1, max_lookback + 1):
prev = (d - timedelta(days=i)).strftime("%Y-%m-%d")
if prev in lookup:
return lookup[prev]
return None
def _compute_ath_series(price_lookup, dates):
"""Compute running ATH and drawdown for each date."""
ath = 0
drawdowns = {}
for d in dates:
p = price_lookup.get(d)
if p is None:
continue
if p > ath:
ath = p
if ath > 0:
drawdowns[d] = ((ath - p) / ath) * 100
return drawdowns
def score_day(date, index, drawdowns):
"""Score a single day using all available metrics. Returns (composite_score, individual_scores, n_metrics)."""
scores = []
details = {}
# Simple range-based metrics
for metric_key, cfg in METRIC_SCORERS.items():
val = _last_known_value(index.get(metric_key, {}), date)
if val is not None:
s = _score_range(val, cfg["ranges"])
if s is not None:
scores.append(s)
details[metric_key] = {"value": val, "score": s}
# Ratio-based metrics (price vs reference)
for metric_key, cfg in RATIO_SCORERS.items():
price_val = _last_known_value(index.get(cfg["price_key"], {}), date)
# Try alternate price keys
if price_val is None:
for pk in ["btc_price_coingecko", "btc_price_sma", "btc_price_lth"]:
price_val = _last_known_value(index.get(pk, {}), date)
if price_val is not None:
break
ref_val = _last_known_value(index.get(cfg["ref_key"], {}), date)
if price_val is not None and ref_val is not None and ref_val > 0:
pct_above = ((price_val - ref_val) / ref_val) * 100
s = _score_range(pct_above, cfg["ranges"])
if s is not None:
scores.append(s)
details[metric_key] = {"value": pct_above, "score": s}
# Drawdown
dd = drawdowns.get(date)
if dd is not None:
s = _score_range(dd, DRAWDOWN_RANGES)
if s is not None:
scores.append(s)
details["drawdown"] = {"value": dd, "score": s}
if not scores:
return None, details, 0
composite = sum(scores) / len(scores) * 10
return round(composite, 1), details, len(scores)
def compute_forward_returns(price_lookup, dates_sorted):
"""Precompute forward returns for all dates."""
periods = [30, 90, 180, 365]
returns = {}
for d in dates_sorted:
p0 = price_lookup.get(d)
if p0 is None or p0 <= 0:
continue
r = {}
dt = datetime.strptime(d, "%Y-%m-%d")
for days in periods:
future = (dt + timedelta(days=days)).strftime("%Y-%m-%d")
pf = price_lookup.get(future)
if pf is not None:
r[f"{days}d"] = round(((pf - p0) / p0) * 100, 2)
if r:
returns[d] = r
return returns
def compute_max_drawdown_forward(price_lookup, date, window=90):
"""Compute max drawdown within N days after a given date."""
dt = datetime.strptime(date, "%Y-%m-%d")
p0 = price_lookup.get(date)
if p0 is None or p0 <= 0:
return None
peak = p0
max_dd = 0
for i in range(1, window + 1):
future = (dt + timedelta(days=i)).strftime("%Y-%m-%d")
pf = price_lookup.get(future)
if pf is None:
continue
if pf > peak:
peak = pf
dd = ((peak - pf) / peak) * 100
if dd > max_dd:
max_dd = dd
return round(max_dd, 2) if max_dd > 0 else 0
def run_backtest():
"""Run the full backtest and return comprehensive results."""
log.info("Loading historical data...")
if not os.path.exists(HISTORY_PATH):
return {"error": "No historical data found. Run history collector first."}
with open(HISTORY_PATH) as f:
history = json.load(f)
index = _build_daily_index(history)
# Build price lookup (prefer coingecko for completeness)
price_lookup = {}
for pk in ["btc_price_coingecko", "btc_price", "btc_price_sma", "btc_price_lth"]:
if pk in index:
for d, v in index[pk].items():
if d not in price_lookup:
price_lookup[d] = v
all_dates = _get_all_dates(index)
if not all_dates:
return {"error": "No date data available."}
log.info("Date range: %s to %s (%d days)", all_dates[0], all_dates[-1], len(all_dates))
# Compute drawdowns
drawdowns = _compute_ath_series(price_lookup, all_dates)
# Precompute forward returns
log.info("Computing forward returns...")
fwd_returns = compute_forward_returns(price_lookup, all_dates)
# Score each day
log.info("Scoring %d days...", len(all_dates))
daily_scores = []
for d in all_dates:
composite, details, n_metrics = score_day(d, index, drawdowns)
if composite is not None and n_metrics >= 3: # Require at least 3 metrics
price = price_lookup.get(d)
entry = {
"date": d,
"score": composite,
"n_metrics": n_metrics,
"price": price,
"forward_returns": fwd_returns.get(d, {}),
}
daily_scores.append(entry)
if not daily_scores:
return {"error": "No scored days (insufficient metric overlap)."}
log.info("Scored %d days with 3+ metrics", len(daily_scores))
# --- Bracket statistics ---
bracket_stats = []
for low, high, label in BRACKETS:
days_in = [d for d in daily_scores if low <= d["score"] <= high]
if not days_in:
bracket_stats.append({
"range": f"{low}-{high}", "label": label, "days": 0,
})
continue
stats = {"range": f"{low}-{high}", "label": label, "days": len(days_in)}
for period in ["30d", "90d", "180d", "365d"]:
returns = [d["forward_returns"][period] for d in days_in if period in d["forward_returns"]]
if returns:
returns_sorted = sorted(returns)
stats[f"avg_{period}"] = round(sum(returns) / len(returns), 2)
stats[f"median_{period}"] = round(returns_sorted[len(returns_sorted) // 2], 2)
stats[f"win_rate_{period}"] = round(len([r for r in returns if r > 0]) / len(returns) * 100, 1)
stats[f"max_gain_{period}"] = round(max(returns), 2)
stats[f"max_loss_{period}"] = round(min(returns), 2)
stats[f"n_{period}"] = len(returns)
# Average max drawdown within 90 days
dd_list = []
for d in days_in:
dd = compute_max_drawdown_forward(price_lookup, d["date"], 90)
if dd is not None:
dd_list.append(dd)
if dd_list:
stats["avg_max_drawdown_90d"] = round(sum(dd_list) / len(dd_list), 2)
bracket_stats.append(stats)
# --- Peak signal events ---
signal_events = []
thresholds = [90, 80, 70]
for thresh in thresholds:
prev_score = 0
for d in daily_scores:
if d["score"] >= thresh and prev_score < thresh:
event = {
"date": d["date"],
"score": d["score"],
"threshold": thresh,
"price": d["price"],
"forward_returns": d["forward_returns"],
}
# Add future prices
if d["price"]:
dt = datetime.strptime(d["date"], "%Y-%m-%d")
for days_ahead in [30, 90, 365]:
future = (dt + timedelta(days=days_ahead)).strftime("%Y-%m-%d")
fp = price_lookup.get(future)
if fp:
event[f"price_{days_ahead}d"] = round(fp, 2)
signal_events.append(event)
prev_score = d["score"]
signal_events.sort(key=lambda e: e["date"])
# --- Current signal context ---
all_scores_list = [d["score"] for d in daily_scores]
all_scores_list.sort()
# Get current score from cache
current_score = None
current_price = None
if os.path.exists(CACHE_PATH):
try:
with open(CACHE_PATH) as f:
cache = json.load(f)
scored = cache.get("_scored", {})
current_score = scored.get("composite_score")
current_price = cache.get("price", {}).get("price")
except Exception:
pass
# If no cache, use latest daily score
if current_score is None and daily_scores:
current_score = daily_scores[-1]["score"]
current_price = daily_scores[-1].get("price")
current_context = None
if current_score is not None:
# Percentile
below = len([s for s in all_scores_list if s <= current_score])
percentile = round(below / len(all_scores_list) * 100, 1)
# Find comparable historical periods
comparable = []
margin = 5
for d in daily_scores:
if abs(d["score"] - current_score) <= margin and d["forward_returns"]:
comparable.append(d)
avg_1yr = None
if comparable:
yr_returns = [d["forward_returns"]["365d"] for d in comparable if "365d" in d["forward_returns"]]
if yr_returns:
avg_1yr = round(sum(yr_returns) / len(yr_returns), 2)
# Best comparable examples (most recent 5)
examples = []
for d in comparable[-5:]:
examples.append({
"date": d["date"],
"score": d["score"],
"price": d["price"],
"forward_returns": d["forward_returns"],
})
current_context = {
"current_score": current_score,
"current_price": current_price,
"percentile": percentile,
"comparable_days": len(comparable),
"avg_1yr_return": avg_1yr,
"examples": examples,
}
# --- Build time series for charting ---
# Downsample to weekly for chart efficiency
chart_data = []
for i, d in enumerate(daily_scores):
# Include every 7th day + last day
if i % 7 == 0 or i == len(daily_scores) - 1:
chart_data.append({
"date": d["date"],
"score": d["score"],
"price": d["price"],
})
result = {
"date_range": {"start": daily_scores[0]["date"], "end": daily_scores[-1]["date"]},
"total_days_scored": len(daily_scores),
"bracket_stats": bracket_stats,
"signal_events": signal_events,
"current_context": current_context,
"chart_data": chart_data,
"computed_at": datetime.utcnow().isoformat() + "Z",
}
log.info("Backtest complete: %d days, %d signal events", len(daily_scores), len(signal_events))
return result

68
config/best_config.json Normal file
View File

@ -0,0 +1,68 @@
{
"model_type": "xgboost",
"features": {
"use_price_position": true,
"use_momentum": true,
"use_volatility": true,
"use_volume": true,
"use_cycle": true,
"use_pca": false,
"pca_variance": 0.95,
"use_scaler": true
},
"target": {
"type": "regression",
"forward_periods_1h": [
168,
720,
2160
],
"forward_periods_4h": [
42,
180,
540
],
"weights": [
0.2,
0.3,
0.5
],
"score_range": [
0,
100
]
},
"hyperparameters": {
"learning_rate": 0.01,
"max_depth": 4,
"n_estimators": 300,
"subsample": 0.8,
"colsample_bytree": 0.8,
"min_child_weight": 20,
"gamma": 0.3,
"reg_alpha": 0.5,
"reg_lambda": 3.0,
"lstm_hidden_size": 128,
"lstm_num_layers": 2,
"lstm_dropout": 0.3,
"lstm_epochs": 100,
"lstm_batch_size": 64,
"lstm_sequence_length": 30,
"lstm_patience": 10
},
"strategy": {
"strong_buy_threshold": 65,
"good_buy_threshold": 55,
"poor_threshold": 35
},
"training": {
"rolling_window": true,
"rolling_train_size": 2500,
"rolling_test_size": 300,
"walk_forward_windows": 5,
"train_pct": 0.7,
"validation_pct": 0.15,
"test_pct": 0.15
},
"timeframe": "4h"
}

View File

@ -0,0 +1,68 @@
{
"model_type": "xgboost",
"features": {
"use_price_position": true,
"use_momentum": true,
"use_volatility": true,
"use_volume": true,
"use_cycle": true,
"use_pca": false,
"pca_variance": 0.85,
"use_scaler": true
},
"target": {
"type": "regression",
"forward_periods_1h": [
168,
720,
2160
],
"forward_periods_4h": [
42,
180,
540
],
"weights": [
0.2,
0.3,
0.5
],
"score_range": [
0,
100
]
},
"hyperparameters": {
"learning_rate": 0.005,
"max_depth": 5,
"n_estimators": 800,
"subsample": 0.7,
"colsample_bytree": 0.7,
"min_child_weight": 15,
"gamma": 0.5,
"reg_alpha": 0.3,
"reg_lambda": 1.0,
"lstm_hidden_size": 64,
"lstm_num_layers": 2,
"lstm_dropout": 0.4,
"lstm_epochs": 80,
"lstm_batch_size": 64,
"lstm_sequence_length": 30,
"lstm_patience": 15
},
"strategy": {
"strong_buy_threshold": 55,
"good_buy_threshold": 35,
"poor_threshold": 20
},
"training": {
"rolling_window": true,
"rolling_train_size": 3500,
"rolling_test_size": 300,
"walk_forward_windows": 5,
"train_pct": 0.7,
"validation_pct": 0.15,
"test_pct": 0.15
},
"timeframe": "4h"
}

View File

@ -1,59 +1,68 @@
{
"model_type": "xgboost",
"features": {
"technical_indicators": [
"RSI_14", "RSI_7", "RSI_21",
"MACD_line", "MACD_signal", "MACD_hist",
"BB_upper", "BB_lower", "BB_width",
"ATR_14",
"SMA_5", "SMA_10", "SMA_20", "SMA_50", "SMA_200",
"EMA_5", "EMA_10", "EMA_20", "EMA_50",
"OBV",
"stoch_k", "stoch_d",
"williams_r",
"CCI_20",
"ROC_10",
"keltner_upper", "keltner_lower"
],
"lookback_periods": [3, 5, 10, 20],
"use_volume_features": true,
"use_volatility_features": true,
"use_candle_patterns": true,
"use_lag_features": true,
"lag_periods": [1, 2, 3, 5]
"use_price_position": true,
"use_momentum": true,
"use_volatility": true,
"use_volume": true,
"use_cycle": true,
"use_pca": false,
"pca_variance": 0.95,
"use_scaler": true
},
"target": {
"type": "classification",
"direction": "long",
"horizon_candles": 6,
"threshold_pct": 1.0
"type": "regression",
"forward_periods_1h": [
168,
720,
2160
],
"forward_periods_4h": [
42,
180,
540
],
"weights": [
0.2,
0.3,
0.5
],
"score_range": [
0,
100
]
},
"hyperparameters": {
"learning_rate": 0.05,
"max_depth": 6,
"n_estimators": 500,
"learning_rate": 0.01,
"max_depth": 4,
"n_estimators": 300,
"subsample": 0.8,
"colsample_bytree": 0.8,
"min_child_weight": 5,
"gamma": 0.1,
"reg_alpha": 0.1,
"reg_lambda": 1.0
"min_child_weight": 20,
"gamma": 0.3,
"reg_alpha": 0.5,
"reg_lambda": 3.0,
"lstm_hidden_size": 128,
"lstm_num_layers": 2,
"lstm_dropout": 0.3,
"lstm_epochs": 100,
"lstm_batch_size": 64,
"lstm_sequence_length": 30,
"lstm_patience": 10
},
"strategy": {
"entry_threshold": 0.60,
"exit_type": "trailing_stop",
"stop_loss_pct": 2.0,
"take_profit_pct": 4.0,
"trailing_stop_pct": 1.5,
"position_sizing": "confidence_scaled",
"max_position_pct": 100,
"min_confidence_to_trade": 0.55
"strong_buy_threshold": 65,
"good_buy_threshold": 55,
"poor_threshold": 35
},
"training": {
"rolling_window": true,
"rolling_train_size": 2500,
"rolling_test_size": 300,
"walk_forward_windows": 5,
"train_pct": 0.7,
"validation_pct": 0.15,
"test_pct": 0.15
},
"timeframe": "4h"
}
}

21
config/llm_settings.json Normal file
View File

@ -0,0 +1,21 @@
{
"provider": "ollama",
"model": "qwen3.5:27b",
"providers": {
"ollama": {
"base_url": "http://100.100.242.21:11434"
},
"lmstudio": {
"base_url": "http://100.100.242.21:1234"
},
"openai": {
"api_key": ""
},
"anthropic": {
"api_key": ""
},
"openrouter": {
"api_key": ""
}
}
}

35
config/thresholds.json Normal file
View File

@ -0,0 +1,35 @@
{
"fear_greed": {
"ranges": [[0, 10, 10], [11, 25, 7], [26, 45, 4], [46, 55, 2], [56, 75, 1], [76, 100, 0]]
},
"puell_multiple": {
"ranges": [[null, 0.3, 10], [0.3, 0.5, 8], [0.5, 0.8, 5], [0.8, 1.2, 3], [1.2, 2.0, 1], [2.0, null, 0]]
},
"mvrv_zscore": {
"ranges": [[null, 0, 10], [0, 0.5, 8], [0.5, 1.5, 5], [1.5, 3, 2], [3, 5, 1], [5, null, 0]]
},
"drawdown": {
"ranges": [[70, null, 10], [50, 70, 8], [30, 50, 6], [20, 30, 4], [10, 20, 2], [null, 10, 0]]
},
"price_vs_200w_sma": {
"ranges": [[null, 0, 10], [0, 20, 6], [20, 50, 3], [50, 100, 1], [100, null, 0]]
},
"reserve_risk": {
"ranges": [[null, 0.002, 10], [0.002, 0.005, 7], [0.005, 0.01, 4], [0.01, 0.02, 2], [0.02, null, 0]]
},
"rhodl_ratio": {
"ranges": [[null, 100, 10], [100, 500, 7], [500, 2000, 4], [2000, 10000, 1], [10000, null, 0]]
},
"nupl": {
"ranges": [[null, 0, 10], [0, 0.25, 7], [0.25, 0.5, 4], [0.5, 0.75, 1], [0.75, null, 0]]
},
"lth_realized_price": {
"ranges": [[null, 0, 10], [0, 20, 6], [20, 50, 3], [50, null, 1]]
},
"hash_ribbons": {
"buy_signal": 10,
"recent_recovery": 6,
"normal": 3,
"euphoria": 0
}
}

File diff suppressed because it is too large Load Diff

4
data/score_history.jsonl Normal file
View File

@ -0,0 +1,4 @@
{"timestamp": "2026-03-20T22:26:50.475811+00:00", "composite_score": 32.5, "scored_count": 8, "metrics": {"fear_greed": {"score": 7, "value": 11}, "puell_multiple": {"score": 5, "value": 0.6602699608966011}, "mvrv_zscore": {"score": 5, "value": 0.5211180167687892}, "drawdown": {"score": 6, "value": 43.891180203045685}, "price_vs_200w_sma": {"score": null, "value": 0.0}, "reserve_risk": {"score": 0, "value": 69871.0}, "rhodl_ratio": {"score": 0, "value": 69871.0}, "nupl": {"score": 0, "value": 69871.0}, "lth_realized_price": {"score": null, "value": null}, "hash_ribbons": {"score": 3, "value": null}}}
{"timestamp": "2026-03-20T22:30:13.547149+00:00", "composite_score": 51.0, "scored_count": 10, "metrics": {"fear_greed": {"score": 7, "value": 11}, "puell_multiple": {"score": 5, "value": 0.6602699608966011}, "mvrv_zscore": {"score": 5, "value": 0.5211180167687892}, "drawdown": {"score": 6, "value": 43.910215736040605}, "price_vs_200w_sma": {"score": 3, "value": 58895.78086828114}, "reserve_risk": {"score": 10, "value": 0.0012985709697654493}, "rhodl_ratio": {"score": 4, "value": 1230.6243545314708}, "nupl": {"score": 7, "value": 0.22243290955405431}, "lth_realized_price": {"score": 1, "value": 43346.58756410873}, "hash_ribbons": {"score": 3, "value": null}}}
{"timestamp": "2026-03-20T22:46:34.952569+00:00", "composite_score": 51.0, "scored_count": 10, "metrics": {"fear_greed": {"score": 7, "value": 11}, "puell_multiple": {"score": 5, "value": 0.6602699608966011}, "mvrv_zscore": {"score": 5, "value": 0.5211180167687892}, "drawdown": {"score": 6, "value": 43.931630710659896}, "price_vs_200w_sma": {"score": 3, "value": 58895.78086828114}, "reserve_risk": {"score": 10, "value": 0.0012985709697654493}, "rhodl_ratio": {"score": 4, "value": 1230.6243545314708}, "nupl": {"score": 7, "value": 0.22243290955405431}, "lth_realized_price": {"score": 1, "value": 43346.58756410873}, "hash_ribbons": {"score": 3, "value": null}}}
{"timestamp": "2026-03-20T22:51:27.724327+00:00", "composite_score": 54.0, "scored_count": 10, "metrics": {"fear_greed": {"score": 7, "value": 11}, "puell_multiple": {"score": 5, "value": 0.6602699608966011}, "mvrv_zscore": {"score": 5, "value": 0.5211180167687892}, "drawdown": {"score": 6, "value": 43.94907994923858}, "price_vs_200w_sma": {"score": 6, "value": 58895.78086828114}, "reserve_risk": {"score": 10, "value": 0.0012985709697654493}, "rhodl_ratio": {"score": 4, "value": 1230.6243545314708}, "nupl": {"score": 7, "value": 0.22243290955405431}, "lth_realized_price": {"score": 1, "value": 43346.58756410873}, "hash_ribbons": {"score": 3, "value": null}}}

Binary file not shown.

View File

@ -1,109 +1,274 @@
#!/usr/bin/env python3
"""
LLM Strategy Analyzer Calls Ollama on Mac Mini to analyze results
LLM Accumulation Signal Analyzer -- Calls LLM to analyze results
and suggest config modifications for the next iteration.
Supports multiple providers: Ollama, LM Studio, OpenAI, Anthropic, OpenRouter.
"""
import json
import os
import re
import requests
OLLAMA_URL = "http://100.100.242.21:11434"
MODEL = "qwen3.5:27b"
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
LLM_SETTINGS_PATH = os.path.join(BASE_DIR, "config", "llm_settings.json")
SYSTEM_PROMPT = """You are a quantitative trading strategy optimizer. You analyze ML model backtesting results for a BTC/USDT trading strategy and suggest precise modifications to improve performance.
# Fallback defaults
DEFAULT_OLLAMA_URL = "http://100.100.242.21:11434"
DEFAULT_MODEL = "qwen3.5:27b"
## Your Task
Given the current configuration and results, suggest 1-3 specific, justified changes to the configuration for the next iteration. Be methodical and scientific change one thing at a time when possible.
def load_llm_settings():
"""Load LLM settings from config file, with fallback to defaults."""
if os.path.exists(LLM_SETTINGS_PATH):
with open(LLM_SETTINGS_PATH) as f:
return json.load(f)
return {
"provider": "ollama",
"model": DEFAULT_MODEL,
"providers": {
"ollama": {"base_url": DEFAULT_OLLAMA_URL},
},
}
SYSTEM_PROMPT = """You are a quantitative analyst optimizing a BTC ACCUMULATION SIGNAL model. The goal is NOT day-trading -- it is finding statistically optimal times to BUY BTC for long-term holding.
## Core Question
"Given current market conditions, is NOW a good time to BUY BTC for long-term holding?"
## What the Model Does
For each candle, the model predicts an Accumulation Score (0-100):
- 90-100: STRONG BUY -- historically rare, excellent entry point
- 70-89: GOOD BUY -- better than average entry
- 50-69: NEUTRAL -- average time to buy
- 30-49: WAIT -- price likely to come down
- 0-29: POOR -- historically bad time to buy (near local tops)
The model is trained on ACTUAL forward returns at 7d, 30d, and 90d horizons, weighted 20/30/50. Times when buying led to the best long-term returns get the highest scores.
## Primary Metric: cost_basis_improvement_pct
This measures how much better the model's average buy price is vs uniform DCA.
- 10%+ = good
- 15%+ = excellent
- 20%+ = exceptional
Also require strong_buy_signal_count >= 30 for statistical validity.
## Config Parameters You Can Modify
**model_type**: "xgboost", "lightgbm", "catboost", or "ensemble"
- xgboost: Generally best for structured data, fast GPU training
- lightgbm: Faster training, good with large feature sets
- catboost: Handles feature interactions well, less tuning needed
- ensemble: Combines all three, reduces variance but slower
**model_type**: "xgboost", "lightgbm", "catboost", "lstm", or "hybrid"
- hybrid: Average of LSTM + XGBoost regression predictions. Recommended default.
- xgboost: Fast GPU training, good for structured features.
- lstm: Captures temporal patterns in price sequences.
**hyperparameters**:
- learning_rate (0.001-0.3): Lower = more robust but slower. If overfitting, decrease.
- max_depth (3-10): Controls model complexity. Deeper = more overfitting risk.
- n_estimators (100-2000): More trees = better fit but diminishing returns.
- subsample (0.5-1.0): Row sampling. Lower = more regularization.
- colsample_bytree (0.5-1.0): Feature sampling per tree. Lower = more diversity.
- min_child_weight (1-20): Higher = more conservative splits.
- gamma (0-5): Minimum loss reduction for split. Higher = more pruning.
- reg_alpha (0-10): L1 regularization. Encourages sparsity.
- reg_lambda (0-10): L2 regularization. Prevents large weights.
**hyperparameters** (gradient boosting):
- learning_rate (0.001-0.1): Lower = more robust. Start conservative.
- max_depth (3-8): Controls complexity. Deeper risks overfitting.
- n_estimators (200-1500): More trees = better fit but diminishing returns.
- subsample (0.5-1.0): Row sampling for regularization.
- colsample_bytree (0.5-1.0): Feature sampling per tree.
- min_child_weight (5-30): Higher = more conservative (important for noisy targets).
- gamma (0-5): Minimum loss reduction for split.
- reg_alpha (0-10): L1 regularization.
- reg_lambda (1-10): L2 regularization. Higher values prevent overfitting.
**hyperparameters** (LSTM):
- lstm_hidden_size (32-256): Hidden units.
- lstm_num_layers (1-4): Stacked layers. 2 is usually optimal.
- lstm_dropout (0.1-0.5): Regularization.
- lstm_epochs (50-200): Max training epochs (early stopping usually triggers).
- lstm_batch_size (32-128): Smaller = noisier but better generalization.
- lstm_sequence_length (15-60): Past candles the LSTM sees. Longer = more context.
- lstm_patience (5-20): Early stopping patience.
**target**:
- direction: "long" or "both"
- horizon_candles (1-20): How far ahead to predict. Longer = smoother but lagging.
- threshold_pct (0.3-3.0): Minimum move % to label as positive. Higher = fewer but clearer signals.
- forward_periods_4h: List of 3 forward periods in 4h candles [short, medium, long].
Defaults: [42, 180, 540] = roughly [7d, 30d, 90d]
- weights: Weights for each period. Default [0.2, 0.3, 0.5] (emphasize long-term).
- score_range: [0, 100] -- do not change.
**strategy**:
- entry_threshold (0.5-0.8): Min prediction probability to enter trade. Higher = fewer trades, higher quality.
- stop_loss_pct (0.5-5.0): Max loss before exit. Tighter = more stopped out.
- take_profit_pct (1.0-10.0): Target profit. Should be > stop_loss for positive expectancy.
- trailing_stop_pct (0.5-3.0): Trailing stop distance. Tighter = locks profit faster but exits early.
- min_confidence_to_trade (0.5-0.9): Absolute minimum confidence to consider.
- exit_type: "trailing_stop" or "fixed" (just SL/TP)
- strong_buy_threshold (70-95): Score above which = STRONG BUY signal. Higher = fewer but better signals.
- good_buy_threshold (50-80): Score above which = GOOD BUY. Used for cost basis comparison.
- poor_threshold (10-40): Score below which = POOR time to buy.
**features**:
- use_volume_features (true/false): Volume features can be noisy in crypto.
- use_candle_patterns (true/false): Candle patterns may or may not help.
- use_lag_features (true/false): Lagged features capture momentum.
- lag_periods: List of lag periods [1,2,3,5,10]
- lookback_periods: List of lookback windows [3,5,10,20]
- use_price_position (true/false): Distance from ATH, 52w high/low, percentile.
- use_momentum (true/false): RSI, MACD, Stochastic, Williams %R, ROC.
- use_volatility (true/false): Bollinger Bands, ATR, consecutive red candles, drawdown.
- use_volume (true/false): Volume ratio, OBV, red/green volume ratio.
- use_cycle (true/false): MA cross regime, candles since major drawdown.
- use_pca (true/false): PCA dimensionality reduction.
- pca_variance (0.80-0.99): Variance to retain.
- use_scaler (true/false): StandardScaler. Critical for LSTM.
**training**:
- walk_forward_windows (3-10): More windows = more robust but less data per window.
- rolling_window (true/false): Rolling vs static walk-forward.
- rolling_train_size (1500-5000): Training window candles.
- rolling_test_size (100-500): Test window candles.
## Key Metrics to Optimize (in priority order)
1. **Sharpe Ratio** (target: > 2.0): Risk-adjusted return. Most important metric.
2. **Profit Factor** (target: > 1.5): Gross profit / gross loss.
3. **Max Drawdown** (target: > -15%): Worst peak-to-trough decline.
4. **Win Rate** (target: > 55%): Percentage of winning trades.
5. **Trade Count**: Need enough trades for statistical significance (>50).
## Key Metrics to Analyze
1. **cost_basis_improvement_pct**: PRIMARY metric. How much better is model buy price vs DCA.
2. **strong_buy_signal_count**: Must be >= 30 for validity. Too few = raise threshold. Too many = lower it.
3. **signal_frequency_pct**: Should be 5-15%. If outside, adjust thresholds.
4. **avg_score_at_actual_bottoms**: Should be high (>70). Model should recognize bottoms.
5. **avg_score_at_actual_tops**: Should be low (<30). Model should avoid tops.
6. **model_r2_score**: Regression fit quality. > 0.2 is decent for financial data.
7. **per_window_cost_improvement**: Consistency across windows. Low variance = robust.
## Decision Guidelines
- If Sharpe < 1.0: The strategy is not working well. Consider larger changes.
- If Sharpe 1.0-1.5: Decent. Fine-tune hyperparameters and thresholds.
- If Sharpe 1.5-2.0: Good. Make small, targeted improvements.
- If Sharpe > 2.0: Very good. Be careful not to overfit.
- If win_rate < 0.50 but profit_factor > 1.5: Strategy relies on big wins ok, tighten SL.
- If win_rate > 0.60 but profit_factor < 1.2: Many small wins but losses are too big widen TP or tighten SL.
- If trade_count < 30: Not enough trades. Lower entry_threshold or min_confidence.
- If max_drawdown < -20%: Too risky. Increase regularization, tighten stop loss.
- If per_window_sharpe has high variance: Model is not stable. More regularization or simpler model.
- Check feature_importances: If top features make financial sense, good. If random features dominate, possible overfitting.
- If cost_improvement < 5%: Strategy is barely working. Try: switch model type, enable all features, increase training window, lower good_buy_threshold.
- If cost_improvement 5-10%: Decent. Fine-tune thresholds and hyperparameters.
- If cost_improvement 10-15%: Good. Make targeted improvements -- focus on signal consistency.
- If cost_improvement > 15%: Very good. Be careful not to overfit. Check per_window variance.
- If signal_count < 30: Not statistically valid. Lower strong_buy_threshold, increase training data.
- If signal_frequency > 20%: Too many signals = not selective enough. Raise threshold.
- If signal_frequency < 3%: Too few signals. Lower threshold.
- If score_at_bottoms < 60: Model is missing bottoms. More features, different model type.
- If score_at_tops > 40: Model is not avoiding tops. More regularization.
- If per_window has high variance: Model is unstable. Increase regularization, try hybrid.
- Check feature_importances: price position features should dominate (distance from ATH, percentile).
## Response Format
You MUST respond with ONLY a JSON object (no markdown, no explanation outside the JSON):
```
{
"reasoning": "Explanation of what you observed and why you're making these changes",
"reasoning": "Explanation of observations and why you are making these changes",
"changes": ["Change 1 description", "Change 2 description"],
"config": { <complete modified config JSON> }
}
```
The "config" field must contain the COMPLETE config (not just changes) so it can be used directly."""
The "config" field must contain the COMPLETE config so it can be used directly."""
def analyze_and_suggest(current_config: dict, results: dict,
iteration_history: list = None) -> tuple[dict, str]:
def _call_ollama(settings, messages):
"""Call Ollama API."""
provider_cfg = settings.get("providers", {}).get("ollama", {})
base_url = provider_cfg.get("base_url", DEFAULT_OLLAMA_URL)
model = settings.get("model", DEFAULT_MODEL)
payload = {
"model": model,
"messages": messages,
"stream": False,
"think": False,
"options": {"temperature": 0.7, "num_predict": 4096},
}
print(f" Calling LLM ({model} via Ollama at {base_url})...")
resp = requests.post(f"{base_url}/api/chat", json=payload, timeout=600)
resp.raise_for_status()
return resp.json()["message"]["content"]
def _call_openai_compatible(settings, messages, provider_name):
"""Call OpenAI-compatible API (LM Studio, OpenAI, OpenRouter)."""
provider_cfg = settings.get("providers", {}).get(provider_name, {})
model = settings.get("model", "")
if provider_name == "lmstudio":
base_url = provider_cfg.get("base_url", "http://100.100.242.21:1234")
url = f"{base_url}/v1/chat/completions"
headers = {"Content-Type": "application/json"}
elif provider_name == "openai":
url = "https://api.openai.com/v1/chat/completions"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {provider_cfg.get('api_key', '')}",
}
elif provider_name == "openrouter":
url = "https://openrouter.ai/api/v1/chat/completions"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {provider_cfg.get('api_key', '')}",
}
else:
raise ValueError(f"Unknown OpenAI-compatible provider: {provider_name}")
payload = {
"model": model,
"messages": messages,
"temperature": 0.7,
"max_tokens": 4096,
}
print(f" Calling LLM ({model} via {provider_name})...")
resp = requests.post(url, json=payload, headers=headers, timeout=600)
resp.raise_for_status()
return resp.json()["choices"][0]["message"]["content"]
def _call_anthropic(settings, messages):
"""Call Anthropic Messages API."""
provider_cfg = settings.get("providers", {}).get("anthropic", {})
model = settings.get("model", "claude-sonnet-4-20250514")
api_key = provider_cfg.get("api_key", "")
# Anthropic uses system as a top-level param, not in messages
system_msg = ""
api_messages = []
for m in messages:
if m["role"] == "system":
system_msg = m["content"]
else:
api_messages.append(m)
payload = {
"model": model,
"max_tokens": 4096,
"messages": api_messages,
}
if system_msg:
payload["system"] = system_msg
headers = {
"Content-Type": "application/json",
"x-api-key": api_key,
"anthropic-version": "2023-06-01",
}
print(f" Calling LLM ({model} via Anthropic)...")
resp = requests.post(
"https://api.anthropic.com/v1/messages",
json=payload,
headers=headers,
timeout=600,
)
resp.raise_for_status()
data = resp.json()
# Extract text from content blocks
return "".join(
block["text"] for block in data.get("content", []) if block.get("type") == "text"
)
def call_llm(messages):
"""Route LLM call to the configured provider."""
settings = load_llm_settings()
provider = settings.get("provider", "ollama")
if provider == "ollama":
return _call_ollama(settings, messages)
elif provider in ("lmstudio", "openai", "openrouter"):
return _call_openai_compatible(settings, messages, provider)
elif provider == "anthropic":
return _call_anthropic(settings, messages)
else:
raise ValueError(f"Unknown LLM provider: {provider}")
def analyze_and_suggest(current_config, results, iteration_history=None):
"""
Send current results to LLM and get suggested config modifications.
Returns (new_config, reasoning).
"""
# Build the user prompt with context
history_text = ""
if iteration_history:
history_text = "\n## Previous Iterations (most recent last)\n"
for h in iteration_history[-5:]:
history_text += (
f"- Iteration {h['iteration']}: Sharpe={h['sharpe']}, "
f"Return={h['return']}%, WinRate={h['win_rate']}, "
f"Trades={h['trades']}, Model={h['model_type']}\n"
f"- Iteration {h.get('iteration', '?')}: "
f"CostImprovement={h.get('cost_improvement', 0):.1f}%, "
f"Signals={h.get('signal_count', 0)}, "
f"R2={h.get('r2_score', 0):.4f}, "
f"Model={h.get('model_type', '?')}\n"
)
user_prompt = f"""## Current Configuration
@ -112,40 +277,31 @@ def analyze_and_suggest(current_config: dict, results: dict,
```
## Current Results
- Sharpe Ratio: {results.get('sharpe_ratio', 0)}
- Total Return: {results.get('total_return_pct', 0)}%
- Max Drawdown: {results.get('max_drawdown_pct', 0)}%
- Win Rate: {results.get('win_rate', 0)}
- Trade Count: {results.get('trade_count', 0)}
- Profit Factor: {results.get('profit_factor', 0)}
- Avg Trade Duration: {results.get('avg_trade_duration_candles', 0)} candles
- Per-Window Sharpe: {results.get('per_window_sharpe', [])}
- Cost Basis Improvement: {results.get('cost_basis_improvement_pct', 0):.1f}%
- Avg Cost (Model): ${results.get('avg_cost_basis_model', 0):,.2f}
- Avg Cost (DCA): ${results.get('avg_cost_basis_dca', 0):,.2f}
- Strong Buy Signals: {results.get('strong_buy_signal_count', 0)}
- Good Buy Signals: {results.get('good_buy_signal_count', 0)}
- Signal Frequency: {results.get('signal_frequency_pct', 0):.1f}%
- Quality of Strong Buys: {results.get('pct_quality_strong_buy', 0):.1%}
- Model R2: {results.get('model_r2_score', 0):.4f}
- Score at Actual Bottoms: {results.get('avg_score_at_actual_bottoms', 0):.1f}
- Score at Actual Tops: {results.get('avg_score_at_actual_tops', 0):.1f}
- Per-Window Improvement: {results.get('per_window_cost_improvement', [])}
- Score Distribution: {results.get('score_distribution', {})}
## Top Feature Importances
{json.dumps(dict(list(results.get('feature_importances', {}).items())[:15]), indent=2)}
{history_text}
Analyze these results and suggest 1-3 specific modifications to the config. Return ONLY valid JSON."""
# Call Ollama
payload = {
"model": MODEL,
"messages": [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt},
],
"stream": False,
"options": {
"temperature": 0.7,
"num_predict": 4096,
},
}
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt},
]
print(f" Calling LLM ({MODEL} on Mac Mini)...")
resp = requests.post(f"{OLLAMA_URL}/api/chat", json=payload, timeout=300)
resp.raise_for_status()
content = resp.json()["message"]["content"]
content = call_llm(messages)
# Parse JSON from response (handle markdown code blocks)
# Strip thinking tags if present
content = re.sub(r"<think>.*?</think>", "", content, flags=re.DOTALL).strip()
@ -153,8 +309,6 @@ Analyze these results and suggest 1-3 specific modifications to the config. Retu
if json_match:
parsed = json.loads(json_match.group(1))
else:
# Try parsing the whole response as JSON
# Find the outermost JSON object
brace_start = content.find("{")
if brace_start >= 0:
depth = 0
@ -164,7 +318,7 @@ Analyze these results and suggest 1-3 specific modifications to the config. Retu
elif content[i] == "}":
depth -= 1
if depth == 0:
parsed = json.loads(content[brace_start:i + 1])
parsed = json.loads(content[brace_start : i + 1])
break
else:
raise ValueError("Could not find complete JSON in LLM response")
@ -175,8 +329,14 @@ Analyze these results and suggest 1-3 specific modifications to the config. Retu
changes = parsed.get("changes", [])
new_config = parsed.get("config", current_config)
# Validate that config has required fields
required_keys = ["model_type", "features", "target", "hyperparameters", "strategy", "training"]
required_keys = [
"model_type",
"features",
"target",
"hyperparameters",
"strategy",
"training",
]
for key in required_keys:
if key not in new_config:
new_config[key] = current_config[key]
@ -186,22 +346,36 @@ Analyze these results and suggest 1-3 specific modifications to the config. Retu
if __name__ == "__main__":
# Test with dummy data
import sys
config_path = sys.argv[1] if len(sys.argv) > 1 else "config/initial_config.json"
with open(config_path) as f:
config = json.load(f)
dummy_results = {
"sharpe_ratio": 1.2,
"total_return_pct": 15.3,
"max_drawdown_pct": -12.5,
"win_rate": 0.55,
"trade_count": 120,
"profit_factor": 1.4,
"avg_trade_duration_candles": 7.2,
"feature_importances": {"RSI_14": 0.15, "MACD_hist": 0.12, "BB_width": 0.10},
"per_window_sharpe": [1.0, 1.3, 1.5, 0.9, 1.1],
"cost_basis_improvement_pct": 8.5,
"avg_cost_basis_model": 65000,
"avg_cost_basis_dca": 71000,
"strong_buy_signal_count": 45,
"good_buy_signal_count": 120,
"signal_frequency_pct": 7.2,
"pct_quality_strong_buy": 0.72,
"model_r2_score": 0.22,
"avg_score_at_actual_bottoms": 68.5,
"avg_score_at_actual_tops": 35.2,
"per_window_cost_improvement": [7.1, 9.3, 8.8, 10.2, 7.0],
"score_distribution": {
"0-20": 80,
"20-40": 150,
"40-60": 200,
"60-80": 130,
"80-100": 40,
},
"feature_importances": {
"dist_from_ath_pct": 0.18,
"RSI_14": 0.12,
"price_percentile_365": 0.10,
},
}
new_config, reasoning = analyze_and_suggest(config, dummy_results)

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python3
"""
BTC ML Trading Strategy Optimizer Orchestrator
BTC Accumulation Signal Optimizer -- Orchestrator
Coordinates the optimization loop across VPS, Windows PC (GPU), and Mac Mini (LLM).
"""
@ -28,7 +28,8 @@ MAC_MINI_HOST = "bizzle@bizzles-mac-mini-1"
MAX_ITERATIONS = 50
CONVERGENCE_WINDOW = 5
CONVERGENCE_THRESHOLD = 0.01 # 1% improvement
TARGET_SHARPE = 3.0
TARGET_COST_IMPROVEMENT = 20.0 # 20% cost basis improvement = exceptional
MIN_SIGNAL_COUNT = 30 # Minimum strong buy signals for valid results
ML_TIMEOUT = 600 # 10 minutes
# Colors
@ -98,7 +99,6 @@ def run_ml_training():
)
if result.returncode != 0:
raise RuntimeError(f"ML training failed:\n{result.stderr}\n{result.stdout}")
# Print training output
for line in result.stdout.strip().split("\n"):
log(f" {C.DIM}{line}", C.DIM)
return True
@ -127,45 +127,53 @@ def check_convergence(history):
if len(history) < CONVERGENCE_WINDOW + 1:
return False, "Not enough iterations"
recent = history[-CONVERGENCE_WINDOW:]
sharpes = [h["sharpe"] for h in recent]
# Only consider valid results (enough signals)
valid = [h for h in history if h.get("signal_count", 0) >= MIN_SIGNAL_COUNT]
# Check if best sharpe exceeds target
best_sharpe = max(h["sharpe"] for h in history)
if best_sharpe >= TARGET_SHARPE:
return True, f"Target Sharpe reached: {best_sharpe:.3f}"
if not valid:
return False, "No valid results yet"
recent = history[-CONVERGENCE_WINDOW:]
scores = [h.get("cost_improvement", 0) for h in recent]
# Check if best score exceeds target
best_score = max(h.get("cost_improvement", 0) for h in valid)
if best_score >= TARGET_COST_IMPROVEMENT:
return True, f"Target cost improvement reached: {best_score:.1f}%"
# Check if improvement has stalled
best_recent = max(sharpes)
worst_recent = min(sharpes)
best_recent = max(scores)
worst_recent = min(scores)
if best_recent > 0 and (best_recent - worst_recent) / best_recent < CONVERGENCE_THRESHOLD:
return True, f"Converged: Sharpe variance < {CONVERGENCE_THRESHOLD*100}% over {CONVERGENCE_WINDOW} iterations"
return True, f"Converged: variance < {CONVERGENCE_THRESHOLD*100}% over {CONVERGENCE_WINDOW} iterations"
return False, ""
def print_header():
print(f"""
{C.BOLD}{C.CYAN}
BTC ML Trading Strategy Optimizer
VPS Windows GPU Mac Mini LLM Loop
{C.RESET}
{C.BOLD}{C.CYAN}========================================================
BTC Accumulation Signal Optimizer
VPS -> Windows GPU -> Mac Mini LLM -> Loop
========================================================{C.RESET}
""")
def print_results(results, iteration):
sharpe = results.get("sharpe_ratio", 0)
sharpe_color = C.GREEN if sharpe > 1.5 else C.YELLOW if sharpe > 1.0 else C.RED
cost_imp = results.get("cost_basis_improvement_pct", 0)
color = C.GREEN if cost_imp > 15 else C.YELLOW if cost_imp > 10 else C.RED
print(f"""
{C.BOLD} Iteration {iteration} Results {C.RESET}
Sharpe Ratio: {sharpe_color}{C.BOLD}{sharpe:.3f}{C.RESET}
Total Return: {results.get('total_return_pct', 0):.1f}%
Max Drawdown: {results.get('max_drawdown_pct', 0):.1f}%
Win Rate: {results.get('win_rate', 0):.1%}
Trade Count: {results.get('trade_count', 0)}
Profit Factor: {results.get('profit_factor', 0):.3f}
Avg Duration: {results.get('avg_trade_duration_candles', 0):.1f} candles
Window Sharpes: {results.get('per_window_sharpe', [])}
{C.BOLD}--- Iteration {iteration} Results ---{C.RESET}
Cost Improvement: {color}{C.BOLD}{cost_imp:.1f}%{C.RESET}
Avg Cost (Model): ${results.get('avg_cost_basis_model', 0):,.2f}
Avg Cost (DCA): ${results.get('avg_cost_basis_dca', 0):,.2f}
Strong Signals: {results.get('strong_buy_signal_count', 0)}
Signal Frequency: {results.get('signal_frequency_pct', 0):.1f}%
Quality Score: {results.get('pct_quality_strong_buy', 0):.1%}
Model R2: {results.get('model_r2_score', 0):.4f}
Score@Bottoms: {results.get('avg_score_at_actual_bottoms', 0):.1f}
Score@Tops: {results.get('avg_score_at_actual_tops', 0):.1f}
Window Improvements: {results.get('per_window_cost_improvement', [])}
""")
@ -173,14 +181,11 @@ def main():
print_header()
os.makedirs(RESULTS_DIR, exist_ok=True)
# Step 1: Ensure data
ensure_data()
# Step 2: Load or create initial config
config_path = os.path.join(CONFIG_DIR, "initial_config.json")
best_config_path = os.path.join(CONFIG_DIR, "best_config.json")
# Resume from best config if it exists
if os.path.exists(best_config_path):
log("Resuming from best_config.json", C.GREEN)
with open(best_config_path) as f:
@ -191,29 +196,24 @@ def main():
history = load_iteration_history()
start_iter = len(history) + 1
best_sharpe = max((h["sharpe"] for h in history), default=0)
best_score = max((h.get("cost_improvement", 0) for h in history), default=0)
log(f"Starting at iteration {start_iter}, best Sharpe so far: {best_sharpe:.3f}", C.BOLD)
log(f"Starting at iteration {start_iter}, best cost improvement so far: {best_score:.1f}%", C.BOLD)
# Step 3: Setup Windows remote
setup_windows_remote()
# SCP the ML engine script (once)
log("Uploading ML engine to Windows...", C.CYAN)
scp_to_windows(os.path.join(BASE_DIR, "ml_engine", "train_and_backtest.py"), "train_and_backtest.py")
# SCP data files (once)
for tf in ["1h", "4h"]:
data_file = os.path.join(DATA_DIR, f"btc_{tf}.csv")
if os.path.exists(data_file):
log(f"Uploading btc_{tf}.csv to Windows...", C.CYAN)
scp_to_windows(data_file, f"btc_{tf}.csv")
# Import LLM analyzer
sys.path.insert(0, os.path.join(BASE_DIR, "llm_client"))
from analyzer import analyze_and_suggest
# Main optimization loop
for iteration in range(start_iter, MAX_ITERATIONS + 1):
log(f"\n{'='*50}", C.BOLD)
log(f"ITERATION {iteration}/{MAX_ITERATIONS}", f"{C.BOLD}{C.CYAN}")
@ -222,13 +222,11 @@ def main():
f"Depth: {config.get('hyperparameters', {}).get('max_depth', '?')}", C.DIM)
log(f"{'='*50}", C.BOLD)
# Write current config to temp file and SCP
tmp_config = os.path.join(BASE_DIR, "config", "current_config.json")
with open(tmp_config, "w") as f:
json.dump(config, f, indent=2)
scp_to_windows(tmp_config, "config.json")
# Run ML training on Windows
try:
run_ml_training()
except (RuntimeError, subprocess.TimeoutExpired) as e:
@ -238,7 +236,6 @@ def main():
config = history[-1].get("config", config)
continue
# Fetch results from Windows
results_local = os.path.join(RESULTS_DIR, f"results_iter_{iteration}.json")
scp_from_windows("results.json", results_local)
@ -247,34 +244,35 @@ def main():
print_results(results, iteration)
# Track best
current_sharpe = results.get("sharpe_ratio", 0)
is_best = current_sharpe > best_sharpe
current_score = results.get("cost_basis_improvement_pct", 0)
signal_count = results.get("strong_buy_signal_count", 0)
is_best = current_score > best_score and signal_count >= MIN_SIGNAL_COUNT
if is_best:
best_sharpe = current_sharpe
best_score = current_score
with open(best_config_path, "w") as f:
json.dump(config, f, indent=2)
log(f"NEW BEST! Sharpe: {best_sharpe:.3f}", f"{C.BOLD}{C.GREEN}")
log(f"NEW BEST! Cost Improvement: {best_score:.1f}%", f"{C.BOLD}{C.GREEN}")
# Log iteration
iter_data = {
"iteration": iteration,
"timestamp": datetime.now(timezone.utc).isoformat(),
"sharpe": current_sharpe,
"return": results.get("total_return_pct", 0),
"max_drawdown": results.get("max_drawdown_pct", 0),
"win_rate": results.get("win_rate", 0),
"trades": results.get("trade_count", 0),
"profit_factor": results.get("profit_factor", 0),
"cost_improvement": current_score,
"avg_30d_return": results.get("avg_quality_score_strong_buy", 0),
"avg_90d_return": results.get("pct_quality_strong_buy", 0),
"signal_count": signal_count,
"signal_frequency": results.get("signal_frequency_pct", 0),
"r2_score": results.get("model_r2_score", 0),
"score_at_bottoms": results.get("avg_score_at_actual_bottoms", 0),
"score_at_tops": results.get("avg_score_at_actual_tops", 0),
"model_type": config.get("model_type", "unknown"),
"is_best": is_best,
"config": config,
"results": results,
}
save_iteration(iter_data)
history.append(iter_data)
# Check convergence
converged, reason = check_convergence(history)
if converged:
log(f"\nOptimization converged: {reason}", f"{C.BOLD}{C.GREEN}")
@ -284,17 +282,15 @@ def main():
log(f"\nMax iterations ({MAX_ITERATIONS}) reached.", C.YELLOW)
break
# Ask LLM for next config
log("\nConsulting LLM for strategy modifications...", C.MAGENTA)
try:
summary_history = [
{
"iteration": h["iteration"],
"sharpe": h["sharpe"],
"return": h["return"],
"win_rate": h["win_rate"],
"trades": h["trades"],
"model_type": h["model_type"],
"cost_improvement": h.get("cost_improvement", 0),
"signal_count": h.get("signal_count", 0),
"r2_score": h.get("r2_score", 0),
"model_type": h.get("model_type", "unknown"),
}
for h in history
]
@ -304,37 +300,34 @@ def main():
except Exception as e:
log(f"LLM call failed: {e}", C.RED)
log("Continuing with current config + random perturbation...", C.YELLOW)
# Small random perturbation as fallback
import random
hp = config.get("hyperparameters", {})
hp["learning_rate"] = hp.get("learning_rate", 0.05) * random.uniform(0.8, 1.2)
hp["max_depth"] = max(3, min(10, hp.get("max_depth", 6) + random.choice([-1, 0, 1])))
hp["learning_rate"] = hp.get("learning_rate", 0.01) * random.uniform(0.8, 1.2)
hp["max_depth"] = max(3, min(10, hp.get("max_depth", 5) + random.choice([-1, 0, 1])))
config["hyperparameters"] = hp
# Final summary
print(f"""
{C.BOLD}{C.GREEN}
Optimization Complete!
{C.RESET}
{C.BOLD}{C.GREEN}========================================================
Optimization Complete!
========================================================{C.RESET}
Total Iterations: {len(history)}
Best Sharpe: {C.BOLD}{best_sharpe:.3f}{C.RESET}
Best Config: {best_config_path}
Iteration Log: {ITERATIONS_LOG}
Total Iterations: {len(history)}
Best Cost Improvement: {C.BOLD}{best_score:.1f}%{C.RESET}
Best Config: {best_config_path}
Iteration Log: {ITERATIONS_LOG}
""")
# --- Library API for dashboard integration ---
# Shared state for dashboard
_stop_event = threading.Event()
_status = {
"state": "idle", # idle, running, completed, error
"state": "idle",
"iteration": 0,
"max_iterations": MAX_ITERATIONS,
"best_sharpe": 0.0,
"best_score": 0.0,
"error": None,
"llm_suggestions": [], # list of {iteration, reasoning, changes}
"llm_suggestions": [],
}
_status_lock = threading.Lock()
@ -352,15 +345,9 @@ def update_status(**kwargs):
def run_optimization_loop(callback=None, config_override=None):
"""
Run the optimization loop. Designed to be called from a background thread.
Args:
callback: Called after each iteration with (iteration_number, iter_data_dict).
config_override: Optional dict to use instead of loading from disk.
"""
"""Run the optimization loop from a background thread."""
_stop_event.clear()
update_status(state="running", iteration=0, error=None, best_sharpe=0.0)
update_status(state="running", iteration=0, error=None, best_score=0.0)
try:
os.makedirs(RESULTS_DIR, exist_ok=True)
@ -380,8 +367,8 @@ def run_optimization_loop(callback=None, config_override=None):
history = load_iteration_history()
start_iter = len(history) + 1
best_sharpe = max((h["sharpe"] for h in history), default=0)
update_status(best_sharpe=best_sharpe)
best_score = max((h.get("cost_improvement", 0) for h in history), default=0)
update_status(best_score=best_score)
setup_windows_remote()
scp_to_windows(os.path.join(BASE_DIR, "ml_engine", "train_and_backtest.py"), "train_and_backtest.py")
@ -418,23 +405,26 @@ def run_optimization_loop(callback=None, config_override=None):
with open(results_local) as f:
results = json.load(f)
current_sharpe = results.get("sharpe_ratio", 0)
is_best = current_sharpe > best_sharpe
current_score = results.get("cost_basis_improvement_pct", 0)
signal_count = results.get("strong_buy_signal_count", 0)
is_best = current_score > best_score and signal_count >= MIN_SIGNAL_COUNT
if is_best:
best_sharpe = current_sharpe
best_score = current_score
with open(best_config_path, "w") as f:
json.dump(config, f, indent=2)
update_status(best_sharpe=best_sharpe)
update_status(best_score=best_score)
iter_data = {
"iteration": iteration,
"timestamp": datetime.now(timezone.utc).isoformat(),
"sharpe": current_sharpe,
"return": results.get("total_return_pct", 0),
"max_drawdown": results.get("max_drawdown_pct", 0),
"win_rate": results.get("win_rate", 0),
"trades": results.get("trade_count", 0),
"profit_factor": results.get("profit_factor", 0),
"cost_improvement": current_score,
"signal_count": signal_count,
"signal_frequency": results.get("signal_frequency_pct", 0),
"r2_score": results.get("model_r2_score", 0),
"score_at_bottoms": results.get("avg_score_at_actual_bottoms", 0),
"score_at_tops": results.get("avg_score_at_actual_tops", 0),
"quality": results.get("pct_quality_strong_buy", 0),
"model_type": config.get("model_type", "unknown"),
"is_best": is_best,
"config": config,
@ -459,10 +449,10 @@ def run_optimization_loop(callback=None, config_override=None):
update_status(state="completed")
return
# LLM suggestion
try:
summary_history = [
{k: h[k] for k in ("iteration", "sharpe", "return", "win_rate", "trades", "model_type")}
{k: h[k] for k in ("iteration", "cost_improvement", "signal_count", "r2_score", "model_type")
if k in h}
for h in history
]
new_config, reasoning = analyze_and_suggest(config, results, summary_history)
@ -471,12 +461,25 @@ def run_optimization_loop(callback=None, config_override=None):
"iteration": iteration,
"reasoning": reasoning,
})
# Also persist LLM suggestion to iteration log
iter_data["llm_reasoning"] = reasoning
iter_data["llm_applied"] = True
config = new_config
except Exception:
import random
except Exception as e:
import random, traceback
err_msg = f"LLM call failed: {type(e).__name__}: {e}"
print(f" WARNING: {err_msg}")
traceback.print_exc()
with _status_lock:
_status["llm_suggestions"].append({
"iteration": iteration,
"reasoning": f"ERROR: {err_msg} — using random perturbation",
})
iter_data["llm_reasoning"] = err_msg
iter_data["llm_applied"] = False
hp = config.get("hyperparameters", {})
hp["learning_rate"] = hp.get("learning_rate", 0.05) * random.uniform(0.8, 1.2)
hp["max_depth"] = max(3, min(10, hp.get("max_depth", 6) + random.choice([-1, 0, 1])))
hp["learning_rate"] = hp.get("learning_rate", 0.01) * random.uniform(0.8, 1.2)
hp["max_depth"] = max(3, min(10, hp.get("max_depth", 5) + random.choice([-1, 0, 1])))
config["hyperparameters"] = hp
update_status(state="completed")

0
scoring/__init__.py Normal file
View File

418
scoring/engine.py Normal file
View File

@ -0,0 +1,418 @@
"""Scoring engine for Bitcoin accumulation zone metrics."""
import json
import os
import logging
log = logging.getLogger(__name__)
THRESHOLDS_PATH = os.path.join(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
"config",
"thresholds.json",
)
def load_thresholds():
"""Load scoring thresholds from config."""
try:
with open(THRESHOLDS_PATH) as f:
return json.load(f)
except Exception:
return {}
def _score_range(value, ranges):
"""Score a value using range-based thresholds.
Each range is [low, high, score]. null means unbounded.
"""
if value is None:
return None
for low, high, score in ranges:
low_ok = low is None or value >= low
high_ok = high is None or value < high
if low_ok and high_ok:
return score
return 0
def _score_range_inverted(value, ranges):
"""Score where higher value = lower range index (for drawdown)."""
if value is None:
return None
for low, high, score in ranges:
low_ok = low is None or value >= low
high_ok = high is None or value < high
if low_ok and high_ok:
return score
return 0
def score_fear_greed(value, thresholds=None):
"""Score Fear & Greed index (0-100 input, 0-10 output)."""
if value is None:
return None, "No data"
t = (thresholds or load_thresholds()).get("fear_greed", {})
ranges = t.get("ranges", [[0, 10, 10], [11, 25, 7], [26, 45, 4], [46, 55, 2], [56, 75, 1], [76, 100, 0]])
score = _score_range(value, ranges)
if value <= 10:
desc = "Extreme Fear — historically excellent buying"
elif value <= 25:
desc = "Fear — good accumulation territory"
elif value <= 45:
desc = "Low neutral — moderate opportunity"
elif value <= 55:
desc = "Neutral"
elif value <= 75:
desc = "Greed — caution"
else:
desc = "Extreme Greed — poor time to accumulate"
return score, desc
def score_puell_multiple(value, thresholds=None):
if value is None:
return None, "No data"
t = (thresholds or load_thresholds()).get("puell_multiple", {})
ranges = t.get("ranges", [[None, 0.3, 10], [0.3, 0.5, 8], [0.5, 0.8, 5], [0.8, 1.2, 3], [1.2, 2.0, 1], [2.0, None, 0]])
score = _score_range(value, ranges)
if value < 0.3:
desc = "Deep value — miners under extreme stress"
elif value < 0.5:
desc = "Low — miners selling below average"
elif value < 0.8:
desc = "Below average miner revenue"
elif value < 1.2:
desc = "Average miner revenue"
elif value < 2.0:
desc = "Above average — miners profiting well"
else:
desc = "Elevated — potential top signal"
return score, desc
def score_mvrv_zscore(value, thresholds=None):
if value is None:
return None, "No data"
t = (thresholds or load_thresholds()).get("mvrv_zscore", {})
ranges = t.get("ranges", [[None, 0, 10], [0, 0.5, 8], [0.5, 1.5, 5], [1.5, 3, 2], [3, 5, 1], [5, None, 0]])
score = _score_range(value, ranges)
if value < 0:
desc = "Below realized value — historically perfect buy zone"
elif value < 0.5:
desc = "Near realized value — strong accumulation"
elif value < 1.5:
desc = "Fair value range"
elif value < 3:
desc = "Above fair value"
elif value < 5:
desc = "Overvalued territory"
else:
desc = "Extreme overvaluation — cycle top territory"
return score, desc
def score_drawdown(value, thresholds=None):
"""Score drawdown from ATH (value is % drawdown, e.g. 50 = 50% below ATH)."""
if value is None:
return None, "No data"
t = (thresholds or load_thresholds()).get("drawdown", {})
ranges = t.get("ranges", [[70, None, 10], [50, 70, 8], [30, 50, 6], [20, 30, 4], [10, 20, 2], [None, 10, 0]])
score = _score_range(value, ranges)
if value > 70:
desc = f"{value:.0f}% below ATH — extreme capitulation"
elif value > 50:
desc = f"{value:.0f}% below ATH — deep bear market"
elif value > 30:
desc = f"{value:.0f}% below ATH — significant correction"
elif value > 20:
desc = f"{value:.0f}% below ATH — moderate pullback"
elif value > 10:
desc = f"{value:.0f}% below ATH — minor dip"
else:
desc = f"{value:.0f}% below ATH — near all-time high"
return score, desc
def score_price_vs_200w_sma(price, sma_200w, thresholds=None):
"""Score price relative to 200-week SMA."""
if price is None or sma_200w is None or sma_200w == 0:
return None, "No data"
pct_above = ((price - sma_200w) / sma_200w) * 100
t = (thresholds or load_thresholds()).get("price_vs_200w_sma", {})
ranges = t.get("ranges", [[None, 0, 10], [0, 20, 6], [20, 50, 3], [50, 100, 1], [100, None, 0]])
score = _score_range(pct_above, ranges)
if pct_above < 0:
desc = f"Below 200W SMA — historically rare buy zone"
elif pct_above < 20:
desc = f"{pct_above:.0f}% above 200W SMA — good value"
elif pct_above < 50:
desc = f"{pct_above:.0f}% above 200W SMA — moderate"
elif pct_above < 100:
desc = f"{pct_above:.0f}% above 200W SMA — extended"
else:
desc = f"{pct_above:.0f}% above 200W SMA — extremely overheated"
return score, desc
def score_reserve_risk(value, thresholds=None):
if value is None:
return None, "No data"
t = (thresholds or load_thresholds()).get("reserve_risk", {})
ranges = t.get("ranges", [[None, 0.002, 10], [0.002, 0.005, 7], [0.005, 0.01, 4], [0.01, 0.02, 2], [0.02, None, 0]])
score = _score_range(value, ranges)
if value < 0.002:
desc = "Very low risk/reward — strong accumulation"
elif value < 0.005:
desc = "Low risk — good entry"
elif value < 0.01:
desc = "Moderate risk/reward"
elif value < 0.02:
desc = "Elevated risk"
else:
desc = "High risk — cycle top territory"
return score, desc
def score_rhodl_ratio(value, thresholds=None):
if value is None:
return None, "No data"
t = (thresholds or load_thresholds()).get("rhodl_ratio", {})
ranges = t.get("ranges", [[None, 100, 10], [100, 500, 7], [500, 2000, 4], [2000, 10000, 1], [10000, None, 0]])
score = _score_range(value, ranges)
if value < 100:
desc = "Extreme low — long-term holders dominate"
elif value < 500:
desc = "Low — mature holder confidence"
elif value < 2000:
desc = "Moderate rotation"
elif value < 10000:
desc = "Elevated — new money entering"
else:
desc = "Extreme — speculative mania"
return score, desc
def score_nupl(value, thresholds=None):
if value is None:
return None, "No data"
t = (thresholds or load_thresholds()).get("nupl", {})
ranges = t.get("ranges", [[None, 0, 10], [0, 0.25, 7], [0.25, 0.5, 4], [0.5, 0.75, 1], [0.75, None, 0]])
score = _score_range(value, ranges)
if value < 0:
desc = "Capitulation — holders underwater"
elif value < 0.25:
desc = "Hope/Fear — early recovery"
elif value < 0.5:
desc = "Optimism — moderate profit taking"
elif value < 0.75:
desc = "Belief/Greed — significant unrealized gains"
else:
desc = "Euphoria — extreme unrealized profit"
return score, desc
def score_lth_realized_price(price, lth_rp, thresholds=None):
"""Score price relative to Long-Term Holder realized price."""
if price is None or lth_rp is None or lth_rp == 0:
return None, "No data"
pct_above = ((price - lth_rp) / lth_rp) * 100
t = (thresholds or load_thresholds()).get("lth_realized_price", {})
ranges = t.get("ranges", [[None, 0, 10], [0, 20, 6], [20, 50, 3], [50, None, 1]])
score = _score_range(pct_above, ranges)
if pct_above < 0:
desc = f"Below LTH cost basis — LTHs underwater (extreme value)"
elif pct_above < 20:
desc = f"{pct_above:.0f}% above LTH cost basis — good value"
elif pct_above < 50:
desc = f"{pct_above:.0f}% above LTH cost basis — moderate"
else:
desc = f"{pct_above:.0f}% above LTH cost basis — extended"
return score, desc
def score_hash_ribbons(data, thresholds=None):
"""Score hash ribbons based on buy signal detection."""
if not data:
return None, "No data"
if data.get("buy_signal"):
return 10, "Active buy signal — miner capitulation recovery"
return 3, "Normal mining activity"
def score_all(metrics):
"""Score all metrics and return individual + composite scores."""
thresholds = load_thresholds()
results = []
# Fear & Greed
fg = metrics.get("fear_greed", {})
fg_score, fg_desc = score_fear_greed(fg.get("value"), thresholds)
results.append({
"name": "Fear & Greed Index",
"key": "fear_greed",
"value": fg.get("value"),
"display_value": f"{fg.get('value', 'N/A')}{fg.get('classification', '')}",
"score": fg_score,
"description": fg_desc,
"recent": fg.get("recent", []),
})
# Puell Multiple
pm = metrics.get("puell_multiple", {})
pm_score, pm_desc = score_puell_multiple(pm.get("value"), thresholds)
results.append({
"name": "Puell Multiple",
"key": "puell_multiple",
"value": pm.get("value"),
"display_value": f"{pm.get('value', 'N/A'):.4f}" if pm.get("value") is not None else "N/A",
"score": pm_score,
"description": pm_desc,
"recent": pm.get("recent", []),
})
# MVRV Z-Score
mz = metrics.get("mvrv_zscore", {})
mz_score, mz_desc = score_mvrv_zscore(mz.get("value"), thresholds)
results.append({
"name": "MVRV Z-Score",
"key": "mvrv_zscore",
"value": mz.get("value"),
"display_value": f"{mz.get('value', 'N/A'):.2f}" if mz.get("value") is not None else "N/A",
"score": mz_score,
"description": mz_desc,
"recent": mz.get("recent", []),
})
# Drawdown from ATH
dd = metrics.get("drawdown", {})
dd_score, dd_desc = score_drawdown(dd.get("value"), thresholds)
results.append({
"name": "Drawdown from ATH",
"key": "drawdown",
"value": dd.get("value"),
"display_value": f"{dd.get('value', 0):.1f}%" if dd.get("value") is not None else "N/A",
"score": dd_score,
"description": dd_desc,
"recent": [],
})
# Price vs 200W SMA
sma = metrics.get("200w_sma", {})
price_data = metrics.get("price", {})
current_price = price_data.get("price") or sma.get("btc_price")
sma_val = sma.get("value")
sma_score, sma_desc = score_price_vs_200w_sma(current_price, sma_val, thresholds)
results.append({
"name": "Price vs 200W SMA",
"key": "price_vs_200w_sma",
"value": sma_val,
"display_value": f"${sma_val:,.0f}" if sma_val else "N/A",
"score": sma_score,
"description": sma_desc,
"recent": sma.get("recent", []),
})
# Reserve Risk
rr = metrics.get("reserve_risk", {})
rr_score, rr_desc = score_reserve_risk(rr.get("value"), thresholds)
results.append({
"name": "Reserve Risk",
"key": "reserve_risk",
"value": rr.get("value"),
"display_value": f"{rr.get('value', 'N/A'):.6f}" if rr.get("value") is not None else "N/A",
"score": rr_score,
"description": rr_desc,
"recent": rr.get("recent", []),
})
# RHODL Ratio
rh = metrics.get("rhodl_ratio", {})
rh_score, rh_desc = score_rhodl_ratio(rh.get("value"), thresholds)
results.append({
"name": "RHODL Ratio",
"key": "rhodl_ratio",
"value": rh.get("value"),
"display_value": f"{rh.get('value', 'N/A'):.0f}" if rh.get("value") is not None else "N/A",
"score": rh_score,
"description": rh_desc,
"recent": rh.get("recent", []),
})
# NUPL
nu = metrics.get("nupl", {})
nu_score, nu_desc = score_nupl(nu.get("value"), thresholds)
results.append({
"name": "Net Unrealized Profit/Loss",
"key": "nupl",
"value": nu.get("value"),
"display_value": f"{nu.get('value', 'N/A'):.4f}" if nu.get("value") is not None else "N/A",
"score": nu_score,
"description": nu_desc,
"recent": nu.get("recent", []),
})
# LTH Realized Price
lth = metrics.get("lth_realized_price", {})
lth_price = lth.get("btc_price") or current_price
lth_rp = lth.get("value")
lth_score, lth_desc = score_lth_realized_price(lth_price, lth_rp, thresholds)
results.append({
"name": "LTH Realized Price",
"key": "lth_realized_price",
"value": lth_rp,
"display_value": f"${lth_rp:,.0f}" if lth_rp else "N/A",
"score": lth_score,
"description": lth_desc,
"recent": lth.get("recent", []),
})
# Hash Ribbons
hr = metrics.get("hash_ribbons", {})
hr_score, hr_desc = score_hash_ribbons(hr, thresholds)
results.append({
"name": "Hash Ribbons",
"key": "hash_ribbons",
"value": None,
"display_value": "Buy Signal" if hr.get("buy_signal") else "Normal",
"score": hr_score,
"description": hr_desc,
"recent": [],
})
# Compute composite
valid_scores = [r["score"] for r in results if r["score"] is not None]
if valid_scores:
# Scale to 0-100 based on available metrics
composite = sum(valid_scores) / len(valid_scores) * 10
else:
composite = 0
# Assessment text
if composite >= 71:
assessment = "STRONG ACCUMULATION ZONE"
elif composite >= 51:
assessment = "MODERATE OPPORTUNITY"
elif composite >= 31:
assessment = "NEUTRAL"
elif composite >= 15:
assessment = "CAUTION — OVERHEATED"
else:
assessment = "EXTREME CAUTION"
return {
"metrics": results,
"composite_score": round(composite, 1),
"assessment": assessment,
"scored_count": len(valid_scores),
"total_count": len(results),
}

0
scrapers/__init__.py Normal file
View File

34
scrapers/fear_greed.py Normal file
View File

@ -0,0 +1,34 @@
"""Fear & Greed Index from alternative.me API."""
import logging
import requests
log = logging.getLogger(__name__)
FNG_URL = "https://api.alternative.me/fng/?limit=30"
def fetch():
"""Fetch Fear & Greed data. Returns dict with value, classification, and recent history."""
try:
resp = requests.get(FNG_URL, timeout=15)
resp.raise_for_status()
data = resp.json()
entries = data.get("data", [])
if not entries:
return {"value": None, "error": "No data"}
current = entries[0]
value = int(current["value"])
classification = current.get("value_classification", "")
recent = [int(e["value"]) for e in entries[:30]]
return {
"value": value,
"classification": classification,
"recent": recent,
}
except Exception as e:
log.error("Fear & Greed fetch error: %s", e)
return {"value": None, "error": str(e)}

View File

@ -0,0 +1,276 @@
"""Collect full historical time series from LookIntoBitcoin charts, CoinGecko, and Fear & Greed."""
import json
import logging
import os
import time
from datetime import datetime
import requests
log = logging.getLogger(__name__)
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
HISTORY_PATH = os.path.join(BASE_DIR, "data", "history.json")
# Charts to scrape with expected trace names
CHART_CONFIGS = {
"puell_multiple": {
"path": "/charts/puell-multiple/",
"traces": {"puell_multiple": "Puell Multiple", "btc_price": "Price"},
},
"mvrv_zscore": {
"path": "/charts/mvrv-zscore/",
"traces": {"mvrv_zscore": "Z-Score"},
},
"reserve_risk": {
"path": "/charts/reserve-risk/",
"traces": {"reserve_risk": "Reserve Risk"},
},
"rhodl_ratio": {
"path": "/charts/rhodl-ratio/",
"traces": {"rhodl_ratio": "RHODL Ratio"},
},
"nupl": {
"path": "/charts/relative-unrealized-profit--loss/",
"traces": {"nupl": "NUPL"},
},
"200w_sma": {
"path": "/charts/200-week-moving-average-heatmap/",
"traces": {"200w_sma": "200 Week Moving Average", "btc_price_sma": "Price"},
},
"lth_realized_price": {
"path": "/charts/long-term-holder-realized-price/",
"traces": {"lth_realized_price": "Long-Term Holder Realized Price", "btc_price_lth": "Price"},
},
"lth_supply": {
"path": "/charts/long-term-holder-supply/",
"traces": {"lth_supply": None}, # None = grab first numeric trace
},
}
def _find_trace(traces, name):
"""Find a trace by name (case-insensitive partial match)."""
if not traces or not name:
return None
name_lower = name.lower()
for t in traces:
trace_name = t.get("name", "").lower()
if name_lower in trace_name or trace_name in name_lower:
return t
words = name_lower.split()
for t in traces:
trace_name = t.get("name", "").lower()
if all(w in trace_name for w in words):
return t
return None
def _extract_series(trace):
"""Extract (dates, values) from a Plotly trace dict."""
if not trace:
return [], []
x = trace.get("x", [])
y = trace.get("y", [])
dates = []
values = []
for i, (d, v) in enumerate(zip(x, y)):
if v is None:
continue
try:
val = float(v)
except (ValueError, TypeError):
continue
# Normalize date string to YYYY-MM-DD
date_str = str(d)[:10]
dates.append(date_str)
values.append(val)
return dates, values
def scrape_chart_history(chart_path):
"""Scrape a chart and return all trace data."""
from scrapers.lookintobitcoin import scrape_chart
return scrape_chart(chart_path)
def collect_onchain_history(progress_cb=None):
"""Scrape all on-chain charts and return dict of {metric: {dates, values}}."""
result = {}
total = len(CHART_CONFIGS)
for idx, (chart_key, cfg) in enumerate(CHART_CONFIGS.items()):
label = f"[{idx+1}/{total}] {chart_key}"
log.info("Scraping history: %s", label)
if progress_cb:
progress_cb(chart_key, idx, total)
try:
traces = scrape_chart_history(cfg["path"])
if not traces:
log.warning("No traces for %s", chart_key)
continue
for metric_key, trace_name in cfg["traces"].items():
if trace_name is None:
# Grab first trace with numeric data
for candidate in traces:
y = candidate.get("y", [])
if y and any(v is not None for v in y[-10:]):
dates, values = _extract_series(candidate)
if dates:
result[metric_key] = {"dates": dates, "values": values}
log.info(" %s: %d data points", metric_key, len(dates))
break
else:
t = _find_trace(traces, trace_name)
if not t:
# Fallback: try BTC Price
if "btc_price" in metric_key or "price" in trace_name.lower():
t = _find_trace(traces, "BTC") or _find_trace(traces, "Price")
if not t:
log.warning(" Trace '%s' not found for %s", trace_name, metric_key)
continue
dates, values = _extract_series(t)
if dates:
result[metric_key] = {"dates": dates, "values": values}
log.info(" %s: %d data points (%s to %s)", metric_key, len(dates), dates[0], dates[-1])
else:
log.warning(" %s: no valid data points", metric_key)
except Exception as e:
log.error("Error scraping %s: %s", chart_key, e)
# Be polite between requests
if idx < total - 1:
time.sleep(2)
return result
def collect_price_history():
"""Fetch BTC price history from CoinGecko (max history)."""
log.info("Fetching BTC price history from CoinGecko...")
try:
resp = requests.get(
"https://api.coingecko.com/api/v3/coins/bitcoin/market_chart",
params={"vs_currency": "usd", "days": "max"},
timeout=30,
)
resp.raise_for_status()
data = resp.json()
prices = data.get("prices", [])
dates = []
values = []
seen_dates = set()
for ts_ms, price in prices:
d = datetime.utcfromtimestamp(ts_ms / 1000).strftime("%Y-%m-%d")
if d not in seen_dates:
seen_dates.add(d)
dates.append(d)
values.append(round(price, 2))
log.info("CoinGecko BTC price: %d days (%s to %s)", len(dates), dates[0] if dates else "?", dates[-1] if dates else "?")
return {"dates": dates, "values": values}
except Exception as e:
log.error("CoinGecko price fetch failed: %s", e)
return None
def collect_fear_greed_history():
"""Fetch full Fear & Greed history from alternative.me."""
log.info("Fetching Fear & Greed history...")
try:
resp = requests.get(
"https://api.alternative.me/fng/",
params={"limit": "0"},
timeout=30,
)
resp.raise_for_status()
data = resp.json().get("data", [])
dates = []
values = []
for entry in reversed(data): # API returns newest first
ts = int(entry["timestamp"])
d = datetime.utcfromtimestamp(ts).strftime("%Y-%m-%d")
dates.append(d)
values.append(int(entry["value"]))
log.info("Fear & Greed: %d days (%s to %s)", len(dates), dates[0] if dates else "?", dates[-1] if dates else "?")
return {"dates": dates, "values": values}
except Exception as e:
log.error("Fear & Greed fetch failed: %s", e)
return None
def collect_all_history(progress_cb=None):
"""Collect all historical data and save to history.json."""
log.info("=== Starting full historical data collection ===")
history = {}
# 1. On-chain metrics from LookIntoBitcoin
onchain = collect_onchain_history(progress_cb=progress_cb)
history.update(onchain)
# 2. BTC price from CoinGecko
price = collect_price_history()
if price:
history["btc_price_coingecko"] = price
# 3. Fear & Greed
fng = collect_fear_greed_history()
if fng:
history["fear_greed"] = fng
# Merge BTC price: prefer the LookIntoBitcoin trace (goes to 2010), fill gaps with CoinGecko
btc_keys = [k for k in history if "btc_price" in k]
if btc_keys:
# Use longest series as base
best = max(btc_keys, key=lambda k: len(history[k]["dates"]))
history["btc_price"] = history[best]
log.info("BTC price source: %s (%d days)", best, len(history[best]["dates"]))
# Add metadata
history["_metadata"] = {
"collected_at": datetime.utcnow().isoformat() + "Z",
"metrics": list(k for k in history if not k.startswith("_")),
"metric_counts": {k: len(v["dates"]) for k, v in history.items() if isinstance(v, dict) and "dates" in v},
}
# Save
os.makedirs(os.path.dirname(HISTORY_PATH), exist_ok=True)
with open(HISTORY_PATH, "w") as f:
json.dump(history, f, separators=(",", ":"))
size_mb = os.path.getsize(HISTORY_PATH) / 1024 / 1024
log.info("=== History saved to %s (%.1f MB) ===", HISTORY_PATH, size_mb)
log.info("Metrics collected: %s", ", ".join(k for k in history if not k.startswith("_")))
return history
def load_history():
"""Load history from disk."""
if not os.path.exists(HISTORY_PATH):
return None
with open(HISTORY_PATH) as f:
return json.load(f)
def history_status():
"""Check if history exists and return metadata."""
if not os.path.exists(HISTORY_PATH):
return {"exists": False}
try:
stat = os.stat(HISTORY_PATH)
with open(HISTORY_PATH) as f:
data = json.load(f)
meta = data.get("_metadata", {})
return {
"exists": True,
"collected_at": meta.get("collected_at"),
"metrics": meta.get("metrics", []),
"metric_counts": meta.get("metric_counts", {}),
"size_mb": round(stat.st_size / 1024 / 1024, 2),
}
except Exception as e:
return {"exists": True, "error": str(e)}

265
scrapers/lookintobitcoin.py Normal file
View File

@ -0,0 +1,265 @@
"""Playwright scraper for LookIntoBitcoin / BitcoinMagazinePro charts."""
import logging
import traceback
log = logging.getLogger(__name__)
BASE_URL = "https://www.lookintobitcoin.com"
CHARTS = {
"puell_multiple": {
"path": "/charts/puell-multiple/",
"traces": ["Puell Multiple"],
},
"mvrv_zscore": {
"path": "/charts/mvrv-zscore/",
"traces": ["Z-Score"],
},
"reserve_risk": {
"path": "/charts/reserve-risk/",
"traces": ["Reserve Risk"],
},
"rhodl_ratio": {
"path": "/charts/rhodl-ratio/",
"traces": ["RHODL Ratio"],
},
"nupl": {
"path": "/charts/relative-unrealized-profit--loss/",
"traces": ["NUPL"],
},
"200w_sma": {
"path": "/charts/200-week-moving-average-heatmap/",
"traces": ["200 Week Moving Average"],
},
"lth_realized_price": {
"path": "/charts/long-term-holder-realized-price/",
"traces": ["Long-Term Holder Realized Price", "BTC Price"],
},
"hash_ribbons": {
"path": "/charts/hash-ribbons/",
"traces": None,
},
"pi_cycle_bottom": {
"path": "/charts/pi-cycle-top-bottom-indicator/",
"traces": None,
},
"lth_supply": {
"path": "/charts/long-term-holder-supply/",
"traces": None,
},
}
def scrape_chart(chart_path, timeout=25000):
"""Scrape a single chart from LookIntoBitcoin. Returns list of trace dicts or None."""
from playwright.sync_api import sync_playwright
store = {"data": None}
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
def handle_response(response):
if "_dash-update-component" in response.url:
try:
store["data"] = response.json()
except Exception:
pass
page.on("response", handle_response)
try:
page.goto(f"{BASE_URL}{chart_path}", timeout=timeout)
page.wait_for_timeout(6000)
except Exception as e:
log.warning("Navigation error for %s: %s", chart_path, e)
finally:
browser.close()
if store["data"]:
try:
return store["data"]["response"]["chart"]["figure"]["data"]
except (KeyError, TypeError):
# Try alternate response structures
try:
resp = store["data"]
if isinstance(resp, dict):
for key in resp:
val = resp[key]
if isinstance(val, dict) and "figure" in val:
return val["figure"]["data"]
if isinstance(val, dict) and "chart" in val:
return val["chart"]["figure"]["data"]
except Exception:
pass
return None
def _find_trace(traces, name):
"""Find a trace by name (case-insensitive partial match)."""
if not traces:
return None
name_lower = name.lower()
# First pass: exact or substring match
for t in traces:
trace_name = t.get("name", "").lower()
if name_lower in trace_name or trace_name in name_lower:
return t
# Second pass: check if all words in name appear in trace name
words = name_lower.split()
for t in traces:
trace_name = t.get("name", "").lower()
if all(w in trace_name for w in words):
return t
return None
def _get_latest_value(trace):
"""Get the most recent non-null y value from a trace."""
if not trace:
return None
y = trace.get("y", [])
for val in reversed(y):
if val is not None:
try:
return float(val)
except (ValueError, TypeError):
continue
return None
def _get_recent_values(trace, n=30):
"""Get the last n non-null values from a trace."""
if not trace:
return []
y = trace.get("y", [])
values = []
for val in reversed(y):
if val is not None:
try:
values.append(float(val))
except (ValueError, TypeError):
continue
if len(values) >= n:
break
values.reverse()
return values
def scrape_all():
"""Scrape all charts and return parsed metric values."""
results = {}
for metric_key, chart_info in CHARTS.items():
log.info("Scraping %s ...", metric_key)
try:
traces = scrape_chart(chart_info["path"])
if not traces:
log.warning("No data for %s", metric_key)
results[metric_key] = {"value": None, "error": "No data returned"}
continue
wanted = chart_info.get("traces")
if metric_key == "puell_multiple":
t = _find_trace(traces, "Puell Multiple")
val = _get_latest_value(t)
results[metric_key] = {
"value": val,
"recent": _get_recent_values(t),
}
elif metric_key == "mvrv_zscore":
t = _find_trace(traces, "Z-Score")
val = _get_latest_value(t)
results[metric_key] = {
"value": val,
"recent": _get_recent_values(t),
}
elif metric_key == "200w_sma":
t = _find_trace(traces, "200 Week Moving Average") or _find_trace(traces, "200 Week MA") or _find_trace(traces, "200W")
val = _get_latest_value(t)
# Also try to find BTC price trace
price_t = _find_trace(traces, "BTC Price") or _find_trace(traces, "Price")
price_val = _get_latest_value(price_t)
results[metric_key] = {
"value": val,
"btc_price": price_val,
"recent": _get_recent_values(t),
}
elif metric_key == "lth_realized_price":
lth_t = _find_trace(traces, "Long-Term Holder Realized Price") or _find_trace(traces, "LTH Realized Price") or _find_trace(traces, "LTH")
price_t = _find_trace(traces, "BTC Price") or _find_trace(traces, "Price")
lth_val = _get_latest_value(lth_t)
price_val = _get_latest_value(price_t)
results[metric_key] = {
"value": lth_val,
"btc_price": price_val,
"recent": _get_recent_values(lth_t),
}
elif metric_key == "hash_ribbons":
# Look for buy/sell signal traces or MA crossover
results[metric_key] = {
"traces": [
{"name": t.get("name", ""), "latest": _get_latest_value(t)}
for t in traces[:6]
],
"value": None,
}
# Try to detect buy signal from trace names/colors
for t in traces:
name = t.get("name", "").lower()
if "buy" in name or "signal" in name:
results[metric_key]["buy_signal"] = True
break
elif metric_key == "lth_supply":
# Get main supply trace
t = traces[0] if traces else None
for candidate in traces:
name = candidate.get("name", "").lower()
if "supply" in name or "lth" in name:
t = candidate
break
recent = _get_recent_values(t, 60)
# Determine trend: compare recent avg to older avg
trend = None
if len(recent) >= 30:
old_avg = sum(recent[:15]) / 15
new_avg = sum(recent[-15:]) / 15
trend = "increasing" if new_avg > old_avg else "decreasing"
results[metric_key] = {
"value": _get_latest_value(t),
"trend": trend,
"recent": _get_recent_values(t),
}
else:
# Generic: grab first non-layout trace with numeric data
t = None
if wanted:
for name in wanted:
t = _find_trace(traces, name)
if t:
break
if not t:
for candidate in traces:
y = candidate.get("y", [])
if y and any(v is not None for v in y[-10:]):
t = candidate
break
val = _get_latest_value(t)
results[metric_key] = {
"value": val,
"recent": _get_recent_values(t),
}
except Exception as e:
log.error("Error scraping %s: %s\n%s", metric_key, e, traceback.format_exc())
results[metric_key] = {"value": None, "error": str(e)}
return results

80
scrapers/price.py Normal file
View File

@ -0,0 +1,80 @@
"""BTC price data from CoinGecko API."""
import logging
import requests
log = logging.getLogger(__name__)
PRICE_URL = "https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&vs_currencies=usd&include_24hr_change=true"
HISTORY_URL = "https://api.coingecko.com/api/v3/coins/bitcoin/market_chart?vs_currency=usd&days=365"
ATH_URL = "https://api.coingecko.com/api/v3/coins/bitcoin?localization=false&tickers=false&market_data=true&community_data=false&developer_data=false"
def fetch_current():
"""Fetch current BTC price and 24h change."""
try:
resp = requests.get(PRICE_URL, timeout=15)
resp.raise_for_status()
data = resp.json()
btc = data.get("bitcoin", {})
return {
"price": btc.get("usd"),
"change_24h": btc.get("usd_24h_change"),
}
except Exception as e:
log.error("Price fetch error: %s", e)
return {"price": None, "error": str(e)}
def fetch_historical():
"""Fetch 365 days of BTC price history. Returns list of [timestamp, price]."""
try:
resp = requests.get(HISTORY_URL, timeout=30)
resp.raise_for_status()
data = resp.json()
prices = data.get("prices", [])
return prices
except Exception as e:
log.error("Historical price fetch error: %s", e)
return []
def fetch_ath():
"""Fetch BTC all-time high from CoinGecko."""
try:
resp = requests.get(ATH_URL, timeout=15)
resp.raise_for_status()
data = resp.json()
market = data.get("market_data", {})
ath = market.get("ath", {}).get("usd")
ath_change = market.get("ath_change_percentage", {}).get("usd")
return {
"ath": ath,
"ath_change_pct": ath_change,
}
except Exception as e:
log.error("ATH fetch error: %s", e)
return {"ath": None, "error": str(e)}
def calculate_200d_sma(prices):
"""Calculate 200-day SMA from historical price data."""
if not prices or len(prices) < 200:
return None
# prices is [[timestamp, price], ...]
recent_200 = [p[1] for p in prices[-200:]]
return sum(recent_200) / len(recent_200)
def calculate_mayer_multiple(current_price, sma_200d):
"""Mayer Multiple = current price / 200-day SMA."""
if not current_price or not sma_200d or sma_200d == 0:
return None
return current_price / sma_200d
def calculate_drawdown(current_price, ath):
"""Drawdown from ATH as percentage."""
if not current_price or not ath or ath == 0:
return None
return (ath - current_price) / ath * 100