Validation Reports¶

Automated test results from IONIS model evaluations.

Current Status: V20 Golden Master (2026-02-11)¶

V20 Production Locked — All Criteria Met

IONIS V20 Golden Master validates the physics constraints in a clean, config-driven codebase.

Metric	Target	V20 Final	Status
Pearson	> +0.48	+0.4879	PASS
Kp sidecar	> +3.0σ	+3.487σ	PASS
SFI sidecar	> +0.4σ	+0.482σ	PASS
RMSE	—	0.862σ	Matched

Training: 100 epochs in 4h 16m on Mac Studio M3 Ultra (MPS backend). Checkpoint: versions/v20/ionis_v20.pth

Mode-Aware Validation¶

IONIS predicts signal-to-noise ratio (SNR) — a physical quantity. The operational question "Can I work this path right now, on my mode?" is answered by applying mode-specific thresholds to that prediction:

Mode Family	Threshold	Recall	Interpretation
WSPR	-28 dB	~97%	Model sees nearly all beacon paths
FT8/FT4	-20 dB	93.29%	Digital modes decode deep in the noise
CW	-10 dB	93.77%	Most CW-viable paths detected
RTTY	-5 dB	99.37%	Contest anchoring taught RTTY ceiling
SSB	+5 dB	98.40%	Contest anchoring taught voice ceiling

Curriculum learning: WSPR taught the floor (-28 dB), contest logs taught the ceiling (+10 dB). The model now knows the full dynamic range.

VOACAP Comparison Context¶

VOACAP (ITS/NTIA) was designed for SSB voice circuits using 1960s-era ionosonde coefficients. It has no concept of digital mode decode thresholds.

SSB is the only direct comparison — both models target voice-viable paths
For digital modes (FT8, FT4, WSPR) and CW/RTTY, IONIS provides predictions where no comparable reference model exists
When FT8 operators use VOACAP and find "closed" paths that are wide open at -20 dB, that's not a VOACAP failure — it was never designed for that world

PSK Reporter Acid Test (2026-02-10)¶

84.14% Recall on Independent Data — Model Generalizes

Validated against 100K spots from 16.5M PSK Reporter observations. Real solar conditions (SFI=140, Kp=1.6). Data the model has never seen.

Test	Recall	Notes
IONIS vs VOACAP (training domain)	96.38%	Contest paths
PSK Reporter (independent)	84.14%	Acid test

By Mode:

Mode	Recall	Spots
FT8	83.61%	91,682
WSPR	100%	4,729
FT4	82.30%	2,729
CW	59.33%	804

By Band:

Band	Recall	Notes
15m-10m	94-96%	F2 mastery
20m-17m	81-89%	Solid
160m-80m	45-69%	NVIS gap

Key insight: -3 pp drop with real SFI (140 vs 150 default) proves model responds to solar conditions — physics, not memorization.

IONIS vs VOACAP (2026-02-11)¶

IONIS 96.38% vs VOACAP 75.82% — 1M Contest QSOs

Comparison on 1,000,000 real contest QSO paths. IONIS showed +20.56 percentage point improvement over VOACAP.

Model	Overall Recall	vs VOACAP
IONIS	96.38%	+20.56 pp
VOACAP	75.82%	—

See IONIS vs VOACAP for full results by mode, band, and methodology.

Prediction Quality (2026-02-09)¶

IONIS Pearson r=+0.3675 vs VOACAP r=+0.0218

100K high-confidence signatures (spot_count > 50), per-band Pearson correlation. IONIS showed higher correlation on 9 of 10 bands. VOACAP anti-correlated on low bands (160/80/60/40/30m).

Note: V20 Golden Master achieves Pearson +0.4879 — a substantial improvement from the original +0.3675 measurement.

See Prediction Quality for full band-by-band results.

Link Budget Battery (2026-02-11)¶

24 Profiles Tested — Full Discrimination Curve Mapped

Validated V20 model predictions across 24 station profiles from WSPR baseline (0 dB) to EME (+70.8 dB) against 3 ground truth sources.

Discrimination Curve (RBN, 56.7M paths):

Profile	Advantage	Recall	Tier
wspr	+0.0 dB	15.61%	baseline
qrp_portable	+11.0 dB	91.86%	GOLDILOCKS
home_station	+31.0 dB	100.00%	saturated
contest_cw	+53.0 dB	100.00%	saturated

Key insight: The model predicts ionospheric propagation correctly. Station profiles provide the "gearbox" for operational predictions. The ~92% QRP recall confirms the model discriminates between easy and hard paths — exactly what operators need.

See Link Budget Battery for full 24-profile results, per-band analysis, and solar breakdowns.

Curriculum Learning¶

Training success comes from teaching sequence:

WSPR (floor): 10.8B observations at -28 dB — "what barely possible looks like"
RBN DXpedition (rare): 91K from 152 DXCC — "unusual paths exist"
Contest (ceiling): 6.34M proven QSOs at +10 dB — "strong signals exist"

The model learned the full dynamic range. WSPR alone only taught "marginal."

Test Suite¶

35/35 Tests Pass

The oracle test suite validates physics constraints, canonical paths, input validation, robustness, and regression baselines.

Group	ID Range	Tests	Purpose
Canonical Paths	TST-100	10	Known HF paths
Physics Constraints	TST-200	6	Monotonicity, sidecars
Input Validation	TST-300	9	Boundary checks
Robustness	TST-500	9	Determinism, stability
Regression	TST-800	1	Catch silent degradation

Documentation¶

Link Budget Battery — 24-profile station discrimination test
IONIS vs VOACAP — 1M path comparison
Prediction Quality — 100K path Pearson correlation comparison
Oracle Test Specification — NASA-style test documentation