IonisGate Architecture¶

Production: V20 Golden Master

IonisGate architecture validated with V20 production checkpoint.

Pearson: +0.4879
RMSE: 0.862σ (~5.8 dB)
SFI benefit: +0.482σ (~3.2 dB), monotonic
Kp storm cost: +3.487σ (~23.4 dB), monotonic
Test Suite: 35/35 PASS

Design Rationale¶

An additive sidecar architecture produces global constants — the same SFI benefit and Kp penalty for every path on Earth:

Finding	Impact
Storm cost is flat everywhere	Polar paths should suffer 3–5x more
SFI benefit is flat on all bands	10m should benefit far more than 160m
No geographic modulation of solar effects	Equatorial vs. auroral zones identical

The DNN cannot modulate the sidecars because the interaction is purely additive. The sidecars are "tone-deaf" to geography and frequency.

Solution: Multiplicative Interaction Gates¶

output = base_snr + sun_scaler × SunSidecar(sfi)
                  + storm_scaler × StormSidecar(kp)

The DNN splits into a shared trunk with three heads:

Input (13 features)
    │
    ├── features 0-10 (geography/time)
    │       │
    │    Trunk: 11 → 512 → 256
    │       │
    │       ├── Base Head: 256 → 128 → 1  ──────────► base_snr
    │       │
    │       ├── Sun Scaler: 256 → 64 → 1 → gate() ──► sun_scaler ∈ [0.5, 2.0]
    │       │                                              │
    │       └── Storm Scaler: 256 → 64 → 1 → gate() ──► storm_scaler ∈ [0.5, 2.0]
    │                                                      │
    ├── feature 11 (sfi) ──► SunSidecar(8) ──────── × sun_scaler ──► sun_term
    │                                                                    │
    └── feature 12 (kp_penalty) ──► StormSidecar(8) ── × storm_scaler ──► storm_term
                                                                          │
                                                Output = base_snr + sun_term + storm_term

Gate Function¶

gate(x) = 0.5 + 1.5 × sigmoid(x)

Property	Value	Purpose
Range	[0.5, 2.0]	Never zero, never too large
Identity	x = -ln(2) ≈ -0.693	gate(-0.693) = 1.0
Differentiable	Everywhere	Standard backprop
Bounded	Always	Cannot reverse sidecar direction

The gate acts as a volume control: it can turn the sidecar effect up (to 2x) or down (to 0.5x), but it can never mute it entirely or reverse it. This preserves the monotonic physics guarantee.

What the Gates Enable¶

Scenario	sun_scaler	storm_scaler	Physical Meaning
10m equatorial path	> 1.0	< 1.0	Strong SFI benefit, mild storm impact
160m high-latitude path	< 1.0	> 1.0	Weak SFI benefit, severe storm impact
20m mid-latitude (reference)	≈ 1.0	≈ 1.0	Baseline behavior

Parameter Budget¶

Component	Parameters
Trunk (11→512→256)	137,472
Base head (256→128→1)	33,025
Sun scaler head (256→64→1)	16,513
Storm scaler head (256→64→1)	16,513
Sun Sidecar (1→8→1)	25
Storm Sidecar (1→8→1)	25
Total	203,573

Anti-Collapse Regularization¶

Without explicit encouragement, the optimizer may find it easier to keep the scaler heads constant (collapsing to gate ≈ 1.0 everywhere). The scaler variance loss prevents this:

L_var = -λ × (Var(sun_gate) + Var(storm_gate))

Negative variance is added to the total loss, so the optimizer is incentivized to produce diverse gate values across the batch. If the gates collapse to a constant, the variance term becomes zero and the penalty is maximized.

Sidecar Constraints¶

The sidecars enforce physics constraints the DNN cannot override:

Property	Value	Enforced by
Architecture	MonotonicMLP (1→8→1)	Class definition
Weight range	[0.5, 2.0]	Post-optimizer clamp
Monotonicity	Non-negative weights via abs()	Forward pass
fc1.bias	Frozen	requires_grad = False
fc2.bias	Learnable	Relief valve for calibration
Activation	Softplus	Smooth, always positive slope

The DNN can only "turn up or down the volume" on the sidecar output — it cannot reverse the direction. Higher SFI always helps; storms always hurt. The gates modulate how much, not whether.

Gate Initialization¶

For identity-equivalent behavior at initialization:

gate(x) = 1.0 when sigmoid(x) = 1/3
sigmoid(x) = 1/3 when x = -ln(2) ≈ -0.693
Set final layer bias of each scaler head to -0.693
Zero-initialize final layer weights so output ≈ -0.693 for all inputs

This ensures the model starts training from a known baseline — the gates are invisible until training data drives them to differentiate by geography and frequency.

Files¶

File	Purpose
`ionis-training/scripts/models/ionisgate.py`	PyTorch module + self-verification
`ionis-docs/docs/architecture/ionisgate.md`	This document