Skip to content

Aggregated Signatures

Status: COMPLETE (2026-02-05) Executed on: Threadripper 9975WX (64 threads, ClickHouse on dedicated NVMe)


Objective

Transform 10.8B individual WSPR spots into a high-precision signature library by aggregating reports into median-based physical buckets. This strips site-level noise (local QRM, antenna inefficiency, ground fading) and reveals the atmospheric transfer function.

Table: wspr.signatures_v1

Dimensions

Column Type Description
tx_grid_4 FixedString(4) 4-char TX Maidenhead grid (field level)
rx_grid_4 FixedString(4) 4-char RX Maidenhead grid (field level)
band Int32 ADIF band ID (102-111)
hour UInt8 Hour of day UTC (0-23)
month UInt8 Month (1-12)

Metrics

Column Type Description
median_snr Float32 quantile(0.5)(snr) — site entropy filter
spot_count UInt32 Number of spots in bucket (minimum 5)
snr_std Float32 SNR standard deviation (dB)
reliability Float32 Fraction of spots with SNR > -20 dB
avg_sfi Float32 Average Solar Flux Index for bucket
avg_kp Float32 Average Kp index for bucket
avg_distance UInt32 Average great-circle distance (km)
avg_azimuth UInt16 Average azimuth (degrees)

ORDER BY: (band, hour, tx_grid_4, rx_grid_4) for fast training access.

Feature Derivation

All 13 IONIS features are derivable from the signature table:

Feature Source
distance avg_distance
freq_log Derived from band → frequency mapping
hour_sin / hour_cos Derived from hour
az_sin / az_cos Derived from avg_azimuth
lat_diff Computed from grid centroids
midpoint_lat Computed from grid centroids
season_sin / season_cos Derived from month
day_night_est Derived from hour + grid longitude
sfi (sidecar) avg_sfi / 300
kp_penalty (sidecar) 1 - avg_kp / 9

Filters Applied

Filter Value Rationale
Band 102-111 HF amateur only
Distance >= 500 km Ground-wave contamination rejection
Spot count >= 5 Noise floor rejection
Median (quantile 0.5) Outlier-resistant central tendency

Aggregation Method

Per-band INSERT INTO ... SELECT from wspr.bronze joined with solar.bronze on date + 3-hour Kp bucket. Processing is sequential by band to stay within memory limits (quantile computation stores all values per group).

Results

Metric Value
Total signatures 93,785,013
Processing time 3 min 10 sec (10 bands sequential)
Compression ratio 115:1 (10.8B → 93.8M)
Average bucket size 96 spots
Zero SFI buckets 25,433 (0.03%)
NaN values 0

Per-Band Distribution

Band ID Buckets Total Spots Avg Median SNR Avg Reliability
160m 102 1.69M 116M -18.1 dB 0.477
80m 103 5.85M 532M -17.9 dB 0.495
60m 104 0.91M 81M -17.6 dB 0.499
40m 105 21.67M 2.80B -17.4 dB 0.537
30m 106 15.98M 1.62B -18.1 dB 0.497
20m 107 27.59M 2.74B -17.5 dB 0.533
17m 108 6.35M 373M -18.4 dB 0.477
15m 109 6.26M 360M -18.3 dB 0.480
12m 110 2.12M 104M -18.8 dB 0.453
10m 111 5.37M 267M -17.9 dB 0.505

20m and 40m dominate (as expected — most active WSPR bands).

Sanity Check: FN31 → JO21 (20m)

The reference path (Connecticut to Belgium, ~5,900 km) shows a smooth, physically consistent diurnal curve:

  • SNR peaks at 15-18 UTC (late afternoon, path fully sunlit)
  • SNR drops overnight (02-04 UTC)
  • Summer months show extended propagation windows into evening
  • Winter months concentrate propagation into midday hours
  • Reliability tracks SNR — higher during peak propagation

No stair-step artifacts. The median successfully strips individual spot noise and reveals the atmospheric transfer function.

ClickHouse Performance Settings

SETTINGS
    max_threads = 64,
    max_memory_usage = 80000000000,
    max_bytes_before_external_group_by = 20000000000,
    join_use_nulls = 0

Memory Limit

A single-pass aggregation of all 10 bands exceeded 74.5 GiB memory due to quantile(0.5) storing all values per group. Per-band processing reduces peak memory to ~8 GiB per band.

Signatures V2 — Balloon-Filtered (2026-02-09)

Training from V14 onward uses wspr.signatures_v2_terrestrial (93.3M rows) which excludes balloon and telemetry contamination identified by the V2 detection system.

  • V1 balloon filter (deprecated): flagged 276M spots (2.56%) — 99.7% false positives
  • V2 balloon filter (current): date-level velocity detection + full Rosetta Stone (3.64M callsigns) — 1,443 entries, 950K spots (0.009%) — surgical

The V2 filter correctly excludes only confirmed high-altitude balloon transmissions while preserving legitimate ground station data. The difference is quantified in the V14-TP vs V14-TP-v2 A/B comparison (+1.3 pp Pearson improvement from corrected filter).

DDL: 21-balloon_callsigns_v2.sql, 19-signatures_v2_terrestrial.sql

DDL Location

/usr/share/ionis-core/ddl/12-signatures_v1.sql