WSPR Ingestion Apps¶
Four Go binaries for downloading and ingesting WSPR spot data into ClickHouse.
wspr-turbo¶
The primary ingestion engine. Streams tar.gz/csv.gz archives directly into ClickHouse
using zero-copy decompression and double-buffered native blocks — no intermediate disk I/O.
Part of ionis-apps (Go binary).
wspr-turbo v3.0.2 - Zero-Copy Streaming Pipeline
Usage: wspr-turbo [OPTIONS] [archives...]
Streams tar.gz/csv.gz directly to ClickHouse Native Blocks.
No intermediate disk I/O - bypasses the 'File Penalty'.
Architecture:
- Stream decompression (klauspost/gzip, ASM-optimized)
- Vectorized CSV parsing (columnar buffers)
- Double-buffering (fill while sending)
- sync.Pool (zero allocation after warmup)
- ch-go native protocol with LZ4
-block-size int
Rows per native block (default 1000000)
-ch-db string
ClickHouse database (default "wspr")
-ch-host string
ClickHouse address (default "127.0.0.1:9000")
-ch-table string
ClickHouse table (default "bronze")
-report-dir string
Report output directory (default "/mnt/ai-stack/wspr-data/reports-turbo")
-source-dir string
Archive source directory (default "/scratch/ai-stack/wspr-data/archives")
-workers int
Parallel archive workers (default 16)
wspr-shredder¶
Maximum throughput ingester for uncompressed CSV files. Uses 1 MB read buffers and
zero-allocation CSV parsing to saturate PCIe 5.0 lanes.
Part of ionis-apps (Go binary).
wspr-shredder v3.0.2 - Maximum Throughput WSPR Ingester
Usage: wspr-shredder [OPTIONS] [path|files...]
If no paths specified, uses -source-dir default.
Optimizations:
- ch-go native protocol (fastest ClickHouse client)
- 1MB read buffers (bufio.NewReaderSize)
- csv.Reader with ReuseRecord (zero-allocation)
- Per-file workers to saturate PCIe 5.0 lanes
-ch-db string
ClickHouse database (default "wspr")
-ch-host string
ClickHouse address (default "127.0.0.1:9000")
-ch-table string
ClickHouse table (default "bronze")
-report-dir string
Report output directory (default "/mnt/ai-stack/wspr-data/reports-shredder")
-source-dir string
Default CSV source directory (default "/scratch/ai-stack/wspr-data/csv")
-workers int
Number of parallel file workers (default 16)
wspr-parquet-native¶
Native Go Parquet reader for ingesting Parquet-format WSPR data. Avoids ClickHouse
file() function restrictions by reading client-side with parquet-go.
Part of ionis-apps (Go binary).
wspr-parquet-native v3.0.2 - Native Go Parquet Ingester
Usage: wspr-parquet-native [OPTIONS] [path|files...]
If no paths specified, uses -source-dir default.
Features:
- Native Go Parquet reading (parquet-go)
- ch-go native protocol with LZ4
- No ClickHouse file() restrictions
- Parallel file processing
-ch-db string
ClickHouse database (default "wspr")
-ch-host string
ClickHouse address (default "127.0.0.1:9000")
-ch-table string
ClickHouse table (default "bronze")
-report-dir string
Report output directory (default "/mnt/ai-stack/wspr-data/reports-parquet-native")
-source-dir string
Default Parquet source directory (default "/scratch/ai-stack/wspr-data/parquet")
-workers int
Number of parallel file workers (default 8)
wspr-download¶
Parallel archive downloader for WSPR spot data from wsprnet.org. Uses ETag validation
to detect updated files and supports configurable rate limiting.
Part of ionis-apps (Go binary).
wspr-download v3.0.2 — WSPR Archive Downloader
Usage: wspr-download [flags]
Downloads WSPR spot archives from wsprnet.org.
Archives are monthly .csv.gz files (~200MB-1GB each).
Uses ETag validation to detect updated files (e.g. end-of-month finalization).
Good neighbor: configurable workers/delay, resume-friendly.
-delay duration
Delay between HTTP requests per worker (default 1s)
-dest string
Destination directory (default "/mnt/wspr-data")
-end string
End date (YYYY-MM, default: current month)
-force
Re-download all files regardless of ETag
-list
List files without downloading
-start string
Start date (YYYY-MM) (default "2008-03")
-timeout duration
HTTP timeout per download (default 5m0s)
-workers int
Parallel download workers (default 4)
Data source: https://wsprnet.org/archive
Archive range: 2008-03 to present (~200 files)
Examples:
wspr-download # Download all, skip unchanged
wspr-download --start 2024-01 --end 2024-12 # Download 2024 only
wspr-download --list # List files without downloading
wspr-download --workers 2 --delay 3s # Be extra polite
wspr-download --force # Re-download everything