back to projects
public · single-node stabledistributed systems

Helios

Time-series database in Go — built from first principles.

github
StorageWAL + LSMCRC32C-framed records, SSTable v2
CompressionGorillaDelta-of-delta + XOR encoding
ClusteringOptional Raft3-node, SST-backed FSM snapshots

The problem

Time-series databases get treated as black boxes. The only way to really understand durable ingestion, compressed persistence, filterable queries, and consensus replication is to build one — and pin every claim to repeatable benchmarks.

Architecture

Framed-WAL writes with CRC32C
in-memory memtable with cardinality controls
SSTable v2 with Gorilla-compressed points + label postings
query path via exact label filters, matchers, and a practical PromQL subset
age-based retention and manual prune endpoint
streaming EWMA anomaly detector exposed over SSE
optional Hashicorp Raft for 3-node clustering with leader-forwarded writes and SST-backed FSM snapshots
/metrics, /livez, health and readiness probes.

Key decisions

DECISIONCHOICEWHY
Storage engineLSM-tree over B-treeTime-series workloads are write-heavy; LSM converts random writes to sequential and matches the access pattern.
CompressionGorilla encoding (delta-of-delta + XOR)Facebook's Gorilla paper is the canonical fit for monotonic timestamps and slowly-varying floats.
ConsensusHashicorp Raft, optional clusteringSingle-node by default keeps the system honest about its baseline; Raft is opt-in via `-peers`/`-bootstrap` when you need replication.
QueryPromQL subset, not full PrometheusImplement only what is reproducible from the spec; document what is supported instead of pretending to ship the entire surface.
GoWALLSM-treeGorillaPromQLRaft (optional)