The contract that has to hold across languages
An HTTP API contract and a telemetry metric have the same structural problem: one definition has to be honored by code written in different languages, compiled by different toolchains, in different parts of the repo.
For the AIO API, an endpoint's request shape is encoded by the iOS client (Swift), validated and served by the backend (Go), and consumed by the admin web UI (TypeScript). For a metric, the bucket layout is computed on the client (Swift), pre-validated on the server (Go), and queried by dbt models against ClickHouse (SQL). When these drift, you get a 400 that only some clients hit, or a histogram whose buckets silently don't line up.
The repo solves both with the same mechanism: a DSL file is the single source of truth, a Rust binary projects it into per-language artifacts, and Bazel genrules wire each artifact to where it's consumed. Two instances exist today — aio-api-codegen (added in be92d9ec8) and metric-codegen (added in 90fdb92a2) — built independently but with a near-identical layout.
The DSL inputs
The .aio API DSL is declarative and reads close to the wire shape. There are 18 .aio specs under Server/services/aio-server/api/, totaling 140 endpoints and 111 objects:
namespace devices
endpoint deviceRegister POST /api/v1/devices {
auth: optional
requires: [bundleID]
request {
deviceToken: string
environment: string
deviceID: string?
appVersion: string?
}
response: DeviceDTO
}
object DeviceDTO {
id: string
bundleID: string
deviceToken: string
isActive: bool
lastSeenAt: string
}
The .metric DSL describes histograms instead, with named bucket presets, enum dimensions, and a sampling policy:
namespace network
schema_version "1.0"
preset http_times = exponential(min: 1ms, max: 60s, buckets: 50)
dimension status_class = enum { "2xx" "4xx" "5xx" network_error canceled }
@owner("Network")
metric http_request_duration {
version = 1
kind = histogram
unit = milliseconds
bucket = http_times
tags = [ tag endpoint { cardinality: 100 } status_class http_method ]
sampling = rate(0.1)
}
Both files are hand-written and live next to the code that owns the domain. Neither generated output is checked in.
One binary, subcommands per target
The Rust crate is split lexer → parser → ir → emit::{...}, and main.rs is a thin clap dispatcher. The metric binary:
#[derive(Subcommand)]
enum Cmd {
Proto { input: PathBuf, output: PathBuf },
Swift { input: PathBuf, output: PathBuf },
Go { input: PathBuf, output: PathBuf, package: String },
Sql { input: PathBuf, output: PathBuf },
Check { inputs: Vec<PathBuf> },
}
Every subcommand runs the same front half — load() does lex → parse → ir::check — then hands the type-checked Module to one emitter. The API binary is identical in shape, with Swift / Go / Ts / Check instead. The check subcommand parses and type-checks without emitting; it exists as a CI gate so a malformed spec fails before any target is built.
The front/back split matters: validation lives once, in ir::check, not duplicated per language. The metric checker rejects a sparse metric that declares a bucket, a histogram with no bucket, a tag referencing an undefined dimension, or a duplicate metric name — before Swift, Go, SQL, or Proto ever sees it. A class of "wrong in three languages at once" bugs is moved to a single parse-time failure.
One genrule per output
The Bazel glue is deliberately thin. Each language target is a macro wrapping genrule, invoking the binary with the matching subcommand. From api.bzl:
def aio_api_go_lib(name, spec, package, importpath, visibility = None):
native.genrule(
name = name + "_gen",
srcs = [spec],
outs = [name + ".go"],
cmd = "$(location " + _CODEGEN + ") go --input $(SRCS) --output $@ --package " + package,
tools = [_CODEGEN],
)
go_library(name = name, srcs = [":" + name + "_gen"], importpath = importpath)
api.bzl exposes three such macros (swift_lib, go_lib, ts_lib); metric.bzl exposes four (proto/swift/go/sql) plus a metric_spec umbrella that emits all of them under one name, and factors the shared command-building into a private _gen_one helper. A BUILD file then reads as a flat list:
aio_api_swift_lib(name = "devices_swift", spec = "devices.aio")
aio_api_go_lib(name = "devices_go", spec = "devices.aio",
package = "devicesapi", importpath = ".../devicesapi")
aio_api_ts_lib(name = "devices_ts", spec = "devices.aio")
Because the spec is the only srcs input, Bazel regenerates exactly the affected targets when a .aio or .metric file changes — and nothing else.
Where the languages actually diverge
A shared IR does not mean identical output. The divergence is concentrated in two places.
Type mapping is a single match over (Lang, Type). An optional string becomes String? in Swift, *string in Go, string | null in TypeScript; a map becomes [String: V], map[K]V, Record<string, V>. Naming is the other axis: the Go emitter runs a word-boundary acronym pass so idToken → IDToken, apiKey → APIKey, avatarURL → AvatarURL, matching Go convention, while the Swift metric emitter has to backtick-escape enum cases that start with a digit — status_class = "1xx" can't be a bare Swift identifier, so it emits case `1xx`. These are the kinds of details that are tedious to keep consistent by hand and trivial to centralize once.
The two binaries differ in emit strategy: the API codegen builds a serializable view model and renders it through tera templates (swift.tera, go.tera, ts.tera); the metric codegen writes strings directly with push_str. Both are fine. Templates separate layout from logic when the output is large and structural; direct string-building has fewer moving parts when the output is a flat registry.
The cross-target invariant: one fingerprint, three runtimes
The metric case has a constraint the API case doesn't: a number that must be byte-identical across Swift, Go, and ClickHouse. A histogram's bucket layout is fingerprinted once in the IR as a 32-bit FNV-1a hash over the canonicalized boundaries — layout_sig — after units are folded to a base (time → ns, size → bytes) so 1s and 1000ms hash the same.
That single value is then stamped into every artifact: the Swift struct gets public let layoutSig: UInt32 = 0x..., the Go Registry map entry gets LayoutSig: 0x..., and the ClickHouse metric_registry table carries a layout_sig UInt32 column. The server's Lookup(name, version) rejects an incoming histogram whose version isn't registered; the layout fingerprint lets client and server agree that "the same metric" means "the same buckets" without shipping the boundary array on every request. Computing it in one place is the only way three independently compiled runtimes stay consistent.
The cost: generated files that aren't on disk
The pattern has a real downside worth naming. The Go output is produced by a genrule, so the .go files do not exist in the source tree. Gazelle — which generates Go BUILD files by walking source — can't find the package, falls back to go mod download against a monorepo-internal vanity import path, gets a 404, and concludes the dependency is stale, deleting it from every consumer's BUILD file.
The fix is explicit and lives in Server/BUILD.bazel: one # gazelle:resolve directive per generated package, mapping importpath directly to the Bazel label, short-circuiting the source walk. There are 18 of them today, one per .aio service, and the comment block says to add a line whenever a new aio_api_go_lib appears. Codegen-as-genrule trades a manual, error-prone consistency problem for a different manual step — registering each generated package with the BUILD generator. Smaller, but not zero.
What transfers
- Put the source of truth in a DSL, generate every consumer. When one definition crosses language boundaries, hand-syncing N copies is the defect source; a spec plus codegen removes the copies.
- One binary, subcommands per target, one shared front half. Validation belongs in the IR, once, so a bad spec fails at parse time instead of N times at compile time. A
checksubcommand makes that a CI gate. - One genrule per output, spec as the only input. Bazel then rebuilds exactly the affected targets, and the BUILD file stays a flat declaration of "which languages does this spec target."
- Centralize the values that must agree. A bucket fingerprint computed in one place and stamped into Swift, Go, and SQL is what keeps three runtimes from disagreeing about what "the same metric" means.
- Account for the genrule blind spot. Generated files aren't on disk, so source-tree tools (Gazelle) need explicit resolution. Budget for that step before it deletes your dependencies.