Programs are graphs.
flowg keeps code in the shared form every language already represents — a typed dataflow graph. Materialize it to any language, lower it to many silicons, and meter every operation in picojoules.
Every program is already a graph.
Keep it, and the wins compound.
A program is a network of operations connected by typed, directed data flows; every language is a surface over that structure. Keep the graph as the source of truth and the gains are concrete — materialize to any language deterministically, port and edit without re-debugging, rule out whole classes of errors by construction, and route each operation to the lowest-energy silicon with a joule receipt for every one.
kinds (closed set)
(ownership semantics)
from one graph
per silicon
Real savings — in time, energy, and cost.
"The graph is canonical. Text in any language is just one projection of it — produced by lookup, not guessed token by token." The flowg premise
A typed hypergraph,
not a text file.
Nodes are computation primitives. Edges are resource-semantic
wires — Rust's ownership generalized: Move, Borrow, Stream, Feedback,
and more. Each Apply node
carries an operation with physical cost metadata. This is the IR — the same
one the runtime executes.
// flowg graph: scale_and_sum (the graph is the program) graph scale_and_sum(xs: tensor<f32>, k: f32) -> f32 { n0 Param xs n1 Param k n2 Apply mul (n0, n1) // Custom("mul") n3 Apply reduce_sum (n2) // → backend chosen by joules return n3 } // wires carry ownership: n0 ──Move──▶ n2 ──Move──▶ n3
One graph.
Many languages.
The same graph projects to source in many target languages — deterministically, the same output on every machine, every run. Text becomes a view, not the source of truth.
fn scale_and_sum(xs: &[f32], k: f32) -> f32 { xs.iter().map(|x| x * k).sum() }
def scale_and_sum(xs, k): return sum(x * k for x in xs)
// emitted WGSL compute shader (WebGPU) @compute @workgroup_size(64) fn main(@builtin(global_invocation_id) g: vec3<u32>) { out[g.x] = xs[g.x] * k; }
Routed to the silicon that
does it for the fewest joules.
Each operation carries a cost. flowg's placement pass picks the backend — CPU, Metal, AMX, CUDA, WebGPU, Wasm — that runs it at the lowest verifiable joules, with no default to any one device. Every run can produce an energy receipt.
Picojoule cost model · analytical prior refined by measured calibration.
The shared target
beneath the languages.
flowg is the graph every sibling language lowers into. They differ in surface; the target is identical. A program written in Joule runs a fused kernel on Metal without the language ever naming Metal.
Inspect a graph.
Build the toolchain, make a sample graph, and watch it run across devices with a joule receipt.
$ cargo install flowg $ flowg make-sample scale_and_sum.fg $ flowg inspect scale_and_sum.fg $ flowg run scale_and_sum.fg total: 8.42 μJ · placement: min-joule · determinism: strict