A programming language where the GPU is the primary execution target. AI-assisted. Zero dependencies. Any GPU vendor.
C (1972), C++ (1979), Python (1991) — designed when the CPU was the only processor. They still work, but accessing a GPU requires CUDA (NVIDIA-only, 4 GB SDK), OpenCL (effectively abandoned), or shader languages designed for graphics. Every computer sold today has a GPU. Most of it sits idle.
OctoFlow is built from scratch for this reality. GPU is primary. Single binary, any vendor, zero dependencies.
On building this. OctoFlow is AI-assisted — designed and developed with LLMs generating the bulk of the code. But every architectural decision is human: Rust at the OS boundary, Vulkan for cross-vendor GPU, the Loom Engine’s main/support split, JIT kernel emission via IR builder, the self-hosted compiler direction.
The entire stdlib and everything in this repo is MIT-licensed. The compiler source is private for now, but the developer is willing to open-source all of it once a team is established to develop and sustain it long-term. If that sounds interesting, get in touch.
Not a wrapper around CUDA. Not a shader language. A complete programming language that happens to run on GPUs.
5 memory regions. Autonomous dispatch chains. Single vkQueueSubmit. The GPU runs your program — the CPU just watches. Learn more →
Data born on the GPU stays in VRAM. No round-trips until you need results on the CPU. Chain operations without memory transfers.
The entire compiler, runtime, GPU dispatch, and standard library in one binary. No SDK, no package manager, no runtime to install.
NVIDIA, AMD, Intel — any GPU with Vulkan support. No vendor lock-in, no proprietary toolchains, no compatibility headaches.
No file access, no network, no subprocesses unless explicitly granted with flags. Run untrusted code safely.
The compiler is written in OctoFlow itself. Lexer, parser, preflight, evaluator, and SPIR-V codegen are all .flow files.
Language guide designed as LLM context. Feed it to Claude, ChatGPT, or Copilot and it writes OctoFlow code — no training needed.
Weights, DB columns, embeddings — stored compressed in GPU memory. Delta encoding, dictionary lookup, Q4_K quantization. Decompression is just another kernel in the chain.
Watch OctoFlow run. Every recording is a real terminal session.
Five lines to crunch a million elements on the GPU. No boilerplate, no shaders, no compilation step.
// 1 million elements — GPU parallel — one command let a = gpu_fill(1.0, 1000000) let b = gpu_fill(2.0, 1000000) let c = gpu_add(a, b) let d = gpu_scale(c, 0.5) let total = gpu_sum(d) print("Total: {total}") // 1500000 // Matrix multiply let result = gpu_matmul(mat_a, mat_b, rows, cols_a, cols_b)
// First-class functions and closures let double = fn(x) x * 2 end print(double(21)) // 42 let nums = [1, 2, 3, 4, 5, 6, 7, 8] let evens = filter(nums, fn(x) x % 2 == 0 end) let squared = map_each(evens, fn(x) x * x end) let total = reduce(squared, 0, fn(acc, x) acc + x end) print("Sum of even squares: {total}") // 120 let sorted = sort_by(nums, fn(a, b) b - a end) // descending
use csv use descriptive let data = read_csv("sales.csv") let revenue = csv_column(data, "revenue") print("Mean: {mean(revenue)}") print("Median: {median(revenue)}") print("Std Dev: {stddev(revenue)}") print("P95: {quantile(revenue, 0.95)}") let r = correlation(revenue, costs) // Pearson r
// Stream pipelines — data flows through operations stream photo = tap("input.jpg") stream enhanced = photo |> brightness(20) |> contrast(1.2) |> saturate(1.1) emit(enhanced, "output.png") // Reusable pipeline functions fn warm_filter: brightness(15) |> contrast(1.1) |> saturate(1.3) stream result = tap("photo.jpg") |> warm_filter emit(result, "warm.png")
use regression use preprocess use metrics // Train/test split with scaling let split = train_test_split(features, labels, 0.8) let scaled = standard_scale(split.train_x) // Linear regression let model = linear_fit(scaled, split.train_y) let predictions = linear_predict(model, split.test_x) // Evaluate let acc = r_squared(split.test_y, predictions) print("R-squared: {acc}")
AI, media, GPU, ML, stats, science, data, and more. All written in OctoFlow itself.
Transformer inference, GGUF model loading, BPE tokenization, streaming generation, GPU-accelerated layers
221 kernels, Loom Engine, SPIR-V codegen, dispatch chains, resident buffers, deferred execution
Audio DSP (oscillator, FFT, ADSR, effects, mixer), image transforms, video timeline, WAV/BMP/GIF/H.264/MP4/AVI/TTF codecs. No ffmpeg.
Linear & ridge regression, KNN, k-means, neural nets, decision trees, bagging ensembles, metrics
Mean, median, stddev, skewness, kurtosis, Pearson/Spearman, t-test, chi-squared, ANOVA, Sharpe, VaR
Calculus, physics (RK4, spring-damper), signal processing (FIR, convolution), interpolation, matrices
Parallel-array ECS, sprite rendering, physics, spatial hash collision, AI behaviors, scene management, debug draw
16 widgets, canvas drawing, 5 chart types, themes, layout engine, event system (Windows)
Lexer, parser, evaluator, preflight, SPIR-V IR builder, codegen — the compiler written in .flow
CSV, JSON, GGUF, columnar DB with SQL-like queries, ETL pipelines, validation
Stack, queue, min/max heap, weighted directed graph with Dijkstra shortest path
SHA-256, base64, CSPRNG, HTTP client, JSON, URL encoding, TCP/UDP sockets
Process execution, filesystem, logging, config templates, env vars, platform detection, timing
Run octoflow chat and describe what you want. The entire language fits in one LLM prompt. Grammar-constrained decoding and 69 structured error codes for the auto-repair loop.
4.5 MB binary. Unzip. Run. That's it. No installer, no SDK, no account.
Requires any GPU with Vulkan driver (NVIDIA, AMD, Intel) · macOS via MoltenVK
Or install from terminal:
irm https://octoflow-lang.github.io/octoflow/install.ps1 | iex
curl -fsSL https://octoflow-lang.github.io/octoflow/install.sh | sh