From model definition to in-browser inference. Compile, optimize, and deploy neural networks with a full compiler stack.
High-level graph IR for optimization, low-level scheduled IR for code generation. Immutable data structures for full pipeline history.
Shape inference, constant folding, dead code elimination, operator fusion, quantization, layout optimization, and memory planning.
Generate WebGPU WGSL compute shaders, WASM dispatch schedules, or pure JavaScript for maximum compatibility.
Run compiled models directly in the browser with WebGPU acceleration, WASM fallback, and JS reference execution.