Overview
ONNX Runtime accelerates >20 frameworks via graph optimizers and hardware EPs (CUDA, DNNL, DirectML, ROCm, CoreML).
Key Capabilities
- Training & inference APIs (Python/C#/C++)
- ORT-GenAI with flash-attn kernels
- Mobile AOT & WebAssembly targets