LogoAIAny
Icon for item

ONNX Runtime

Microsoft’s high-performance, cross-platform inference engine for ONNX and GenAI models.

Introduction

Overview

ONNX Runtime accelerates >20 frameworks via graph optimizers and hardware EPs (CUDA, DNNL, DirectML, ROCm, CoreML).

Key Capabilities
  • Training & inference APIs (Python/C#/C++)
  • ORT-GenAI with flash-attn kernels
  • Mobile AOT & WebAssembly targets

Information

Categories