vLLM-Omni is an open-source framework from the vLLM community for efficient inference and serving of omni-modality models. It extends vLLM's fast autoregressive support to handle multi-modal data (text, image, video, audio), non-autoregressive architectures, heterogeneous outputs, and integrates with Hugging Face models while offering pipeline parallelism, KV-cache optimizations, and an OpenAI-compatible API.