Unified multimodal LLM for enterprise workflows: ingests video, audio, image and text to perform transcription, OCR, Q&A, summarization and long-context reasoning. Provides BF16/FP8/NVFP4 weights and integrations with vLLM, TensorRT-LLM and other runtimes.