AIAny - Project AIRI

Project AIRI — Overview

Project AIRI is an open-source initiative to build "cyber living" virtual characters (AI waifu / digital companions / VTubers) that can chat, act, and play games autonomously. Heavily inspired by Neuro-sama, AIRI focuses on providing a self-hosted, extensible stack that integrates modern LLMs, real-time audio, model orchestration, memory/RAG, and multi-platform frontends.

Key features

Brain: multi-provider LLM integrations (OpenAI, Anthropic, vLLM, Google Gemini, Ollama, Mistral, Groq, and many others via the xsAI adapter), prompt engineering and agent orchestration.
Game agents: out-of-the-box capabilities for playing Minecraft and Factorio (includes demos and separate subprojects for deeper integration).
Real-time audio: browser and Discord audio input, client-side speech recognition, VAD/talking detection, and TTS support (e.g., ElevenLabs).
Avatar & presentation: supports VRM and Live2D models with animations (auto-blink, auto look-at, idle eye movement) and Web-based rendering using WebGPU / WebAssembly where applicable.
Memory & retrieval: in-browser DB support (DuckDB WASM/pglite) and RAG-style systems; memory subsystem is in active development.
Cross-platform: Stage Web (browser), desktop (Tauri / native with CUDA/Metal support), and mobile (PWA / Capacitor) targets.
Extensibility: plugin and subproject architecture (xsai, unspeech, duckdb-wasm, tauri plugins, etc.), encouraging community contributions (artists, modellers, engineers).

Architecture highlights

Uses modern Web tech for UI and some runtime (WebGPU, WebAudio, WebWorkers, WebAssembly) while offering native paths that leverage CUDA/Apple Metal for heavier model inference via underlying runtimes (e.g., candle, vLLM).
xsAI acts as the adapter layer to multiple LLM providers and runtimes, enabling flexible provider switching and hybrid local/cloud setups.
Modular subprojects (e.g., unspeech, hfup, inventory, mcp-launcher) let AIRI handle audio pipelines, model bundling/deployment, centralized model catalogs, and MCP server tooling.

Developer & deployment notes

Repo includes development scripts and stages:
- pnpm i
- pnpm dev (Stage Web)
- pnpm dev:tamagotchi (desktop Tamagotchi stage)
- pnpm dev:capacitor (mobile/capacitor)
Documentation and developer guides are provided on the official docs site; contributors are invited for roles ranging from UI/artist to RL and inference engineering.

Supported providers & integrations

AIRI is designed to interoperate with many LLM and service providers (OpenAI, Anthropic, vLLM, Google Gemini, Ollama, Mistral, Groq, xAI, ElevenLabs for TTS, Discord/Telegram integrations for chat, and others) and exposes connectors for RAG and embedding stores.

Use cases

Personal/self-hosted VTuber or digital companion that streams, chats, and interacts with viewers.
Research and prototyping platform for multi-modal agents (speech + vision + actions in games).
Education/demo platform to show agent orchestration, RAG/memory, and multi-provider LLM usage.

Community & ecosystem

Active GitHub organization and many subprojects (unspeech, airi-factorio, xsai-transformers, duckdb-wasm, etc.).
Public documentation site, Discord server, social media presence, and devlogs tracking progress and releases.

Status & roadmap notes

Project is under active development with many working components (game playing, audio IO, avatar control) while some subsystems (memory/alaya, pure in-browser full-model inference) remain works-in-progress.

Quick links

Repository: GitHub (this URL)
Docs / Try it: AIRI official site and docs

(End of introduction.)

Project AIRI

Introduction

Project AIRI — Overview

Key features

Architecture highlights

Developer & deployment notes

Supported providers & integrations

Use cases

Community & ecosystem

Status & roadmap notes

Quick links

Information

Categories

Tags

More Items

MiroThinker

Memvid

opcode