Search
Collection
Category
Tag
Blog

AIAny

Learn Anything about AI in one site

Best learning resources for AI

AIAny

Learn Anything about AI in one site.

support@aiany.app

Product

Search
Collection
Category
Tag

Resources

Blog

Company

Privacy Policy
Terms of Service
Sitemap

AIAny

Search
Collection
Category
Tag
Blog

AIAny

Learn Anything about AI in one site

Best learning resources for AI

CodeLayer

2025

HumanLayer Team

CodeLayer is an open-source IDE that orchestrates AI coding agents to tackle hard problems in complex codebases. Built on Claude Code, it features battle-tested workflows, keyboard-first interfaces for speed, advanced context engineering for team scaling, and multi-Claude parallel sessions.

ai-coding ai-agent claude github IDE+1

OpenCode

2025

sst, thdxr

OpenCode is the open source AI coding agent built for the terminal. It features built-in agents (build, plan, general), LSP support, multi-provider LLM compatibility (75+ including Claude, GPT, Gemini, local), native TUI, multi-session, share links, and privacy-first design. No account needed, 35k+ stars.

ai-coding ai-agent github ai-client llm

LEANN

2025

Yichuan Wang, Zhifei Li +1

LEANN is the world's smallest vector index, enabling RAG on everything from documents to chat histories with 97% storage savings, no accuracy loss, and 100% privacy on personal devices. It uses graph-based selective recomputation and high-degree preserving pruning for lightweight, scalable semantic search.

RAG github ai-library ai-tools ai-development+5

Magentic-UI

2025

Microsoft Research

Magentic-UI is a research prototype from Microsoft Research for a human-centered AI web agent. It automates complex web and coding tasks while keeping users in control, revealing plans before execution, allowing guidance, and requiring approvals for sensitive actions. Key features include co-planning, action guards, plan learning, and integration with models like GPT-4o and Fara-7B.

ai-agent microsoft github LLM

WhisperLiveKit

2025

QuentinFuxa

WhisperLiveKit is an ultra-low-latency, self-hosted speech-to-text toolkit with speaker identification. Powered by leading simultaneous speech research like Simul-Whisper and WhisperStreaming, it enables intelligent buffering and incremental processing for real-time transcription, translation across 200 languages, and speaker diarization. Ideal for meeting notes, accessibility tools, and content creation.

github ai-tools ai-inference ai-serving ASR+2

VibeVoice

2025

Microsoft

VibeVoice is Microsoft's open-source frontier voice AI framework designed for generating expressive, long-form, multi-speaker conversational audio (e.g., podcasts) from text. It supports up to 90 minutes of speech with up to 4 distinct speakers. Key innovations include continuous speech tokenizers at 7.5 Hz frame rate and next-token diffusion using LLMs for context and high-fidelity acoustics. Recently released VibeVoice-Realtime-0.5B for real-time streaming TTS with ~300ms latency

microsoft audio github huggingface paper+2

Hello-Agents: 从零开始构建智能体

2025

Datawhale Community, Si Zhou Chen +2

Hello-Agents is a systematic open-source tutorial from the Datawhale community, dedicated to building AI Native Agents. It covers agent fundamentals, history, large language model basics, classic paradigms like ReAct, low-code platforms such as Coze, mainstream frameworks like LangGraph, and custom framework development. Advanced topics include memory and retrieval, context engineering, agent communication protocols, Agentic-RL training, and performance evaluation, culminating in case studies like intelligent travel assistants and cyber towns, transforming learners from LLM users to agent builders.

tutorial ai-agent github LLM course+2

Video models are zero-shot learners and reasoners

2025

Thaddäus Wiedemer, Yuxuan Li +7

This paper demonstrates the zero-shot learning and reasoning abilities of the generative video model Veo 3, paralleling the evolution of Large Language Models (LLMs) in natural language processing. Veo 3 excels in diverse visual tasks without explicit training, such as object segmentation, edge detection, image editing, understanding physical properties, recognizing affordances, and simulating tool use, enabling early visual reasoning like maze solving and symmetry detection.

video vision LLM paper ai-video+3

Tinker Cookbook

2025

Thinking Machines Lab

Tinker Cookbook is an open-source library from Thinking Machines Lab for customizing language models via the Tinker API. It offers realistic fine-tuning examples for supervised learning, reinforcement learning, chat, math reasoning, preference learning, tool use, prompt distillation, and multi-agent setups, along with utilities for rendering, hyperparameters, and evaluation.

github ai-train LLM RL ai-library

Next AI Draw.io

2025

DayuanJiang

A Next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visualization, with features like image replication, diagram history, interactive chat, AWS architecture support, and animated connectors.

github ai-tools ai-image chatbot ai-development

LightX2V

2025

LightX2V Contributors, ModelTC

LightX2V is an advanced lightweight video generation inference framework engineered to deliver efficient, high-performance video synthesis solutions. This unified platform integrates multiple state-of-the-art video generation techniques, supporting diverse generation tasks including text-to-video (T2V) and image-to-video (I2V). X2V represents the transformation of different input modalities (X, such as text or images) into video output (V).

github ai-video ai-tools ai-inference huggingface+2

Deep Agents

2025

LangChain

Deepagents is an open-source agent harness built on LangChain and LangGraph. It equips agents with planning tools, a filesystem backend, sub-agent spawning, and middleware for handling complex, long-horizon tasks reliably and cost-effectively.