Search
Collection
Category
Tag
Daily AI

Tag

Explore by tags

AIAIAny

Curated AI Resources for Everyone

[email protected]

Product

Search
Collection
Category
Tag

Resources

Blog

Company

Privacy Policy
Terms of Service
Sitemap

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

agent-skills

ai

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-deploy

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

algorithms

alibaba

amazon

android

anthropic

audio

aws

biology

blog

book

bytedance

chatbot

chatgpt

chemistry

claude

claude-code

cli

code

codex

copilot

course

cuda

cursor

deepmind

deepseek

depth

devops

diffusers

docker

drug-discovery

electron

embeddings

engineering

evaluation

facebook

finance

flow-matching

foundation

foundation-model

gemini

gemini-cli

gemma

genomics

gitHub

github

go

google

gradient-booting

grok

groq

huggingface

image

ios

java

javascript

json

kimi

llama.cpp

LLM

llm

lora

mLOps

math

mcp

mcp-client

mcp-server

meta-ai

meta-pytorch

metal

microsoft

mlops

mobile

multilingual

multimodal

mysql

NLP

nlp

nodejs

numpy

nvidia

ocr

ollama

openai

opencode

pandas

paper

physics

pi

plugin

polars

postgres

privacy

prompt-engineering

pwa

python

pytorch

qwen

react

reasoning

RL

robotics

rust

science

security

segmentation

shodan

skillkit

sora

speech

sqlite

ssh

stt

swe

tensorrt

terminal

transformers

translation

tts

tutorial

typescript

vibe-coding

video

vision

vllm

voice

windsurf

xAI

xai

AI Leaderboard·2023

VLMEvalKit

open-compass (OpenCompass community)·OpenCompass, Shanghai AI Laboratory

Runs one-command evaluation of vision-language models across 80+ multimodal benchmarks, handling data download, inference, and metric scoring in a single pass. Supports 220+ LMMs; adding a new model means writing one generate_inner() function.

#vision #ai-leaderboard #huggingface #github #ai-tools+1

AI Deploy·2023

LitServe | Deploy any AI model Lightning fast

Lightning AI

Builds custom AI inference servers in pure Python on top of FastAPI, keeping full control over request logic while batching, GPU autoscaling, streaming, and OpenAI-spec endpoints come built in. Claims a 2x+ throughput edge over plain FastAPI.

#mlops #ai-inference #pytorch #docker #vllm+3

AI Infra·2023

Flash Linear Attention

fla-org, Songlin Yang +1·MIT CSAIL, Soochow University

Triton kernels and PyTorch layers for linear-attention, state-space, and sparse-attention token mixers (GLA, RWKV, Mamba2, GSA) as drop-in replacements for multihead attention. Runs on NVIDIA, AMD, and Intel GPUs with Hugging Face support.

#pytorch #github #ai-library #llm #huggingface+1

AI Infra·2023

LLM Transparency Tool

facebookresearch, Igor Tufanov +3·Meta AI

Traces how Transformer LLMs route information from input to output, attributing each block's effect to individual attention heads and feed-forward neurons. Click any edge to see what a head promotes or suppresses in vocabulary space.

#github #llm #NLP #xai #ai-tools+3

AI Infra·2024

SGLang

LMSYS, SGLang contributors·LMSYS

Serves large language and multimodal models with low latency and high throughput using RadixAttention, continuous batching, structured outputs, parallelism, quantization, and broad accelerator support.

#ai-serving #ai-inference #llm #pytorch #huggingface+1

AI Audio·2024

GPT-SoVITS-WebUI

RVC-Boss

Clones a voice from a 5-second sample for zero-shot TTS, or fine-tunes on ~1 minute of audio for few-shot synthesis. Covers Chinese, English, Japanese, Korean, and Cantonese, with a WebUI bundling vocal separation, ASR, and dataset labeling.

#github #audio #pytorch #ASR #huggingface+3

AI Audio·2024

ebook2audiobook

Drew Thomasson

Converts e-books (epub, pdf, mobi, docx, and more) into chapter-aware audiobooks, with optional zero-shot voice cloning. Bundles eight TTS engines including XTTSv2 and Bark, and covers 1,158 languages via Meta's MMS — all runnable on CPU or GPU.

#audio #gitHub #ai-tools #python #huggingface+1

AI Agent·2024

MobileAgent

X-PLUG (Tongyi Lab, Alibaba Group)·Tongyi Lab, Alibaba Group

A family of GUI agents that operate phones, desktops, and browsers by perceiving the screen visually rather than reading app code. Ships open GUI-Owl vision-language models (7B/32B) plus a multi-agent framework for planning, reflection, and tool use.

#llm #vision #agent-skills #github #huggingface+2

Embodied AI·2024

LeRobot

Hugging Face

A PyTorch-native, hardware-agnostic stack for robot learning: data collection, training, and deployment across 11+ robots, from SO100 to Unitree G1. Includes imitation, RL, and vision-language-action policies (ACT, Diffusion, Pi0, SmolVLA).

#robotics #huggingface #pytorch #gitHub #ai-library+4

AI Model·2024

MiniCPM-V

OpenBMB, ModelBest +1

Pocket-sized multimodal LLM for efficient image- and video-understanding on mobile and edge devices, featuring mixed 4x/16x visual-token compression (MiniCPM‑V 4.6), compact 1.3B variants, and ready guides for iOS/Android/HarmonyOS deployment.

#multimodal #vision #video #LLM #huggingface+5

Chatbot·2024

WeClone

xming521 (GitHub)

Creates personalized digital avatars (AI twins) by fine-tuning LLMs on users' chat history and binding them to chatbots. Provides an end-to-end pipeline — chat export, preprocessing with privacy filters, SFT/LoRA training, and deployment (Telegram/Discord/Slack). Best with larger models and substantial chat data.

#chatbot #LLM #ai-train #ai-deploy #privacy+4

AI Train·2024

MiniCPM-o

OpenBMB·OpenBMB, ModelBest

Runs GPT-4o-class vision, speech, and full-duplex audio-video conversation on a 9B model small enough to deploy on phones and tablets. The 4.5 release scores 77.6 on OpenCompass and adds real-time bilingual voice with voice cloning.

#foundation-model #llm #pytorch #vision #audio+5