Search
Collection
Category
Tag
Daily AI

AIAny

Tag

Explore by tags

AIAny

Curated AI Resources for Everyone

[email protected]

Product

Search
Collection
Category
Tag

Resources

Blog

Company

Privacy Policy
Terms of Service
Sitemap

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

agent-skills

ai

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-deploy

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

algorithms

alibaba

amazon

android

anthropic

audio

aws

biology

blog

book

bytedance

chatbot

chatgpt

chemistry

claude

claude-code

cli

code

codex

copilot

course

cursor

deepmind

deepseek

depth

devops

diffusers

docker

drug-discovery

electron

embeddings

engineering

evaluation

facebook

finance

foundation

foundation-model

gemini

gemini-cli

gemma

genomics

gitHub

github

go

google

gradient-booting

grok

groq

huggingface

image

ios

java

javascript

json

LLM

llm

mLOps

math

mcp

mcp-client

mcp-server

meta-ai

meta-pytorch

microsoft

mlops

mobile

multilingual

multimodal

mysql

NLP

nlp

nodejs

nvidia

ocr

ollama

openai

opencode

pandas

paper

physics

pi

plugin

polars

postgres

privacy

prompt-engineering

pwa

python

pytorch

qwen

RL

robotics

rust

science

security

shodan

skillkit

sora

speech

sqlite

ssh

stt

swe

tensorrt

terminal

transformers

translation

tts

tutorial

typescript

vibe-coding

video

vision

vllm

voice

xAI

xai

unsloth/MiniMax-M3-GGUF

2026

unsloth, MiniMaxAI

Provides experimental GGUF-format quantized weights for MiniMax-M3 to run local multimodal (image‑text‑video) inference via llama.cpp or Unsloth Studio. The model is very large (~428B params) and requires GPU offload or large CPU RAM; llama.cpp currently falls back from sparse to dense attention.

multimodal video transformers huggingface llm+5

DreamX-World 1.0: A General-Purpose Interactive World Model

2026

DreamX Team, Yancheng Bai +21

Controllable long-horizon text/image-to-video generation that supports camera navigation, revisits, and promptable events across photorealistic and stylized domains. Introduces camera-aware positional encoding (E-PRoPE), memory-conditioned scene persistence, causal-forcing distillation, and RL alignment to retain camera control and reduce drift.

video vision multimodal RL paper+2

TurboServe: Serving Streaming Video Generation Efficiently and Economically

2026

1Shanghai Jiao Tong University, 2Shengshu Technology +1

Youhe Jiang, Haoxu Wang +6

Serves interactive, long-lived streaming video-generation sessions by jointly scheduling session placement and GPU autoscaling to meet tight per-chunk latency. Combines migration-aware placement, load-driven autoscaling, coalesced chunk processing, GPU–CPU offloading and NCCL GPU–GPU migration; reports ~37% reductions in worst-case per-chunk latency and GPU operating cost.

video ai-video ai-serving ai-inference mLOps+4

LTX‑2.3 IC‑LoRA — 3D render to Photoreal

2026

fal, Lightricks

Lovis Odin

Converts low‑poly 3D viewport or game/CG renders into photorealistic cinematic video while preserving the input's composition, camera motion and layout; offers Light and Strong LoRA variants to trade fidelity for aggressive photorealism.

ai-video ai-image ai-demos ai-image-demos huggingface+1

Gaming Dataset (gaming-1)

2026

markov-ai

Provides ~494.7 hours of trimmed native PC/console gameplay screen recordings organized by game, with per-session clips plus input and per-frame event annotations. Each workflow includes clip.mp4, events.json, frame_events.json, and metadata — suitable for training vision-action, behavior-cloning, and gameplay understanding models.

video ai-video vision multimodal agent-skills+5