Search
Collection
Category
Tag
Daily AI

AIAny

Tag

Explore by tags

AIAny

Curated AI Resources for Everyone

[email protected]

Product

Search
Collection
Category
Tag

Resources

Blog

Company

Privacy Policy
Terms of Service
Sitemap

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

agent-skills

ai

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-deploy

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

algorithms

alibaba

amazon

android

anthropic

audio

aws

biology

blog

book

bytedance

chatbot

chatgpt

chemistry

claude

claude-code

cli

code

codex

copilot

course

cursor

deepmind

deepseek

depth

devops

diffusers

docker

drug-discovery

electron

embeddings

engineering

evaluation

facebook

finance

foundation

foundation-model

gemini

gemini-cli

gemma

genomics

gitHub

github

go

google

gradient-booting

grok

groq

huggingface

image

ios

java

javascript

json

LLM

llm

mLOps

math

mcp

mcp-client

mcp-server

meta-ai

meta-pytorch

microsoft

mlops

mobile

multilingual

multimodal

mysql

NLP

nlp

nodejs

nvidia

ocr

ollama

openai

opencode

pandas

paper

physics

pi

plugin

polars

postgres

privacy

prompt-engineering

pwa

python

pytorch

qwen

RL

robotics

rust

science

security

shodan

skillkit

sora

speech

sqlite

ssh

stt

swe

tensorrt

terminal

transformers

translation

tts

tutorial

typescript

vibe-coding

video

vision

vllm

voice

xAI

xai

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

2026

Songhao Wu, Zhongxin Chen +4

Removes the subspace of frequent, uninformative tokens that LLMs inject into text embeddings via the model's unembedding matrix. EmbedFilter is a lightweight linear transform that refines LLM-derived embeddings to improve zero‑shot semantic retrieval, enable dimensionality reduction, and speed up indexing; code on GitHub.

embeddings LLM NLP paper github+3

TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders

2026

Wei Pang, Xiangru Jian +11

Standardizes representation-level evaluation for tabular encoders by exporting row-, column-, and table-level embeddings and probing them with shared lightweight heads across three suites (TRL-CTbench, TRL-Rbench, TRL-DLTE). Supplies curated benchmark assets and task rewrites (50 OpenML tables, 123 targets, a 47,772-table DLTE lake) to enable fair cross-paradigm comparison.

paper code github embeddings ai-leaderboard+2

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

2026

Hyeongwon Jang, Gyouk Chu +4

Generates outcome-specific, dialectical rationales with an LLM and derives continuous, calibrated risk scores for irregularly sampled medical time series—mitigating risk polarization. Reports +3.3% average AUPRC and 81% reduction in calibration error across three benchmarks; code released.

llm nlp paper code github+2

Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories

2026

University of Oxford, Stanford University

Kevin Qinghong Lin, Batu EI +4

Turns raw datasets into verifiable multimodal news features via a multi-agent newsroom pipeline. Key innovations: (1) an Inspector that links each claim to data/code/external references for re-execution and audit; (2) multimodal asset generation (interactive maps, audio, visuals) tailored to the story.

agent-skills multimodal ai-agent paper code+3

FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents

2026

Jia Deng, Yimeng Chen +10

Synthesizes shortcut-resistant search tasks to train deep search agents by controlling four shortcut risks across entity selection, evidence-graph construction, question formulation, and adversarial refinement. Produces training trajectories with longer pre-answer search and fewer shortcut patterns; code will be released on GitHub.

paper github ai-agent agent-skills deepseek+2

MiniMax Sparse Attention

2026

Xunhao Lai, Weiqi Xu +9

Implements a blockwise sparse attention (MiniMax Sparse Attention) that scores and Top-k selects key-value blocks per Grouped Query Attention group to enable attention over million-token contexts. Paired with an exp-free Top-k GPU kernel and KV-outer sparse execution, it reduces per-token attention compute and yields large prefill/decoding speedups.

paper llm multimodal github huggingface+4

Kairos: A Native World Model Stack for Physical AI

2026

Kairos Team

Kairos Team, Fei Wang +22

Learns, maintains, and runs unified world models for Physical AI using a cross-embodiment pretraining curriculum and a hybrid linear temporal-attention architecture. Emphasizes long-horizon state persistence, theoretical bounds on error accumulation, and deployment-aware low-latency inference for real-world embodied agents.

robotics multimodal vision physics paper+5

TurboServe: Serving Streaming Video Generation Efficiently and Economically

2026

1Shanghai Jiao Tong University, 2Shengshu Technology +1

Youhe Jiang, Haoxu Wang +6

Serves interactive, long-lived streaming video-generation sessions by jointly scheduling session placement and GPU autoscaling to meet tight per-chunk latency. Combines migration-aware placement, load-driven autoscaling, coalesced chunk processing, GPU–CPU offloading and NCCL GPU–GPU migration; reports ~37% reductions in worst-case per-chunk latency and GPU operating cost.

video ai-video ai-serving ai-inference mLOps+4

Domain Arithmetic: One-Shot VLA Adaptation under Environmental Shifts

2026

Taewook Kang, Taeheon Kim +2

Adapts pretrained Vision-Language-Action (VLA) models to new camera poses and robot embodiments from a single demonstration by performing weight-vector arithmetic that injects domain-specific information. Filters noise via subspace alignment of singular components; designed for one-shot adaptation under visual and embodiment shifts.

robotics vision multimodal paper code+2