Tag

Explore by tags

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

agent-skills

ai

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-deploy

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

algorithms

alibaba

amazon

android

anthropic

audio

aws

benchmark

benchmarks

biology

blog

book

bytedance

chatbot

chatgpt

chemistry

claude

claude-code

cli

code

codex

coding

coding-agents

copilot

course

cpu

cuda

cursor

deepmind

deepseek

depth

devops

diffusers

distillation

docker

drug-discovery

electron

embeddings

engineering

evaluation

facebook

finance

flow-matching

foundation

foundation-model

gcode

gemini

gemini-cli

gemma

genomics

gitHub

github

go

google

gradient-booting

grok

groq

huggingface

image

ios

java

javascript

json

kimi

llama.cpp

LLM

llm

long-horizon

lora

mLOps

math

mcp

mcp-client

mcp-server

meta-ai

meta-pytorch

metal

microsoft

mlops

mobile

multilingual

multimodal

mysql

NLP

nlp

nodejs

numpy

nvidia

ocr

ollama

openai

opencode

pandas

paper

parquet

physics

pi

plugin

polars

postgres

privacy

programming

prompt-engineering

pwa

python

pytorch

qwen

react

reasoning

redis

retrieval

RL

rl

robotics

rust

science

security

segmentation

shodan

skillkit

software-engineering

sora

speech

sqlite

ssh

stt

swe

swift

tensorrt

terminal

transformers

translation

tts

tutorial

typescript

vibe-coding

video

vision

vllm

voice

vulkan

web-search

windsurf

xAI

xai

Jackrong/claude-opus-4.6-traceInversion-9000x

Provides 9,000 reconstructed chain-of-thought (CoT) SFT examples produced by trace inversion from Claude Opus 4.6 outputs for fine-tuning reasoning-capable LLMs. Multilingual, packaged as .jsonl.gz and SFT/DPO-ready; verify numeric/code cases before training.

huggingface llm nlp multilingual pandas+4

Qwen3.6 27B - OBLITERATED

Provides a locally runnable 26.9B Qwen3.6 checkpoint that surgically reduces refusal behavior in weight space while preserving capability; ships bfloat16 safetensors and a GGUF quant ladder for local runtimes and red-team evaluation.

huggingface transformers llm vllm ai-deploy+5

GoLongRL (Kwai-Klear)

RL training dataset for long-context language-model fine-tuning with ~23K samples and nine reward types, provided in Parquet with bilingual ground-truth and reward metadata for direct RL/bench evaluation.

huggingface RL LLM NLP paper+2

MOSS-Transcribe-Diarize

OpenMOSS-Team, MOSI.AI +1

Converts long-form multi-speaker audio/video into a compact, speaker-aware transcript with timestamps and anonymous speaker labels in one pass. Combines ASR and diarization in a single model, supports custom prompts/hotwords, and targets meetings, podcasts, interviews and long recordings.

ASR audio speech stt transformers+5

Cosmos-Framework

End-to-end Python framework for training and serving NVIDIA's Cosmos world models (Cosmos3), integrating distributed training (FSDP/TP/CP/PP), DCP/safetensors checkpoints, dataset adapters, multiple inference backends, online serving, and agent skills.

nvidia ai-train ai-serving pytorch cuda+8

openbmb/UltraData-SFT-2605

Supervised fine-tuning dataset of instruction-style examples in English and Chinese covering generation, QA, reasoning, math and code — targeted for SFT of 10–100B-parameter LLMs. Associated with arXiv:2602.09003; first published May 21, 2026.

llm huggingface multilingual math code+4

Qwopus3.6-27B-v2-MTP

Jack Rong (Jackrong)

Fine-tuned reasoning model that speeds up structured multi-step outputs using Multi-Token Prediction (MTP) from a Qwen3.6-27B base. Produces more concise, faster generations for coding, DevOps, math, and constrained-format tasks; experimental community release for research and evaluation.

huggingface transformers llm ai-train ai-inference+5

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Provides 100,000 generated low-quality↔high-quality image pairs created with modern multi-frame/multi-modal models to boost generalization of image restoration methods; includes train/test JSONL lists, baseline training code, and pretrained checkpoints under CC BY‑NC‑ND 4.0.

vision image ai-image huggingface paper+3

Fara1.5-27B

Microsoft Research AI Frontiers, Microsoft

Automates end-to-end web workflows from browser screenshots by emitting pixel-grounded actions (click, type, scroll, visit, search). Vision-first multimodal agent fine-tuned from Qwen3.5-27B with critical-point safety checks; intended for sandboxed, human-supervised deployments.

qwen multimodal vision agent-skills microsoft+6

Miso TTS 8B

Generates conversational speech and voice continuation from text and optional audio context, outputting Mimi audio codes. Built on a Sesame-style CSM with an 8B Llama-like backbone plus a smaller autoregressive audio decoder. Suited for local TTS inference and voice-cloning workflows.

pytorch audio voice speech huggingface+1

NVIDIA Cosmos3-Super-Image2Video

Generates temporally coherent MP4 videos from a single input image plus text instructions, with configurable resolution, frame count, and optional AAC audio. Optimized for NVIDIA GPU stacks and integrates with vLLM‑Omni and Hugging Face Diffusers for production inference and research workflows.

nvidia huggingface diffusers ai-video video+5

LongCat-Video-Avatar-1.5

Meituan LongCat Team

Generates audio-driven avatar videos from text, images, or audio inputs with production-grade stability (accurate lip sync, identity consistency) and an 8-step distillation inference mode for faster serving; suitable for broadcasting, virtual hosts, animation, and multi-person scenarios.

ai-video video audio transformers huggingface+4