Tag

Explore by tags

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

agent-skills

ai

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-deploy

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

algorithms

alibaba

amazon

android

anthropic

audio

aws

benchmark

biology

blog

book

bytedance

chatbot

chatgpt

chemistry

claude

claude-code

cli

code

codex

copilot

course

cuda

cursor

deepmind

deepseek

depth

devops

diffusers

docker

drug-discovery

electron

embeddings

engineering

evaluation

facebook

finance

flow-matching

foundation

foundation-model

gemini

gemini-cli

gemma

genomics

gitHub

github

go

google

gradient-booting

grok

groq

huggingface

image

ios

java

javascript

json

kimi

llama.cpp

LLM

llm

lora

mLOps

math

mcp

mcp-client

mcp-server

meta-ai

meta-pytorch

metal

microsoft

mlops

mobile

multilingual

multimodal

mysql

NLP

nlp

nodejs

numpy

nvidia

ocr

ollama

openai

opencode

pandas

paper

physics

pi

plugin

polars

postgres

privacy

prompt-engineering

pwa

python

pytorch

qwen

react

reasoning

retrieval

RL

robotics

rust

science

security

segmentation

shodan

skillkit

sora

speech

sqlite

ssh

stt

swe

tensorrt

terminal

transformers

translation

tts

tutorial

typescript

vibe-coding

video

vision

vllm

voice

web-search

windsurf

xAI

xai

AI Dataset2024

FineWeb-Edu

HuggingFaceFW, Anton Lozhkov +3

Provides ~1.3 trillion tokens of web pages filtered for educational quality using an LLM-trained classifier; includes per-Crawl configs, smaller random samples (10B/100B/350B tokens), and the classifier code and model for reproducible filtering.

huggingface LLM nlp foundation-model ai-train+2

AI Train2024

modded-nanogpt

Keller Jordan, Jeremy Bernstein +2NanoGPT Speedrun Community

A community speedrun to train a 124M GPT as fast as possible on 8 H100s, all chasing a fixed 3.28 FineWeb loss. Successive records cut the run from llm.c's 45 minutes to under 1.4, mostly via the new Muon optimizer rather than more hardware.

github pytorch ai-train llm huggingface+2

AI Train2024

minimind

jingyaogongIndependent

Trains a sub-100M-parameter LLM from scratch — pretraining, SFT, LoRA, DPO/RLHF, and distillation, sized from ~26M up to ~100M-plus dense and MoE. Headline figure: the ~64M minimind-3 variant's SFT stage runs 1 epoch in ~2h and ~3 RMB on one NVIDIA 3090.

pytorch llm ai-train github ai-development+1

AI Image2024

MiniMind-V

Jingyao Gong (jingyaogong)

Trains a 65M-parameter vision-language model from scratch in ~2 hours on one RTX 3090, about 3 RMB (~$0.40) of GPU rental. Connects a frozen SigLIP2 encoder to a small MiniMind LLM via a two-layer MLP projector; full PyTorch code for pretraining and SFT.

vision pytorch github llm ai-train+2

AI Image2024

olmOCR

Allen Institute for AI (AI2), AllenNLP teamAllen Institute for AI (Ai2)

Turns PDFs and images into clean Markdown with a 7B vision-language model, keeping tables, equations, handwriting, and multi-column reading order while removing headers and footers. Runs on one 12GB+ GPU at about 1/32 the cost of GPT-4o APIs.

ocr vision llm foundation-model huggingface+4

Embodied AI2024

ProtoMotions3

Chen Tessler, Yifeng Jiang +7NVIDIA Research (NVlabs)

GPU‑accelerated framework for training physically simulated humanoid characters and robots using reinforcement learning and motion imitation. Provides a modular multi‑backend simulator stack, large‑scale multi‑GPU training recipes, built‑in motion retargeting and an ONNX deployment pathway to real robots.

robotics RL nvidia ai-train ai-deploy+3

AI Infra2024

cuTile Python

NVIDIA CORPORATIONNVIDIA

Lets Python developers write tile-based parallel kernels for NVIDIA GPUs, generating CUDA Tile IR while staying close to Python syntax for custom GPU operations.

nvidia ai-development ai-library ai-train

AI Model2024

SANA

NVIDIA Research (NVlabs)

High-resolution image and video generation codebase and models that run with far lower compute and memory than typical diffusion systems. Uses linear-attention DiT variants, aggressive latent compression, and inference-scaling to support text-to-image (up to 4K), fast one/few-step generation, and efficient video pipelines.

nvidia ai-image image video pytorch+5

AI Train2024

DataFlow

OpenDCAIPeking University

Parses, generates, and filters training data from noisy sources like PDFs and weak QA, then feeds it into LLM pre-training, SFT, RL, or RAG cleaning. Ships 100+ operators and ready-made pipelines for text, reasoning, Text2SQL, and agentic data.

github mlops ai-development ai-library python+3

AI Train2024

verl: Volcano Engine Reinforcement Learning for LLMs

ByteDance Seed Team, Volcengine +1ByteDance Seed, The University of Hong Kong +1

Open-source HybridFlow implementation for RL post-training of LLMs. Decouples control flow from compute so PPO, GRPO, GSPO and DAPO share one dataflow; pairs FSDP/Megatron with vLLM/SGLang rollout and reports 1.5-20x throughput over prior RLHF stacks.

RL LLM vllm pytorch huggingface+3

AI Train2024

Protenix

ByteDance AI4Science (AML) Team

High-accuracy biomolecular structure prediction suite: open-source models (protenix-v2/v1), a benchmark/evaluation toolkit, and a web server for inference. Targets protein/antibody–antigen and ligand-aware predictions with inference-time sampling and constraint support.

bytedance github foundation-model genomics drug-discovery+2

Machine Learning Foundation Tutorials2024

Awesome-ML-SYS-Tutorial

zhaochenyang20

A GitHub repository of learning notes and code dedicated to ML + SYS (machine learning systems). It collects tutorials, code walkthroughs and engineering notes on RLHF, distributed training (FSDP, Megatron), inference and scheduling (SGLang, vllm), quantization, CUDA/GPU optimization, system design, and practical engineering.

github mlops ai-train pytorch LLM+6