Tag

Explore by tags

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

agent-skills

ai

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-deploy

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

algorithms

alibaba

amazon

android

anthropic

audio

aws

biology

blog

book

bytedance

chatbot

chatgpt

chemistry

claude

claude-code

cli

code

codex

copilot

course

cursor

deepmind

deepseek

depth

devops

diffusers

docker

drug-discovery

electron

embeddings

engineering

evaluation

facebook

finance

foundation

foundation-model

gemini

gemini-cli

gemma

genomics

gitHub

github

go

google

gradient-booting

grok

groq

huggingface

image

ios

java

javascript

json

LLM

llm

mLOps

math

mcp

mcp-client

mcp-server

meta-ai

meta-pytorch

microsoft

mlops

mobile

multilingual

multimodal

mysql

NLP

nlp

nodejs

nvidia

ocr

ollama

openai

opencode

pandas

paper

physics

pi

plugin

polars

postgres

privacy

prompt-engineering

pwa

python

pytorch

qwen

RL

robotics

rust

science

security

shodan

skillkit

sora

speech

sqlite

ssh

stt

swe

tensorrt

terminal

transformers

translation

tts

tutorial

typescript

vibe-coding

video

vision

vllm

voice

xAI

xai

Ornith-1.0-35B

2026

DeepReinforce (deepreinforce-ai)

A 35B mixture-of-experts LLM specialized for agentic coding and tool-enabled code generation, fine-tuned with self-scaffolding reinforcement learning. Supports very long contexts, OpenAI-compatible tool calls, and multiple serving runtimes under an MIT license.

ai-coding agent-skills llm transformers vllm+4

Agents-A1: Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent

2026

Lei Bai, Zongsheng Cao +48

35B Mixture-of-Experts agent model for long-horizon, multi-domain agent workflows; trained with a knowledge–action infrastructure that produces ~45K-token trajectories and supports native tool calling and function integration for research and deployment.

transformers huggingface vllm llm agent-skills+3

Qwen-AgentWorld-35B-A3B

2026

Yuxin Zuo, Zikai Xiao +8

Simulates agentic environments and predicts next environment states from actions and interaction history using a language-based world model across seven domains. Trained via a CPT→SFT→RL pipeline with an MoE architecture and very long context; intended for environment simulation and agent research.

qwen llm transformers vllm huggingface+4

NVIDIA GLM-5.2 NVFP4

2026

NVIDIA, Z.ai

Provides a pre-quantized NVFP4 checkpoint of GLM-5.2 for long-context reasoning and coding; reduces model footprint so GLM-5.2 can run on multi‑GPU Blackwell nodes and is ready for inference with SGLang and vLLM.

nvidia huggingface llm vllm tensorrt+5

nvidia/Qwen3.6-27B-NVFP4

2026

NVIDIA, Alibaba Group (Qwen Team)

NVFP4-quantized variant of Qwen3.6-27B that reduces parameter bits from 16 to 4, cutting disk and GPU memory requirements by ~2.5× while keeping comparable benchmark accuracy; ready for vLLM-based inference on NVIDIA hardware and supports long, multimodal contexts.

nvidia qwen vllm huggingface llm+6

Ornith-1.0-397B

2026

DeepReinforce Team

Provides an open-source Mixture-of-Experts coding LLM (397B) optimized for agentic, tool-enabled coding workflows with a 262,144-token context window, OpenAI-compatible API, serving recipes (vLLM/SGLang), and published coding-benchmark results.

ai-coding agent-skills vllm transformers qwen+6

LFM2.5-230M

2026

Liquid AI

230M-parameter multilingual instruction-tuned text-only LLM for on-device agentic pipelines and data extraction; 32K context, 19T-token pretraining, optimized for fast CPU/edge inference (e.g., 213 tok/s on Galaxy S25 Ultra, 42 tok/s on Raspberry Pi 5); not for heavy reasoning or complex code generation.

transformers huggingface llm vllm agent-skills+5

Ornith-1.0-35B-GGUF

2026

DeepReinforce AI

A self-improving, agentic coding LLM tailored for terminal-style coding agents and tool-calling, provided as 35B MoE GGUF weights with very large context support. Trained with reinforcement learning to jointly generate task scaffolds and solutions; designed for local inference and OpenAI-compatible tool endpoints.

huggingface transformers vllm ollama ai-coding+6

Ornith-1.0-9B-GGUF

2026

deepreinforce-ai (DeepReinforce Team)

Provides a GGUF-quantized local build of Ornith-1.0's 9B dense model for offline inference and terminal-focused coding agents. Supports OpenAI-compatible tool-calling, a 256K context window, and runs via llama.cpp or Ollama on a single high-memory GPU.

transformers llm ai-coding vllm ollama+5

DeepSeek-V4-Pro-DSpark

2026

DeepSeek-AI

Mixture-of-Experts LLM designed for million-token contexts, combining hybrid compressed attention, FP4/FP8 quantization-aware training for MoE experts, and multi-mode 'thinking' (Non-think/Think High/Think Max); includes a speculative-decoding extension for faster inference.

deepseek llm transformers huggingface ai-inference+2

DeepSeek-V4-Flash-DSpark

2026

DeepSeek-AI

A Hugging Face model checkpoint that attaches a speculative decoding module to DeepSeek-V4-Flash, enabling million-token context handling with MoE architecture, FP4/FP8 mixed precision, and long-context inference optimizations.

deepseek foundation-model llm transformers huggingface

Huihui-GLM-5.2-abliterated-GGUF

2026

huihui-ai, zai-org +1

An uncensored GGUF build of GLM-5.2 that applies weight “abliteration” to remove refusal filters and produce a locally runnable text-generation model; includes quantization conversions and shard-merge instructions, intended for experimental research rather than production use.

foundation-model llm transformers huggingface ai-inference+1