LogoAIAny
  • Search
  • Collection
  • Category
  • Tag
  • Daily AI
LogoAIAny

Tag

Explore by tags

LogoAIAny

Curated AI Resources for Everyone

[email protected]

Powered by airss.app

Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
Company
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2026 All Rights Reserved.
  • All

  • 30u30

  • ASR

  • ChatGPT

  • GNN

  • IDE

  • RAG

  • agent-skills

  • ai

  • ai-agent

  • ai-api

  • ai-api-management

  • ai-client

  • ai-coding

  • ai-demos

  • ai-deploy

  • ai-development

  • ai-framework

  • ai-image

  • ai-image-demos

  • ai-inference

  • ai-leaderboard

  • ai-library

  • ai-rank

  • ai-serving

  • ai-tools

  • ai-train

  • ai-video

  • ai-workflow

  • AIGC

  • algorithms

  • alibaba

  • amazon

  • android

  • anthropic

  • audio

  • aws

  • biology

  • blog

  • book

  • bytedance

  • chatbot

  • chatgpt

  • chemistry

  • claude

  • claude-code

  • cli

  • code

  • codex

  • copilot

  • course

  • cursor

  • deepmind

  • deepseek

  • depth

  • devops

  • diffusers

  • docker

  • drug-discovery

  • electron

  • embeddings

  • engineering

  • evaluation

  • facebook

  • finance

  • foundation

  • foundation-model

  • gemini

  • gemini-cli

  • gemma

  • genomics

  • gitHub

  • github

  • go

  • google

  • gradient-booting

  • grok

  • groq

  • huggingface

  • image

  • ios

  • java

  • javascript

  • LLM

  • llm

  • math

  • mcp

  • mcp-client

  • mcp-server

  • meta-ai

  • meta-pytorch

  • microsoft

  • mlops

  • mobile

  • multilingual

  • multimodal

  • NLP

  • nlp

  • nodejs

  • nvidia

  • ocr

  • ollama

  • openai

  • opencode

  • pandas

  • paper

  • physics

  • plugin

  • postgres

  • privacy

  • prompt-engineering

  • python

  • pytorch

  • RL

  • robotics

  • rust

  • science

  • security

  • shodan

  • skillkit

  • sora

  • speech

  • ssh

  • tensorrt

  • terminal

  • transformers

  • translation

  • tts

  • tutorial

  • typescript

  • vibe-coding

  • video

  • vision

  • vllm

  • voice

  • xAI

  • xai

Hugging Face
Icon for item

unsloth/gemma-4-12B-it-qat-GGUF

2026
unsloth, Google DeepMind

GGUF-format QAT (quantization-aware training) build of Gemma 4 12B that reduces memory needs for local or lightweight inference while preserving near bfloat16 quality. Ready for any-to-any conversational pipelines and ecosystem deployment.

gemmahuggingfacegoogledeepmindtransformers+5
Hugging Face
Icon for item

unsloth/gemma-4-26B-A4B-it-qat-GGUF

2026
unsloth

A GGUF release of Gemma 4 26B A4B (QAT) packaged by Unsloth for local multimodal inference — quantization-aware trained to keep near-bfloat16 quality while significantly lowering memory requirements, compatible with Transformers and Unsloth tooling.

gemmahuggingfacetransformersllmvision+3
Hugging Face
Icon for item

MiMo-V2.5-Pro-FP4-DFlash

2026
Xiaomi MiMo Team

Implements MXFP4 quantization on MoE experts plus a BF16 DFlash block-diffusion drafter to propose whole-token blocks for verification, cutting memory bandwidth and backbone forward passes for trillion‑parameter text generation—targeting long‑context, agent and code workloads.

huggingfacetransformersllmai-inferenceai-serving+3
Hugging Face
Icon for item

DiffusionGemma 26B A4B

2026
Google DeepMind

Generates text from interleaved text, image, and short-video inputs using discrete diffusion and block‑autoregressive multi‑canvas sampling; built on a sparse MoE (8/128) Gemma 4 backbone and optimized for low‑latency inference and very long contexts (up to 256K tokens).

gemmafoundation-modelmultimodalvisiontransformers+5
Hugging Face
Icon for item

unsloth/diffusiongemma-26B-A4B-it-GGUF

2026
unsloth

A community-distributed GGUF bundle of Google DeepMind’s DiffusionGemma (26B A4B) with multiple quantization variants for local image-text-to-text inference. Targets experimentation and offline deployment via the DiffusionGemma llama.cpp branch and llama-diffusion-cli; choose quantization for GPU memory vs. fidelity trade-offs.

gemmagooglehuggingfacellmmultimodal+2
Hugging Face
Icon for item

Qwopus-3.6-27B-Coder

2026
Jackrong

A quantized 27B coder LLM fine-tuned for repository-level code generation, multi-turn tool calling, and agentic workflows — packaged for local GGUF/llama.cpp deployment with MTP speculative decoding and trace-inversion SFT. Optimized for developer tooling; experimental and not fully safety-validated.

huggingfacellmtransformersai-codingai-agent+5
Icon for item

MiniMax Sparse Attention

2026
Xunhao Lai, Weiqi Xu +9

Implements a blockwise sparse attention (MiniMax Sparse Attention) that scores and Top-k selects key-value blocks per Grouped Query Attention group to enable attention over million-token contexts. Paired with an exp-free Top-k GPU kernel and KV-outer sparse execution, it reduces per-token attention compute and yields large prefill/decoding speedups.

paperllmmultimodalgithubhuggingface+4
Hugging Face
Icon for item

moonshotai/Kimi-K2.7-Code

2026
Moonshot AI (moonshotai)

An agentic multimodal coding model for long-horizon software tasks: MoE architecture (1T params, 32B activated), 256K context, image/video input, native int4 quantization and preserved chain-of-thought (thinking) mode. Tuned for multi-step coding workflows and vLLM/SGLang deployment.

huggingfacetransformersai-codingai-agentagent-skills+5
Icon for item

TensorRT-LLM

2023
NVIDIA

NVIDIA’s open-source library that compiles Transformer blocks into highly-optimized TensorRT engines for blazing-fast LLM inference on NVIDIA GPUs.

ai-developmentai-inferenceai-servingnvidia
Icon for item

KTransformers

2024
KVCache-AI Team

Pythonic framework to inject experimental KV-cache optimizations into HuggingFace Transformers stacks.

ai-developmentai-inferenceai-serving
Icon for item

NVIDIA Dynamo

2025
NVIDIA

NVIDIA Dynamo is an open-source, high-throughput, low-latency inference framework that scales generative-AI and reasoning models across large, multi-node GPU clusters.

ai-developmentai-inferenceai-servingnvidia
  • Previous
  • 1
  • 2
  • More pages
  • 20
  • 21
  • 22
  • Next