AIAny - North Mini Code (CohereLabs/North-Mini-Code-1.0)

North Mini Code aims to make agentic coding and complex terminal workflows practical with an open-weights sparse Mixture-of-Experts model. The core insight is that a large sparse model (30B total, 3B active) plus tooling-aware training lets an LLM both generate large codebases and reason across extremely long contexts (256K tokens), which matters for multi-file engineering, long traces, and stepwise tool interactions.

Key Capabilities

Sparse MoE architecture: 30B total parameters with ~3B active per token via 128 experts (8 active) — trades inference peak size for lower per-token compute. This means better capacity-per-flop for complex code reasoning while keeping runtime costs closer to a smaller dense model.
Long-context & long-output support: Designed for up to 256K context and very long outputs (model card references 64K output), enabling multi-file code generation, long conversational histories, and agentic tool chains without frequent context truncation.
Tool-use and agent support: Trained and benchmarked for agentic coding (terminal/agent benchmarks), includes chat templates and function-calling examples, and integrates with vLLM and OpenCode for local tool-enabled deployments.
Research-friendly release: Weights and model card are published under Apache-2.0 with evaluation details and recommended sampling parameters, easing replication and local experimentation.

Who it's for — and tradeoffs

Great fit if you need a locally runnable, research-accessible model for multi-step coding tasks, automated agent workflows, or experiments that require very long context windows. It's especially useful for teams prototyping tool-enabled agents (terminal automation, code synthesis across many files) and for benchmarks comparing sparse vs dense scaling.

Look elsewhere if you need a turnkey managed API with production SLAs, or if your deployment environment cannot accommodate MoE routing complexity or tooling (some runtimes require vLLM/melody or transformer forks). Expect additional engineering for efficient inference (device mapping, vLLM, or specialized runtimes) compared with standard dense models.

Where it fits

Positioned between research-oriented open-weight models and production-focused closed APIs: it targets practitioners who want high-capacity code reasoning and tool-use with reproducible results, without depending on a hosted provider. Compared to dense models of similar active parameter count, it aims to provide higher effective capacity for coding and agentic tasks at comparable inference cost.

North Mini Code (CohereLabs/North-Mini-Code-1.0)

Introduction

Key Capabilities

Who it's for — and tradeoffs

Where it fits

Information

Categories

Tags

More Items

KAT-Coder-V2.5-Dev

Qwen3-TTS-12Hz-1.7B-CustomVoice

GLM-5.2-Vision (NVFP4)