LogoAIAny
Icon for item

Qwopus-3.6-27B-Coder

A quantized 27B coder LLM fine-tuned for repository-level code generation, multi-turn tool calling, and agentic workflows — packaged for local GGUF/llama.cpp deployment with MTP speculative decoding and trace-inversion SFT. Optimized for developer tooling; experimental and not fully safety-validated.

Introduction

Most small-to-midsize local coder models trade off tool orchestration or deep multi-step reasoning to stay fast and deployable. Qwopus-3.6-27B-Coder tries a different balance: it packages a 27B dense reasoning base with trace-inversion reconstructed chains and agent trajectories, then fine-tunes for structured tool use and repository edits while remaining deployable as a quantized GGUF for llama.cpp.

Key Capabilities
  • Agentic coding & tool calling — trained on real multi-turn agent traces with tool definitions and environment feedback, so it better follows structured tool schemas and multi-step interactions compared with plain instruction-tuned coders.
  • Repository-level patching & debugging — curriculum SFT and targeted datasets improve multi-file reasoning and realistic repo-edit behaviors, making it effective for automated bug fixes, patch generation and test-driven repairs.
  • Local quantized deployment & MTP support — distributed as a Q5_K_M GGUF (27B) and an MTP variant for speculative decoding, enabling single-GPU interactive throughput (reported ~100 tokens/sec on an RTX 5090 with MTP).
  • Trace-Inversion training signal — uses reconstructed step-by-step reasoning traces (from Claude-opus style data) to teach continuous CoT-like behavior while still supporting “thinking-off” fast inference modes.
Who it's for & tradeoffs

Great fit if you need a locally runnable coder that can orchestrate tools and edit repositories (developers building local AI assistants, teams prototyping agentic automation, or researchers evaluating agent workflows). Look elsewhere if you require a fully safety-evaluated general-purpose assistant or heavyweight, SOTA cloud APIs for the absolute top benchmark numbers: this is an experimental community release and the model may emit internal <think> tags, show domain-specific capability decay outside coding, and relies on specific prompt/tool schemas to activate agent behaviors. The first reported local SWE-bench Verified run (no-thinking) achieved 335/500 = 67.0%.

Where it fits

Positioned between compact local coders and large hosted coder APIs: it emphasizes tool-use and repo-edit robustness at 27B (quantized) so teams that need on-prem or latency-sensitive coding agents can test agent loops without depending on external endpoints.

Training snapshot

Released on 2026-06-11, the package is built on top of Qwopus3.6-27B-v2 and uses datasets such as claude-opus-4.6/4.7 trace-inversion sets and lambda/hermes-agent-reasoning-traces. Deployment notes: parse or hide <think> tags where needed; match tool-schema prompts to training format for best results.