AIAny - lordx64/agentic-distill-fable-5-sft

This dataset matters because explicit agentic traces that combine long chain-of-thought (CoT) reasoning with concrete tool-call sequences are rare but highly useful for teaching models to act like tool-using agents. The bundle collapses Claude Fable‑5 sessions into 4,659 SFT-ready rows (avg ~2.6k tokens/row), prioritizing visible CoT plus the model's final action so you can train both thinking and acting behaviors.

What Sets It Apart

Composition and scale: 4,659 single-turn rows sourced from Glint-Research Fable‑5 traces, with ~81% ending in a <tool_use> block and ~19% pure-text responses. Average CoT lengths are large (mean/median in the thousands of characters), so the data is concentrated on long-form developer-style reasoning about edits and tool calls.
Training-ready format: collapsed into a single text column using a Qwen chat template (system + user + <think> + visible response), and serialized tool calls as custom <tool_use name="X" id="Y">…</tool_use> XML, ready for SFT trainers that accept a single text field.
Hygiene and provenance: PII and leaked keys were scrubbed, duplicate user turns deduplicated, and problematic assistant rows removed. The dataset inherits AGPL-3.0 from upstream and is explicitly derived from Anthropic's Claude Fable‑5 preview outputs.

Who It's For and Trade-offs

Great fit if you want to SFT an LLM to emulate agentic, tool-using developer workflows (e.g., emit file edits, shell commands, reads/writes) or to study coupling of CoT and tool invocations. It’s practical for experimenting with Qwen-style fine-tuning and agent scaffolds like Qwable.

Look elsewhere if you need multi-turn conversational chat logs without CoT, large-scale diverse instruction data (this is a narrow developer-session distribution), or unambiguous native Qwen <tool_call> tokens—the dataset uses a custom XML envelope for tool calls and many contexts are truncated upstream, which may require context-reconstruction or filtering for your use case.

Practical notes

Licensing & policy: AGPL-3.0; downstream users should confirm compliance with Anthropic usage policies for any use of model-derived content.
Provenance: cleartext CoT comes from Glint-Research traces; some other Fable‑5 sources had thinking blocks redacted and were excluded.
Typical workflow: use as SFT input to teach agentic behaviors (optionally training on responses only), or as a source of long-form CoT paired with concrete tool actions for evaluation and distillation.

lordx64/agentic-distill-fable-5-sft

Introduction

What Sets It Apart

Who It's For and Trade-offs

Practical notes

Information

Categories

Tags

More Items

ArithMark 3.0

XYZ-Aquila SFT

Reasoning Corpus 5M