This dataset matters because explicit agentic traces that combine long chain-of-thought (CoT) reasoning with concrete tool-call sequences are rare but highly useful for teaching models to act like tool-using agents. The bundle collapses Claude Fable‑5 sessions into 4,659 SFT-ready rows (avg ~2.6k tokens/row), prioritizing visible CoT plus the model's final action so you can train both thinking and acting behaviors.
What Sets It Apart
- Composition and scale: 4,659 single-turn rows sourced from Glint-Research Fable‑5 traces, with ~81% ending in a
<tool_use>block and ~19% pure-text responses. Average CoT lengths are large (mean/median in the thousands of characters), so the data is concentrated on long-form developer-style reasoning about edits and tool calls. - Training-ready format: collapsed into a single
textcolumn using a Qwen chat template (system + user +<think>+ visible response), and serialized tool calls as custom<tool_use name="X" id="Y">…</tool_use>XML, ready for SFT trainers that accept a single text field. - Hygiene and provenance: PII and leaked keys were scrubbed, duplicate user turns deduplicated, and problematic assistant rows removed. The dataset inherits AGPL-3.0 from upstream and is explicitly derived from Anthropic's Claude Fable‑5 preview outputs.
Who It's For and Trade-offs
Great fit if you want to SFT an LLM to emulate agentic, tool-using developer workflows (e.g., emit file edits, shell commands, reads/writes) or to study coupling of CoT and tool invocations. It’s practical for experimenting with Qwen-style fine-tuning and agent scaffolds like Qwable.
Look elsewhere if you need multi-turn conversational chat logs without CoT, large-scale diverse instruction data (this is a narrow developer-session distribution), or unambiguous native Qwen <tool_call> tokens—the dataset uses a custom XML envelope for tool calls and many contexts are truncated upstream, which may require context-reconstruction or filtering for your use case.
Practical notes
- Licensing & policy: AGPL-3.0; downstream users should confirm compliance with Anthropic usage policies for any use of model-derived content.
- Provenance: cleartext CoT comes from Glint-Research traces; some other Fable‑5 sources had thinking blocks redacted and were excluded.
- Typical workflow: use as SFT input to teach agentic behaviors (optionally training on responses only), or as a source of long-form CoT paired with concrete tool actions for evaluation and distillation.
