Provides 4,659 agentic single-turn SFT training pairs extracted from Claude Fable‑5, formatted as a single-column parquet for Qwen-style fine-tuning. Includes explicit chain-of-thought (<think>) blocks, XML-serialized <tool_use> calls, PII redaction, and AGPL-3.0 licensing.
Assesses whether coding agents can generate complete, playable games end-to-end inside the Godot engine. Implements an interaction-grounded evaluation (replayed demonstrations + rubric-guided multimodal judging) across 140 tasks and 15 game families; top agents score ~41%.