Provides JSON traces from a Codex-driven swebenchpro agentic benchmark, including per-call token counts, cache hit rates, timing, and per-trial outcomes. Useful for research into LLM caching, long-context workloads, and agent evaluation. MIT-licensed and compact.