AI Agent Papers2026

FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents

Synthesizes shortcut-resistant search tasks to train deep search agents by controlling four shortcut risks across entity selection, evidence-graph construction, question formulation, and adversarial refinement. Produces training trajectories with longer pre-answer search and fewer shortcut patterns; code will be released on GitHub.

Visit Website

Introduction

Most synthetic search-task generators increase apparent difficulty by complicating graph structure, but models can still collapse the intended search process via cheaper identifying routes (shortcuts). FORT starts from the observation that structural complexity alone doesn't guarantee realized search difficulty and instead formalizes a shortcut-aware difficulty framework to diagnose and mitigate four concrete shortcut risks: evidence co-coverage, single-clue selectivity, exposed constants, and prior-knowledge binding.

Key Findings

FORT's shortcut-aware synthesis yields trajectories with longer pre-answer search costs and lower prior-shortcut rates than existing open-source deep-search datasets — so what: training signals better reflect the intended multi-step search process rather than superficial cues.
By controlling shortcuts at four stages (entity selection, evidence-graph construction, question formulation, adversarial refinement), FORT reduces exploitable patterns that let agents bypass search. This means learned agents must actually acquire evidence across steps to answer.
Using FORT-generated trajectories to supervise fine-tuning produced FORT-Searcher, which outperforms comparable-size open-source search agents on challenging deep-search benchmarks when trained with SFT only — so what: high-quality, shortcut-resistant synthetic data can substantially improve supervised training for search agents.

How it works (brief)

FORT instruments dataset synthesis with measurable trajectory signatures (solving cost, answer hit time, prior-shortcut rate) to detect realized shortcuts. It then applies targeted interventions during entity selection, evidence-graph creation, and question phrasing, followed by adversarial refinement to iteratively remove remaining shortcut patterns. The authors plan to release dataset generation code and FORT-Searcher code at RUCAIBox/FORT-Searcher on GitHub.

Who it's for & tradeoffs

Great fit if you are developing or evaluating deep search/QA agents and need training trajectories that enforce multi-step evidence acquisition rather than surface heuristics. Look elsewhere if your task requires naturalistic, human-generated question-answer pairs without synthetic control (FORT intentionally modifies generation to avoid shortcuts) or if you need methods focused on reinforcement learning rather than SFT-driven training.

Back

Information

Websitearxiv.org
AuthorsJia Deng, Yimeng Chen, Xiaoqing Xiang, Ziyang Zeng, Shuo Tang, Wayne Xin Zhao, Feng Chang, Chuan Hao, Yuan Wei, Ran Tao …
Published date2026/06/10

More Items

AI Agent Papers2026

AREX: Towards a Recursively Self-Improving Agent for Deep Research

Shuqi Lu, Chaofan Li +21Beijing Academy of Artificial Intelligence (BAAI)

Alternates targeted research and constraint-wise audits to recursively improve long-horizon answers: an inner loop gathers evidence and drafts solutions, an outer loop audits unresolved claims and launches focused follow-ups. Trains 4B dense and 122B-A10B MoE agents with long-horizon RL and agentic mid-training, outperforming comparable-scale baselines on multi-step research benchmarks.

agent-skills ai-agent RL reasoning llm+1

AI Agent Papers2026

DataFlow-Harness: A Grounded Code-Agent Platform for Constructing Editable LLM Data Pipelines

Runming He, Zhen Hao Wong +5

Guides an LLM agent to build persistent, editable DAG-based data pipelines via typed, incremental mutations instead of free-form scripts. Combines DataFlow-Skills, a Model Context Protocol exposing live operator registry and pipeline state, and a synchronized Web UI; achieves 93.3% end-to-end pass rate on a 12-task benchmark while cutting cost and latency versus script baselines.

agent-skills mcp ai-agent ai-workflow LLM+4

AI Agent Papers2026

EvolvingWorld: An Open-Schema Framework for Co-Evolving Role-Play Agents and World Model in Interactive Literary World

Qing Zong, Yue Guo +3

Models long-horizon interactive literary simulation where characters and world co-evolve; introduces an open‑schema framework with a Character Agent and an LLM-based World Model, plus seven trainable tasks and a dataset from 57 books for benchmarking persistent narrative state.

paper LLM NLP ai-agent agent-skills+2

FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents

Introduction

Key Findings

How it works (brief)

Who it's for & tradeoffs

Information

Categories

Tags

More Items

AREX: Towards a Recursively Self-Improving Agent for Deep Research

DataFlow-Harness: A Grounded Code-Agent Platform for Constructing Editable LLM Data Pipelines

EvolvingWorld: An Open-Schema Framework for Co-Evolving Role-Play Agents and World Model in Interactive Literary World