SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

SWE-agent is a system designed to empower language model (LM) agents to autonomously perform software engineering tasks. It features a custom agent-computer interface (ACI) that enhances the agent's ability to navigate repositories, create and edit code, and execute programs, achieving state-of-the-art results on the SWE-bench and HumanEvalFix benchmarks. [2, 5, 8]

Visit Website

Introduction

Language model (LM) agents are increasingly used for automating complex digital tasks. [5] Just as human developers rely on Integrated Development Environments (IDEs), this paper posits that LM agents also need specialized interfaces to effectively tackle complex challenges like software engineering. [5, 8]

To address this, the paper introduces SWE-agent, a system that provides a tailored agent-computer interface (ACI). [2, 3] This interface is specifically designed to improve an agent's ability to perform key software development actions, including:

Creating and editing code files [2]
Navigating entire code repositories [2]
Executing tests and other programs [2]

By using this specialized ACI, SWE-agent achieved state-of-the-art performance, solving 12.5% of issues on the SWE-bench benchmark and 87.7% on HumanEvalFix, significantly outperforming previous non-interactive models. [2, 5, 9]

Back

Information

Websitearxiv.org
AuthorsJohn Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
Published date2024/05/06

More Items

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

2025

DeepSeek-AI, Aixin Liu +262

DeepSeek-V3.2 is an open large language model that balances high computational efficiency with superior reasoning and agent capabilities. Key innovations include DeepSeek Sparse Attention (DSA) for reduced complexity in long contexts, a scalable reinforcement learning framework achieving GPT-5-level performance, and a large-scale agentic task synthesis pipeline for improved generalization in tool-use scenarios.

deepseek LLM paper RL ai-agent

LightRAG

2024

Zirui Guo, Lianghao Xia +3

LightRAG is an open-source framework designed for simple and fast Retrieval-Augmented Generation (RAG), integrating knowledge graphs, vector search, and efficient LLM-based processing to enhance question-answering over large document collections.

RAG LLM NLP github ai-development+5

ReAct: Synergizing Reasoning and Acting in Language Models

2022

Shunyu Yao, Jeffrey Zhao +5

This paper introduces ReAct, an approach that integrates reasoning and acting in large language models (LLMs). ReAct enables LLMs to generate both reasoning traces and task-specific actions in an interleaved manner. This synergy allows reasoning to help induce, track, and update action plans, while actions interface with external sources like knowledge bases to gather more information, overcoming issues of hallucination and error propagation in prior methods.

paper LLM NLP ai-agent google+1