LogoAIAny
Icon for item

PentestGPT

PentestGPT is an open-source research prototype that uses large language models as autonomous agents to perform automated penetration testing. It supports Docker-first deployment, session persistence, multiple LLM backends (Anthropic/Claude/OpenAI/local), and a benchmark suite of vulnerability challenges.

Introduction

Overview

PentestGPT is an AI-powered, autonomous penetration testing agent and research prototype. The project demonstrates how modern LLMs can be orchestrated as intelligent agents to perform vulnerability discovery, exploit development, and CTF-style challenge solving. It was published alongside a USENIX Security 2024 paper that evaluates and contextualizes the approach.

Key Features
  • Autonomous agent pipeline: orchestrates LLM-driven reasoning and action steps for penetration testing tasks.
  • Docker-first design: pre-built Docker images with common security tools for reproducible testing environments.
  • Session persistence: save and resume testing sessions so long investigations can be continued later.
  • Multi-LLM support: integrations with Anthropic, Claude (OAuth), OpenAI, OpenRouter, and options for routing to local LLM servers (LM Studio, Ollama, text-generation-webui).
  • Benchmark suite: includes 100+ vulnerability challenges across categories like web, crypto, reversing, forensics, pwn, and privilege escalation to evaluate capabilities.
  • Extensible architecture: modular components for tool execution, background reasoning, and model routing.
Typical Usage
  1. Install and run via Docker (repository provides Makefile helpers and a Docker-first workflow).
  2. Configure an LLM provider or local LLM endpoint during first-time setup (Anthropic/OpenAI/Claude/local).
  3. Launch interactive or non-interactive pentest runs targeting an authorized host or benchmark challenge.
  4. Observe live walkthroughs and step-by-step actions as the agent searches, tests, and reports findings.
Technical Notes
  • The project routes different workloads to different models (e.g., "think" for reasoning-heavy tasks, "webSearch" for search-related tasks) and supports custom router configuration in CCR config files.
  • Local LLM support is provided by pointing PentestGPT at OpenAI-compatible endpoints exposed by tools such as LM Studio or Ollama.
  • Telemetry is anonymous by default and can be disabled; the maintainers emphasize that no sensitive command outputs or credentials are transmitted.
Ethics & Use Policy

PentestGPT is explicitly a research prototype for authorized security testing, evaluation, and education. The authors and repository disclaimers emphasize that it should only be used in legal, authorized contexts; misuse for unauthorized attacks is prohibited and unethical.

Citation & Origins

The tool is associated with a USENIX Security 2024 paper titled "PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing" and maintained in the GreyDGL GitHub organization. The repository contains demos, installation instructions, and a benchmark suite for reproducible research.

Information

  • Websitegithub.com
  • AuthorsGelei Deng, Yi Liu, Víctor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, Stefan Rass, GreyDGL (GitHub repo owner / maintainer)
  • Published date2023/02/27

Categories

More Items