Overview
PentestGPT is an AI-powered, autonomous penetration testing agent and research prototype. The project demonstrates how modern LLMs can be orchestrated as intelligent agents to perform vulnerability discovery, exploit development, and CTF-style challenge solving. It was published alongside a USENIX Security 2024 paper that evaluates and contextualizes the approach.
Key Features
- Autonomous agent pipeline: orchestrates LLM-driven reasoning and action steps for penetration testing tasks.
- Docker-first design: pre-built Docker images with common security tools for reproducible testing environments.
- Session persistence: save and resume testing sessions so long investigations can be continued later.
- Multi-LLM support: integrations with Anthropic, Claude (OAuth), OpenAI, OpenRouter, and options for routing to local LLM servers (LM Studio, Ollama, text-generation-webui).
- Benchmark suite: includes 100+ vulnerability challenges across categories like web, crypto, reversing, forensics, pwn, and privilege escalation to evaluate capabilities.
- Extensible architecture: modular components for tool execution, background reasoning, and model routing.
Typical Usage
- Install and run via Docker (repository provides Makefile helpers and a Docker-first workflow).
- Configure an LLM provider or local LLM endpoint during first-time setup (Anthropic/OpenAI/Claude/local).
- Launch interactive or non-interactive pentest runs targeting an authorized host or benchmark challenge.
- Observe live walkthroughs and step-by-step actions as the agent searches, tests, and reports findings.
Technical Notes
- The project routes different workloads to different models (e.g., "think" for reasoning-heavy tasks, "webSearch" for search-related tasks) and supports custom router configuration in CCR config files.
- Local LLM support is provided by pointing PentestGPT at OpenAI-compatible endpoints exposed by tools such as LM Studio or Ollama.
- Telemetry is anonymous by default and can be disabled; the maintainers emphasize that no sensitive command outputs or credentials are transmitted.
Ethics & Use Policy
PentestGPT is explicitly a research prototype for authorized security testing, evaluation, and education. The authors and repository disclaimers emphasize that it should only be used in legal, authorized contexts; misuse for unauthorized attacks is prohibited and unethical.
Citation & Origins
The tool is associated with a USENIX Security 2024 paper titled "PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing" and maintained in the GreyDGL GitHub organization. The repository contains demos, installation instructions, and a benchmark suite for reproducible research.
