LogoAIAny
Icon for item

garak

garak is an open-source command-line LLM vulnerability scanner and red-teaming assessment kit. It probes large language models for failures such as hallucination, data leakage, prompt injection, misinformation, toxicity, jailbreaks, and other weaknesses. garak supports many backends (Hugging Face, OpenAI, Replicate, AWS Bedrock, gguf/llama.cpp, etc.), provides a wide range of probes and detectors, and produces structured logs and JSONL reports. Licensed under Apache-2.0.

Introduction

garak — LLM vulnerability scanner (Generative AI Red-teaming & Assessment Kit)

garak is an open-source command-line toolkit designed to probe and evaluate large language models (LLMs) for undesirable behaviors and security weaknesses. The project aggregates a broad set of static, dynamic, and adaptive probes to explore failures such as hallucinations, training-data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and other emergent weaknesses. It was published as a GitHub project (moved to the NVIDIA organization) and is intended for researchers, red-teamers, and engineering teams who need to assess model robustness and safety.

Key features
  • Comprehensive probe library: multiple probe types (e.g., promptinject, encoding, dan, leakreplay, malwaregen, realtoxicityprompts, snowball, xss, and many others) that target different failure modes.
  • Detector/evaluator system: each probe can be paired with detectors that automatically flag failures; results are summarized per-probe with fail rates and logs.
  • Multi-backend support: works with Hugging Face (local & Inference API), OpenAI, Replicate, AWS Bedrock, litellm, gguf/llama.cpp (>= specific versions), REST endpoints and many others — making it usable across hosted, on-prem, and local models.
  • Extensible plugin architecture: probes, detectors, generators, harnesses and evaluators are implemented as plugins, making it straightforward to add custom tests.
  • CLI-first workflow: designed as a command-line tool with simple install and run patterns; outputs structured JSONL run reports and separate hit logs for vulnerabilities.
  • Logging & analysis: produces garak.log, detailed JSONL run reports and hit logs; includes example analysis scripts to inspect problematic prompts and probe hits.
  • Open-source license: Apache-2.0 license, encouraging adoption and contribution.
Typical usage

Install via PyPI or from GitHub for the latest code:

python -m pip install -U garak
# or
python -m pip install -U git+https://github.com/NVIDIA/garak.git@main

Run a scan (example probing an OpenAI chat model):

export OPENAI_API_KEY="sk-..."
python3 -m garak --target_type openai --target_name gpt-3.5-turbo --probes encoding

You can list probes, detectors and generators, and selectively run only specific tests. garak prints progress and records detailed results to a JSONL report file.

Supported targets & integrations

garak provides generator plugins for many model interfaces, including:

  • Hugging Face pipeline & Inference API
  • OpenAI Chat & Completion APIs
  • Replicate
  • AWS Bedrock
  • gguf/llama.cpp and local gguf models
  • REST endpoints (rest.RestGenerator) for arbitrary HTTP-based models
  • NIM endpoints and other cloud vendor integrations

Several probes assume different detector backends (e.g., toxicity detectors, pattern matchers), and garak ships with a variety of built-in detectors.

Development & extensibility

The codebase is organized into plugin categories: garak.probes, garak.detectors, garak.generators, garak.harnesses, and garak.evaluators. Developers can add plugins by inheriting from provided base classes and running local tests. The project includes documentation (docs.garak.ai, readthedocs) and an active Discord for support.

Logging, reporting & citation

Runs produce:

  • garak.log (debugging info),
  • JSONL run reports (one per run), and
  • a hit log listing attempts that triggered vulnerabilities.

If you use garak in research, the README includes a citation entry and references a related preprint.

Governance & license

The repository is hosted under the NVIDIA GitHub organization (original contributors include Leon Derczynski and others). The project is distributed under the Apache-2.0 license and welcomes community contributions via PRs and issues.

Where to find more

Official docs and project pages are linked from the repository: docs.garak.ai, garak.readthedocs.io, and the project homepage garak.ai. The README also links to slides and paper references for deeper reading.

Information

  • Websitegithub.com
  • AuthorsLeon Derczynski, Erick Galinkin, Jeffrey Martin, Subho Majumdar, Nanna Inie, NVIDIA
  • Published date2023/05/10

Categories