Skip to content

PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention. For more information, read our paper at https://arxiv.org/abs/2411.05185

License

Notifications You must be signed in to change notification settings

nbshenxm/pentest-agent

Repository files navigation

Pentest-Agent

Overview

PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention.

The PentestAgent framework consists of several modules corresponding to aforementioned penetration testing stages:

  1. Reconnaissance Agent
  2. Planning Agent
  3. Execution Agent

For further information, please refer to our paper.

Installation

1. Download Source Code

git clone https://github.com/nbshenxm/pentest-agent.git
cd pentest-agent

2. Setup Environment Variables

Several environment variables need to be filled in. If you are not familiar with environment variables, set them in the .env file.

  • GITHUB_KEY: GitHub Token for github search
  • OPENAI_API_KEY: OpenAI API key for accessing OpenAI models
  • HUGGING_FACE_TOKEN: HuggingFace token for accessing HuggingFace models
  • INDEX_STORAGE_DIR: directory for storing index for RAG
  • PLANNING_OUTPUT_DIR: directory for storing vulnerability analysis results
  • LOG_DIR: directory for storing logs

3. Install Dependencies

  • Python version: 3.12

  • Python libraries can be installed by running pip install -r requirements.txt

It is recommended to create a virtual environment before installing dependencies

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txt 

or

conda create -n venv python=3.12    
conda activate venv               
python -m pip install -r requirements.txt 
  • CVEMap is needed to fetch CVE-related information. Follow their installation instructions.

Run Agents

Warning: please run the system in an isolated environment (e.g., VM, container) to avoid unintended consequences from the execution.

Reconnaissance Agent

Given a target IP, the reconnaissance agent will collect information of the target.

Entry point: pentest_agent/agents/recon_agent.py

Usage:

  1. Set the topic name for future reference
  2. Set the target IP
  3. Run the program and check the output in terminal

Planning Agent

Given a product of interest, the planning agent will search for and download relevant vulnerabilities and corresponding exploits.

Entry point: pentest_agent/agents/planning_agent.py

Usage:

  1. Set the desired language model
  2. Set the product of interest
  3. Run the program and check the results in the output directory

Execution Agent

Given an exploit code repository, the execution agent can leverage the information collected during reconnaissance phase to automatically execute and debug the exploit.

Entry point: pentest_agent/agents/execution_agent.py

Usage:

  1. Set the topic (the same topic used in reconnaissance) or set it to None if there is no previous reconnaissance
  2. Set the exploit code repository local path at line
  3. Run the program and monitor the terminal for execution steps

Optional:

  • Manually provide environmental information

Benchmark

Infrastructure

We used vulhub as the infrastructure of the benchmark. VulHub provides containers that reproduce various vulnerable environments.

Target Selection

Out of format consistency consideration, we only consider the vulnerabilities with a CVE number. At the time when we evaluate PentestAgent, VulHub contains 90 applications with 189 CVEs. For each application, we include the CVE with the highest CVSS score in our benchmark.

Our results

It's been a while since we performed our evaluation. We are working on including some new scenarios in addition to the VulHub in the benchmark, as well as evaluating PentestAgent on a variety of advanced LLM backbones. We will publish our results on the benchmark these works are finished.

Contribution

If you have any suggestions for improvements or found bugs, feel free to drop an issue. I'll try to respond ASAP.

About

PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention. For more information, read our paper at https://arxiv.org/abs/2411.05185

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published