PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention.
The PentestAgent framework consists of several modules corresponding to aforementioned penetration testing stages:
- Reconnaissance Agent
- Planning Agent
- Execution Agent
For further information, please refer to our paper.
git clone https://github.com/nbshenxm/pentest-agent.git
cd pentest-agent
Several environment variables need to be filled in. If you are not familiar with environment variables, set them in the .env
file.
- GITHUB_KEY: GitHub Token for github search
- OPENAI_API_KEY: OpenAI API key for accessing OpenAI models
- HUGGING_FACE_TOKEN: HuggingFace token for accessing HuggingFace models
- INDEX_STORAGE_DIR: directory for storing index for RAG
- PLANNING_OUTPUT_DIR: directory for storing vulnerability analysis results
- LOG_DIR: directory for storing logs
-
Python version: 3.12
-
Python libraries can be installed by running
pip install -r requirements.txt
It is recommended to create a virtual environment before installing dependencies
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txt
or
conda create -n venv python=3.12
conda activate venv
python -m pip install -r requirements.txt
- CVEMap is needed to fetch CVE-related information. Follow their installation instructions.
Warning: please run the system in an isolated environment (e.g., VM, container) to avoid unintended consequences from the execution.
Given a target IP, the reconnaissance agent will collect information of the target.
Entry point: pentest_agent/agents/recon_agent.py
Usage:
- Set the topic name for future reference
- Set the target IP
- Run the program and check the output in terminal
Given a product of interest, the planning agent will search for and download relevant vulnerabilities and corresponding exploits.
Entry point: pentest_agent/agents/planning_agent.py
Usage:
- Set the desired language model
- Set the product of interest
- Run the program and check the results in the output directory
Given an exploit code repository, the execution agent can leverage the information collected during reconnaissance phase to automatically execute and debug the exploit.
Entry point: pentest_agent/agents/execution_agent.py
Usage:
- Set the topic (the same topic used in reconnaissance) or set it to None if there is no previous reconnaissance
- Set the exploit code repository local path at line
- Run the program and monitor the terminal for execution steps
Optional:
- Manually provide environmental information
We used vulhub as the infrastructure of the benchmark. VulHub provides containers that reproduce various vulnerable environments.
Out of format consistency consideration, we only consider the vulnerabilities with a CVE number. At the time when we evaluate PentestAgent, VulHub contains 90 applications with 189 CVEs. For each application, we include the CVE with the highest CVSS score in our benchmark.
It's been a while since we performed our evaluation. We are working on including some new scenarios in addition to the VulHub in the benchmark, as well as evaluating PentestAgent on a variety of advanced LLM backbones. We will publish our results on the benchmark these works are finished.
If you have any suggestions for improvements or found bugs, feel free to drop an issue. I'll try to respond ASAP.