Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[py-tx] Migrate setup.py to pyproject.toml for modern packaging #1746

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

b8zhong
Copy link
Contributor

@b8zhong b8zhong commented Feb 3, 2025

Summary

Migrate setup.py to pyproject.toml for modern packaging!

Refactor away the setup.py file in the main py-tx project to adopt pyproject.toml format.
Used hatchling though.. not sure if that is best

Closes #1611!

Test Plan

Ran python3 -m pip install -e .; everything looks good I think..

python3 -m pip install -e .
Obtaining file:///Users/vincen/projects/open-source/ThreatExchange/python-threatexchange
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Installing backend dependencies ... done
  Preparing editable metadata (pyproject.toml) ... done
Requirement already satisfied: dacite in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (1.8.1)
Requirement already satisfied: faiss-cpu in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (1.9.0.post1)
Requirement already satisfied: numpy in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (2.2.1)
Requirement already satisfied: packaging in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (24.2)
Collecting pdqhash (from threatexchange==1.2.4)
  Using cached pdqhash-0.2.7-cp312-cp312-macosx_10_13_universal2.whl.metadata (2.1 kB)
Requirement already satisfied: pillow in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (11.0.0)
Requirement already satisfied: python-dateutil in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (2.9.0.post0)
Collecting python-levenshtein (from threatexchange==1.2.4)
  Using cached python_Levenshtein-0.26.1-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: requests in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (2.32.3)
Collecting types-python-dateutil (from threatexchange==1.2.4)
  Using cached types_python_dateutil-2.9.0.20241206-py3-none-any.whl.metadata (2.1 kB)
Requirement already satisfied: urllib3 in /Users/vincen/miniforge3/lib/python3.12/site-packages (from threatexchange==1.2.4) (2.2.3)
Requirement already satisfied: six>=1.5 in /Users/vincen/miniforge3/lib/python3.12/site-packages (from python-dateutil->threatexchange==1.2.4) (1.17.0)
Collecting Levenshtein==0.26.1 (from python-levenshtein->threatexchange==1.2.4)
  Using cached levenshtein-0.26.1-cp312-cp312-macosx_11_0_arm64.whl.metadata (3.2 kB)
Collecting rapidfuzz<4.0.0,>=3.9.0 (from Levenshtein==0.26.1->python-levenshtein->threatexchange==1.2.4)
  Using cached rapidfuzz-3.12.1-cp312-cp312-macosx_11_0_arm64.whl.metadata (11 kB)
Requirement already satisfied: charset_normalizer<4,>=2 in /Users/vincen/miniforge3/lib/python3.12/site-packages (from requests->threatexchange==1.2.4) (3.4.0)
Requirement already satisfied: idna<4,>=2.5 in /Users/vincen/miniforge3/lib/python3.12/site-packages (from requests->threatexchange==1.2.4) (3.10)
Requirement already satisfied: certifi>=2017.4.17 in /Users/vincen/miniforge3/lib/python3.12/site-packages (from requests->threatexchange==1.2.4) (2024.12.14)
Using cached pdqhash-0.2.7-cp312-cp312-macosx_10_13_universal2.whl (105 kB)
Using cached python_Levenshtein-0.26.1-py3-none-any.whl (9.4 kB)
Using cached levenshtein-0.26.1-cp312-cp312-macosx_11_0_arm64.whl (157 kB)
Using cached types_python_dateutil-2.9.0.20241206-py3-none-any.whl (14 kB)
Using cached rapidfuzz-3.12.1-cp312-cp312-macosx_11_0_arm64.whl (1.4 MB)
Building wheels for collected packages: threatexchange
  Building editable for threatexchange (pyproject.toml) ... done
  Created wheel for threatexchange: filename=threatexchange-1.2.4-py3-none-any.whl size=7686 sha256=3f6c86217fd62a4bfee6f43caae13390559516b47c8ed3dbc1662416c070d14a
  Stored in directory: /private/var/folders/x2/kh7kf2d55z98x2pzkqk3fy3m0000gn/T/pip-ephem-wheel-cache-p0ba8471/wheels/b9/68/6f/fbc3ee5372a15163dda2ce0f93dec440da5729b38b546fc23c
Successfully built threatexchange
Installing collected packages: pdqhash, types-python-dateutil, rapidfuzz, Levenshtein, python-levenshtein, threatexchange
Successfully installed Levenshtein-0.26.1 pdqhash-0.2.7 python-levenshtein-0.26.1 rapidfuzz-3.12.1 threatexchange-1.2.4 types-python-dateutil-2.9.0.20241206

@b8zhong b8zhong requested a review from Dcallies as a code owner February 3, 2025 02:19
@b8zhong b8zhong force-pushed the pypackage-reformatting branch 2 times, most recently from 37c5d75 to ae1a6ed Compare February 3, 2025 02:41
@b8zhong b8zhong marked this pull request as draft February 3, 2025 04:12
@b8zhong
Copy link
Contributor Author

b8zhong commented Feb 3, 2025

Aware it's not passing; working on it..

@b8zhong b8zhong force-pushed the pypackage-reformatting branch from 86cffb0 to d427439 Compare February 6, 2025 14:30
@b8zhong b8zhong marked this pull request as ready for review February 6, 2025 15:50
@b8zhong
Copy link
Contributor Author

b8zhong commented Feb 6, 2025

@Dcallies Ready to take a look now? Linting error aside.. I don't want to touch it bc usually it results in mypy getting more mad lol

Then we can version bump I'm guessing?

@b8zhong
Copy link
Contributor Author

b8zhong commented Feb 6, 2025

Err... caused more stuff to break when attemping to fix types. Gonna skip that

@Dcallies
Copy link
Contributor

Sorry, been traveling! Looking now.

Copy link
Contributor

@Dcallies Dcallies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the rollout of this might be tricky, I believe you can dry run the packaging publishing steps, but it might be up to me to test them against the test pypi instance.

I have some blocking questions inline, and my toplevel one is mostly about how we can test that packaging still works. I have the credentials for the pypi instances and can test them if we need to.

Thanks for giving this a try!

python-threatexchange/pyproject.toml Show resolved Hide resolved

[tool.hatch.version]
path = "version.txt"
pattern = "^(?P<version>[\\d.]+)$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: I don't think this will correctly capture our version strings, which are in \d+.\d+.\d+ form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, changed to "^(?P<version>\\d+\\.\\d+\\.\\d+)$"

[project]
name = "threatexchange"
dynamic = ["version"]
description = "Python Library for Signal Exchange"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code currently loads a long description (shown on the pypi) from DESCRIPTION.rst and README.md, will we need that for this to retain behavior with pypi?

Copy link
Contributor Author

@b8zhong b8zhong Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On line 9; readme = "README.md" should work

in the original setup.py though, DESCRIPTION.rst wasn't used and was just the description that shows up rn... so I thought to keep the description that was there. I think it only shows up on pypi package index - so up to you which one to use.

Screenshot 2025-02-10 at 10 54 00 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up using the rst one... as that was probably the intention

extras_require = {}

for extension_dir in extensions_dir.iterdir():
requirements = extension_dir / "requirements.txt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you go through these and include them? Or do we have options to make these partially dynamic (I don't know which is better).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I did;

extensions = [
    "vpdq",
    "py-tlsh",
    "pdfminer.six",
    "pytesseract",
]

Are the four ones. I think it's easier to state them outright instead of finding them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you recommend we do with the requirements.txt in those underlying extensions? Should we delete them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe leave it for now? I don't think anything else uses it though

Copy link
Contributor

@Dcallies Dcallies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think the next steps are on me to figure out how to test the package building, and make sure the produced package is still installable. Let me patch you and see if I can figure that out - will aim to do it tomorrow.

@Dcallies Dcallies self-requested a review February 11, 2025 15:51
@b8zhong
Copy link
Contributor Author

b8zhong commented Feb 11, 2025

I did a quick google search and apparently testpypi is a thing - maybe that helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[py-tx] Modernize packaging
3 participants