Skip to content

Fix mypy type checking via creating a nox typecheck session #13476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

notatallshaw
Copy link
Member

@notatallshaw notatallshaw commented Jul 11, 2025

This moves type checking out of the pre-commit and into a dedicated nox -s typecheck, this was inspired by and succeeds #12866.

On top of the reasons outlined there there are two big issues with pip's type checking:

1. The additional dependencies in the .pre-commit.yaml are rarely updated and currently obsolete

This PR fixes that by making the nox -s typecheck a new type-check dependency groups, which inherits from the existing test dependency group, a new docs dependency group, and a couple of other dependency groups that define various parts of pip.

Taking the changes from #12866 and apply this increased the number of mypy errors from 0 to 48.

2. Vendored dependencies that use type stubs are not currently type checked

Pip vendors libraries like requests which do not include type hints, normally this is fixed by installing a stubs package, types-requests at type hint time, however this was not being picked up by mypy because it does not recognized pip._vendor.requests as requests.

This can be fixed by doing all the imports of vendored libraries that don't include type hints like this (fortunately there are only a few libraries that need this):

if TYPE_CHECKING:
    import requests
else:
    from pip._vendor import requests

Doing this increased the number of mypy errors from 48 to 71.

The rest of the PR is simply about fixing all the type check errors, I tried to be very conservative, using # type: ignore and cast rather than making complicated refactors, removing this can be done by follow up PRs if someone is so inclined.

A side benefit is pre-commit runs much much faster, allowing you to enable it locally and it not be painful.

@notatallshaw notatallshaw marked this pull request as draft July 11, 2025 05:47
@notatallshaw
Copy link
Member Author

Ah, there's a number of issues to work through, it seems I broke a few tests, and vendoring uses an old version of toml that can not correctly parse mixed type tables. I'll have to return to this next week.

@notatallshaw notatallshaw changed the title Fix mypy type checking via nox typecheck session Fix mypy type checking via creating a nox typecheck session Jul 11, 2025
@notatallshaw notatallshaw force-pushed the make-mypy-nox-session branch from 4253245 to c5208d2 Compare July 17, 2025 13:20
@notatallshaw notatallshaw added the skip news Does not need a NEWS file entry (eg: trivial changes) label Jul 17, 2025
@notatallshaw notatallshaw force-pushed the make-mypy-nox-session branch from 752ba0f to 32e4afa Compare July 17, 2025 13:38
@notatallshaw notatallshaw marked this pull request as ready for review July 17, 2025 13:54
@notatallshaw
Copy link
Member Author

This is ready for review, this is almost all type hint fixes, there are almost no logical changes, the purpose of this PR is to fix mypy's type checking, follow up PRs may remove the # type: ignores and casts on a case by case basis.

The exceptions are:

  • A few extra asserts, these would be runtime errors anyway, but the assert catches them early and allow the type checker to know that the type is restricted
  • "Content-Length": stats.st_size, -> "Content-Length": str(stats.st_size),, this is technically a bug fix, headers should only be strings, but requests probably implicitly converts it anyway rather than be strict.

Copy link
Contributor

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope you don't mind the drive-by review. I think it is great to have the codebase type checked, and it is a big step forwards 👍

@@ -81,6 +81,48 @@ test-common-wheels = [
"pytest-subket >= 0.8.1",
]

docs = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do dependency groups for docs in this PR? Seems like it could be a separate one...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation is to make everything available as a dependency group for type hinting so that all dependencies are exposed in the pyproject.toml and therefore more "noticeable" by future maintainers / tools.

I have no problem splitting the docs group into a separate PR, it just seemed a small enough change to include here (I initially didn't split them out in my previous PR #13475 because I didn't properly test that it worked, but on making this PR I realized that as long as you've installed pip it does).


# Libraries that are not required for pip to run,
# but are used in some optional features.
all-optional = ["keyring"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems very curious to me to be using dependency-groups to specify optional extras...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So historically pip has a couple of dependencies it has where if it is present in the environment it will behave differently, notable keyring and truststore (though this is now full vendored). I don't the history of not providing them as an optional dependency, but probably to do with bootstrapping?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pip has historically had no dependencies at all. That's an explicit and deliberate design choice - pip is not a package that you install "normally"1, so it needs to be usable in a situation where no pre-existing package installer is available. Working out how to install optional dependencies while bootstrapping pip is non-trivial (and not something I'd want to support), while once pip is installed, pip install keyring is just as easy as (if not easier than) doing something like pip install --force-reinstall pip[keyring]. Given that we currently don't have any command for listing available extras, it's not even more discoverable.

So rather than have to deal with the support issues of all of this, we simply note that if the user has keyring installed, pip will make use of it.

I don't object to having an all-optional dependency group for developers to install anything that affects the behaviour of pip (although I question the usefulness of it, from my own perspective as a developer), but I don't think we should try to navigate the complexities of making extras work as an end user interface for pip.

Footnotes

  1. Even if you can sort of do so, by using --python or the pip zipapp to install pip. Those still need special care, and aren't what I'd call entirely normal situations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't object to having an all-optional dependency group for developers to install anything that affects the behaviour of pip (although I question the usefulness of it, from my own perspective as a developer), but I don't think we should try to navigate the complexities of making extras work as an end user interface for pip.

The usefulness here to me is for two reasons:

  1. It highlights these situations directly in the pyproject.toml
  2. It allows install all packages that might require type hints to be specified in a very visible place (not hidden in the noxfile or pre-commit config)

But this is not a sticking point for me, I'm willing to put it in the noxfile directly.

# Vendored libraries with type stubs
from requests.exceptions import InvalidProxyURL
else:
from pip._vendor.requests.exceptions import InvalidProxyURL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistically, I find the moving of the real imports to be uncomfortable. An alternative pattern:

import typing


from pip._vendor.requests.exceptions import InvalidProxyURL

if typing.TYPE_CHECKING:
    from requests.exceptions import InvalidProxyURL as _InvalidProxyURL
    InvalidProxyURL = _InvalidProxyURL

Alternatively, perhaps we should just vendor typing-requests too...

This latter option will be the least error-prone, as I can see that it will be easy for developers to accidentally use the real requests instead of the vendored one (e.g. in exceptions.py we import requests only, since we are inside a TYPE_CHECKING block).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, perhaps we should just vendor typing-requests too...

I guess this is a drive-by review of your drive-by review, but in general I’m -1 on vendoring things that we only require for type checking. That feels like runtime bloat for something that is only needed at development time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a drive-by review of your drive-by review

😂 - and it is a very welcome comment 👍

That feels like runtime bloat for something that is only needed at development time

Yes, fully accept. The proposal in this PR currently adds maintenance bloat, which is arguably a more precious resource for the project, but I fully see your perspective.

Given the fact that this is a developer-time requirement, then I think there is a reasonable middle-ground where you could install & vendor the typestub packages at test time, and not actually commit the vendored type libraries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philfreo I'm pretty sure your proposed pattern doesn't work for type checkers, it throws an error that InvalidProxyURL is of type pip._vendor.requests.exceptions.InvalidProxyURL can not assign type requests.exceptions.InvalidProxyURL.

I went though several rounds of how to provide the type stub hints to the the type checker without creating any run time differences and landed on the current approach as fairly simple, the most readable, and works in all situations. But I am happy to find out there's a simpler or more readable version.

Copy link
Member Author

@notatallshaw notatallshaw Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level the options here are:

  1. Leave as is
  2. Configure mypy not to type check against these vendored libraries
  3. Implement some pattern to get the type checker to recognize the type stubs
  4. Vendor type stubs

And in my opinion here are the downsides:

  1. Because we vendor these libraries the type checker assumes it can pull types from these libraries, so we've historically type checking against the wrong types
  2. We won't be doing any type checking for these libraries, configuration will not apply to other type checkers
  3. Adds some additional complexity where we are importing these libraries
  4. Affects run time distribution of pip

I do not want 1 and 4, because I don't want to be running mypy producing incorrect type checking, and I don't want to affect the pip distribution with type hint stuff.

I prefer 3 because I think I've found a pattern that is fairly readable and maintainable, so if 3 is rejected I would look to implement 2.

@@ -80,6 +86,7 @@ def _ensure_api_header(response: Response) -> None:
):
return

assert response.request.method is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runtime behaviour change. I guess this is because the type hints suggest that method can be none. If that is the case, do we need to adapt the code to handle that case... ?

(e.g. _NotAPIContent accepts None as a method)

@@ -80,7 +80,7 @@ def _looks_like_bpo_44860() -> bool:

See <https://bugs.python.org/issue44860>.
"""
from distutils.command.install import INSTALL_SCHEMES
from distutils.command.install import INSTALL_SCHEMES # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest that pyproject.toml is used to declare that distutils imports should be ignored, rather than peppering this into the codebase.

@@ -44,13 +45,13 @@ def request(
headers = {"Content-Type": "text/xml"}
response = self._session.post(
url,
data=request_body,
data=cast(bytes, request_body),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the input type just become bytes then?

@@ -99,7 +99,7 @@ def make_mock_server(**kwargs: Any) -> _MockServer:

mock = Mock()
app = _mock_wsgi_adapter(mock)
server = _make_server("localhost", 0, app=app, **kwargs)
server = cast(_MockServer, _make_server("localhost", 0, app=app, **kwargs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to do:

server: _MockServer = _make_server(...)

here? I appreciate that _make_server doens't have type hints yet, but when it gets them, it would be good that we don't trample on them by doing dangerous casts.

"Version": version,
}
)
metadata: dict[str, HeaderValue] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a behaviour change. Technically, I think you could have used typing.Mapping instead of dict, and kept the insensitive dict usage.

I don't have an opinion on whether this is a good change or not, but wonder if it makes sense to make a separate PR to have the merit of it assessed separately?

@@ -71,13 +72,13 @@ def require(self, name: str) -> None:
)
def test_get_distribution(ws: _MockWorkingSet, req_name: str) -> None:
"""Ensure get_distribution() finds all kinds of distributions."""
dist = Environment(ws).get_distribution(req_name)
dist = Environment(cast(WorkingSet, ws)).get_distribution(req_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to avoid this cast, as well as the import ordering issue, is to define _MockWorkingSet in at TYPE_CHECKING block:

if typing.TYPE_CHECKING:
    from pkg_resources import WorkingSet as _MockWorkingSet
else:
    class _MockWorkingSet(list[mock.Mock]):
        def require(self, name: str) -> None:
            pass

@@ -320,7 +325,7 @@ def _send(sent_req: MockRequest, **kwargs: Any) -> MockResponse:
resp.status_code = 401
resp.connection = connection

auth.handle_401(resp)
auth.handle_401(cast("Response", resp))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid this cast, perhaps adapt https://github.com/pypa/pip/blob/cc1ad642092a4928fd71e894089bfe68bfe21325/tests/lib/requests_mocks.py to define MockRequest in a TYPE_CHECKING block:

if TYPE_CHECKING:
    from requests.models import Response as MockResponse
else:
    class MockResponse:
        request: MockRequest
        connection: MockConnection
        url: str

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the current implementation over "action at a distance" effects that make the type checker think something different is happening than what is actually happening.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pradyunsg Are you referring to @pelson comment or my change?

Currently mypy is checking the wrong types, so something needs to be fixed.

I've not yet looked at @pelson's specific suggestion here.

Comment on lines 60 to 61
if not TYPE_CHECKING:
from distutils.dist import Distribution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the use of if TYPE_CHECKING anywhere other than the top level of the module to be a code smell -- since it makes it trickier to reason about what is available at runtime vs type checking time, by adding an additional location to remember the "TYPE_CHECKING" is true thing.

Can we avoid this check entirely here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is a use of if not TYPE_CHECKING not if TYPE_CHECKING and is a consequence of where the import itself is happening. The if TYPE_CHECKING is at the top of the module.

The if not TYPE_CHECKING was the only way I found to make both mypy and pyright happy, without it pyright complains Import "distutils.dist" could not be resolved from source, adding a type ignroe comment mypy complains error: Unused "type: ignore" comment. But on this basis I've removed the if not TYPE_CHECKING and ignoring the pyright error as it doesn't affect anyone not running pyright.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip news Does not need a NEWS file entry (eg: trivial changes)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants