-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mount pip_cache_path
in Docker container
#3556
Conversation
This allow us to have permanent pip cache between different builds / projects to work faster locally and under slow internet connections.
I actually dont like having pip cache in build. |
@safwanrahman we were discussing this in the chat to "fix the problem" of working under intermitent/slow internet connections. This would be only for development and not a setting for production actually. I think the examples you propose (the "why is not a good idea") are valid. So, maybe we should limit this line https://github.com/rtfd/readthedocs.org/pull/3299/files#diff-d61f6aff9736141f30b581471d5d55ceR444 only when |
@humitos Yeah. I think putting it in development mode is totally a great idea! |
@@ -449,7 +449,7 @@ def checkout_path(self, version=LATEST): | |||
@property | |||
def pip_cache_path(self): | |||
"""Path to pip cache.""" | |||
if getattr(settings, 'GLOBAL_PIP_CACHE', False): | |||
if getattr(settings, 'GLOBAL_PIP_CACHE', False) and settings.DEBUG: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a default GLOBAL_PIP_CACHE
in dev.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. By default this feature is deactivated. It could be something like:
GLOBAL_PIP_CACHE = os.path.join(SITE_ROOT, '.cache', 'pip')
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say leave deactivated so our development environments are close to production use.
@@ -728,6 +728,10 @@ def create_container(self): | |||
'bind': self.project.doc_path, | |||
'mode': 'rw' | |||
}, | |||
self.project.pip_cache_path: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly redundant, as the project pip cache path is under doc_path
already.
Also, normally to share a pip cache, we'd set CACHE_DIR
env variable, or --cache-dir
argument on pip. To use underlying storage like this, but then repeated remount this path to different locations through docker might cause problems if pip stores and metadata that includes the full path. I don't know enough about pip internals to answer this.
It seems like the most correct option would be to explicitly set CACHE_DIR
on the pip commands if using a global cache, and to mount the global cache conditionally on the binds
argument here if GLOBAL_PIP_CACHE
is True
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly redundant, as the project pip cache path is under doc_path already.
Mmm... Yes. This is redundant when the GLOBAL_PIP_CACHE
is not set. So, maybe we should add this path only when it's set (as you said, conditionally).
Also, normally to share a pip cache, we'd set CACHE_DIR env variable, or --cache-dir argument on pip
We are using the --cache-dir
in the command at python_environments.py
file, and that dir points to the project.pip_cache_path
I prefer to use the argument option instead of the env variable (since it's less explicit and more complicated to debug later)
I don't know enough about pip internals to answer this
As far as I know, pip doesn't save metadata, but creates a hash of the package downloaded and use it as filename --but use the real one for wheels:
├── http
│ ├── 0
│ │ ├── 0
│ │ │ ├── 3
│ │ │ │ └── f
│ │ │ │ └── 8
│ │ │ │ └── 003f82596103cb4b5d71b69aeb8119f406b1bc78110a190848c4fb24
│ │ │ ├── 6
│ │ │ │ └── 6
│ │ │ │ └── b
│ │ │ │ └── 0066b726b642ce2657f32a7c7b9f63ae698a2a16695cda32f18eb806
└── wheels
├── 02
│ └── 2e
│ └── 0d
│ └── 30ef9919ce3a07990f191c164d07b083761535fb721a1af3cf
│ ├── ip_associations_python_novaclient_ext-0.2-cp27-none-any.whl
│ └── ip_associations_python_novaclient_ext-0.2-cp36-none-any.whl
├── 06
│ └── 3d
│ └── 74
│ └── f64c3ce3266df2598031818ecc86bc25c8ccede0723973f119
│ └── anyjson-0.3.3-cp27-none-any.whl
I'd say that mounting conditionally under the binds depending on this setting is the missing piece here and the rest is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds good. I don't think there is any metadata dependent on the path either, so this is most likely safe I'm sure.
Refactor, by creating a new method, where the binds are created to be easier to read.
I just refactor the creation of bindings by creating a new method in that class which will mount the global cache conditionally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
r+
@ericholscher Can I merge this Pull request? |
This allow us to have permanent pip cache between different builds /
projects to work faster locally and under slow internet connections.
Closes #3553