Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use conda packages cache as we do with pip #3261

Closed
humitos opened this issue Nov 14, 2017 · 7 comments
Closed

Use conda packages cache as we do with pip #3261

humitos opened this issue Nov 14, 2017 · 7 comments
Labels
Improvement Minor improvement to code Needed: design decision A core team decision is required

Comments

@humitos
Copy link
Member

humitos commented Nov 14, 2017

At the moment our conda command uses these settings,

docs@cec884de0fdf:~$ conda info

SNIP

       root environment : /home/docs/.conda  (writable)
    default environment : /home/docs/.conda
       envs directories : /home/docs/.conda/envs
          package cache : /home/docs/.conda/pkgs

SNIP

For pip we use the --cache-dir attribute so we save all the packages in the project's directory and we avoid to re-download them again.

In conda we are doing nothing. I found that we can use CONDA_PKGS_DIRS env variable for this (I didn't find an attribute yet), so:

docs@cec884de0fdf:~$ CONDA_PKGS_DIRS=/tmp conda info

SNIP

       root environment : /home/docs/.conda  (writable)
    default environment : /home/docs/.conda
       envs directories : /home/docs/.conda/envs
          package cache : /tmp

SNIP

We could use this env variable to point to something like ./user_builds/<project-slug>/.cache/conda

@humitos humitos added Improvement Minor improvement to code Needed: design decision A core team decision is required Needed: more information A reply from issue author is required labels Nov 14, 2017
@agjohnson agjohnson added this to the New build features milestone Nov 16, 2017
@agjohnson agjohnson removed the Needed: more information A reply from issue author is required label Mar 30, 2018
@shoyer
Copy link

shoyer commented Jun 1, 2018

My project's builds on RTD have been timing-out recently. The biggest contributor (likely ~300 seconds) appears to be time spent downloading packages in packages as part of conda env create.

So if we could setup caching, that would make a big difference for us.

@jorisvandenbossche
Copy link

I want to second this. I am having similar problems recently (#4071), and it looks the download + install of the conda packages varies to take between 500- 800 seconds (once even up to 1500 seconds), of which the download is a significant part I suspect.
Running the docs itself once everything is installed does not take that long, but we have some quite heavy dependencies (eg the full gdal and deps stack). So caching the conda packages would also help a lot.

@stsewd
Copy link
Member

stsewd commented Jul 23, 2018

Just to clarify, if this is implemented you will still have the timeout problem in your first recent builds. How is this?

  1. The build is triggered
  2. You get the timeout (but rtd cache the downloaded Conda packages)
  3. The second time probably there is no timeout
  4. After some time rtd will clear the cache to save disk space
  5. Back to 1)

@humitos
Copy link
Member Author

humitos commented Jul 23, 2018

Once I thought that maybe a global cache for conda/pip would be great, but then I realize that this has two main problems:

  • the packages cache need to be replicated on each builder server
  • if a package is malicious (for some reason) and is saved in the cache all the other people will use that one

Then, we talked about devpi (which is great!) and we opened an issue for using it under development on intermittent internet connections: #3553

Although, devpi could be used to solve this problem of a shared cache because:

  • we need to store only the packages that were used at least once
  • we could use management commands to keep it clean
  • we could share all those packages within our intranet and do not repeat the storage (repointing the index URL to our devpi server)

Anyway, these are just ideas and unfortunately, they are not on our current roadmap.

@shoyer
Copy link

shoyer commented Jul 25, 2018

Just to clarify, if this is implemented you will still have the timeout problem in your first recent builds.

This is true -- but it would still be better than our current state of affairs!

@humitos
Copy link
Member Author

humitos commented Jul 31, 2018

Unfortunately, I didn't find a proxy for conda channels similar to devpi for PyPI.

I found conda-mirror (https://github.com/Valassis-Digital-Media/conda-mirror) which creates a mirror of an upstream channel. Although it's not exactly what we need, it could be useful to link it here. There is possibles solution around using a custom mirror, but they are too complex.

@stsewd
Copy link
Member

stsewd commented Apr 29, 2020

We no longer cache things, and we have bigger builders for users using conda

@stsewd stsewd closed this as completed Apr 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Improvement Minor improvement to code Needed: design decision A core team decision is required
Projects
None yet
Development

No branches or pull requests

5 participants