-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update data-pipeline to pull data from archived location #13
Comments
Just as a note, they cannot be served from Zenodo. |
ah ok! the download seemed to work without a login or anything but maybe it's against the ToS? |
No, I mean we can't just point the servers to the Zenodo deposit. This is just cold storage until we place them in the right location. |
Yes, that seems right! Sorry to be confusing. |
I'm sure we could do this (switch The data pipeline fetches specific datasets one at a time e.g.
rather than grabbing one or two big files. (CalEnviroScreen is a bad example because we also have that archived separately, so we could just point to that...) So to do what I think you're suggesting @titaniumbones I think we'd either have to rearrange the zenodo repository into many different smaller .zip files or reconfigure the data pipeline code to grab one or two big files. The latter might be a lot of work? Instead, we might just update the documentation here to say, download the five zip files from zenodo (maybe not I hope this makes sense. Just going off my understanding of the issue and read of the documentation. |
This sounds good at least for now. And maybe somewhere add a little bash script that just does all the steps (it's not harder than writing them in the markdown!). Sincewe have the data, I'd mark this as a "later" task. |
The current version of
data-pipeline
pulls data sources from US servers (on AWS). @willf has captured these and uploaded to zenodo. We shouldThe text was updated successfully, but these errors were encountered: