Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One large package or many small packages for different datasets? #13

Open
natgeo-wong opened this issue Dec 30, 2020 · 8 comments
Open

Comments

@natgeo-wong
Copy link
Member

natgeo-wong commented Dec 30, 2020

Need a second opinion ...

Many (i.e. around 3-4+) small packages for dealing with reanalysis datasets (one for each major dataset), like, one for ERA5, one for NCEP, one for MERRA, one for ERA-Interim, etc., or

1 large package allowing one to download/process all the datasets together? Going to be tricky because NCEP's data is arranged quite differently (esp. pressure levels) compared to ECMWF reanalysis datasets.

Same would go for different satellite datasets. Thinking about one for GPM, one for TRMM, and then going for the MODIS, and other different satellites.

@natgeo-wong natgeo-wong changed the title Reanalysis Packages One large package or many small packages for different datasets? Dec 30, 2020
@Datseris
Copy link
Member

my vote: 1 large.

@Alexander-Barth
Copy link
Member

I would also tend to use 1 large package, unless this would lead to a package with a very large number of dependencies (or tricky to install dependencies, like a python module).

@juliohm
Copy link
Member

juliohm commented Jan 6, 2021

Another vote for 1 large package.

Also appreciate if you can clarify how your idea relates to CDSAPI.jl for example, which is a package that already contains a bunch of climate datasets.

@Balinus
Copy link
Member

Balinus commented Jan 22, 2021

I'd also prefer 1 package, as this would help people discovering unknown datasets (unkonwn to the user I mean).

@natgeo-wong
Copy link
Member Author

Alright! Sorry for the slow response, trying to catch up with everything and it was holiday season!

Will slowly be working on this in the coming months. Need to pick up where I left off.

@gaelforget
Copy link
Member

Just linking #14 to this related discussion thread

@juliohm
Copy link
Member

juliohm commented Aug 4, 2021

My opinion about this issue changed over time. I think it depends on the scope of the "large package". If there is a common theme like a package to load "raster data", then it may make sense to group data sources. In any case, it is reasonable to start with specific packages that do their job well and then consider later if it makes sense to merge them into a common API.

@natgeo-wong
Copy link
Member Author

Yeah that was my thought. I'm thinking of doing several different satellite dataset packages (mostly those that I might use), and the maybe eventually combine them in ClimateSatellite.jl by calling them and reexporting them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants