You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When opening a large catalog (e.g. regional climate models at hourly frequency), using a project catalog can be prohibitively slow, due to check_valid needing to touch every file in the catalog.
Potential Solution
If check_valid were an option for Project Catalogs, this would be fixed.
Additional context
Trying to open a catalog with 7128 rows of ~500MB netcdfs (~2 GB uncompressed) took longer than 5 minutes, which is unnecessarily slow. The catalog itself weighs only 2.8MB.
Contribution
I would be willing/able to open a Pull Request to contribute this feature.
The text was updated successfully, but these errors were encountered:
I agree with the PR and the idea, but I'm not sure I understand why you are putting the raw MRCC data into a ProjectCatalog ?
I think our design idea was that you first search in a DataCatalog and then you only put datasets you have created in the ProjectCatalog. Is there another issue that made you generate 7128 netCDFs within your project, or made it necessary to use a ProjectCatalog including raw data ?
I agree with the PR and the idea, but I'm not sure I understand why you are putting the raw MRCC data into a ProjectCatalog ?
I think our design idea was that you first search in a DataCatalog and then you only put datasets you have created in the ProjectCatalog. Is there another issue that made you generate 7128 netCDFs within your project, or made it necessary to use a ProjectCatalog including raw data ?
I was subsetting the DataCatalog, and saving as a new catalog using ProjectCatalog, since opening/searching the MRCC5 catalog can take a while. Maybe there's a better way to do that?
Addressing a Problem?
When opening a large catalog (e.g. regional climate models at hourly frequency), using a project catalog can be prohibitively slow, due to check_valid needing to touch every file in the catalog.
Potential Solution
If check_valid were an option for Project Catalogs, this would be fixed.
Additional context
Trying to open a catalog with 7128 rows of ~500MB netcdfs (~2 GB uncompressed) took longer than 5 minutes, which is unnecessarily slow. The catalog itself weighs only 2.8MB.
Contribution
The text was updated successfully, but these errors were encountered: