You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use a naive catch-all get-fits utility function that downloads the entire fits file from the archive. Some functions don't need the whole file, for example getting the source catalog. Instead download just the CAT header from the archive and reduce memory usage/endpoint latency.
Also we removed temp files to stabilize datalab for beta users, but lost the ability to cache them. This would be nice to reimplement with a smarter system that keeps track of available tmp space left. LFU queue for the FITS files.
set up a cleaning routine to remove files after operations
Write a util function that downloads specific header data using ffspec in astropy
rewrite /source-catalog/ to only download CAT header
Retain downloaded files in tmp space, managed by a LFU queue that removes the least used file when we reach near disk capacity
The text was updated successfully, but these errors were encountered:
I think we should use fs_spec for /source-catalog/ since it only wants the smaller CAT and header, but for the data operations that use the data, fs_spec won't save us much since downloading the file is similar to downloading just the data from the file in size. Also, the last task I added about retaining files for a short time to help some things like raw_image wouldn't work if we fs_spec everything.
So I think what we want is something that checks if the file is local - if not, it either downloads the file locally or fs_spec depending on if it needs the image data or not. And when it downloads locally, we leave it there but add a service to delete temp files older than 1 hour (easy way), or we keep track of downloaded files and their sizes in redis and delete files when we get closer to the disk limit, basically implementing a LFU queue (hard way).
We use a naive catch-all
get-fits
utility function that downloads the entire fits file from the archive. Some functions don't need the whole file, for example getting the source catalog. Instead download just the CAT header from the archive and reduce memory usage/endpoint latency.Also we removed temp files to stabilize datalab for beta users, but lost the ability to cache them. This would be nice to reimplement with a smarter system that keeps track of available tmp space left. LFU queue for the FITS files.
/source-catalog/
to only download CAT headerThe text was updated successfully, but these errors were encountered: