Improve download and file management #72

LTDakin · 2025-03-25T18:48:45Z

We use a naive catch-all get-fits utility function that downloads the entire fits file from the archive. Some functions don't need the whole file, for example getting the source catalog. Instead download just the CAT header from the archive and reduce memory usage/endpoint latency.

Also we removed temp files to stabilize datalab for beta users, but lost the ability to cache them. This would be nice to reimplement with a smarter system that keeps track of available tmp space left. LFU queue for the FITS files.

set up a cleaning routine to remove files after operations
Write a util function that downloads specific header data using ffspec in astropy
rewrite /source-catalog/ to only download CAT header
Retain downloaded files in tmp space, managed by a LFU queue that removes the least used file when we reach near disk capacity

The text was updated successfully, but these errors were encountered:

jnation3406 · 2025-04-01T16:00:06Z

I think we should use fs_spec for /source-catalog/ since it only wants the smaller CAT and header, but for the data operations that use the data, fs_spec won't save us much since downloading the file is similar to downloading just the data from the file in size. Also, the last task I added about retaining files for a short time to help some things like raw_image wouldn't work if we fs_spec everything.

So I think what we want is something that checks if the file is local - if not, it either downloads the file locally or fs_spec depending on if it needs the image data or not. And when it downloads locally, we leave it there but add a service to delete temp files older than 1 hour (easy way), or we keep track of downloaded files and their sizes in redis and delete files when we get closer to the disk limit, basically implementing a LFU queue (hard way).

LTDakin self-assigned this Mar 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve download and file management #72

Improve download and file management #72

LTDakin commented Mar 25, 2025 •

edited

Loading

jnation3406 commented Apr 1, 2025

Improve download and file management #72

Improve download and file management #72

Comments

LTDakin commented Mar 25, 2025 • edited Loading

jnation3406 commented Apr 1, 2025

LTDakin commented Mar 25, 2025 •

edited

Loading