diff --git a/scripts/ellipse/modeling.py b/scripts/ellipse/modeling.py index 3f1fc8b0..42ff08bf 100644 --- a/scripts/ellipse/modeling.py +++ b/scripts/ellipse/modeling.py @@ -43,7 +43,7 @@ **Analysis:** Setting up the AnalysisEllipse object for the model fit. **Run Times:** Estimating the computational cost of the ellipse model fit. **Model-Fit:** Running the non-linear search to fit the ellipse to data. -**Output Folder:** Description of the results written to hard-disk during and after the fit. +**Output Folder Layout:** Description of the structure of the `output` folder where results are written. **Result:** Inspecting the result object and maximum likelihood ellipse parameters. **Multiple Ellipses:** Fitting many ellipses of increasing size to trace the full galaxy. **Final Fit:** Combining all ellipses into a single final fit. @@ -343,32 +343,39 @@ result = search.fit(model=model, analysis=analysis) """ -__Output Folder__ - -Now this is running you should checkout the `autogalaxy_workspace/output` folder. This is where the results of the -search are written to hard-disk (in the `start_here` folder), where all outputs are human readable (e.g. as .json, -.csv or text files). - -As the fit progresses, results are written to the `output` folder on the fly using the highest likelihood model found -by the non-linear search so far. This means you can inspect the results of the model-fit as it runs, without having to -wait for the non-linear search to terminate. - -The `output` folder includes: - - - `model.info`: Summarizes the model, its parameters and their priors discussed in the next tutorial. - - - `model.results`: Summarizes the highest likelihood model inferred so far including errors. - - - `images`: Visualization of the highest likelihood model-fit to the dataset, (e.g. a fit subplot showing the - galaxies, model data and residuals). - - - `files`: A folder containing .fits files of the dataset, the model as a human-readable .json file, - a `.csv` table of every non-linear search sample and other files containing information about the model-fit. - - - search.summary: A file providing summary statistics on the performance of the non-linear search. - - - `search_internal`: Internal files of the non-linear search (in this case Nautilus) used for resuming the fit and - visualizing the search. +__Output Folder Layout__ + +Now the fit is running you should checkout the `autogalaxy_workspace/output` folder. This is where results are +written to hard-disk in human-readable formats — `.json`, `.csv`, `.fits`, `.png` and plain text. + +As the fit progresses, results are written on the fly using the highest likelihood model found by the +non-linear search so far. This means you can inspect the model-fit as it runs, without waiting for the +non-linear search to terminate. + +Each completed fit lives at a path like:: + + output/imaging//modeling// + files/ <- JSON + CSV: loadable Python objects + ellipse.json <- max log likelihood Ellipse(s) + model.json <- fitted af.Collection model + samples.csv <- full Nautilus samples + samples_summary.json <- max log likelihood parameter values + errors + samples_info.json <- metadata about the samples + search.json <- non-linear search configuration + settings.json <- search settings + covariance.csv <- parameter covariance matrix + image/ <- FITS + PNG: ellipse-fit products + dataset.fits <- data and noise-map + fit.fits <- ellipse traces over the image, residuals along each ellipse + dataset.png, fit.png <- visualisations + model.info <- human-readable model summary + model.results <- human-readable fit summary + search.summary <- search run summary + search_internal/ <- internal files used to resume / visualise the search + metadata <- run metadata + +The `` is a 32-character identifier derived from the model, search and dataset, so re-running the +same configuration resumes from the existing fit automatically. __Result__ diff --git a/scripts/guides/results/start_here.py b/scripts/guides/results/start_here.py index f2d3fd26..dfe485aa 100644 --- a/scripts/guides/results/start_here.py +++ b/scripts/guides/results/start_here.py @@ -17,10 +17,11 @@ `Model`, ...) via Python generators, so you can iterate over hundreds of fits without holding them all in memory at once. This is the right tool when you want to analyse a sample of galaxies together. -Both sections appear below in that order. The aggregator section first runs a quick model-fit so a results -directory exists, then walks through the deeper API (samples, fits, galaxies, units, pixelization). Where each -aggregator section reaches a result that the simple-loading API also exposes, this is noted — both routes return -the same PyAutoFit / PyAutoGalaxy objects, just sourced from disk in different ways. +Both sections appear below in that order. To keep them runnable from a fresh checkout, this script first +performs a quick model-fit so a results directory exists for both halves to read from. The aggregator +section then walks through the deeper API (samples, fits, galaxies, units, pixelization). Where each +aggregator section reaches a result that the simple-loading API also exposes, this is noted — both routes +return the same PyAutoFit / PyAutoGalaxy objects, just sourced from disk in different ways. __Output Folder Layout__ @@ -50,6 +51,9 @@ __Contents__ +**Model Fit:** Run a quick fit once so both halves of this guide have a real result on disk to read from. +**Info:** Print the result in a readable format. + **Simple Loading (one fit at a time):** **Galaxies:** Load the maximum log likelihood `Galaxies` from `galaxies.json`. @@ -59,8 +63,6 @@ **Aggregator (many fits, generator-based):** -**Model Fit:** Run a quick fit so a results directory exists for the aggregator examples. -**Info:** Print the result in a readable format. **Loading From Hard-disk:** Use `Aggregator.from_directory(...)` to scrape `output/`. **Generators:** How Python generators give the aggregator its memory efficiency. **Database File:** Loading from a `.sqlite` database for very large samples. @@ -89,28 +91,98 @@ """ ============================================================================== - SIMPLE LOADING + MODEL FIT ============================================================================== -The first half of this guide loads a single fit directly from `output/`. This is the fastest way to inspect one -fit — every file under `files/` and `image/` is a Python object away. +Both halves of this guide — simple loading and the aggregator — need a real fit on disk to read from. We +perform that fit once here. The rest of the file then uses the resulting `search` object (via +`search.paths.output_path`) and the in-memory `result`. -__Result Path__ +The model and search match `aggregator/samples.py` and `_quick_fit.py`, so re-running this tutorial resumes +from the cached fit instead of redoing the search. + +__Quick Fit Auto-Trigger__ + +If a previous fit has not been run yet, the shared helper ``_quick_fit.py`` is invoked to produce one. +The helper writes a capped Nautilus fit to ``output/results_folder/`` so this tutorial has results to +work with. When that folder already exists the helper exits immediately, so re-running this tutorial is +cheap. +""" +results_path = Path("output") / "results_folder" +if not results_path.exists(): + import subprocess + import sys + + subprocess.run( + [sys.executable, "scripts/guides/results/_quick_fit.py"], + check=True, + ) -Set the path to the folder of the fit you want to load. Replace the values below with the path to your own fit. +""" +__Dataset and Model__ -The `` placeholder is a 32-character identifier specific to the fit; each load below is guarded with -`.exists()` so this script runs cleanly even before you replace it. +Set up the same dataset and model as `_quick_fit.py`, then call `search.fit(...)`. Because the search has +already run, this resumes from the checkpoint and returns the in-memory `Result` object instantly. """ -result_path = ( - Path("output") - / "imaging" - / "features" - / "simple" - / "start_here" - / "" # The 32-character identifier for the specific fit. +dataset_name = "simple" +dataset_path = Path("dataset") / "imaging" / dataset_name + +dataset = ag.Imaging.from_fits( + data_path=dataset_path / "data.fits", + psf_path=dataset_path / "psf.fits", + noise_map_path=dataset_path / "noise_map.fits", + pixel_scales=0.1, +) + +mask = ag.Mask2D.circular( + shape_native=dataset.shape_native, pixel_scales=dataset.pixel_scales, radius=3.0 +) + +dataset = dataset.apply_mask(mask=mask) + +bulge = af.Model(ag.lp_linear.Sersic) +disk = af.Model(ag.lp_linear.Exponential) +bulge.centre = disk.centre + +galaxy = af.Model(ag.Galaxy, redshift=0.5, bulge=bulge, disk=disk) + +model = af.Collection(galaxies=af.Collection(galaxy=galaxy)) + +search = af.Nautilus( + path_prefix=Path("results_folder"), + name="results", + unique_tag=dataset_name, + n_batch=50, + n_live=100, + n_like_max=300, ) +analysis = ag.AnalysisImaging(dataset=dataset, use_jax=True) + +result = search.fit(model=model, analysis=analysis) + +""" +__Info__ + +As seen throughout the workspace, the `info` attribute shows the result in a readable format. +""" +print(result.info) + +""" +============================================================================== + SIMPLE LOADING +============================================================================== + +The first half of this guide loads a single fit directly from `output/`. This is the fastest way to inspect +one fit — every file under `files/` and `image/` is a Python object away. + +__Result Path__ + +Point at the fit's output folder. Because the fit ran above, ``search.paths.output_path`` already points at +the right location — there is no need to construct the path manually or know the unique hash. +""" +result_path = search.paths.output_path + files_path = result_path / "files" image_path = result_path / "image" @@ -144,7 +216,7 @@ """ if (files_path / "samples.csv").exists() and (files_path / "model.json").exists(): model = from_json(file_path=files_path / "model.json") - samples = af.SamplesNest.from_csv(file_path=files_path / "samples.csv", model=model) + samples = af.SamplesNest.from_table(filename=files_path / "samples.csv", model=model) print(samples.max_log_likelihood()) """ @@ -174,80 +246,27 @@ AGGREGATOR ============================================================================== -Everything above loaded one fit at a time, by pointing at a specific output directory. To analyse many fits — for -example a sample of hundreds of galaxies — use the **aggregator** instead. It scrapes a directory of fits and yields -the same objects (`Galaxies`, `Samples`, `Model`, ...) via Python generators, keeping memory use low. - -The remainder of this file is the aggregator entry point. After reading it, the sibling files in `aggregator/` -provide more detailed examples for analysing different aspects of the results. - -To make the examples below runnable from a fresh checkout, we first perform a quick model-fit so the aggregator -has a results directory to scrape. Anything `from_json(...)` reaches in the simple-loading section above can also -be reached through the aggregator below — both APIs return the same Python objects. - -If you are not familiar with the modeling API, see `autogalaxy_workspace/*/examples/modeling/` first. -""" -dataset_name = "simple" -dataset_path = Path("dataset") / "imaging" / dataset_name - -""" -__Dataset Auto-Simulation__ - -If the dataset does not already exist on your system, it will be created by running the corresponding -simulator script. This ensures that all example scripts can be run without manually simulating data first. -""" -if not dataset_path.exists(): - import subprocess - import sys - - subprocess.run( - [sys.executable, "scripts/imaging/simulator.py"], - check=True, - ) - - -dataset = ag.Imaging.from_fits( - data_path=dataset_path / "data.fits", - psf_path=dataset_path / "psf.fits", - noise_map_path=dataset_path / "noise_map.fits", - pixel_scales=0.1, -) - -mask = ag.Mask2D.circular( - shape_native=dataset.shape_native, pixel_scales=dataset.pixel_scales, radius=3.0 -) - -dataset = dataset.apply_mask(mask=mask) +Simple loading is the right tool when you have a single fit and you want to pull a specific object off +disk. The **aggregator** is a different tool — it scrapes a directory of fits and yields the same objects +(`Galaxies`, `Samples`, `Model`, ...) via **Python generators**, so memory use stays bounded no matter how +many fits the directory contains. That generator-based design is its core feature, and it's what lets the +aggregator scale from one fit to hundreds. -bulge = af.Model(ag.lp_linear.Sersic) -disk = af.Model(ag.lp_linear.Exponential) -bulge.centre = disk.centre +The two routes are complementary, not a hierarchy: -galaxy = af.Model(ag.Galaxy, redshift=0.5, bulge=bulge, disk=disk) + - **Simple loading** is the most direct way to inspect one fit you already have a path to — one Python + object per call, all in memory. + - **Aggregator** is the right tool when you want to *iterate* over fits — even a single fit — and rely + on lazy evaluation, query filtering across many runs, or a `.sqlite` database back-end for very large + samples. It is also the API used by the workflow tools (`csv_maker.py`, `png_maker.py`, `fits_maker.py`) + that build scientific summaries of large fit samples. -model = af.Collection(galaxies=af.Collection(galaxy=galaxy)) +Anything reached via `from_json(...)` in the simple-loading section above can also be reached through the +aggregator below — both APIs return the same PyAutoFit / PyAutoGalaxy objects. After reading this section, +the sibling files in `aggregator/` provide deeper examples for samples, fits, queries and database use. -search = af.Nautilus( - path_prefix=Path("results_folder"), - name="results", - unique_tag=dataset_name, - n_batch=50, - n_live=100, - n_like_max=300, -) - -analysis = ag.AnalysisImaging(dataset=dataset, use_jax=True) - -result = search.fit(model=model, analysis=analysis) - -""" -__Info__ - -As seen throughout the workspace, the `info` attribute shows the result in a readable format. -""" -print(result.info) +If you are not familiar with the modeling API, see `autogalaxy_workspace/*/examples/modeling/` first. -""" __Loading From Hard-disk__ When performing fits which output results to hard-disk, a `files` folder is created containing .json / .csv files of diff --git a/scripts/imaging/modeling.py b/scripts/imaging/modeling.py index 46cb405a..28b92b11 100644 --- a/scripts/imaging/modeling.py +++ b/scripts/imaging/modeling.py @@ -46,7 +46,7 @@ **VRAM Use:** Estimating GPU VRAM usage for JAX-accelerated fitting. **Run Times:** Discussion of computational run times and how to estimate them. **Model-Fit:** Running the model-fit and monitoring output. -**Output Folder:** Description of the results written to the output folder. +**Output Folder Layout:** Description of the structure of the `output` folder where results are written. **Result:** Inspecting the result object, maximum likelihood model and posteriors. **Features:** Links to advanced modeling features in the workspace. **Data Preparation:** Links to data preparation resources. @@ -374,32 +374,41 @@ result = search.fit(model=model, analysis=analysis) """ -__Output Folder__ - -Now this is running you should checkout the `autogalaxy_workspace/output` folder. This is where the results of the -search are written to hard-disk (in the `start_here` folder), where all outputs are human readable (e.g. as .json, -.csv or text files). - -As the fit progresses, results are written to the `output` folder on the fly using the highest likelihood model found -by the non-linear search so far. This means you can inspect the results of the model-fit as it runs, without having to -wait for the non-linear search to terminate. - -The `output` folder includes: - - - `model.info`: Summarizes the model, its parameters and their priors discussed in the next tutorial. - - - `model.results`: Summarizes the highest likelihood model inferred so far including errors. - - - `images`: Visualization of the highest likelihood model-fit to the dataset, (e.g. a fit subplot showing the - galaxies, model data and residuals). - - - `files`: A folder containing .fits files of the dataset, the model as a human-readable .json file, - a `.csv` table of every non-linear search sample and other files containing information about the model-fit. - - - search.summary: A file providing summary statistics on the performance of the non-linear search. - - - `search_internal`: Internal files of the non-linear search (in this case Nautilus) used for resuming the fit and - visualizing the search. +__Output Folder Layout__ + +Now the fit is running you should checkout the `autogalaxy_workspace/output` folder. This is where results are +written to hard-disk in human-readable formats — `.json`, `.csv`, `.fits`, `.png` and plain text. + +As the fit progresses, results are written on the fly using the highest likelihood model found by the +non-linear search so far. This means you can inspect the model-fit as it runs, without waiting for the +non-linear search to terminate. + +Each completed fit lives at a path like:: + + output/imaging//modeling// + files/ <- JSON + CSV: loadable Python objects + galaxies.json <- max log likelihood Galaxies + model.json <- fitted af.Collection model + samples.csv <- full Nautilus samples + samples_summary.json <- max log likelihood parameter values + errors + samples_info.json <- metadata about the samples + search.json <- non-linear search configuration + settings.json <- search settings + covariance.csv <- parameter covariance matrix + image/ <- FITS + PNG: imaging products + dataset.fits <- data, noise-map and PSF + fit.fits <- model image, residuals, chi-squared map + model_galaxy_images.fits <- per-galaxy model images + galaxy_images.fits <- per-galaxy images + dataset.png, fit.png <- visualisations + model.info <- human-readable model summary + model.results <- human-readable fit summary + search.summary <- search run summary + search_internal/ <- internal files used to resume / visualise the search + metadata <- run metadata + +The `` is a 32-character identifier derived from the model, search and dataset, so re-running the +same configuration resumes from the existing fit automatically. __Result__ diff --git a/scripts/interferometer/modeling.py b/scripts/interferometer/modeling.py index 729194db..c8a94063 100644 --- a/scripts/interferometer/modeling.py +++ b/scripts/interferometer/modeling.py @@ -38,7 +38,7 @@ **VRAM Use:** Estimating GPU VRAM requirements for the model fit. **Run Times:** Estimating the computational cost of the model fit. **Model-Fit:** Running the non-linear search to fit the model to data. -**Output Folder:** Description of the results written to hard-disk during and after the fit. +**Output Folder Layout:** Description of the structure of the `output` folder where results are written. **Result:** Inspecting the result object and maximum likelihood model. **Features:** Overview of advanced interferometer modeling features like pixelizations. **Data Preparation:** Pointers to data preparation scripts for your own data. @@ -283,32 +283,42 @@ result = search.fit(model=model, analysis=analysis) """ -__Output Folder__ - -Now this is running you should checkout the `autogalaxy_workspace/output` folder. This is where the results of the -search are written to hard-disk (in the `start_here` folder), where all outputs are human readable (e.g. as .json, -.csv or text files). - -As the fit progresses, results are written to the `output` folder on the fly using the highest likelihood model found -by the non-linear search so far. This means you can inspect the results of the model-fit as it runs, without having to -wait for the non-linear search to terminate. - -The `output` folder includes: - - - `model.info`: Summarizes the model, its parameters and their priors discussed in the next tutorial. - - - `model.results`: Summarizes the highest likelihood model inferred so far including errors. - - - `images`: Visualization of the highest likelihood model-fit to the dataset, (e.g. a fit subplot showing the - galaxies, model data and residuals). - - - `files`: A folder containing .fits files of the dataset, the model as a human-readable .json file, - a `.csv` table of every non-linear search sample and other files containing information about the model-fit. - - - search.summary: A file providing summary statistics on the performance of the non-linear search. - - - `search_internal`: Internal files of the non-linear search (in this case Nautilus) used for resuming the fit and - visualizing the search. +__Output Folder Layout__ + +Now the fit is running you should checkout the `autogalaxy_workspace/output` folder. This is where results are +written to hard-disk in human-readable formats — `.json`, `.csv`, `.fits`, `.png` and plain text. + +As the fit progresses, results are written on the fly using the highest likelihood model found by the +non-linear search so far. This means you can inspect the model-fit as it runs, without waiting for the +non-linear search to terminate. + +Each completed fit lives at a path like:: + + output/interferometer//modeling// + files/ <- JSON + CSV: loadable Python objects + galaxies.json <- max log likelihood Galaxies + model.json <- fitted af.Collection model + samples.csv <- full Nautilus samples + samples_summary.json <- max log likelihood parameter values + errors + samples_info.json <- metadata about the samples + search.json <- non-linear search configuration + settings.json <- search settings + covariance.csv <- parameter covariance matrix + image/ <- FITS + PNG: visibility + image-plane products + dataset.fits <- visibilities, noise-map and uv-coverage + fit.fits <- model visibilities, residuals, chi-squared + dirty_images.fits <- dirty images of data, model and residuals + model_galaxy_images.fits <- per-galaxy model images + galaxy_images.fits <- per-galaxy images + dataset.png, fit.png <- visualisations + model.info <- human-readable model summary + model.results <- human-readable fit summary + search.summary <- search run summary + search_internal/ <- internal files used to resume / visualise the search + metadata <- run metadata + +The `` is a 32-character identifier derived from the model, search and dataset, so re-running the +same configuration resumes from the existing fit automatically. __Result__