Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 34 additions & 27 deletions scripts/ellipse/modeling.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
**Analysis:** Setting up the AnalysisEllipse object for the model fit.
**Run Times:** Estimating the computational cost of the ellipse model fit.
**Model-Fit:** Running the non-linear search to fit the ellipse to data.
**Output Folder:** Description of the results written to hard-disk during and after the fit.
**Output Folder Layout:** Description of the structure of the `output` folder where results are written.
**Result:** Inspecting the result object and maximum likelihood ellipse parameters.
**Multiple Ellipses:** Fitting many ellipses of increasing size to trace the full galaxy.
**Final Fit:** Combining all ellipses into a single final fit.
Expand Down Expand Up @@ -343,32 +343,39 @@
result = search.fit(model=model, analysis=analysis)

"""
__Output Folder__

Now this is running you should checkout the `autogalaxy_workspace/output` folder. This is where the results of the
search are written to hard-disk (in the `start_here` folder), where all outputs are human readable (e.g. as .json,
.csv or text files).

As the fit progresses, results are written to the `output` folder on the fly using the highest likelihood model found
by the non-linear search so far. This means you can inspect the results of the model-fit as it runs, without having to
wait for the non-linear search to terminate.

The `output` folder includes:

- `model.info`: Summarizes the model, its parameters and their priors discussed in the next tutorial.

- `model.results`: Summarizes the highest likelihood model inferred so far including errors.

- `images`: Visualization of the highest likelihood model-fit to the dataset, (e.g. a fit subplot showing the
galaxies, model data and residuals).

- `files`: A folder containing .fits files of the dataset, the model as a human-readable .json file,
a `.csv` table of every non-linear search sample and other files containing information about the model-fit.

- search.summary: A file providing summary statistics on the performance of the non-linear search.

- `search_internal`: Internal files of the non-linear search (in this case Nautilus) used for resuming the fit and
visualizing the search.
__Output Folder Layout__

Now the fit is running you should checkout the `autogalaxy_workspace/output` folder. This is where results are
written to hard-disk in human-readable formats — `.json`, `.csv`, `.fits`, `.png` and plain text.

As the fit progresses, results are written on the fly using the highest likelihood model found by the
non-linear search so far. This means you can inspect the model-fit as it runs, without waiting for the
non-linear search to terminate.

Each completed fit lives at a path like::

output/imaging/<dataset_name>/modeling/<unique_hash>/
files/ <- JSON + CSV: loadable Python objects
ellipse.json <- max log likelihood Ellipse(s)
model.json <- fitted af.Collection model
samples.csv <- full Nautilus samples
samples_summary.json <- max log likelihood parameter values + errors
samples_info.json <- metadata about the samples
search.json <- non-linear search configuration
settings.json <- search settings
covariance.csv <- parameter covariance matrix
image/ <- FITS + PNG: ellipse-fit products
dataset.fits <- data and noise-map
fit.fits <- ellipse traces over the image, residuals along each ellipse
dataset.png, fit.png <- visualisations
model.info <- human-readable model summary
model.results <- human-readable fit summary
search.summary <- search run summary
search_internal/ <- internal files used to resume / visualise the search
metadata <- run metadata

The `<unique_hash>` is a 32-character identifier derived from the model, search and dataset, so re-running the
same configuration resumes from the existing fit automatically.

__Result__

Expand Down
199 changes: 109 additions & 90 deletions scripts/guides/results/start_here.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,11 @@
`Model`, ...) via Python generators, so you can iterate over hundreds of fits without holding them all in
memory at once. This is the right tool when you want to analyse a sample of galaxies together.

Both sections appear below in that order. The aggregator section first runs a quick model-fit so a results
directory exists, then walks through the deeper API (samples, fits, galaxies, units, pixelization). Where each
aggregator section reaches a result that the simple-loading API also exposes, this is noted — both routes return
the same PyAutoFit / PyAutoGalaxy objects, just sourced from disk in different ways.
Both sections appear below in that order. To keep them runnable from a fresh checkout, this script first
performs a quick model-fit so a results directory exists for both halves to read from. The aggregator
section then walks through the deeper API (samples, fits, galaxies, units, pixelization). Where each
aggregator section reaches a result that the simple-loading API also exposes, this is noted — both routes
return the same PyAutoFit / PyAutoGalaxy objects, just sourced from disk in different ways.

__Output Folder Layout__

Expand Down Expand Up @@ -50,6 +51,9 @@

__Contents__

**Model Fit:** Run a quick fit once so both halves of this guide have a real result on disk to read from.
**Info:** Print the result in a readable format.

**Simple Loading (one fit at a time):**

**Galaxies:** Load the maximum log likelihood `Galaxies` from `galaxies.json`.
Expand All @@ -59,8 +63,6 @@

**Aggregator (many fits, generator-based):**

**Model Fit:** Run a quick fit so a results directory exists for the aggregator examples.
**Info:** Print the result in a readable format.
**Loading From Hard-disk:** Use `Aggregator.from_directory(...)` to scrape `output/`.
**Generators:** How Python generators give the aggregator its memory efficiency.
**Database File:** Loading from a `.sqlite` database for very large samples.
Expand Down Expand Up @@ -89,28 +91,98 @@

"""
==============================================================================
SIMPLE LOADING
MODEL FIT
==============================================================================

The first half of this guide loads a single fit directly from `output/`. This is the fastest way to inspect one
fit — every file under `files/` and `image/` is a Python object away.
Both halves of this guide — simple loading and the aggregator — need a real fit on disk to read from. We
perform that fit once here. The rest of the file then uses the resulting `search` object (via
`search.paths.output_path`) and the in-memory `result`.

__Result Path__
The model and search match `aggregator/samples.py` and `_quick_fit.py`, so re-running this tutorial resumes
from the cached fit instead of redoing the search.

__Quick Fit Auto-Trigger__

If a previous fit has not been run yet, the shared helper ``_quick_fit.py`` is invoked to produce one.
The helper writes a capped Nautilus fit to ``output/results_folder/`` so this tutorial has results to
work with. When that folder already exists the helper exits immediately, so re-running this tutorial is
cheap.
"""
results_path = Path("output") / "results_folder"
if not results_path.exists():
import subprocess
import sys

subprocess.run(
[sys.executable, "scripts/guides/results/_quick_fit.py"],
check=True,
)

Set the path to the folder of the fit you want to load. Replace the values below with the path to your own fit.
"""
__Dataset and Model__

The `<unique_hash>` placeholder is a 32-character identifier specific to the fit; each load below is guarded with
`.exists()` so this script runs cleanly even before you replace it.
Set up the same dataset and model as `_quick_fit.py`, then call `search.fit(...)`. Because the search has
already run, this resumes from the checkpoint and returns the in-memory `Result` object instantly.
"""
result_path = (
Path("output")
/ "imaging"
/ "features"
/ "simple"
/ "start_here"
/ "<unique_hash>" # The 32-character identifier for the specific fit.
dataset_name = "simple"
dataset_path = Path("dataset") / "imaging" / dataset_name

dataset = ag.Imaging.from_fits(
data_path=dataset_path / "data.fits",
psf_path=dataset_path / "psf.fits",
noise_map_path=dataset_path / "noise_map.fits",
pixel_scales=0.1,
)

mask = ag.Mask2D.circular(
shape_native=dataset.shape_native, pixel_scales=dataset.pixel_scales, radius=3.0
)

dataset = dataset.apply_mask(mask=mask)

bulge = af.Model(ag.lp_linear.Sersic)
disk = af.Model(ag.lp_linear.Exponential)
bulge.centre = disk.centre

galaxy = af.Model(ag.Galaxy, redshift=0.5, bulge=bulge, disk=disk)

model = af.Collection(galaxies=af.Collection(galaxy=galaxy))

search = af.Nautilus(
path_prefix=Path("results_folder"),
name="results",
unique_tag=dataset_name,
n_batch=50,
n_live=100,
n_like_max=300,
)

analysis = ag.AnalysisImaging(dataset=dataset, use_jax=True)

result = search.fit(model=model, analysis=analysis)

"""
__Info__

As seen throughout the workspace, the `info` attribute shows the result in a readable format.
"""
print(result.info)

"""
==============================================================================
SIMPLE LOADING
==============================================================================

The first half of this guide loads a single fit directly from `output/`. This is the fastest way to inspect
one fit — every file under `files/` and `image/` is a Python object away.

__Result Path__

Point at the fit's output folder. Because the fit ran above, ``search.paths.output_path`` already points at
the right location — there is no need to construct the path manually or know the unique hash.
"""
result_path = search.paths.output_path

files_path = result_path / "files"
image_path = result_path / "image"

Expand Down Expand Up @@ -144,7 +216,7 @@
"""
if (files_path / "samples.csv").exists() and (files_path / "model.json").exists():
model = from_json(file_path=files_path / "model.json")
samples = af.SamplesNest.from_csv(file_path=files_path / "samples.csv", model=model)
samples = af.SamplesNest.from_table(filename=files_path / "samples.csv", model=model)
print(samples.max_log_likelihood())

"""
Expand Down Expand Up @@ -174,80 +246,27 @@
AGGREGATOR
==============================================================================

Everything above loaded one fit at a time, by pointing at a specific output directory. To analyse many fits — for
example a sample of hundreds of galaxies — use the **aggregator** instead. It scrapes a directory of fits and yields
the same objects (`Galaxies`, `Samples`, `Model`, ...) via Python generators, keeping memory use low.

The remainder of this file is the aggregator entry point. After reading it, the sibling files in `aggregator/`
provide more detailed examples for analysing different aspects of the results.

To make the examples below runnable from a fresh checkout, we first perform a quick model-fit so the aggregator
has a results directory to scrape. Anything `from_json(...)` reaches in the simple-loading section above can also
be reached through the aggregator below — both APIs return the same Python objects.

If you are not familiar with the modeling API, see `autogalaxy_workspace/*/examples/modeling/` first.
"""
dataset_name = "simple"
dataset_path = Path("dataset") / "imaging" / dataset_name

"""
__Dataset Auto-Simulation__

If the dataset does not already exist on your system, it will be created by running the corresponding
simulator script. This ensures that all example scripts can be run without manually simulating data first.
"""
if not dataset_path.exists():
import subprocess
import sys

subprocess.run(
[sys.executable, "scripts/imaging/simulator.py"],
check=True,
)


dataset = ag.Imaging.from_fits(
data_path=dataset_path / "data.fits",
psf_path=dataset_path / "psf.fits",
noise_map_path=dataset_path / "noise_map.fits",
pixel_scales=0.1,
)

mask = ag.Mask2D.circular(
shape_native=dataset.shape_native, pixel_scales=dataset.pixel_scales, radius=3.0
)

dataset = dataset.apply_mask(mask=mask)
Simple loading is the right tool when you have a single fit and you want to pull a specific object off
disk. The **aggregator** is a different tool — it scrapes a directory of fits and yields the same objects
(`Galaxies`, `Samples`, `Model`, ...) via **Python generators**, so memory use stays bounded no matter how
many fits the directory contains. That generator-based design is its core feature, and it's what lets the
aggregator scale from one fit to hundreds.

bulge = af.Model(ag.lp_linear.Sersic)
disk = af.Model(ag.lp_linear.Exponential)
bulge.centre = disk.centre
The two routes are complementary, not a hierarchy:

galaxy = af.Model(ag.Galaxy, redshift=0.5, bulge=bulge, disk=disk)
- **Simple loading** is the most direct way to inspect one fit you already have a path to — one Python
object per call, all in memory.
- **Aggregator** is the right tool when you want to *iterate* over fits — even a single fit — and rely
on lazy evaluation, query filtering across many runs, or a `.sqlite` database back-end for very large
samples. It is also the API used by the workflow tools (`csv_maker.py`, `png_maker.py`, `fits_maker.py`)
that build scientific summaries of large fit samples.

model = af.Collection(galaxies=af.Collection(galaxy=galaxy))
Anything reached via `from_json(...)` in the simple-loading section above can also be reached through the
aggregator below — both APIs return the same PyAutoFit / PyAutoGalaxy objects. After reading this section,
the sibling files in `aggregator/` provide deeper examples for samples, fits, queries and database use.

search = af.Nautilus(
path_prefix=Path("results_folder"),
name="results",
unique_tag=dataset_name,
n_batch=50,
n_live=100,
n_like_max=300,
)

analysis = ag.AnalysisImaging(dataset=dataset, use_jax=True)

result = search.fit(model=model, analysis=analysis)

"""
__Info__

As seen throughout the workspace, the `info` attribute shows the result in a readable format.
"""
print(result.info)
If you are not familiar with the modeling API, see `autogalaxy_workspace/*/examples/modeling/` first.

"""
__Loading From Hard-disk__

When performing fits which output results to hard-disk, a `files` folder is created containing .json / .csv files of
Expand Down
Loading
Loading