Skip to content

docs: comprehensive output folder layout in modeling tutorials and use search.paths in results/start_here #124

@Jammy2211

Description

@Jammy2211

Overview

The modeling.py tutorials currently show a brief flat-bullet description of the output folder. A comprehensive directory-tree layout — showing the files/, image/, and root-level human-readable artefacts — is more useful to users skimming the script for the first time. This task replaces the existing block in every modeling-style tutorial across autolens_workspace and autogalaxy_workspace, adapted per package and data type.

In parallel, guides/results/start_here.py currently constructs a result path using a hardcoded <unique_hash> placeholder that users must replace by hand. The script already runs a real model-fit further down for the aggregator examples. This task restructures the file so that fit happens first, and the simple-loading section reaches the same disk artefacts via search.paths.output_path — eliminating the manual placeholder from active code.

Plan

  • Replace the simple __Output Folder__ block in every modeling.py (imaging, interferometer, point_source, cluster, group, ellipse) with a full directory-tree layout, adapted per library (autolens → tracer/source-plane; autogalaxy → galaxies; ellipse → ellipse fit products) and per data type (imaging vs interferometer vs point_source).
  • Restructure guides/results/start_here.py (autolens + autogalaxy) so the model-fit happens before the simple-loading examples and the loading uses search.paths.output_path.
  • Remove <unique_hash> from active code paths; keep it only in the descriptive header tree (it remains a useful documentation token).
  • Smoke-test all touched scripts with PYAUTO_TEST_MODE=2, then regenerate notebooks via /generate_and_merge.
Detailed implementation plan

Affected Repositories

  • Jammy2211/autolens_workspace (primary)
  • Jammy2211/autogalaxy_workspace

Work Classification

Workspace (no library code touched).

Branch Survey

Repository Current Branch Dirty?
./autolens_workspace main yes (unrelated dataset regenerations from smoke-test-optimization task)
./autogalaxy_workspace main yes (unrelated dataset regenerations from smoke-test-optimization task)

worktree_check_conflict returned 0 — no conflicting active task holds either repo. The new worktree is created from origin/main, so the dirty state in the canonical checkouts does not affect this task.

Suggested branch: feature/output-folder-layout-tutorials
Worktree root: ~/Code/PyAutoLabs-wt/output-folder-layout-tutorials/ (created later by /start_workspace)

Implementation Steps

  1. autolens modeling.py replacements. Replace the __Output Folder__ block in each of:

    • scripts/imaging/modeling.py (line 479)
    • scripts/interferometer/modeling.py (line 424)
    • scripts/point_source/modeling.py (line 410)
    • scripts/cluster/modeling.py (line 581)
    • scripts/group/modeling.py (line 653)

    New block uses the directory-tree format adapted to each data type. For imaging: dataset.fits with data/noise-map/PSF, plus tracer.fits, source_plane_images.fits, model_galaxy_images.fits, galaxy_images.fits. For interferometer: visibilities/uv-plane equivalents. For point_source: dataset.json with positions/fluxes. For cluster/group: imaging layout with cluster/group-specific products.

  2. autogalaxy modeling.py replacements. Same treatment for:

    • scripts/imaging/modeling.py (line 377)
    • scripts/interferometer/modeling.py
    • scripts/ellipse/modeling.py

    Variant uses galaxies.json, galaxy_images.fits, model_galaxy_images.fits (no tracer or source-plane). Ellipse variant lists ellipse-fit specific outputs.

  3. autolens results/start_here.py restructure. In scripts/guides/results/start_here.py:

    • Move the __Result Path__ and simple-loading sections (currently lines 101-176) to after the aggregator section's search.fit(...) call (line 256).
    • Replace result_path = Path("output") / "imaging" / "simple" / "modeling" / "<unique_hash>" with result_path = search.paths.output_path.
    • Adopt the same _quick_fit.py invocation pattern used by aggregator/samples.py so re-runs are cheap.
    • Update the narrative to introduce simple loading as a follow-on demonstration ("the same objects result.* returns in memory are also available on disk via search.paths.output_path").
  4. autogalaxy results/start_here.py restructure. Same treatment for autogalaxy_workspace/scripts/guides/results/start_here.py. The header tree is already correct in this file; only the <unique_hash> Path block needs the same restructure.

  5. Validation. Run each touched script under PYAUTO_TEST_MODE=2 PYAUTO_SMALL_DATASETS=1 PYAUTO_FAST_PLOTS=1 to confirm the rewrites still execute end to end. Run scripts/check_sizes.sh after the bulk pass to guard against accidental file truncation per workspace CLAUDE.md.

  6. Notebook regeneration. After scripts are merged, run /generate_and_merge for both workspaces to regenerate the corresponding .ipynb files in notebooks/.

Key Files

  • autolens_workspace/scripts/{imaging,interferometer,point_source,cluster,group}/modeling.py — replace __Output Folder__ block.
  • autogalaxy_workspace/scripts/{imaging,interferometer,ellipse}/modeling.py — replace __Output Folder__ block.
  • autolens_workspace/scripts/guides/results/start_here.py — restructure simple-loading; remove <unique_hash> from active code.
  • autogalaxy_workspace/scripts/guides/results/start_here.py — same restructure.

Decisions / Trade-offs

  • Layout block is customized per package and data type (not a single shared block) because the autolens variant references tracer/source-plane and the autogalaxy variant references galaxies-only.
  • <unique_hash> is removed from active Path(...) constructions but kept in the descriptive header directory tree (the user-pasted layout preserved it; documentation only, not executable).
  • Restructure simple-loading to follow the fit, rather than precede it, so search.paths.output_path is available without inventing a placeholder. Cleanest for the reader.
  • The user clarified that "aggregator/start_here.py" in the prompt refers to results/start_here.py — so no separate file restructure is needed.

Original Prompt

Click to expand starting prompt

This:

Output Folder Layout

Each completed fit lives at a path like::

output/imaging/<dataset_name>/modeling/<unique_hash>/
    files/                     <- JSON + CSV: loadable Python objects
        tracer.json            <- max log likelihood Tracer
        model.json             <- fitted af.Collection model
        samples.csv            <- full Nautilus samples
        samples_summary.json   <- max log likelihood parameter values + errors
        samples_info.json      <- metadata about the samples
        search.json            <- non-linear search configuration
        settings.json          <- search settings
        cosmology.json         <- cosmology used for the fit
        covariance.csv         <- parameter covariance matrix
    image/                     <- FITS: imaging products
        dataset.fits           <- data, noise-map and PSF
        fit.fits               <- model image, residuals, chi-squared map
        tracer.fits            <- tracer image-plane images per galaxy
        source_plane_images.fits  <- source plane reconstructions
        model_galaxy_images.fits  <- per-galaxy model images
        galaxy_images.fits        <- per-galaxy images
        dataset.png, fit.png, tracer.png   <- visualisations
    model.info                 <- human-readable model summary
    model.results              <- human-readable fit summary
    search.summary             <- search run summary
    metadata                   <- run metadata

Is far superior to this in modeling.py files (and others, do a search):

Output Folder

Now this is running you should checkout the `autolens_workspace/output` folder. This is where the results of the
search are written to hard-disk (in the `start_here` folder), where all outputs are human readable (e.g. as .json,
.csv or text files).

As the fit progresses, results are written to the `output` folder on the fly using the highest likelihood model found
by the non-linear search so far. This means you can inspect the results of the model-fit as it runs, without having to
wait for the non-linear search to terminate.

The `output` folder includes:

  • `model.info`: Summarizes the lens model, its parameters and their priors discussed in the next tutorial.

  • `model.results`: Summarizes the highest likelihood lens model inferred so far including errors.

  • `image`: Visualization of the highest likelihood model-fit to the dataset, (e.g. a fit subplot showing the lens
    and source galaxies, model data and residuals) in .png and .fits formats.

  • `files`: A folder containing human-readable .json file describing the model, search and other aspects of the fit and
    a `.csv` table of every non-linear search sample.

  • search.summary: A file providing summary statistics on the performance of the non-linear search.

  • `search_internal`: Internal files of the non-linear search (in this case Nautilus) used for resuming the fit and
    visualizing the search.

Update the modeling.py files to include the full thing you listed!

However, in results/start_here.py, you use this to load results:

result_path = (
Path("output")
/ "imaging"
/ "simple"
/ "modeling"
/ "<unique_hash>" # The 32-character identifier for the specific fit.
)

instead of the search path:

result_path = search.paths.output_path # Points at the fit's unique output folder.

if (result_path / "files" / "tracer.json").exists():
tracer = from_json(file_path=result_path / "files" / "tracer.json")

tracer_fits = al.Array2D.from_fits(
    file_path=result_path / "image" / "tracer.fits", hdu=0, pixel_scales=0.1
)

We do not want <unique_hash> to be something anywhere in these examples, can you make it so
results/start_here.py runs the same analysis and search as aggregator/start_here.py, and loads the results using
the seaerch path in the same way.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions