From 11669059b0e9388b7577ddf3cb7c03477916da91 Mon Sep 17 00:00:00 2001 From: Jammy2211 Date: Thu, 28 May 2026 10:45:08 +0100 Subject: [PATCH] feat: first-class af.Nautilus search profiling + A100 HPC submit MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the raw nautilus.Sampler wrappers in searches/ with a first-class af.Nautilus profile that exercises the full PyAutoFit lifecycle: visualization, samples I/O, search.summary, latent variables. - Sweep matrix: (sampler × dataset_class × model × instrument × hardware × precision). Sampler registry in _samplers.py is ready for Dynesty/Emcee/ BlackJAX additions as one-function changes. - Per-model n_live matches the SLaM canonical phases (200 for mge / point-source / parametric; 150 for pixelization / Delaunay / datacube). - Datacube uses af.FactorGraphModel to combine N AnalysisInterferometer factors, mirroring autolens_workspace/scripts/multi/modeling.py. - _metrics.attach_viz_timer wraps every visualize-family hook so the JSON splits total_wall_s into sampler_wall_s + viz_wall_s. - force_pickle_overwrite=True + unique path_prefix per cell defeat the .completed-file resume that would otherwise return cached results across repeated sweep iterations. - sweep.py: resume-by-default with --force override. - aggregate.py: walks the 4-level (sampler/ds/model/instrument) tree and emits comparison.{json,png} per cell. - hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64: SLURM submit for the HST MGE fp64 cell on A100, modeled on the existing likelihood-profiling submits in z_projects/profiling/hpc/batch_gpu/. Co-Authored-By: Claude Opus 4.7 (1M context) --- hpc/batch_gpu/error/.gitignore | 2 + hpc/batch_gpu/output/.gitignore | 2 + .../submit_imaging_mge_a100_hst_fp64 | 49 ++ searches/README.md | 185 ++++-- searches/_metrics.py | 236 +++++--- searches/_runner.py | 269 +++++++++ searches/_samplers.py | 96 +++ searches/_setup.py | 557 ++++++++++++++++-- searches/aggregate.py | 256 ++++++++ searches/nautilus/README.md | 46 -- searches/nautilus/datacube/delaunay.py | 28 + searches/nautilus/imaging/delaunay.py | 24 + searches/nautilus/imaging/mge.py | 24 + searches/nautilus/imaging/pixelization.py | 24 + searches/nautilus/interferometer/delaunay.py | 19 + searches/nautilus/interferometer/mge.py | 24 + .../nautilus/interferometer/pixelization.py | 19 + searches/nautilus/jax.py | 227 ------- searches/nautilus/point_source/image_plane.py | 23 + .../nautilus/point_source/source_plane.py | 23 + searches/nautilus/simple.py | 198 ------- searches/sweep.py | 359 +++++++++++ 22 files changed, 2039 insertions(+), 651 deletions(-) create mode 100644 hpc/batch_gpu/error/.gitignore create mode 100644 hpc/batch_gpu/output/.gitignore create mode 100755 hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 create mode 100644 searches/_runner.py create mode 100644 searches/_samplers.py create mode 100644 searches/aggregate.py delete mode 100644 searches/nautilus/README.md create mode 100644 searches/nautilus/datacube/delaunay.py create mode 100644 searches/nautilus/imaging/delaunay.py create mode 100644 searches/nautilus/imaging/mge.py create mode 100644 searches/nautilus/imaging/pixelization.py create mode 100644 searches/nautilus/interferometer/delaunay.py create mode 100644 searches/nautilus/interferometer/mge.py create mode 100644 searches/nautilus/interferometer/pixelization.py delete mode 100644 searches/nautilus/jax.py create mode 100644 searches/nautilus/point_source/image_plane.py create mode 100644 searches/nautilus/point_source/source_plane.py delete mode 100644 searches/nautilus/simple.py create mode 100644 searches/sweep.py diff --git a/hpc/batch_gpu/error/.gitignore b/hpc/batch_gpu/error/.gitignore new file mode 100644 index 0000000..c1aefd2 --- /dev/null +++ b/hpc/batch_gpu/error/.gitignore @@ -0,0 +1,2 @@ +.gitignore +!.gitignore diff --git a/hpc/batch_gpu/output/.gitignore b/hpc/batch_gpu/output/.gitignore new file mode 100644 index 0000000..c1aefd2 --- /dev/null +++ b/hpc/batch_gpu/output/.gitignore @@ -0,0 +1,2 @@ +.gitignore +!.gitignore diff --git a/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 b/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 new file mode 100755 index 0000000..5f0f3ee --- /dev/null +++ b/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 @@ -0,0 +1,49 @@ +#!/bin/bash -l +# +# A100 first-class search profiling: searches/nautilus/imaging/mge × hst × fp64. +# +# Drives af.Nautilus end-to-end (visualization, samples I/O, search.summary) +# on the HST imaging MGE model from the autolens_profiling/searches package. +# Mirrors the resource budget of the sibling likelihood profiling submit +# (z_projects/profiling/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64) but +# allocates more wall time because a first-class fit runs the full Nautilus +# convergence loop, not a one-shot likelihood evaluation. + +#SBATCH -J search_nautilus_imaging_mge_hst_fp64 +#SBATCH --partition=gpu +#SBATCH --gres=gpu:1 +#SBATCH --ntasks=1 +#SBATCH --cpus-per-task=4 +#SBATCH --mem=64gb +#SBATCH --time=2:00:00 +#SBATCH -o output/output.%A.out +#SBATCH -e error/error.%A.err +#SBATCH --mail-type=END,FAIL +#SBATCH --mail-user=james.w.nightingale@durham.ac.uk + +export AP_ROOT=/mnt/ral/jnightin/autolens_profiling +source /mnt/ral/jnightin/PyAutoNSS/PyAutoNSS/bin/activate + +export JAX_PLATFORM_NAME=cuda +export JAX_PLATFORMS=cuda,cpu +export XLA_PYTHON_CLIENT_PREALLOCATE=false +export JAX_ENABLE_X64=True +export NUMBA_CACHE_DIR=/tmp/numba_cache +export MPLCONFIGDIR=/tmp/matplotlib + +nvidia-smi + +echo "==========================================" +date +echo "Cell: searches/nautilus/imaging/mge" +echo "Instrument: hst" +echo "Precision: fp64" + +cd $AP_ROOT +python3 searches/nautilus/imaging/mge.py \ + --instrument hst \ + --config-name hpc_a100_fp64 \ + --output-dir $AP_ROOT/results/searches/nautilus/imaging/mge/hst + +echo "Finished." +date diff --git a/searches/README.md b/searches/README.md index 6ab7d3c..26285d4 100644 --- a/searches/README.md +++ b/searches/README.md @@ -1,72 +1,159 @@ -# searches +# `searches/` — first-class search profiling -Sampler / search profiling for the PyAutoLens HST MGE lens-modelling likelihood. Each subfolder drives a single sampler family directly against the real likelihood — bypassing `af.NonLinearSearch` — so the per-sampler convergence characteristics (wall time, likelihood evaluations, posterior ESS, evals/time to ML) can be compared on identical footing. +This section profiles **first-class PyAutoFit search objects** end-to-end: +`af.Nautilus` today, with the registry shape ready for `af.DynestyStatic`, +`af.BlackJAXNUTS`, `af.Emcee`, etc. Unlike `likelihood_runtime/` (which +profiles `analysis.log_likelihood_function` in isolation), every cell here +runs `search.fit(model=model, analysis=analysis)` — so visualization, +samples I/O, `samples_info.json`, latent variables, and every other piece +of PyAutoFit machinery is exercised and measured. -## Why bypass `af.NonLinearSearch`? +## Design -`af.NonLinearSearch` adds caching, multi-process forking, output formatting, and result hierarchies that are valuable for production fits but obscure the underlying sampler's cost. The scripts in this section call the sampler library directly and instrument every likelihood evaluation through a shared `MLTracker`. The result is a clean apples-to-apples comparison of: +| Dimension | Values | +|----------------|---------------------------------------------------------------------------| +| Sampler | `nautilus` (more to come via `_samplers.SAMPLER_BUILDERS`) | +| Dataset class | `imaging`, `interferometer`, `point_source`, `datacube` | +| Model type | `mge`, `pixelization`, `delaunay`, `image_plane`, `source_plane` | +| Instrument | per-dataset-class (HST/Euclid/JWST/AO; SMA/ALMA/ALMA-high/JVLA; simple) | +| Hardware | `local_cpu`, `local_gpu`, `hpc_a100` (external dispatch) | +| Precision | `fp64`, `mp` (mixed precision via `al.Settings(use_mixed_precision=...)`) | -- Wall time and likelihood-evaluation count to **Nautilus's default convergence** (`n_eff=10000`, `f_live=0.01`). -- Per-evaluation likelihood cost (NumPy baseline vs JAX-JIT'd path). -- Evals-to-ML and time-to-ML — the eval index and wall time at which the running max log L first came within 1 nat of the final maximum. - -## Shared helpers - -| File | Role | -|------|------| -| [`_setup.py`](./_setup.py) | Builds the HST imaging dataset, the MGE + Isothermal + ExternalShear lens model with an MGE source bulge, and the `AnalysisImaging` object. The dataset, mask, and model mirror the reference setup in [`likelihood/imaging/mge.py`](../likelihood/imaging/mge.py) so likelihood values are directly comparable across the two sections. | -| [`_metrics.py`](./_metrics.py) | `MLTracker` — records the log-likelihood and wall time of every evaluation, computes evals-to-ML and time-to-ML headline numbers. Also offers `MLTracker.from_log_l_history` for samplers that JIT their likelihood and only expose log-L per dead/live point post hoc. | - -## Supported samplers - -| Sampler | Folder | Status | Notes | -|---------|--------|--------|-------| -| Nautilus | [`nautilus/`](./nautilus/README.md) | ✓ profiled | Both NumPy (`simple.py`) and JAX-JIT (`jax.py`) variants. | -| Dynesty | _planned_ | not yet mirrored | Static nested sampling; reference scripts at `autolens_workspace_developer/searches_minimal/dynesty_simple.py`. | -| Emcee | _planned_ | not yet mirrored | Affine-invariant ensemble MCMC. | -| BlackJAX (NUTS, SMC) | _planned_ | not yet mirrored | Pure-JAX HMC family. Gradient pathology surfaced in upstream `sweep_findings.md`; HMC viability depends on first fixing NaN-gradient hot spots. | -| NumPyro (ESS) | _planned_ | not yet mirrored | Ensemble slice sampler under JAX. | -| PocoMC | _planned_ | not yet mirrored | Preconditioned Monte Carlo. | -| NSS (simple, jit, grad) | _planned_ | not yet mirrored | Nested slice sampler; `nss_jit.py` shows VRAM ceiling on consumer GPUs (see `sweep_findings.md`). | -| LBFGS | _planned_ | not yet mirrored | Not a sampler; serves as the maximum-likelihood reference point. | - -Each row above corresponds to one or more scripts under `autolens_workspace_developer/searches_minimal/`; the mirror migration here under their own follow-up prompts. - -## Versioned artifacts - -Each script writes a JSON + PNG pair to: +Layout: ``` -results/searches//