diff --git a/hpc/batch_gpu/error/.gitignore b/hpc/batch_gpu/error/.gitignore new file mode 100644 index 0000000..c1aefd2 --- /dev/null +++ b/hpc/batch_gpu/error/.gitignore @@ -0,0 +1,2 @@ +.gitignore +!.gitignore diff --git a/hpc/batch_gpu/output/.gitignore b/hpc/batch_gpu/output/.gitignore new file mode 100644 index 0000000..c1aefd2 --- /dev/null +++ b/hpc/batch_gpu/output/.gitignore @@ -0,0 +1,2 @@ +.gitignore +!.gitignore diff --git a/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 b/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 new file mode 100755 index 0000000..5f0f3ee --- /dev/null +++ b/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 @@ -0,0 +1,49 @@ +#!/bin/bash -l +# +# A100 first-class search profiling: searches/nautilus/imaging/mge × hst × fp64. +# +# Drives af.Nautilus end-to-end (visualization, samples I/O, search.summary) +# on the HST imaging MGE model from the autolens_profiling/searches package. +# Mirrors the resource budget of the sibling likelihood profiling submit +# (z_projects/profiling/hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64) but +# allocates more wall time because a first-class fit runs the full Nautilus +# convergence loop, not a one-shot likelihood evaluation. + +#SBATCH -J search_nautilus_imaging_mge_hst_fp64 +#SBATCH --partition=gpu +#SBATCH --gres=gpu:1 +#SBATCH --ntasks=1 +#SBATCH --cpus-per-task=4 +#SBATCH --mem=64gb +#SBATCH --time=2:00:00 +#SBATCH -o output/output.%A.out +#SBATCH -e error/error.%A.err +#SBATCH --mail-type=END,FAIL +#SBATCH --mail-user=james.w.nightingale@durham.ac.uk + +export AP_ROOT=/mnt/ral/jnightin/autolens_profiling +source /mnt/ral/jnightin/PyAutoNSS/PyAutoNSS/bin/activate + +export JAX_PLATFORM_NAME=cuda +export JAX_PLATFORMS=cuda,cpu +export XLA_PYTHON_CLIENT_PREALLOCATE=false +export JAX_ENABLE_X64=True +export NUMBA_CACHE_DIR=/tmp/numba_cache +export MPLCONFIGDIR=/tmp/matplotlib + +nvidia-smi + +echo "==========================================" +date +echo "Cell: searches/nautilus/imaging/mge" +echo "Instrument: hst" +echo "Precision: fp64" + +cd $AP_ROOT +python3 searches/nautilus/imaging/mge.py \ + --instrument hst \ + --config-name hpc_a100_fp64 \ + --output-dir $AP_ROOT/results/searches/nautilus/imaging/mge/hst + +echo "Finished." +date diff --git a/searches/README.md b/searches/README.md index 6ab7d3c..26285d4 100644 --- a/searches/README.md +++ b/searches/README.md @@ -1,72 +1,159 @@ -# searches +# `searches/` — first-class search profiling -Sampler / search profiling for the PyAutoLens HST MGE lens-modelling likelihood. Each subfolder drives a single sampler family directly against the real likelihood — bypassing `af.NonLinearSearch` — so the per-sampler convergence characteristics (wall time, likelihood evaluations, posterior ESS, evals/time to ML) can be compared on identical footing. +This section profiles **first-class PyAutoFit search objects** end-to-end: +`af.Nautilus` today, with the registry shape ready for `af.DynestyStatic`, +`af.BlackJAXNUTS`, `af.Emcee`, etc. Unlike `likelihood_runtime/` (which +profiles `analysis.log_likelihood_function` in isolation), every cell here +runs `search.fit(model=model, analysis=analysis)` — so visualization, +samples I/O, `samples_info.json`, latent variables, and every other piece +of PyAutoFit machinery is exercised and measured. -## Why bypass `af.NonLinearSearch`? +## Design -`af.NonLinearSearch` adds caching, multi-process forking, output formatting, and result hierarchies that are valuable for production fits but obscure the underlying sampler's cost. The scripts in this section call the sampler library directly and instrument every likelihood evaluation through a shared `MLTracker`. The result is a clean apples-to-apples comparison of: +| Dimension | Values | +|----------------|---------------------------------------------------------------------------| +| Sampler | `nautilus` (more to come via `_samplers.SAMPLER_BUILDERS`) | +| Dataset class | `imaging`, `interferometer`, `point_source`, `datacube` | +| Model type | `mge`, `pixelization`, `delaunay`, `image_plane`, `source_plane` | +| Instrument | per-dataset-class (HST/Euclid/JWST/AO; SMA/ALMA/ALMA-high/JVLA; simple) | +| Hardware | `local_cpu`, `local_gpu`, `hpc_a100` (external dispatch) | +| Precision | `fp64`, `mp` (mixed precision via `al.Settings(use_mixed_precision=...)`) | -- Wall time and likelihood-evaluation count to **Nautilus's default convergence** (`n_eff=10000`, `f_live=0.01`). -- Per-evaluation likelihood cost (NumPy baseline vs JAX-JIT'd path). -- Evals-to-ML and time-to-ML — the eval index and wall time at which the running max log L first came within 1 nat of the final maximum. - -## Shared helpers - -| File | Role | -|------|------| -| [`_setup.py`](./_setup.py) | Builds the HST imaging dataset, the MGE + Isothermal + ExternalShear lens model with an MGE source bulge, and the `AnalysisImaging` object. The dataset, mask, and model mirror the reference setup in [`likelihood/imaging/mge.py`](../likelihood/imaging/mge.py) so likelihood values are directly comparable across the two sections. | -| [`_metrics.py`](./_metrics.py) | `MLTracker` — records the log-likelihood and wall time of every evaluation, computes evals-to-ML and time-to-ML headline numbers. Also offers `MLTracker.from_log_l_history` for samplers that JIT their likelihood and only expose log-L per dead/live point post hoc. | - -## Supported samplers - -| Sampler | Folder | Status | Notes | -|---------|--------|--------|-------| -| Nautilus | [`nautilus/`](./nautilus/README.md) | ✓ profiled | Both NumPy (`simple.py`) and JAX-JIT (`jax.py`) variants. | -| Dynesty | _planned_ | not yet mirrored | Static nested sampling; reference scripts at `autolens_workspace_developer/searches_minimal/dynesty_simple.py`. | -| Emcee | _planned_ | not yet mirrored | Affine-invariant ensemble MCMC. | -| BlackJAX (NUTS, SMC) | _planned_ | not yet mirrored | Pure-JAX HMC family. Gradient pathology surfaced in upstream `sweep_findings.md`; HMC viability depends on first fixing NaN-gradient hot spots. | -| NumPyro (ESS) | _planned_ | not yet mirrored | Ensemble slice sampler under JAX. | -| PocoMC | _planned_ | not yet mirrored | Preconditioned Monte Carlo. | -| NSS (simple, jit, grad) | _planned_ | not yet mirrored | Nested slice sampler; `nss_jit.py` shows VRAM ceiling on consumer GPUs (see `sweep_findings.md`). | -| LBFGS | _planned_ | not yet mirrored | Not a sampler; serves as the maximum-likelihood reference point. | - -Each row above corresponds to one or more scripts under `autolens_workspace_developer/searches_minimal/`; the mirror migration here under their own follow-up prompts. - -## Versioned artifacts - -Each script writes a JSON + PNG pair to: +Layout: ``` -results/searches//