feat: first-class af.Nautilus search profiling + A100 submit by Jammy2211 · Pull Request #29 · PyAutoLabs/autolens_profiling

Jammy2211 · 2026-05-28T09:45:34Z

Summary

Replaces raw nautilus.Sampler wrappers in searches/ with first-class af.Nautilus — every cell runs search.fit(model, analysis) end-to-end, so visualization, samples I/O, search.summary, and latent variables are profiled.
Sweep matrix: (sampler × dataset_class × model × instrument × hardware × precision) with a sampler registry (_samplers.py) ready for Dynesty/Emcee/BlackJAX as one-function additions.
_metrics.attach_viz_timer splits total_wall_s into sampler_wall_s + viz_wall_s by wrapping every analysis visualize hook (visualize, visualize_combined, visualize_before_fit, visualize_before_fit_combined) plus search.plot_results.
n_live matches the SLaM canonical phases — 200 for MGE/parametric/point-source (source_lp[1]), 150 for pixelization/Delaunay (source_pix[1]).
Datacube uses af.FactorGraphModel to combine N AnalysisInterferometer factors — mirrors autolens_workspace/scripts/multi/modeling.py. _DATACUBE_N_CHANNELS=4 by default.
sweep.py resumes by default (skips cells whose JSON exists); --force re-runs.
aggregate.py walks the 4-level (sampler/ds/model/instrument) tree and emits comparison.{json,png}.
hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 — first SLURM submit script in autolens_profiling/, modelled on the existing z_projects/profiling/hpc/batch_gpu/ submits.

Profiling-specific design notes

number_of_cores=1 everywhere — measures per-eval end-to-end cost, not pool throughput.
force_x1_cpu=True and use_jax_vmap=True on JAX rows — mandatory because nautilus.Sampler forking corrupts JAX state.
force_pickle_overwrite=True + unique path_prefix per cell defeat the .completed-file resume that would silently return cached results across sweep iterations.

Out of scope

HPC sync tool for autolens_profiling (would mirror z_projects/profiling/hpc/sync).
number_of_cores > 1 pool-scaling sweep.
Production-realistic adapt-image regeneration mid-search (uses truth-derived lensed_source.fits).

Test plan

All 9 leaf scripts pass AUTOLENS_PROFILING_SMOKE=1.
sweep.py --dry-run --only nautilus/imaging/mge --instrument hst dispatches the matrix correctly.
aggregate.py discovers cells + handles empty tree gracefully.
Module imports of _samplers, _metrics, _setup, _runner all clean.
Real A100 run via hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64 (queued after merge).

🤖 Generated with Claude Code

Replaces the raw nautilus.Sampler wrappers in searches/ with a first-class af.Nautilus profile that exercises the full PyAutoFit lifecycle: visualization, samples I/O, search.summary, latent variables. - Sweep matrix: (sampler × dataset_class × model × instrument × hardware × precision). Sampler registry in _samplers.py is ready for Dynesty/Emcee/ BlackJAX additions as one-function changes. - Per-model n_live matches the SLaM canonical phases (200 for mge / point-source / parametric; 150 for pixelization / Delaunay / datacube). - Datacube uses af.FactorGraphModel to combine N AnalysisInterferometer factors, mirroring autolens_workspace/scripts/multi/modeling.py. - _metrics.attach_viz_timer wraps every visualize-family hook so the JSON splits total_wall_s into sampler_wall_s + viz_wall_s. - force_pickle_overwrite=True + unique path_prefix per cell defeat the .completed-file resume that would otherwise return cached results across repeated sweep iterations. - sweep.py: resume-by-default with --force override. - aggregate.py: walks the 4-level (sampler/ds/model/instrument) tree and emits comparison.{json,png} per cell. - hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64: SLURM submit for the HST MGE fp64 cell on A100, modeled on the existing likelihood-profiling submits in z_projects/profiling/hpc/batch_gpu/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…me (#32) Discovered while debugging the 3.5× run-2 vs run-1 speedup on the A100 HST MGE submit (job 322549 = 3m43s vs 322548 = 11m40s). Run 2 didn't actually re-sample — it loaded the cached samples.csv + Nautilus pickle left by run 1 and reported the same total_samples=65500 with a meaningless time_per_eval_ms=2.82. The resume gate is `.completed` (PyAutoFit/abstract_search.py:520-529), not `force_pickle_overwrite` as the previous comment claimed. `force_pickle_overwrite=True` only re-writes output pickles on an existing resume; it does not bypass the gate. For production (SLaM-style chained phases) the resume default is correct. For profiling it produces phantom speedups whenever a prior run completed sampling — even one that crashed in post-fit, as the latent-crash in PR #29 showed. - sweep.py: --keep-completed flag (default off). When off, removes output/searches/<sampler>/<ds>/<model>/<instrument>/<config>/ before each cell run, wiping .completed + Nautilus pickle + samples.csv. - _samplers.py: correct the docstring claim about force_pickle_overwrite. - README.md: rewrite the "force_pickle_overwrite defeats .completed" paragraph; document the sweep-level wipe as the actual mechanism. The honest A100 number from run 1's actual sampling window is ~6.6 ms/eval (432 s for 65500 evals between Visualization warm-up complete and the first Fit Running update), not the 2.82 ms in run 2's JSON. Co-authored-by: Jammy2211 <JNightingale2211@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Jammy2211 merged commit e5b1118 into main May 28, 2026
1 check failed

Jammy2211 deleted the feature/first-class-search-profiling branch May 28, 2026 09:47

This was referenced May 28, 2026

fix: skip latents on A100 search submits #30

Merged

fix: sweep.py wipes search state by default to defeat .completed resume #32

Merged

Jammy2211 mentioned this pull request May 28, 2026

fix: use result.max_log_likelihood_instance for best_fit summary #34

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: first-class af.Nautilus search profiling + A100 submit#29

feat: first-class af.Nautilus search profiling + A100 submit#29
Jammy2211 merged 1 commit into
mainfrom
feature/first-class-search-profiling

Jammy2211 commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jammy2211 commented May 28, 2026

Summary

Profiling-specific design notes

Out of scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant