feat: first-class af.Nautilus search profiling + A100 submit#29
Merged
Conversation
Replaces the raw nautilus.Sampler wrappers in searches/ with a first-class
af.Nautilus profile that exercises the full PyAutoFit lifecycle:
visualization, samples I/O, search.summary, latent variables.
- Sweep matrix: (sampler × dataset_class × model × instrument × hardware ×
precision). Sampler registry in _samplers.py is ready for Dynesty/Emcee/
BlackJAX additions as one-function changes.
- Per-model n_live matches the SLaM canonical phases (200 for mge /
point-source / parametric; 150 for pixelization / Delaunay / datacube).
- Datacube uses af.FactorGraphModel to combine N AnalysisInterferometer
factors, mirroring autolens_workspace/scripts/multi/modeling.py.
- _metrics.attach_viz_timer wraps every visualize-family hook so the JSON
splits total_wall_s into sampler_wall_s + viz_wall_s.
- force_pickle_overwrite=True + unique path_prefix per cell defeat the
.completed-file resume that would otherwise return cached results
across repeated sweep iterations.
- sweep.py: resume-by-default with --force override.
- aggregate.py: walks the 4-level (sampler/ds/model/instrument) tree and
emits comparison.{json,png} per cell.
- hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64: SLURM submit for the
HST MGE fp64 cell on A100, modeled on the existing likelihood-profiling
submits in z_projects/profiling/hpc/batch_gpu/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 28, 2026
Jammy2211
added a commit
that referenced
this pull request
May 28, 2026
…me (#32) Discovered while debugging the 3.5× run-2 vs run-1 speedup on the A100 HST MGE submit (job 322549 = 3m43s vs 322548 = 11m40s). Run 2 didn't actually re-sample — it loaded the cached samples.csv + Nautilus pickle left by run 1 and reported the same total_samples=65500 with a meaningless time_per_eval_ms=2.82. The resume gate is `.completed` (PyAutoFit/abstract_search.py:520-529), not `force_pickle_overwrite` as the previous comment claimed. `force_pickle_overwrite=True` only re-writes output pickles on an existing resume; it does not bypass the gate. For production (SLaM-style chained phases) the resume default is correct. For profiling it produces phantom speedups whenever a prior run completed sampling — even one that crashed in post-fit, as the latent-crash in PR #29 showed. - sweep.py: --keep-completed flag (default off). When off, removes output/searches/<sampler>/<ds>/<model>/<instrument>/<config>/ before each cell run, wiping .completed + Nautilus pickle + samples.csv. - _samplers.py: correct the docstring claim about force_pickle_overwrite. - README.md: rewrite the "force_pickle_overwrite defeats .completed" paragraph; document the sweep-level wipe as the actual mechanism. The honest A100 number from run 1's actual sampling window is ~6.6 ms/eval (432 s for 65500 evals between Visualization warm-up complete and the first Fit Running update), not the 2.82 ms in run 2's JSON. Co-authored-by: Jammy2211 <JNightingale2211@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
nautilus.Samplerwrappers insearches/with first-classaf.Nautilus— every cell runssearch.fit(model, analysis)end-to-end, so visualization, samples I/O,search.summary, and latent variables are profiled.(sampler × dataset_class × model × instrument × hardware × precision)with a sampler registry (_samplers.py) ready for Dynesty/Emcee/BlackJAX as one-function additions._metrics.attach_viz_timersplitstotal_wall_sintosampler_wall_s + viz_wall_sby wrapping every analysis visualize hook (visualize,visualize_combined,visualize_before_fit,visualize_before_fit_combined) plussearch.plot_results.n_livematches the SLaM canonical phases — 200 for MGE/parametric/point-source (source_lp[1]), 150 for pixelization/Delaunay (source_pix[1]).af.FactorGraphModelto combine NAnalysisInterferometerfactors — mirrorsautolens_workspace/scripts/multi/modeling.py._DATACUBE_N_CHANNELS=4by default.sweep.pyresumes by default (skips cells whose JSON exists);--forcere-runs.aggregate.pywalks the 4-level(sampler/ds/model/instrument)tree and emitscomparison.{json,png}.hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64— first SLURM submit script inautolens_profiling/, modelled on the existingz_projects/profiling/hpc/batch_gpu/submits.Profiling-specific design notes
number_of_cores=1everywhere — measures per-eval end-to-end cost, not pool throughput.force_x1_cpu=Trueanduse_jax_vmap=Trueon JAX rows — mandatory becausenautilus.Samplerforking corrupts JAX state.force_pickle_overwrite=True+ uniquepath_prefixper cell defeat the.completed-file resume that would silently return cached results across sweep iterations.Out of scope
synctool forautolens_profiling(would mirrorz_projects/profiling/hpc/sync).number_of_cores > 1pool-scaling sweep.lensed_source.fits).Test plan
AUTOLENS_PROFILING_SMOKE=1.sweep.py --dry-run --only nautilus/imaging/mge --instrument hstdispatches the matrix correctly.aggregate.pydiscovers cells + handles empty tree gracefully._samplers,_metrics,_setup,_runnerall clean.hpc/batch_gpu/submit_imaging_mge_a100_hst_fp64(queued after merge).🤖 Generated with Claude Code