fix OOB read in psf_precision_value_from causing NaN sparse-CPU log_evidence by Jammy2211 · Pull Request #296 · PyAutoLabs/PyAutoArray

Jammy2211 · 2026-05-01T16:29:51Z

Summary

psf_precision_value_from (used by the sparse-operator CPU inversion path) walks the PSF kernel over value_native (the noise map) without bounds-checking the read. @numba.jit() does not bounds-check, so for mask pixels within kernel_shift of the noise-map array boundary, the read returns uninitialized memory — astronomical/non-finite contributions that poison the entire psf_precision_operator and downstream curvature matrix.

Symptom

On the HST 28×28 RectangularAdaptDensity model profiled in autolens_workspace_developer/jax_profiling/imaging/pixelization_sparse_cpu.py:

1064 inf entries in inversion.curvature_matrix.
log_det_curvature_reg_matrix_term = NaN → log_evidence = NaN.
NNLS solver returns all-zeros for s → chi_squared = 462622 (vs correct 22306) → log_likelihood = -191493 (vs correct +28664).

Why existing tests didn't catch it

test__psf_precision_operator_sparse_from (test_inversion_imaging_util.py:87) uses interior-only pixels [[1,1],[1,2],[2,1],[2,2]] in a 4×4 noise map — the bounds check is a no-op there.
Integration tests in test_factory.py use a 3×3 no-blur PSF whose only non-zero entry is the center — off-diagonal kernel reads contribute zero regardless of OOB.
The bug only fires when (a) the PSF has non-trivial off-center weight and (b) at least one mask pixel sits within kernel_shift of the noise-map array boundary. This is the standard HST configuration but not the test fixtures.

Fix

Add an explicit bounds check around the value_native[…] read in psf_precision_value_from. Off-array positions are skipped, matching the existing if value > 0.0 filter for masked-but-zeroed interior pixels.

Verification

All 44 tests in test_autoarray/inversion/inversion/ pass.
New regression test test__psf_precision_operator_sparse_from__edge_pixels (corner pixels + non-trivial 3×3 PSF; reference checked against pure-numpy bounds-checked re-implementation) passes after fix; would have failed on main.
Existing test__psf_precision_operator_sparse_from continues to pass with byte-identical numbers (the bounds check is a no-op for interior pixels).
End-to-end on autolens_workspace_developer/jax_profiling/imaging/pixelization_sparse_cpu.py: sparse and non-sparse log_evidence now agree to rtol=1.18e-09 (26232.0685428 vs 26232.0685738); JAX expected: 26232.0685738.

Files Changed

autoarray/inversion/inversion/imaging_numba/inversion_imaging_numba_util.py — bounds-check the kernel read in psf_precision_value_from.
test_autoarray/inversion/inversion/imaging/test_inversion_imaging_util.py — add test__psf_precision_operator_sparse_from__edge_pixels.

Note for future maintainers

A quick read of the rest of inversion_imaging_numba_util.py shows no other unguarded kernel walks on value_native — every other function iterates on precomputed sparse-pair indices. So this single-function bounds-check is sufficient.

Test Plan

pytest test_autoarray/inversion/inversion/ — 44 passed
New test__psf_precision_operator_sparse_from__edge_pixels — passed
End-to-end pixelization_sparse_cpu.py — sparse path now matches non-sparse to rtol=1.18e-09

🤖 Generated with Claude Code

…vidence The kernel walk in psf_precision_value_from (used by the sparse-operator CPU inversion path) reads value_native[ip0_y + k0_y + kernel_shift_y, ...] without bounds checking. For mask pixels within `kernel_shift` of the noise-map array boundary, that index lands off the array; @numba.jit() does not bounds-check, so the read returns uninitialized memory. For the HST 28x28 RectangularAdaptDensity pixelization profiled in autolens_workspace_developer/jax_profiling/imaging/pixelization_sparse_cpu.py, this produced 1064 inf entries in the curvature matrix, a NaN log_det, and ultimately NaN log_evidence. log_likelihood was -191493 (vs the correct +28664) because the inf-poisoned F+H made the NNLS solver return all-zeros for s, giving a wildly wrong chi-squared. The fix adds an explicit bounds check around the kernel read. Off-array positions are skipped, which matches the function's existing semantics for masked-but-zeroed interior pixels (`if value > 0.0: ...` already filters those). After the fix, sparse and non-sparse paths agree on the HST workspace_developer model to rtol=1.18e-09. Existing tests are unaffected: the closest util test (test__psf_precision_operator_sparse_from) uses interior pixels in a 4x4 noise map so the bounds check is a no-op there. The integration tests in test_factory.py use a 3x3 no-blur PSF (only center=1) so off-diagonal kernel reads contribute zero regardless. The new test__psf_precision_operator_sparse_from__edge_pixels exercises the fixed path with corner pixels and a non-trivial 3x3 PSF, validating against a pure-numpy reference. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Jammy2211 added the pending-release PR queued for the next release build label May 1, 2026

Jammy2211 merged commit ae43ae1 into main May 1, 2026
4 checks passed

Jammy2211 deleted the feature/sparse-cpu-oob-fix branch May 1, 2026 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix OOB read in psf_precision_value_from causing NaN sparse-CPU log_evidence#296

fix OOB read in psf_precision_value_from causing NaN sparse-CPU log_evidence#296
Jammy2211 merged 1 commit intomainfrom
feature/sparse-cpu-oob-fix

Jammy2211 commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jammy2211 commented May 1, 2026

Summary

Symptom

Why existing tests didn't catch it

Fix

Verification

Files Changed

Note for future maintainers

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant