Add GPU implementation of preconditioner - v2 #31

LucasBoTang · 2025-11-15T09:11:43Z

Summary

This PR introduces a CUDA-based implementation of the preconditioner module, including both Ruiz, Pock–Chambolle, and objective-bound scaling.

Main Changes

Replaced preconditioner.cu with preconditioner.c
Modified initialize_solver_state in solver.cu for GPU preconditioner integration

Implementation Details

The matrix is stored in CSR format, with an additional row ID array to enable efficient row-wise scaling (A[i,j] *= E[i]) without extra lookups.
Added an auxiliary array recording the mapping of each A element to its corresponding position in Aᵀ, enabling synchronized scaling of A and Aᵀ without atomics or additional CSR/CSC conversions.

Next Step

Benchmark GPU vs CPU preconditioner runtime before merging

Note

Reduce_bound_norm_sq_atomic currently relies on atomicAdd(double*) for the bound-norm reduction, which requires CMAKE_CUDA_ARCHITECTURES ≥ 60.

Would it be preferable to:

Switch to a portable single-block shared-memory reduction (no atomics), or
Redesign the reduction kernel
Keep the current implementation and require sm_60+?

LucasBoTang · 2025-11-16T02:46:14Z

This update fixes two issues in the GPU preconditioner:

Corrected objective/bound rescaling: The previous GPU code applied the wrong scaling to the bounds and objective. This caused very long PDHG iterations. Now, all scaling is applied correctly on the GPU.
Improved reduce_bound_norm_sq_kernel: Replaced atomic accumulation with a shared-memory block reduction. This removes the atomic overhead and makes the result consistent and fast.

LucasBoTang added 3 commits November 14, 2025 18:33

New feat: GPU precondition

2a31c99

New feat: synchronized A/At scaling

5a993cd

Todo: infeasible is stucked

c2c7252

LucasBoTang requested review from ZedongPeng and jinwen-yang November 15, 2025 09:11

LucasBoTang added 2 commits November 15, 2025 16:45

New feat: preconditioner prints

63b2526

Bug fixed: objective & bound rescale

3f17100

LucasBoTang added 2 commits November 15, 2025 21:46

Merge branch 'main' into preconditioner-v2

e2149c3

New feat: record precondition time

1989b6b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add GPU implementation of preconditioner - v2 #31

Add GPU implementation of preconditioner - v2 #31

Uh oh!

LucasBoTang commented Nov 15, 2025

Uh oh!

LucasBoTang commented Nov 16, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add GPU implementation of preconditioner - v2 #31

Are you sure you want to change the base?

Add GPU implementation of preconditioner - v2 #31

Uh oh!

Conversation

LucasBoTang commented Nov 15, 2025

Summary

Main Changes

Implementation Details

Next Step

Note

Uh oh!

LucasBoTang commented Nov 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

LucasBoTang commented Nov 16, 2025 •

edited

Loading