Skip to content

Ggm mixed#79

Open
MaartenMarsman wants to merge 59 commits intomainfrom
ggm_mixed
Open

Ggm mixed#79
MaartenMarsman wants to merge 59 commits intomainfrom
ggm_mixed

Conversation

@MaartenMarsman
Copy link
Collaborator

This is the mixed MRF.

- Add Blume-Capel variable support across all phases (A-G)
- Add pseudolikelihood consistency note (mixedGM vs bgms verified)
- Add PR and commit strategy (§16, 17 planned commits)
- Fix test numbering: unit T1-T12, integration T13-T19,
  edge-case T20-T27, regression R1-R3
- Split §3.3 edge-selection table by PL mode
- Fix §3.1/3.2 parameter label (threshold → main effect)
- Add BC note to §2.1 conditional OMRF formula
- Cross-reference phase checkpoints with T-numbers
- Remove legacy workflow section (superseded by commit strategy)
- Add reuse inventory entries for variable_helpers BC functions
- Add BC items to risk register
- Add ::: bindings for internal functions in test-bgm-spec.R
- Exclude .lintr and .editorconfig from tarball via .Rbuildignore
Implement log_conditional_omrf(s) and log_conditional_ggm() in
mixed_mrf_likelihoods.cpp. OMRF branches on ordinal vs Blume-Capel
using compute_denom_ordinal/blume_capel from variable_helpers.
GGM uses cached Kyy decomposition and conditional mean.

14 test assertions validated against pure-R reference implementations.
Cross-validation against mixedGM deferred to B.5 estimation tests.
Six scalar MH update functions in mixed_mrf_metropolis.cpp:
- update_main_effect: ordinal thresholds and BC alpha/beta
- update_continuous_mean: muy with conditional_mean_ save/restore
- update_Kxx: symmetric discrete interactions
- update_Kyy_offdiag: permute-based Cholesky (ported from mixedGM)
- update_Kyy_diag: log-scale Cholesky with Jacobian
- update_Kxy: cross interactions with conditional_mean_ save/restore

Static helpers: log_beta_prior, permutation builders,
get_constants/constrained_diagonal for Cholesky block updates.

do_one_metropolis_step wired as 5-step sweep:
main effects -> muy -> Kxx -> Kyy -> Kxy, edge-gated.

Robbins-Monro adaptation deferred to Phase F.
- Fix get_constants: include diagonal L(q-1,q-1)² in c3
  (was missing, causing wrong constrained diagonal)
- Add test_mixed_mrf_sampler() Rcpp export for recovery tests
- Add test-mixed-mrf-sampling.R (conditional PL recovery test)
- Add Phase B+ (10 sub-steps) for replacing O(q³) permute-based
  Kyy updates with GGM-style rank-1 Cholesky infrastructure
- Add B+.10 rank-2 quadratic-form shortcut (O(nq) vs O(npq+nq²))
- Design decision: keep Cholesky over Woodbury-only for PD safety
- Add tests T28-T30, commits 6a-6d to planned schedule
- Update risk register, reuse inventory, and table of contents
Add log_marginal_omrf(s) using Theta = Kxx + 2 Kxy Kyy^{-1} Kxy'.
Dispatch marginal likelihood in all six Metropolis update functions:
  - update_main_effect: marginal vs conditional per-node
  - update_continuous_mean: add sum_s marginal OMRF terms
  - update_Kxx: recompute Theta before/after proposal
  - update_Kxy: all p marginal OMRF nodes + save/restore Theta
  - update_Kyy_offdiag: temporarily install proposed Kyy for Theta
  - update_Kyy_diag: same pattern as offdiag
- Test helper conditionally computes marg_omrf_ll in marginal mode
- R reference ref_log_marginal_omrf validated against C++ (ordinal + BC)
- Zero-param equality: marginal == conditional when Kxy = 0
- Divergence test: marginal != conditional with nonzero Kxy
- Simulation recovery: bgms vs mixedGM (r >= 0.90, p=3, q=2, n=1000)
- Wenchuan empirical: bgms vs mixedGM (r >= 0.90, p=4, q=2)
Implement spike-and-slab reversible-jump MH for all three edge types:
- update_edge_indicator_Kxx: discrete-discrete birth/death
- update_edge_indicator_Kyy: Cholesky-reparameterized continuous-continuous
- update_edge_indicator_Kxy: cross-type birth/death

Add update_edge_indicators() dispatcher with shuffled edge ordering,
initialize_graph() for prior-based initial graph draw, and Step 6 in
do_one_metropolis_step(). Test helper updated to activate edge selection
after warmup and return indicator samples.

Bug fix: Kyy birth proposal uses C[3]*epsilon (scale only), matching
the GGM model, not C[2]+C[3]*epsilon (which mismatched the death
reversal density).

Plan updated: Phase D marked complete, D.7 added for pairwise prior
standardization (OMRF-style scaling factors for Kxx edges).
R-side infrastructure for MixedMRFModel:
- bgm_spec: auto-detect mixed data, build_spec_mixed_mrf(), split
  discrete/continuous, validate mixed variable types
- run_sampler: run_sampler_mixed_mrf() dispatches to sample_mixed_mrf()
- build_output: build_output_mixed_mrf() maps flat C++ vector back to
  (p+q)×(p+q) matrices in original column order
- build_arguments: build_arguments_mixed_mrf()
- bgm(): add pseudolikelihood parameter
- validate_variable_types: allow_mixed parameter

C++ entry point:
- sample_mixed.cpp: Rcpp wrapper calling MixedMRFModel

Tests:
- 8 end-to-end tests in test-bgm.R (reproducibility, dimensions,
  symmetry, naming, precision positivity, marginal PL)
- Remove test_mixed_mrf.cpp scaffolding and all dependent test files
  (cholesky, skeleton, likelihoods, marginal, sampling, edge-selection)
- Remove friend declarations from mixed_mrf_model.h

3965 tests pass in 25.6s, 0 failures.
@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 90.47806% with 243 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.84%. Comparing base (a7d2aad) to head (7d67b9b).

Files with missing lines Patch % Lines
src/models/mixed/mixed_mrf_metropolis.cpp 76.04% 97 Missing ⚠️
src/models/mixed/mixed_mrf_model.cpp 90.66% 41 Missing ⚠️
src/models/mixed/mixed_mrf_gradient.cpp 89.37% 29 Missing ⚠️
R/simulate_predict.R 93.54% 25 Missing ⚠️
R/bgm_spec.R 92.41% 16 Missing ⚠️
src/models/mixed/mixed_mrf_likelihoods.cpp 84.90% 8 Missing ⚠️
R/extractor_functions.R 75.00% 7 Missing ⚠️
R/validate_model.R 78.78% 7 Missing ⚠️
R/bgms-methods.R 86.20% 4 Missing ⚠️
src/models/ggm/ggm_model.cpp 92.00% 2 Missing ⚠️
... and 5 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #79      +/-   ##
==========================================
+ Coverage   89.59%   89.84%   +0.25%     
==========================================
  Files          57       65       +8     
  Lines        7936    10322    +2386     
==========================================
+ Hits         7110     9274    +2164     
- Misses        826     1048     +222     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Wire iteration index through all 6 MH update functions and add
per-parameter Robbins-Monro adaptation (weight = iter^{-0.75},
target = 0.44, clamp [0.001, 2.0]) during warmup:
- update_main_effect, update_continuous_mean, update_Kxx
- update_Kyy_offdiag, update_Kyy_diag, update_Kxy

init_metropolis_adaptation stores total_warmup from WarmupSchedule.
No changes needed to WarmupSchedule or ChainRunner — the existing
Metropolis-only pipeline already handles MH models correctly.

Also: add @param pseudolikelihood to bgm() roxygen (fixes codoc
mismatch from Phase E).

3965 tests pass, 0 failures.
MixedMRFModel::do_one_metropolis_step() called update_edge_indicators()
at step 6, but ChainRunner already calls it after each sampler step.
This caused indicator proposals to happen twice per iteration when edge
selection was active. Remove the call from do_one_metropolis_step() to
match the OMRF pattern where ChainRunner is the sole owner of
edge-indicator scheduling.
- Diagonal factor: use 4 (not 2) for dTheta_ss/dKxy_sj
- Missing bias term: add 2*mu_yj * sum(x_s - E_s) contribution
- Missing cross-contributions: variable s's conditional contributes
  to Kxy_t gradient for t != s
- Add test_gradient.cpp (Rcpp export) and numerical gradient test
  script confirming all 4 test cases pass (max error ~1.5e-7)
- HybridNUTSSampler: NUTS for (mux, Kxx, muy, Kxy), MH for Kyy
- Wire sampler selection: R 'nuts' remapped to 'hybrid-nuts' for mixed
- Pass sampler config (type, target_accept, tree depth, leapfrogs)
  from R through to C++ SamplerConfig
- Update chain_runner factory and learn_sd flag for hybrid-nuts
…rflow guards

- Delete test_gradient.cpp and its Rcpp export (misguided test infrastructure)
- Wire treedepth/divergent/energy diagnostics into mixed MRF output
- Recognize 'hybrid-nuts' sampler type for NUTS diagnostic collection
- Pass nuts_max_depth through to mixed MRF arguments
- Fix ordinal gradient bound: use max(main_param) + C_s*rest, clamped at 0
- Fix BC gradient: use uninitialized bc_bound (computed inside helper)
- Tighten fast-block overflow threshold by max(|main_param|) in both
  compute_logZ_and_probs_ordinal and compute_logZ_and_probs_blume_capel
- Change mixed MRF progress interval from 50 to 1
- Phase H: stochastic block model edge prior (future)
- Phase I: missing data imputation (future)
- Phase J: performance profiling
- Phase K: code deduplication audit
- Phase L: model output R code inspection
…guards

- Move get_bgms_fixtures() (16 entries), get_bgmcompare_fixtures() (13),
  and get_extractor_fixtures() (9) into helper-fixtures.R as single
  source of truth.
- Remove duplicate local definitions from test-methods.R,
  test-simulate-predict-regression.R, and test-extractor-functions.R.
- Add mixed-mrf skip guards to ordinal simulate/predict loops and
  dedicated mixed-mrf simulate and predict test blocks.
- Add fixture-coverage guard tests (required labels + counts).
- Delete dead-weight test-scaffolding-fixtures.R.
…vent segfault

Removed std::thread wrapper around parallelFor that caused segfaults on
Linux and Windows CI. TBB expects parallelFor on the calling thread;
spawning it from std::thread corrupted TBB scheduler state.

Now follows the same pattern as MCMCChainRunner: parallelFor called
directly from main thread, workers use begin==0 guard so only the
main-thread chunk calls pm.update(0) (R API). Other chunks check
pm.shouldExit() for cooperative cancellation.
@MaartenMarsman MaartenMarsman marked this pull request as ready for review March 10, 2026 06:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds mixed Markov Random Field (MRF) support spanning discrete + continuous variables, including sampling, prediction, and updated extractor APIs/documentation.

Changes:

  • Introduces MixedMRFModel + mixed sampler entry points (including a hybrid NUTS sampler path) and mixed conditional prediction.
  • Renames threshold extraction to extract_main_effects() (deprecating extract_category_thresholds()), and updates tests/docs accordingly.
  • Refactors parameter storage/adaptation plumbing (storage dimension vs NUTS dimension) and improves numerical stability/documentation in C++ utilities.

Reviewed changes

Copilot reviewed 89 out of 154 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/testthat/test-extractor-functions.R Switches to shared fixture helper; adds extract_main_effects() tests and deprecation-warning assertions.
tests/testthat/test-build-arguments.R Updates build_arguments() expectations to include model_type.
tests/testthat/test-bgm.R Adjusts GGM expectations (quadratic diagonal) and adds mixed MRF end-to-end tests.
tests/testthat/test-bgm-spec.R Tests internal bgm spec helpers via bgms::: bindings.
src/utils/variable_helpers.h Converts key numerical helper comments to Doxygen-style docs.
src/utils/variable_helpers.cpp Tightens FAST/SAFE thresholds to avoid intermediate overflow in probability computations.
src/utils/print_mutex.h Replaces ad-hoc comments with a Doxygen doc block for print mutex usage.
src/utils/common_helpers.h Documents enums + string-to-enum converters.
src/sample_omrf.cpp Replaces Doxygen blocks with .cpp-appropriate comments.
src/sample_mixed.cpp Adds Rcpp entry point for mixed MRF sampling using unified MCMC runner.
src/priors/sbm_edge_prior.cpp Converts large Doxygen comments to .cpp comments.
src/mrf_prediction.cpp Adds mixed conditional prediction entry point.
src/models/omrf/omrf_model.h Annotates many model fields with Doxygen ///< comments.
src/models/omrf/omrf_model.cpp Uses macro-based exp to reduce overflow risk in log_beta_prior.
src/models/mixed/mixed_mrf_model.h Adds new MixedMRFModel (discrete + continuous) and sampling interfaces.
src/models/mixed/mixed_mrf_likelihoods.cpp Implements mixed model pseudo-likelihood components.
src/models/ggm/ggm_model.h Removes local helper decls and uses shared Cholesky helper header.
src/models/ggm/ggm_model.cpp Switches to shared Cholesky helpers + explog macros for stability/consistency.
src/models/base_model.h Introduces storage_dimension() and get_storage_vectorized_parameters() to separate storage vs NUTS dims.
src/mcmc/samplers/metropolis_adaptation.h Adds field documentation.
src/mcmc/samplers/hybrid_nuts_sampler.h Adds hybrid NUTS sampler (NUTS block + MH for Kyy) for mixed models.
src/mcmc/samplers/hmc_adaptation.h Documents adaptation accumulator fields.
src/mcmc/execution/sampler_config.h Converts config comments to /// doc comments.
src/mcmc/execution/chain_runner.cpp Adds hybrid-nuts sampler, stores storage-dimension samples, reserves using storage dim.
src/mcmc/execution/chain_result.h Documents chain result fields.
src/mcmc/algorithms/nuts.cpp Converts Doxygen blocks to .cpp comments.
src/math/cholupdate.cpp Updates include path to new shared math/cholupdate.h.
src/math/cholesky_helpers.h Adds shared Cholesky helper utilities (log-det, Schur complement element).
src/bgmCompare_interface.cpp Converts Doxygen blocks to .cpp comments.
src/bgmCompare/bgmCompare_output.h Documents output struct fields.
src/bgmCompare/bgmCompare_helper.cpp Converts Doxygen blocks to .cpp comments.
src/RcppExports.cpp Registers new mixed MRF / prediction exported symbols.
man/simulate_mrf.Rd Reflows docs and refines Blume-Capel equation formatting.
man/simulate.bgms.Rd Documents mixed-model simulate output + reflows examples.
man/predict.bgms.Rd Documents mixed-model predict output and reflows long lines.
man/predict.bgmCompare.Rd Reflows value/seealso sections.
man/mrfSampler.Rd Reflows documentation (wrapping edits).
man/extract_sbm.Rd Adds extract_main_effects() to “Other extractors”.
man/extract_rhat.Rd Adds extract_main_effects() to “Other extractors”.
man/extract_posterior_inclusion_probabilities.Rd Adds extract_main_effects() to “Other extractors”.
man/extract_pairwise_thresholds.Rd Updates deprecation target to extract_main_effects().
man/extract_pairwise_interactions.Rd Updates seealso and “Other extractors” to include extract_main_effects().
man/extract_main_effects.Rd Adds new Rd for extract_main_effects().
man/extract_indicators.Rd Adds extract_main_effects() to “Other extractors”.
man/extract_indicator_priors.Rd Adds extract_main_effects() to “Other extractors”.
man/extract_group_params.Rd Updates seealso and “Other extractors” links.
man/extract_ess.Rd Adds extract_main_effects() to “Other extractors”.
man/extract_category_thresholds.Rd Reworks content as deprecated and redirects readers to extract_main_effects().
man/extract_arguments.Rd Adds extract_main_effects() to “Other extractors”.
man/coef.bgms.Rd Updates coef.bgms docs for new “main effects” meaning across model types.
man/bgms-package.Rd Updates package title/description to include mixed models.
man/bgm.Rd Documents mixed variable_type and new pseudolikelihood arg; clarifies outputs.
Readme.Rmd Updates high-level description to include mixed models.
README.md Updates rendered README to include mixed models.
R/validate_model.R Extends variable-type validation to allow mixed designs with allow_mixed.
R/run_sampler.R Adds mixed MRF sampler dispatcher + refactors between-cluster default handling.
R/extractor_functions.R Adds extract_main_effects() and deprecates extract_category_thresholds(); improves pairwise naming for mixed.
R/bgms-package.R Updates package-level roxygen to include mixed models.
R/bgms-methods.R Updates print/summary/coef semantics for GGM quadratic diagonal + mixed model labeling.
R/bgm_spec.R Adds mixed_mrf model_type, mixed spec builder, and mixed build_arguments.
R/bgm.R Adds pseudolikelihood argument and passes through to spec builder.
R/RcppExports.R Adds R wrappers for new mixed C++ exports.
NEWS.md Adds release note describing mixed MRF support.
NAMESPACE Exports extract_main_effects() and registers new S3 methods.
DESCRIPTION Updates package description to include mixed MRF support.
.github/workflows/weekly-compliance.yaml Adds scheduled compliance/validation workflow.
.github/workflows/nightly-validation.yaml Adds scheduled full test + validation workflow.
.Rbuildignore Adds ignores for contributing/config files.
dev/numerical_analyses/bgm_blumecapel_probs_PL.R Removes developer-only numerical analysis script.
dev/generate_legacy_fixtures.R Removes developer-only legacy fixture generation script.
dev/benchmark_memoization.R Removes developer-only benchmarking script.
dev/README.md Removes dev directory README.

@MaartenMarsman MaartenMarsman requested a review from Copilot March 10, 2026 16:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 96 out of 168 changed files in this pull request and generated 6 comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants