Feature tracker for v2.0 #109

nicholasjclark · 2025-03-31T06:38:08Z

Switch to full support for all families and distributional modifications handled by {brms}

Improved support for factor-based correlations

Write a Stan function that can add correlated effects at any level, and can also include information in data2 or some other interface to use informed factor loadings following methods in Heaps et al to ensure identification constraints. For example, this should be possible for N-mixture and occupancy models by adding to trend_mus
Needs to be possible to use this for correlated temporal dynamics such as RW(cor = TRUE)
Should be tracked so that covariance matrix can be computed at any time; needs to return loadings in a usable form as well as returning the correlated estimates of the correct dimension
Probably best that the Gaussian draws are standard normal and can then be multiplied by any sigma's after returning

Added flexibility for latent processes

Allow for both hierarchical and shared process parameters (i.e. sharing AR(1) coefficients among each series) (Allow hierarchical or no pooling of trend components #35); allow this to be controlled from the relevant functions such as AR()
Add exponential smoothing latent processes and ensure they can be supported for State-Space models
Allow higher order lags for AR() and VAR(), as well as lags that are discontinuous (i.e. should be able to call AR(p = c(1, 12, 24))
Ensure a new C++ function can be used to work with higher-order lags, and ensure latent state estimates are returned up to the relevant lag for forecasting purposes
Look into how {brms} predicts for autocor terms and think about expanding to VARs and other autocorrelation processes
Possibly allow for order to be learned (particularly for VARs) by following examples from Heaps et al

Better interfaces for JSDMs and detection error hierarchical models

trend_map should be created internally for nmix() and occ() models

More downstream functionality to improve user interface

Add a function to decompose link prediction variance into contributions from predictor terms and dynamics
Re-use the model_file when running lfo_cv() (Re-use model_file in lfo_cv() #103)
Look into auto-creating a lightweight {gam}-style object that can be used by {gratia}; may be able to to create my own simple class, give this class a predict() function by leveraging prepare_predictions_sm() and data_sm(), and then tellling {gratia} how to draw this class; would be nice to handle effects for distributional parameters as well

A few examples to show some ways that {brms} can already be used to extend {mvgam}

library(mvgam)           # Fit, interrogate and forecast DGAMs
library(dplyr)           # Tidy and flexible data manipulation
library(ggplot2)         # Flexible plotting
library(gratia)          # Graceful plotting of smooth terms
library(viridis)         # Plotting colours
library(patchwork)       # Combining ggplot objects

# Get Portal data ready
portal_ts <- read.csv('https://raw.githubusercontent.com/nicholasjclark/EFI_seminar/main/data/portal_data.csv', 
                          as.is = TRUE)
portal_ts %>%
  dplyr::mutate(series = as.factor(species)) -> portal_ts

# Get lags ready
lagard <- function(x, n_lag = 6) {
  n <- length(x)
  X <- matrix(NA, n, n_lag)
  for (i in 1:n_lag){
    X[i:n, i] <- x[i:n - i + 1]
  } 
  X
}

mintemp <- do.call(rbind, lapply(seq_along(levels(portal_ts$series)), function(x){
  portal_ts %>%
    dplyr::filter(series == levels(portal_ts$series)[x]) %>%
    dplyr::arrange(time) %>%
    dplyr::select(mintemp, time) %>%
    dplyr::pull(mintemp) -> tempdat
  
  lag_mat <- lagard(tempdat, 6)
  tail(lag_mat, NROW(lag_mat) - 5)
}))

portal_ts %>%
  dplyr::arrange(series, time) %>%
  dplyr::filter(time > 5) -> portal_ts

data_all <- list(
  lag = matrix(0:5, nrow(portal_ts), 6, byrow = TRUE),
  captures = portal_ts$captures,
  ndvi_ma12 = portal_ts$ndvi_ma12,
  time = portal_ts$time,
  series = portal_ts$series)
data_all$mintemp <- mintemp

# Use just PP for thie example
pp_inds <- which(data_all$series == 'PP')
data_pp <- lapply(data_all, function(x){
  if(is.matrix(x)){
    x[pp_inds, ]
  } else {
    x[pp_inds]
  }
})

# A brms distributed lag model using cbind()
brms_dat <- data.frame(
  captures = data_pp$captures,
  time = data_pp$time,
  lag0 = data_pp$lag[,1],
  lag1 = data_pp$lag[,2],
  lag2 = data_pp$lag[,3],
  lag3 = data_pp$lag[,4],
  mintemp0 = data_pp$mintemp[,1],
  mintemp1 = data_pp$mintemp[,2],
  mintemp2 = data_pp$mintemp[,3],
  mintemp3 = data_pp$mintemp[,4]
)

mod1 <- brm(
  captures ~ t2(
    cbind(lag0, lag1, lag2, lag3),
    cbind(mintemp0, mintemp1, mintemp2, mintemp3),
    k = 4
  ),
  data = brms_dat,
  family = poisson(),
  backend = 'cmdstanr',
  cores = 4
)

# Visualising lagged effects
patchwork::wrap_plots(
  plot_predictions(mod1, 
                     condition = 'mintemp0',
                     type = 'link'),
  plot_predictions(mod1, 
                   condition = 'mintemp1',
                   type = 'link'),
  plot_predictions(mod1, 
                   condition = 'mintemp2',
                   type = 'link'),
  plot_predictions(mod1, 
                   condition = 'mintemp3',
                   type = 'link'),
  ncol = 2
) &
  theme_classic(base_size = 12)


# Now an example that integrates multiple observation types
brms_dat$ycaptures <- rbinom(
  NROW(brms_dat),
  prob = plogis(-0.1 + brms_dat$mintemp0 * 0.25),
  size = 1
)

# Set up the model to already handle latent predictors, 
# where we could us a general predictor ._Z_. to represent the
# latent states (i.e. 'trend' in current mvgam models)
brms_dat$._Z_. <- as.numeric(NA)

# Observation formulae and response distributions
bf1 <- bf(captures ~ mintemp0 + mi(._Z_.), 
  family = poisson()
)

bf2 <- bf(
  ycaptures ~ mintemp0 + mi(._Z_.),
  family = bernoulli()
)

# Latent variable, estimated as Gaussian white noise but could add AR dynamics
bf3 <- bf(._Z_. | mi() ~ 0)

mod2 <- brm(
  bf1 + bf2 + bf3 + set_rescor(FALSE),
  data = brms_dat,
  # Set the coefficients for the latent state
  # to equal 1
  prior = c(prior_string("constant(1)", 
                         resp = "captures", 
                         coef = "mi._Z_."),
            prior_string("constant(1)", 
                         resp = "ycaptures", 
                         coef = "mi._Z_.")),
  backend = 'cmdstanr',
  cores = 4
)

# Should then generally be able to add latent state models at
# the "target += normal_lpdf(Yl_Z | mu_Z, sigma_Z);" line;
# will also need to ensure NA's are handled so that latent
# states continue to move in time
stancode(mod2)
standata(mod2)$N_captures
standata(mod2)$N_Z
summary(mod2)
predict(mod2)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature tracker for v2.0 #109

Feature tracker for v2.0 #109

nicholasjclark commented Mar 31, 2025 •

edited

Loading

Feature tracker for v2.0 #109

Feature tracker for v2.0 #109

Comments

nicholasjclark commented Mar 31, 2025 • edited Loading

nicholasjclark commented Mar 31, 2025 •

edited

Loading