Introduce Model Adapter Layer to unify PyMC and OLS interfaces #629

drbenvincent · 2026-01-02T18:35:12Z

drbenvincent
Jan 2, 2026
Maintainer

Based on some refactoring discussion kicked off by @JeanVanDyk, I'm adding in an idea which I generated in discussion with Claude. I think this shows promise, but more thought would need to go in to unified result types.

Tagging a few people who may have interest/thoughts: @NathanielF, @juanitorduz, @williambdean, @cetagostini.

Summary

This proposal addresses the growing complexity of model-type dispatching (if isinstance(self.model, PyMCModel) ... else ...) throughout the experiment classes by introducing a Model Adapter Layer that provides a unified interface for both PyMC (Bayesian) and scikit-learn (OLS) models.

Problem Statement

Current Architecture

CausalPy experiment classes (e.g., InterruptedTimeSeries, DifferenceInDifferences, SyntheticControl) accept either a PyMCModel or a scikit-learn RegressorMixin via the model= parameter. Internally, these classes use isinstance() checks to dispatch to appropriate code paths.

The Problem

The two model types have fundamentally different interfaces:

Aspect	PyMCModel	RegressorMixin
`fit()` signature	`fit(X, y, coords)`	`fit(X, y)`
`predict()` return type	`az.InferenceData`	`np.ndarray`
Data format expected	`xr.DataArray` (2D with `treated_units` dim)	`np.ndarray` (typically 1D)
Impact calculation	Uses posterior samples (`chain`, `draw` dims)	Point estimate subtraction
`score()` return	`pd.Series` with HDI	Scalar `float`

This mismatch requires extensive branching logic throughout the codebase.

Extent of the Problem

A survey of the codebase reveals model-type dispatching in every experiment class:

`InterruptedTimeSeries` (~10+ branch points)

__init__: model fitting (lines 203-215)
__init__: scoring (lines 218-223)
__init__: post-period prediction (lines 230-233)
__init__: impact calculation (lines 236-246)
_split_post_period(): slicing predictions/impacts (lines 330-407)
effect_summary(): computing statistics (lines 496-569)
_comparison_period_summary(): period comparison (lines 619-744)
analyze_persistence(): persistence analysis (lines 1241-1336)
Separate _bayesian_plot() / _ols_plot() methods
Separate get_plot_data_bayesian() / get_plot_data_ols() methods

`DifferenceInDifferences` (~5+ branch points)

__init__: model fitting (lines 137-153)
__init__: causal impact extraction (lines 220-244)
Separate _bayesian_plot() / _ols_plot() methods

`PanelRegression` (~5+ branch points)

__init__: model fitting (lines 220-231)
get_plot_data_bayesian() / get_plot_data_ols()
plot_unit_effects(): coefficient extraction
plot_trajectories(): prediction handling
plot_residuals(): data extraction

Other Experiment Classes

Similar patterns exist in SyntheticControl, RegressionDiscontinuity, RegressionKink, and PrePostNEGD.

Consequences

Code Duplication: Logic is repeated with minor variations for each model type
Maintenance Burden: Changes to data handling require updates in multiple locations
Testing Complexity: Each experiment needs test coverage for both code paths
Documentation Difficulty: Hard to clearly document backend-specific behavior
Extension Friction: Adding new model backends would multiply the branching

Proposed Solution: Model Adapter Layer

Architecture Overview

Introduce an adapter layer that wraps both model types to provide a unified interface:

┌─────────────────────────────────────────────────────────────┐
│                    Experiment Classes                        │
│  (InterruptedTimeSeries, DifferenceInDifferences, etc.)     │
└─────────────────────────────────────────────────────────────┘
                              │
                              │ Uses unified interface
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                     ModelAdapter Protocol                    │
│  fit(X, y, **kwargs) → None                                 │
│  predict(X) → PredictionResult                              │
│  calculate_impact(y_true, y_pred) → ImpactResult            │
│  score(X, y) → ScoreResult                                  │
│  get_coefficients() → CoefficientsResult                    │
└─────────────────────────────────────────────────────────────┘
                    ▲                       ▲
                    │                       │
        ┌───────────┴───────────┐ ┌────────┴────────────┐
        │    PyMCAdapter        │ │   SklearnAdapter    │
        │  wraps PyMCModel      │ │ wraps RegressorMixin│
        └───────────────────────┘ └─────────────────────┘

Unified Result Types

@dataclass
class PredictionResult:
    """Unified prediction result for both model types."""
    mean: np.ndarray                    # Point estimate (always available)
    samples: xr.DataArray | None        # Posterior samples (Bayesian only)
    lower: np.ndarray | None            # Lower CI/HDI bound
    upper: np.ndarray | None            # Upper CI/HDI bound
    is_bayesian: bool                   # Flag for downstream handling
    
    def to_inference_data(self) -> az.InferenceData | None:
        """Convert to ArviZ InferenceData if Bayesian."""
        ...

@dataclass
class ImpactResult:
    """Unified causal impact result."""
    mean: np.ndarray                    # Point estimate
    samples: xr.DataArray | None        # Posterior samples (Bayesian only)
    cumulative_mean: np.ndarray         # Cumulative impact
    cumulative_samples: xr.DataArray | None
    is_bayesian: bool

@dataclass  
class ScoreResult:
    """Unified model score result."""
    r2: float                           # R² value
    r2_std: float | None                # Std of R² (Bayesian only)
    is_bayesian: bool

Adapter Implementations

class PyMCAdapter:
    """Adapter for PyMC models providing unified interface."""
    
    def __init__(self, model: PyMCModel):
        self._model = model
    
    def fit(self, X: xr.DataArray, y: xr.DataArray, **kwargs) -> None:
        coords = kwargs.get("coords", self._build_default_coords(X, y))
        self._model.fit(X, y, coords)
    
    def predict(self, X: xr.DataArray) -> PredictionResult:
        idata = self._model.predict(X)
        mu = idata.posterior_predictive["mu"]
        return PredictionResult(
            mean=mu.mean(dim=["chain", "draw"]).values,
            samples=mu,
            lower=mu.quantile(0.03, dim=["chain", "draw"]).values,
            upper=mu.quantile(0.97, dim=["chain", "draw"]).values,
            is_bayesian=True,
        )
    
    def calculate_impact(self, y_true: xr.DataArray, y_pred: PredictionResult) -> ImpactResult:
        # Use samples for full posterior impact
        impact_samples = y_true - y_pred.samples
        return ImpactResult(
            mean=impact_samples.mean(dim=["chain", "draw"]).values,
            samples=impact_samples,
            cumulative_mean=impact_samples.cumsum(dim="obs_ind").mean(dim=["chain", "draw"]).values,
            cumulative_samples=impact_samples.cumsum(dim="obs_ind"),
            is_bayesian=True,
        )


class SklearnAdapter:
    """Adapter for scikit-learn models providing unified interface."""
    
    def __init__(self, model: RegressorMixin):
        self._model = model
        # Apply CausalPy compatibility
        create_causalpy_compatible_class(model)
    
    def fit(self, X: xr.DataArray, y: xr.DataArray, **kwargs) -> None:
        # Ignore coords, convert to numpy, handle dimensionality
        X_np = X.values
        y_np = y.values.squeeze() if y.ndim > 1 else y.values
        self._model.fit(X_np, y_np)
    
    def predict(self, X: xr.DataArray) -> PredictionResult:
        pred = self._model.predict(X.values)
        return PredictionResult(
            mean=pred,
            samples=None,
            lower=None,
            upper=None,
            is_bayesian=False,
        )
    
    def calculate_impact(self, y_true: xr.DataArray, y_pred: PredictionResult) -> ImpactResult:
        y_np = y_true.values.squeeze() if y_true.ndim > 1 else y_true.values
        impact = y_np - y_pred.mean
        return ImpactResult(
            mean=impact,
            samples=None,
            cumulative_mean=np.cumsum(impact),
            cumulative_samples=None,
            is_bayesian=False,
        )

Factory Function

def create_adapter(model: PyMCModel | RegressorMixin) -> ModelAdapter:
    """Create appropriate adapter for the given model."""
    if isinstance(model, PyMCModel):
        return PyMCAdapter(model)
    elif isinstance(model, RegressorMixin):
        return SklearnAdapter(model)
    else:
        raise ValueError(f"Unsupported model type: {type(model)}")

Simplified Experiment Code

With adapters, experiment classes become cleaner:

class InterruptedTimeSeries(BaseExperiment):
    def __init__(self, data, treatment_time, formula, model, ...):
        super().__init__(model=model)
        self._adapter = create_adapter(model)
        
        # ... data preparation (unchanged) ...
        
        # Unified model operations - NO isinstance checks!
        self._adapter.fit(self.pre_X, self.pre_y, coords=COORDS)
        self.score = self._adapter.score(self.pre_X, self.pre_y)
        self.pre_pred = self._adapter.predict(self.pre_X)
        self.post_pred = self._adapter.predict(self.post_X)
        self.pre_impact = self._adapter.calculate_impact(self.pre_y, self.pre_pred)
        self.post_impact = self._adapter.calculate_impact(self.post_y, self.post_pred)

Advantages

1. No Breaking API Changes

Users continue to pass model=cp.pymc_models.LinearRegression() or model=LinearRegression(). The adapter layer is purely internal.

2. Eliminates Pervasive Branching

Instead of 10+ isinstance checks per experiment class, there's one check in create_adapter().

3. Centralized Model Logic

All PyMC-specific and sklearn-specific behavior lives in the adapter classes, not scattered across experiment classes.

4. Easier Extension

Adding a new backend (e.g., Nutpie, NumPyro, statsmodels) requires only implementing a new adapter class, not modifying every experiment.

5. Better Testability

Adapters can be unit tested independently
Experiment classes can be tested with mock adapters
Clear separation of concerns

6. Improved Documentation

Backend-specific behavior is documented in adapter classes, not buried in experiment implementation details.

7. Type Safety

The unified result types (PredictionResult, ImpactResult) provide clear contracts that can be type-checked.

Disadvantages / Considerations

1. Increased Abstraction

Adds a layer of indirection. Developers need to understand the adapter pattern to contribute.

2. Migration Effort

Significant refactor touching all experiment classes. Estimated effort: 2-3 days for core implementation, plus testing.

3. Result Type Overhead

The unified result types add some overhead compared to raw numpy arrays or InferenceData. This should be negligible in practice since MCMC sampling dominates runtime.

4. Potential Edge Cases

Some experiment classes may have unique requirements that don't fit the unified interface cleanly. May need adapter extensions or experiment-specific handling.

5. Learning Curve

New contributors need to understand the adapter pattern, though this is offset by cleaner experiment code.

Implementation Plan

Phase 1: Core Infrastructure

Define ModelAdapter protocol and result dataclasses
Implement PyMCAdapter and SklearnAdapter
Add comprehensive unit tests for adapters

Phase 2: Migrate Experiment Classes

Start with InterruptedTimeSeries as proof of concept
Migrate remaining experiment classes one at a time
Ensure backward compatibility at each step

Phase 3: Cleanup

Remove redundant isinstance checks
Consolidate _bayesian_plot / _ols_plot where possible (using is_bayesian flag)
Update documentation

Relationship to Issue #624

This proposal supersedes the approach suggested in #624, which proposed:

Separate files per backend (interrupted_time_series_pymc.py, interrupted_time_series_ols.py)
A backend="pymc" parameter instead of passing model objects

The adapter approach is preferred because:

It doesn't break the existing API
It applies to all experiment classes, not just ITS
It maintains the flexibility of passing custom model objects
It's a more established design pattern (Adapter/Strategy)

Questions for Discussion

Should adapters be exposed publicly for advanced users, or kept internal?
Should we support lazy adapter creation (create on first use) or eager (in __init__)?
How should we handle experiment-specific model requirements (e.g., InstrumentalVariableRegression has a unique fit() signature)?
Should plotting methods also use a unified interface, or remain separate (_bayesian_plot / _ols_plot)?

JeanVanDyk · 2026-01-05T11:30:41Z

JeanVanDyk
Jan 5, 2026
Collaborator

Hi @drbenvincent!

First, thanks for this clear and neat work. Your proposed solution not only resolves the issue I mentioned, but this refactor will without a doubt bring the library to a better state, making it easier for people to contribute thanks to this layer of abstraction.

To answer your questions:

1 - Public vs Internal: I don't have strong feelings about it. It might be worth keeping them internal in the documentation at first, as exposing them could clutter meaningful information for the everyday user.

2 - Lazy vs Eager: Again, no strong opinion. On one hand, I believe it would be clearer to have them instantiated in init (eager), but I can't see a reason why lazy creation would be an issue.

3 - Experiment-specific requirements: That is a complicated one to tackle. Originally, that's why I proposed an experiment-based adapter; I wasn't sure we could unify the logic across all experiments. We want to avoid moving if/else logic based on model types to if/else logic based on experiment types. In my opinion, as the library scales, the needs of experiments will become more specific. Therefore, an experiment-specific adapter could be a valuable option. I believe the best approach would be to create a default Adapter for general cases, and then for specific needs like InstrumentalVariableRegression, create child classes (e.g., SklearnAdapterIVG or PyMCAdapterIVG) that inherit from the base and only override the required methods.

classDiagram
    class BaseAdapter {
        <<interface>>
        +fit(data)
        +predict(data)
    }

    class PyMCAdapter {
        +fit(data)
        +predict(data)
    }

    class PyMCAdapterIVG {
        +fit(data, instruments)
    }

    BaseAdapter <|-- PyMCAdapter
    PyMCAdapter <|-- PyMCAdapterIVG
    
    note for PyMCAdapterIVG "Overrides fit() to handle \nInstrumental Variables"

4 - Plotting: From a developer's POV, I would recommend keeping them separate. It would be easier to maintain and verify, especially while "vibe coding", ensuring that modifications to one backend don't unintentionally break the other. For instance, if OLS needs an update, we should know exactly which files are affected.

I'd be happy to propose a draft PR later this week to start exploring this. In the meantime, where do you think these new files should be located within the library structure?

Thanks again for diving so deep into this and finding a solution!

1 reply

drbenvincent Jan 7, 2026
Maintainer Author

Let's try to get some more input from others first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce Model Adapter Layer to unify PyMC and OLS interfaces #629

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Introduce Model Adapter Layer to unify PyMC and OLS interfaces #629

Uh oh!

drbenvincent Jan 2, 2026 Maintainer

Summary

Problem Statement

Current Architecture

The Problem

Extent of the Problem

InterruptedTimeSeries (~10+ branch points)

DifferenceInDifferences (~5+ branch points)

PanelRegression (~5+ branch points)

Other Experiment Classes

Consequences

Proposed Solution: Model Adapter Layer

Architecture Overview

Unified Result Types

Adapter Implementations

Factory Function

Simplified Experiment Code

Advantages

1. No Breaking API Changes

2. Eliminates Pervasive Branching

3. Centralized Model Logic

4. Easier Extension

5. Better Testability

6. Improved Documentation

7. Type Safety

Disadvantages / Considerations

1. Increased Abstraction

2. Migration Effort

3. Result Type Overhead

4. Potential Edge Cases

5. Learning Curve

Implementation Plan

Phase 1: Core Infrastructure

Phase 2: Migrate Experiment Classes

Phase 3: Cleanup

Relationship to Issue #624

Questions for Discussion

Replies: 1 comment · 1 reply

Uh oh!

JeanVanDyk Jan 5, 2026 Collaborator

Uh oh!

drbenvincent Jan 7, 2026 Maintainer Author

drbenvincent
Jan 2, 2026
Maintainer

`InterruptedTimeSeries` (~10+ branch points)

`DifferenceInDifferences` (~5+ branch points)

`PanelRegression` (~5+ branch points)

Replies: 1 comment 1 reply

JeanVanDyk
Jan 5, 2026
Collaborator

drbenvincent Jan 7, 2026
Maintainer Author