Skip to content

Conversation

@timbo112711
Copy link
Contributor

@timbo112711 timbo112711 commented Oct 26, 2025

Description

Currently to implement Time-Slice CV one would need to reference the example notebook and copy / pasta the code blocks. This PR aims to wrap the code inside the current notebook into a TimeSliceCrossValidator class. This new class will be used to streamline how users implement this methodology.

The class functions similar to the scikit-learn / generator approach for returning the spilts.

A simple flow would look the this,

# Initialize cross-validator
cv = TimeSliceCrossValidator(
    n_init=163,
    forecast_horizon=12,
    date_column="date",
    step_size=1,
)

# We can check how many splits we will have
# As a reference, the number of splits is computed as:
# n_iterations = y.size - n_init - forecast_horizon + 1
n_splits = cv.get_n_splits(X, y)

# Run the CV!
results = cv.run(
    X,
    y,
    yaml_path="<your-path-to-saved-model>",
)

Related Issue

Checklist


📚 Documentation preview 📚: https://pymc-marketing--2037.org.readthedocs.build/en/2037/

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added docs Improvements or additions to documentation MMM labels Oct 26, 2025
@timbo112711
Copy link
Contributor Author

timbo112711 commented Oct 26, 2025

  • add tests
  • fix bug in CRSP method

@codecov
Copy link

codecov bot commented Oct 26, 2025

Codecov Report

❌ Patch coverage is 11.65919% with 197 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.45%. Comparing base (162e1a7) to head (f6c00f6).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
pymc_marketing/mmm/time_slice_cross_validation.py 11.65% 197 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (162e1a7) and HEAD (f6c00f6). Click for more details.

HEAD has 5 uploads less than BASE
Flag BASE (162e1a7) HEAD (f6c00f6)
23 18
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2037       +/-   ##
===========================================
- Coverage   92.86%   62.45%   -30.42%     
===========================================
  Files          68       69        +1     
  Lines        9213     9596      +383     
===========================================
- Hits         8556     5993     -2563     
- Misses        657     3603     +2946     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added the tests label Oct 27, 2025
@juanitorduz
Copy link
Collaborator

Thanks @timbo112711 ! I will look inot this one carefully this week :)

sampler_config: dict | None = None,
yaml_path: str | None = None,
mmm: Any | None = None,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing hint.

@cetagostini
Copy link
Contributor

Hey, really nice implementation and critical feature we are missing!

A few requests on my side:

  1. I see some test failings, lets try to get those work.
  2. Then let's move the plots to the plot suite, instead of plot_param_stability better to use the suite in the class as property (plot.param_stability). Probably once you add the plots in the plot suite, you need to make sure they are agnostic to several dims (You can follow how other plots works), and plus add an small validator in the plots to see if the idata came from MMM or a cross validation, this relate to my next comment.
  3. You can make the posteriors a single xarray object, instead of a list. In my opinion you have two options, an inference data object with each model as a variable (user can even decide based on the number of splits the name of each variable, we can have default names as in model builder) or a single xarray with model dimension. Any of those would be better and will integrate better with the plot suite API.

This things align a lot more with the vision we are following right now with the class! Could you take a look? Otherwise, I can help you! @timbo112711

@juanitorduz juanitorduz requested a review from Copilot October 27, 2025 13:32
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new TimeSliceCrossValidator class to standardize time-slice cross-validation for Media Mix Models (MMM). Previously, users had to copy code from example notebooks; now they can use this scikit-learn-style class with methods like split(), get_n_splits(), and run().

Key Changes:

  • New TimeSliceCrossValidator class with configurable n_init, forecast_horizon, date_column, and step_size parameters
  • Support for YAML-based model configuration or programmatic MMM instance passing
  • Comprehensive test suite covering basic functionality, step_size variations, and edge cases

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.

File Description
pymc_marketing/mmm/time_slice_cross_validation.py Implements the core TimeSliceCrossValidator class with cross-validation logic and visualization methods
tests/mmm/test_time_slice_cross_validator.py Comprehensive test suite validating initialization, splitting logic, model fitting, and edge cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@timbo112711
Copy link
Contributor Author

Hey, really nice implementation and critical feature we are missing!

A few requests on my side:

  1. I see some test failings, lets try to get those work.
  2. Then let's move the plots to the plot suite, instead of plot_param_stability better to use the suite in the class as property (plot.param_stability). Probably once you add the plots in the plot suite, you need to make sure they are agnostic to several dims (You can follow how other plots works), and plus add an small validator in the plots to see if the idata came from MMM or a cross validation, this relate to my next comment.
  3. You can make the posteriors a single xarray object, instead of a list. In my opinion you have two options, an inference data object with each model as a variable (user can even decide based on the number of splits the name of each variable, we can have default names as in model builder) or a single xarray with model dimension. Any of those would be better and will integrate better with the plot suite API.

This things align a lot more with the vision we are following right now with the class! Could you take a look? Otherwise, I can help you! @timbo112711

@cetagostini thanks for the review 🙌🏻! These make sense and I'll start working towards each of these. Will reach out if I need anything!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation MMM tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Time-Slice CV Class

3 participants