Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot save/load model #1457

Open
kb-open opened this issue Feb 3, 2025 · 17 comments
Open

Cannot save/load model #1457

kb-open opened this issue Feb 3, 2025 · 17 comments
Labels

Comments

@kb-open
Copy link

kb-open commented Feb 3, 2025

Using version 0.11.0. Trying to save the model using pickle, but it gives the following error:
PicklingError: Can't pickle <function create_dim_handler.<locals>.func at 0x00000202218598A0>: it's not found as pymc_marketing.prior.create_dim_handler.<locals>.func

I've tried joblib and pickle. Both result in this error.

@wd60622
Copy link
Contributor

wd60622 commented Feb 3, 2025

Is there an issue with the save and load methods? That is the intended io

You will have to use cloudpickle due to local functions as an alternative

@kb-open
Copy link
Author

kb-open commented Feb 3, 2025

I think we can close the issue. I tried with the default save and load methods, and they worked fine.

@kb-open kb-open closed this as completed Feb 3, 2025
@wd60622
Copy link
Contributor

wd60622 commented Feb 3, 2025

Sounds good @kb-open

If you run into any issues, feel free to open another issue.
The load_from_idata classmethod can also be used for additional IO flexibility FYI

@kb-open kb-open changed the title Cannot pickle model Cannot save/load model Feb 8, 2025
@kb-open
Copy link
Author

kb-open commented Feb 8, 2025

The load method does not work for causal MMM. For example, I get the following error when trying to load the causal_mm model object as per the official documentation here: https://www.pymc-marketing.io/en/stable/notebooks/mmm/mmm_causal_identification.html.

DifferentModelError                       Traceback (most recent call last)
File ~\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\model_builder.py:617, in ModelBuilder.load(cls, fname)
    616 try:
--> 617     return cls.load_from_idata(idata)
    618 except DifferentModelError as e:

File ~\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\model_builder.py:572, in ModelBuilder.load_from_idata(cls, idata)
    566     msg = (
    567         "The model id in the InferenceData does not match the model id. "
    568         "There was no error loading the inference data, but the model may "
    569         "be different. "
    570         "Investigate if the model structure or configuration has changed."
    571     )
--> 572     raise DifferentModelError(msg)
    574 return model

DifferentModelError: The model id in the InferenceData does not match the model id. There was no error loading the inference data, but the model may be different. Investigate if the model structure or configuration has changed.

The above exception was the direct cause of the following exception:

DifferentModelError                       Traceback (most recent call last)
Cell In[55], line 1
----> 1 model_bayesian = MMM.load('model/model_bayesian.nc')

File ~\anaconda3\envs\mmm\Lib\site-packages\pymc_marketing\model_builder.py:624, in ModelBuilder.load(cls, fname)
    618 except DifferentModelError as e:
    619     error_msg = (
    620         f"The file '{fname}' does not contain "
    621         "an InferenceData of the same model "
    622         f"or configuration as '{cls._model_type}'"
    623     )
--> 624     raise DifferentModelError(error_msg) from e

DifferentModelError: The file 'model/model_bayesian.nc' does not contain an InferenceData of the same model or configuration as 'MMM'

@kb-open kb-open reopened this Feb 8, 2025
@kb-open
Copy link
Author

kb-open commented Feb 8, 2025

My code:

model_bayesian.save('model/model_bayesian.nc')
model_bayesian = MMM.load('model/model_bayesian.nc')

And model_bayesian is same as causal_mm, just the name is different.

@wd60622
Copy link
Contributor

wd60622 commented Feb 8, 2025

So you are running the notebook? Or do you have a different configuration than the notebook?

@kb-open
Copy link
Author

kb-open commented Feb 8, 2025

Same configuration but I tried to save and load.

@wd60622
Copy link
Contributor

wd60622 commented Feb 8, 2025

Same configuration but I tried to save and load.

Which version(s)

@kb-open
Copy link
Author

kb-open commented Feb 8, 2025

0.11.0

@kb-open
Copy link
Author

kb-open commented Feb 8, 2025

Just to give a little more info, in case it helps solving the problem.

As soon as I add dag and outcome variables to the model configurations, the issue occurs. That is, the issue occurs only with causal model. As soon as I remove these variables (while keeping everything else exactly the same), the issue disappears.

One thing I notice is that, with causal model, saturation_beta variable doesn't exist anymore. And saturation_alpha appears instead. I'm talking about the changes in default configs. Maybe this is the clue to debugging the issue @wd60622

@wd60622
Copy link
Contributor

wd60622 commented Feb 9, 2025

Thanks for the context. What are the values you are passing to dag?

@kb-open
Copy link
Author

kb-open commented Feb 9, 2025

causal_dag = """digraph {x1 -> y;
x2 -> y;
x1 -> x2;
holiday_signal -> y;
holiday_signal -> x1;
holiday_signal -> x2;
competitor_offers -> x2;
competitor_offers -> y;
market_growth -> y;}"""

@wd60622
Copy link
Contributor

wd60622 commented Feb 12, 2025

Can you load the nc file directly with arviz and share what are the attrs of InferenceData and the values of fit_data Dataset group

@kb-open
Copy link
Author

kb-open commented Feb 12, 2025

Code used:

idata = az.from_netcdf('model/model_bayesian.nc')
print("Attributes of InferenceData:")
print(idata.attrs)

if "fit_data" in idata.groups():
    print("\nValues of 'fit_data' Dataset group:")
    print(idata.fit_data)
else:
    print("\n'fit_data' group not found in the InferenceData object.")

Output:

Attributes of InferenceData:
{'id': 'bcbce0522a5869f2', 'model_type': 'MMM', 'version': '0.0.2', 'sampler_config': '{}', 'model_config': '{"intercept": {"dist": "HalfNormal", "kwargs": {"sigma": 0.5}}, "likelihood": {"dist": "Normal", "kwargs": {"sigma": {"dist": "HalfNormal", "kwargs": {"sigma": 2}}}, "dims": ["date"]}, "gamma_control": {"dist": "Normal", "kwargs": {"mu": 0, "sigma": 1}, "dims": ["control"]}, "gamma_fourier": {"dist": "Laplace", "kwargs": {"mu": 0, "b": 1}, "dims": ["fourier_mode"]}, "intercept_tvp_config": {"m": 200, "L": 729.25, "eta_lam": 1.0, "ls_mu": 100.0, "ls_sigma": 10.0, "cov_func": null}, "media_tvp_config": {"m": 200, "L": 729.25, "eta_lam": 1.0, "ls_mu": 5.0, "ls_sigma": 10.0, "cov_func": null}, "adstock_alpha": {"dist": "Beta", "kwargs": {"alpha": 1, "beta": 3}, "dims": ["channel"]}, "saturation_alpha": {"dist": "Gamma", "kwargs": {"mu": 2, "sigma": 1}, "dims": ["channel"]}, "saturation_lam": {"dist": "HalfNormal", "kwargs": {"sigma": 1}, "dims": ["channel"]}}', 'date_column': '"date_str"', 'adstock': '{"lookup_name": "geometric", "prefix": "adstock", "priors": {"alpha": {"dist": "Beta", "kwargs": {"alpha": 1, "beta": 3}, "dims": ["channel"]}}, "l_max": 12, "normalize": true, "mode": "After"}', 'saturation': '{"lookup_name": "michaelis_menten", "prefix": "saturation", "priors": {"alpha": {"dist": "Gamma", "kwargs": {"mu": 2, "sigma": 1}, "dims": ["channel"]}, "lam": {"dist": "HalfNormal", "kwargs": {"sigma": 1}, "dims": ["channel"]}}}', 'adstock_first': 'true', 'control_columns': '["holiday_signal"]', 'channel_columns': '["x1", "x2"]', 'validate_data': 'true', 'yearly_seasonality': 'null', 'time_varying_intercept': 'true', 'time_varying_media': 'true', 'dag': '"digraph {x1 -> y;\\n                         x2 -> y;\\n                         x1 -> x2;\\n                         holiday_signal -> y;\\n                         holiday_signal -> x1;\\n                         holiday_signal -> x2;\\n                         competitor_offers -> x2;\\n                         competitor_offers -> y;\\n                         market_growth -> y;}"', 'treatment_nodes': '["x1", "x2"]', 'outcome_node': '"y"'}

Values of 'fit_data' Dataset group:
<xarray.Dataset> Size: 76kB
Dimensions:            (date: 729)
Coordinates:
  * date               (date) datetime64[ns] 6kB 2022-01-01 ... 2023-12-30
Data variables:
    holiday_signal     (date) float64 6kB ...
    competitor_offers  (date) float64 6kB ...
    x1                 (date) float64 6kB ...
    x2                 (date) float64 6kB ...
    market_growth      (date) float64 6kB ...
    t                  (date) float64 6kB ...
    date_str           (date) <U10 29kB ...
    y                  (date) float64 6kB ...

@kb-open
Copy link
Author

kb-open commented Feb 16, 2025

Tagging @wd60622 just in case my comment above got missed, since there has been no update.

@wd60622
Copy link
Contributor

wd60622 commented Feb 16, 2025

I am unable to reproduce. Can you make a small reproducible example

@kb-open
Copy link
Author

kb-open commented Feb 17, 2025

example.zip
Please find attached @wd60622

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants