generate synthetic dataset given BG/NBD Model parameters? #717
-
Hi, I am still trying to wrap my head around this great library. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
Hi @seanreed1111, Good question! Simple exampleFor a simple PyMC example, y = np.array([0, 1, 2, 3, 4])
# Can take random samples of y
with pm.Model() as model:
mu = pm.Normal("mu")
sigma = pm.HalfNormal("sigma")
pm.Normal("y", mu=mu, sigma=sigma, observed=y)
# Cannot take random samples of y
with pm.Model() as model:
mu = pm.Normal("mu")
sigma = pm.HalfNormal("sigma")
pm.Potential("y", pm.logp(pm.Normal.dist(mu=mu, sigma=sigma), y)) This would even work if the y depends on other variables like they tend to in regression. This is where CLV SupportFor the models in pymc-marketing/clv, this will be shown in the BG/NBG (BetaGeo) model uses the potential so that can not be sampled. However, the other models should be able to because they are built with pymc distribution or the custom distributions. However, the @ColtAllen will have some more context as well |
Beta Was this translation helpful? Give feedback.
Hey @seanreed1111,
Are you interested in generating RFM data for modeling, or raw transaction data? (See the Quickstart for examples). Simulating RFM data is not supported yet for BG/NBD, but can be done for the Pareto/NBD model. Simulating raw transactions is not supported at all natively, but I did hack something together in the legacy
lifetimes
library last year for this very situation:https://github.com/ColtAllen/marketing-case-study/blob/main/case-study.ipynb
First three cells in that notebook should cover your request. If this is something you think you'll be doing on a regular basis, please create an issue and we'll prioritize it accordingly.