Skip to content

Add functions for Empirical Bayes#39

Open
ddimmery wants to merge 6 commits intomasterfrom
eb-mle
Open

Add functions for Empirical Bayes#39
ddimmery wants to merge 6 commits intomasterfrom
eb-mle

Conversation

@ddimmery
Copy link
Collaborator

This is for generic distributions, not just Beta-Binomial. It calculates online mean and variance (variance of the mean now) like MLELearner, but using the James-Stein estimator we have been using.

Drew Dimmery added 5 commits April 24, 2017 22:20
This adds a positive part James Stein estimator that isn't dependent on
distribution of the outcome.
Rename from a dumb name to JamesSteinLearner. Also add in fix to the
stochastic bandit because it wasn't working before.
This provides functions for generating samples from the (normal)
posteriors of MLELearners and James-Stein learners.
MLELearner was calculating SD, but want to calculate SE
@ddimmery ddimmery requested a review from eytan May 31, 2017 18:08
if nᵢ == 1
learner.oldMs[a] = r
learner.Ss[a] = 0.0
learner.Ss[a] = learner.σ₀
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these bug fixes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear that I should really be doing this. We're already abusing the learner.σ₀ notation a bit, since we aren't "really" using a prior as the notation would suggest. This just ensures that the standard error is never actually exactly zero. For a Bernoulli DGP, we may have an estimated standard deviation of zero for a while until we finally observe a success. So this is basically like an Agresti-Coull estimate, with this change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This seems like we should potentially be using something like NaN instead. Or at least not calling this MLE anymore.

learner.oldMs[a] = learner.newMs[a]
learner.μs[a] = learner.newMs[a]
learner.σs[a] = sqrt(learner.Ss[a] / (nᵢ - 1))
learner.σs[a] = sqrt(learner.Ss[a] / (nᵢ - 1) / nᵢ)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this a bug as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it was a really fun one to track down. Doing anything with MLELearner seems like it's been broken for a while (learner.σs was the estimated standard deviation of the data, not the standard error of the mean).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I think there's some confusion: this field was always supposed to be the standard deviation, not the standard error. So I think other places in the code are the problem rather than this line.

learner.oldMs[a] = learner.newMs[a]
learner.ys[a] = learner.newMs[a]
learner.ss[a] = learner.Ss[a] / (nᵢ - 1) / nᵢ
y̅ = mean(learner.ys)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not a problem, but want to confirm that these steps are changing the inferred means for all means, not just the observed arm. I don't think we assume invariance anywhere, but just confirming.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's correct. Every data point changes the predictions (slightly, due to the shrinkage to an updated global mean) for all data points.

learner.ss[a] = learner.Ss[a] / (nᵢ - 1) / nᵢ
y̅ = mean(learner.ys)
φs = min(1.0, learner.ss ./ (sumabs2(learner.ys - y̅) ./ (learner.K - 3)))
learner.μs[:] = y̅ + (1 - φs) .* (learner.ys - y̅)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor, but these computations allocate memory. If we update this to a newer version of Julia, using the following should be allocation-free:

learner.μs .= y̅ .+ (1 .- φs) .* (learner.ys .- y̅)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I've already been running on 0.5.1, didn't know this existed.

y̅ = mean(learner.ys)
φs = min(1.0, learner.ss ./ (sumabs2(learner.ys - y̅) ./ (learner.K - 3)))
learner.μs[:] = y̅ + (1 - φs) .* (learner.ys - y̅)
learner.σs[:] = sqrt(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same allocation concern.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@johnmyleswhite
Copy link
Collaborator

Should we chat for five minutes over Messenger or VC to figure out what we should do here? I think the main problem here is a lack of documentation in this code about the intent of various fields.

This doesn't address the concern over learner.σs containing the std dev
rather than the std error, but it fixes a problem with the underlying
calculation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants