Open
Conversation
Many of the algorithms get stuck on arms, either because they expect the learner’s counts to be updated from zero, or because it always selects the first arm with the maximum value (which is just the first arm, when we initialize an MLELearner object to initially take on some fixed mean for all arms). This fixes these issues, and makes sure every arm is played at least once (in cases where this is the expected behavior), and randomly selects among best arms if there are multiple best arms. With this change, all algorithms should in expectation sample from all arms with equal probability during the first batch.
Also make plots a little easier to read.
Collaborator
There was a problem hiding this comment.
I think we should use require this information to be provided by the learner. MLELearner did provide this information and so would StreamStats objects.
Owner
Author
There was a problem hiding this comment.
The issue is that nobs only gets updated when the learner gets updated, which only happens at the end of the batch.
Collaborator
|
Made some comments in-line. Let me know if you want help implementing the strategy I proposed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Many of the algorithms get stuck on arms, either because they expect
the learner’s counts to be updated from zero, or because it always
selects the first arm with the maximum value (which is just the first
arm, when we initialize an MLELearner object to initially take on some
fixed mean for all arms). This fixes these issues, and makes sure
every arm is played at least once (in cases where this is the expected
behavior), and randomly selects among best arms if there are multiple
best arms. With this change, all algorithms should in expectation
sample from all arms with equal probability during the first batch.