-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transmit BayesDB stochasticity to crosscat engine #382
Conversation
CrosscatMetamodel now takes a crosscat engine class constructor instead of an instance, so that the crosscat stochasticity can be controlled by the bdb RNG services. CrosscatMetamodel can still take an instance, for backward compatibility.
…0160311-alxempirical-i-want-to-be-surprised-and-delighted-but-i-also-want-to-be-in-total-control
Please let me know if I need to give more detail about why this is a valuable change. |
Sorry, just haven't gotten a round tuit yet. From what you said this sounds like a good idea, but I haven't taken a look at the actual changes. |
@@ -240,34 +239,8 @@ class CrosscatMetamodel(metamodel.IBayesDBMetamodel): | |||
|
|||
""" | |||
|
|||
def __init__(self, crosscat, subsample=None, ccargs=None, | |||
cckwargs=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cckwargs was so we could pass e.g. the subsample keyword to the underlying metamodel, even if we wanted to do multiprocessing. Can you help me understand how that works in the revised flow?
No worries. Thanks for the update. |
`crosscat` can be a crosscat engine instance or constructor. Prefer the | ||
latter case, as that way bayeslite stochasticity is passed to the | ||
constructor `seed` argument when engines are constructed for crosscat | ||
queries (which at the time of writing happens every time bayeslite makes a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the provided constructor will be used seems like info a caller would want to have. Please keep that?
CrosscatMetamodel now takes a callable which should return a crosscat engine. See, e.g., crosscat.MultiprocessingEngine.MultiprocessingEngineFactoryFromPool. Arguments to the engine constructor can be included in a lambda if need be. I am agnostic about whether to use ccargs/cckwargs vs an opaque lambda here. One is simpler to write, the other is simpler to interrogate during debugging. |
for ccargs/cckwargs vs. lambda, I'm fine with lambda, but would suggest adding an example of that use in nearby documentation. |
self._subsample = subsample | ||
def __init__(self, crosscat, subsample=None, ccargs=None, | ||
cckwargs=None): | ||
if isinstance(crosscat, EngineTemplate): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more dynamic decisions on the basis of isinstance
please, unless actually operating on a data structure that is sensibly a union type (e.g., a json tree).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already resolved in later commit.
|
||
>>> import bayeslite.metamodels.crosscat as cc | ||
>>> from crosscat import MultiprocessingEngine as mpe | ||
>>> import bayeslite.metamodels.crosscat as cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dup import
>>> from bayeslite import bayesdb_register_metamodel as rg | ||
>>> from bayeslite import bayesdb_open as bp | ||
>>> from bayeslite import bayesdb_register_metamodel as register_metamodel | ||
>>> from bayeslite import bayesdb_open | ||
>>> pool = mpe.Pool(4) | ||
>>> thelambda = lambda s: mpe.MultiprocessingEngine(seed=s, pool=pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much more readable; thank you!
thelambda -> engine_factory perhaps?
See #392 instead. |
At the moment, the seed passed to a BayesDB instance essentially goes unused. A metamodel is seeded during instantiation, and that's it. That makes it more difficult than it should be to run repeatable tests in a REPL. This PR corrects that by allowing the metamodel to create a new engine for each query. Since the only state in an engine is an RNG, the overhead of this change is minimal.
This change needs to take place in concert with probcomp/crosscat#88
The new interface expects
bayeslite.metamodels.crosscat.CrosscatMetamodel
to be instantiated with a callable which takes aseed
argument. For backwards compatibility, it is still possible to pass acrosscat.EngineTemplate.EngineTemplate
instance (or instances of subclasses likeLocalEngine
orMultiprocessingEngine
.