Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Note on use of tpcp.Algorithm classes #13

Open
AKuederle opened this issue Oct 25, 2024 · 3 comments
Open

Note on use of tpcp.Algorithm classes #13

AKuederle opened this issue Oct 25, 2024 · 3 comments

Comments

@AKuederle
Copy link
Member

I saw that at least in one of the algorithms implemented here (

self.selected_coords = ["x", "y", "z"] # Default coordinates
), have a bunch of loading in the __init__. This is discuraged in the context of tpcp/sklearn and similar frameworks, as the idea is that parameters can be modified even after the init (using set_params) and the assumption is that creating an instance is "cheap" and it is hence done under the hood often (e.g. via .clone).

All logic should be put where it is first needed (aka in detect or train method)

@AKuederle
Copy link
Member Author

Here as well:

@Jahneel
Copy link
Collaborator

Jahneel commented Feb 16, 2025

I tried to address your raised Issue in: fa2b888 & e14b7c2 .
This moves the computation heavy Stuff out of the initialization and places them where they are first used / in separate function. Is this a correct fix for you addressed Issue @AKuederle ?

@AKuederle
Copy link
Member Author

There are still computations in the respective inits. The parameters of an algorithm can (and will be changed, e.g. in a gridsearch) without rerunning the init. In result dependent computations that only exist in the init become out of sync.
Further, you need to always rerun all dependent computations. Assume that parameters can change between executions. Then you must not rely on pre-loaded values that where potentially calulated based on other parameters.

If you want to cache expensive computations, use proper caching that tracks inputs.

Btw. you also don't need to initilize result variables and others in the init. You can just set the typehints in the class body and then set them when needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants