Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-processing code #49

Open
ericmjl opened this issue Feb 9, 2021 · 0 comments
Open

Pre-processing code #49

ericmjl opened this issue Feb 9, 2021 · 0 comments

Comments

@ericmjl
Copy link
Collaborator

ericmjl commented Feb 9, 2021

Hey @ericmjl, yep we’re getting there 😁 ! Docs are on the horizon.

The plan for tomorrow is:

  1. Docs coverage (as close 100% as sanity allows)
  2. Feature Pre-processing

Re: pre-processing I was thinking of a setup where users can pass a few standard functions (one-hot, mean, etc) to a dictionary/config as well as fitted sklearn scalers for normalising across graphs in a dataset (with some helpful functions for creating them from a list of graphs) or unfitted sklearn scalers (for single graph normalisation). A config object seems nice and consistent, but a dictionary might be better for users that create their own features as a config object would be a bit inflexible there.

Would be super keen to hear any thoughts/suggestions on this.

pre_processing_dict = {
“molecular_weight”: StandardScaler,
“secondary_structure”: partial(one_hot, vocab = SS_ELEMENTS)
}

G = process_graph(G, pre_processing_dict)

After this, I think the only outstanding task is a conversion toolkit to support various frameworks. Then it’s cleaning up & polishing before I think a V2.0 release is in order :D
EDIT: and tests! Will crack on with them this week.

Originally posted by @a-r-j in #45 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant