Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Automatic creation of Columns #13

Open
hhsecond opened this issue Mar 6, 2020 · 2 comments
Open

[FEATURE REQUEST] Automatic creation of Columns #13

hhsecond opened this issue Mar 6, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@hhsecond
Copy link
Member

hhsecond commented Mar 6, 2020

Describe the feature
As part of making the APIs easier, it might be a good idea to infer the type of data on the first assignment and create hangar columns if it doesn't exist. Few thoughts to make it a difficult decision:

  • The implicit creation of columns might make the user think that they could add data with different shape and different dtype
  • Inferring whether the columns should be variable_shaped might not be possible.

Thoughts?

@hhsecond hhsecond added the enhancement New feature or request label Mar 6, 2020
@rlizzo
Copy link
Member

rlizzo commented Mar 6, 2020

The variable_shape argument would be difficult to inter, but I think the bigger issue would be figuring out which column layout to use from data samples alone..

also does stockroom allow users to specify kwargs for the backend / backend_options parameters? hangars heuristics are really dumb... automatic methods to optimize this is part of hangar enterprise, but even with a reasonable set of options for a dataset, the final choice requires you to know a bit about the users environment and where they fall on the compression tradeoff (time vs space) scale. Neither hangar nor stockroom can handle this in full right now...

If you were able to infer basic info though, you'll need to consider how a user would correct a column definition if the heuristics were wrong?

@hhsecond
Copy link
Member Author

hhsecond commented Mar 7, 2020

No, if the user is expert enough to configure backend_options, he/she can rely on the hangar for that.

So the idea here, the user always has the choice to use hangar CLI to create the columns (and we'll make sure the user understand this through the document) but if they just need the stock to act as a dictionary and not worry about anything else, that's when heuristics are going to help. Does that sound reasonable to you?

About inferring the layout, I think the currently available layouts are inferrable. Isn't that true?. About the time series layout (and any future layouts), I am not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants