Skip to content

[RFC] Data processing pipelinesย #56

Open
@rth

Description

@rth

In line with code reorganization in issue #33 , I was wondering what's your opinion on data pipelines. I could be wrong but, I looks like what the current OridnaryKrigging etc.. could be separated in several independent steps,

  • (optional) anisotropy correction coordinate transformation
  • (optional) optional geographic coordinate transformation
  • actual kriging (not sure if drift could be separated in a separate step..)

One could then potentially create, for instance, AnisotropyTransformer, CoordinateTransformer, KrigingEstimator classes (or some other names) and get the results by constructing a pipeline using sklearn.pipeline.Pipeline or sklearn.pipeline.make_pipeline. The advantage of this being that the different transformers

  • could then be simpler (with few input parameters), and users would only use the elementary bricks they need
  • can be reused for different krigging types.
  • can be more easily customized by subclassing different steps
  • can be tested independently

I'm not sure if that would be useful. Among other things this could depend on how much more options we might end up adding. For instance the Universal Krigging class currently has 16 input parameters which is already quite a bit. If we end up adding, say local variogram search radius, kriging search radius and a few others, splitting the processing into several steps might be a way to simplify the interface for the users.

Just a though... What do you think?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions