-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch job options/configuration settings #276
Comments
The JSON object in the body of /jobs (and other similar endpoints) is intentionally open (i.e. By the way, the client parameter |
That and how we handle it in our VITO backend is of course our business. But we've also put that "feature" it in the python client (because we need it). |
Also: I'm not pushing to include this in the 1.0.0 API. If there is consensus to include it in a future version, that's fine too. |
@soxofaan I just added a clarification to the client library guidelines that the This behavior was actually foreseen in the API and was meant to be used as follows: So for example:
It just lacked the actual wording that additional was meant to be merged with the request body. If you want to stick with job_options on the back-end side, I'd propose to just use the Python client as follows: That would make at least the clients compatible. |
The page build hasn't finished yet: The addition is:
And yes, that came from a time where we experimented with custom fields (i.e. publishing resources to the public with a custom |
interesting, thanks I think we'll stick with "namespacing" these job configuration things under a toplevel "job_options" field |
Reopening this with the idea to allow JSON-schema based exploration of additional options through the /jobs and /services endpoints (we could also add two additional endpoints, if you'd like), similar to how secondary service parameters are defined. Please note that the primary use case is still top-level options, but it's still possible to expose the job options with small caveats in the rendered UIs. Example for the VITO job options (I'm not sure about the exact options allowed) exposed in {
"jobs": [...],
"links": [...],
"create_parameters": [
{
"name": "job_options",
"schema": {
"type": "object",
"properties": {
"driver-memory": {
"type": "string",
"description": "...",
"default": "2g"
},
"executor-memory": {
"type": "string",
"description": "...",
"default": "2g"
},
"large-scale": {
"type": "boolean",
"default": false
}
}
},
"description": "...",
"optional": true,
"default": {}
}
]
} |
… define the parameters that can be submitted suring creation #276
That means that each backend can choose the actual field name of these job options ("job_options" in this example). |
Your example is not ideal, because in the API the job and service options should be top-level, but you've chosen to do group them in one parameter as an object. So yes, this is free to choose and the place where you could standardize this is the openEO Platform API/federation contract. job_options would not be standardized as part of the openEO API. |
Indeed, that's because it feels much cleaner like that. |
Honestly, I don't see a difference between both approaches. |
Another example: if we are talking about "large area" processing in the aggregator, maybe some options are intended towards the aggregator, some other towards the VITO backend, some others towards the EODC backend, etc. In such a context it feels just more maintainable to have some kind of grouping or tree structure to cleanly separate options. |
I don't quite get what your aim is? The top-level approach has been in the API spec since 0.3 or so and your way of grouping is actually supported, but the grouping itself is not standardized. So are we good or what is your expectation? The aggregator surely can expose them in multiple groups, too (e.g. eodc_job_options, vito_job_options) and merge what is specified the same way. On the other hand, grouping them in the aggregator may also feel weird from a user perspective... But the aggregator thing is also a completely new story to be discussed separately, I think. |
I understand that anything is possible and that a backend can pick any approach to their liking, for example, in the python client, |
That's the issue if implementors choose to not follow the spec and even integrate proprietary stuff into only one of the clients... The "additional" thing has been there since years and now a decision VITO has taken later should not enforce a change in the spec, sorry. The JS client for example also has an additional flag for the top level fields and the GEE driver uses the top level fields to make resources public. |
We committed a change in the geotrellis backend to also support top-level job_options. This is backward compatible, and once it's rolled out we can also adjust the Python client to prefer that style. PS: our proposal for specifying job_options predates the clarifications made to the spec: |
@jdries Sorry, I acknowledge that this wasn't bad intention from your side. I'm also always very happy that VITO is very active in pointing out issues and giving feedback. The primary reason for this misunderstanding is that ReDoc earlier showed the "additional properties for batch job" thing in the rendered version nicely and that indicated additional properties should go top-level (through the
This is not true. The spec fully allows job_options in the PR and GEE still has an implementation that uses the top-level fields. Anyway, great that we are coming to a conclusion here so that we can push the open PRs and issues to harmonize this in openEO. Thanks! |
* Jobs and services: Added `create_parameters` property in responses to define the parameters that can be submitted suring creation #276 * Add processing options to a separate endpoint #276 * Options should be optional * Move to an extension * Add UDP extension * Apply suggestions from code review Co-authored-by: Stefaan Lippens <[email protected]> * Renamed according to PR review * Clarify where UDPs can come from * Apply suggestions from code review * Add example * Update extensions/processing-parameters/README.md Co-authored-by: Stefaan Lippens <[email protected]> --------- Co-authored-by: Stefaan Lippens <[email protected]>
Currently in VITO backend we support (backend specific) batch job options for example to finetune spark driver and executor memory limits. These are provided in the
POST /jobs
request in a "job_options" field, e.g. something like this:This is also supported in the python client (e.g. see https://github.com/Open-EO/openeo-python-client/blob/1171465c69f85f3c225719a6664c7a3865076ff1/openeo/rest/connection.py#L463-L471)
The actual options and usage of this is of course highly backend specific, but it's probably a good idea to reserve this field "job_options" in official spec.
The text was updated successfully, but these errors were encountered: