Batch job options/configuration settings #276

soxofaan · 2020-04-23T13:55:00Z

Currently in VITO backend we support (backend specific) batch job options for example to finetune spark driver and executor memory limits. These are provided in the POST /jobs request in a "job_options" field, e.g. something like this:

POST /jobs
{
    "process": { .... },
    "job_options": {"driver-memory": "10g", "executor-memory": "13g"},
     ....

This is also supported in the python client (e.g. see https://github.com/Open-EO/openeo-python-client/blob/1171465c69f85f3c225719a6664c7a3865076ff1/openeo/rest/connection.py#L463-L471)

The actual options and usage of this is of course highly backend specific, but it's probably a good idea to reserve this field "job_options" in official spec.

The text was updated successfully, but these errors were encountered:

m-mohr · 2020-04-23T13:58:45Z

The JSON object in the body of /jobs (and other similar endpoints) is intentionally open (i.e. additionalProperties is implicitly set to true) for such additions. Whether you add job_options or put for example driver-memory on the top-level is pretty back-end specific and thus I don't see any to do for the API spec. I'm open to discussing this once multiple back-ends and clients implement this, although I wouldn't want to call this "job_options". Additionally, you'd need an endpoint to expose what options are available, see /services / /service_types and the configuration property for a similar construct.

By the way, the client parameter additional was meant to be merged with the request body object. Not sure whether that's documented though.

soxofaan · 2020-04-23T14:06:00Z

That and how we handle it in our VITO backend is of course our business.

But we've also put that "feature" it in the python client (because we need it).
I was just wondering if there were any issues with implementing backend specific extensions in a general client.

soxofaan · 2020-04-23T14:08:20Z

Also: I'm not pushing to include this in the 1.0.0 API. If there is consensus to include it in a future version, that's fine too.

m-mohr · 2020-04-23T14:18:50Z

@soxofaan I just added a clarification to the client library guidelines that the additional parameter should be merged with the request body: https://openeo.org/documentation/1.0/developers/clients/library-guidelines.html#parameters-3 (seems the page build hasn't finished yet)

This behavior was actually foreseen in the API and was meant to be used as follows:

So for example:
createJob(process, title, description, plan, budget, {"driver-memory": "10g", "executor-memory": "13g"})
would lead to the following request:

POST /jobs
{
    "process": { .... },
    "driver-memory": "10g",
    "executor-memory": "13g",
     ....

It just lacked the actual wording that additional was meant to be merged with the request body.

If you want to stick with job_options on the back-end side, I'd propose to just use the Python client as follows:
createJob(process, title, description, plan, budget, {"job_options": {"driver-memory": "10g", "executor-memory": "13g"}})

That would make at least the clients compatible.

m-mohr · 2020-04-23T14:21:45Z

The page build hasn't finished yet:

The addition is:

additional in createJob and createService (also below in updateJob and updateService): May hold additional key-value-pairs as object. The object should be merged with the body of the request.

And yes, that came from a time where we experimented with custom fields (i.e. publishing resources to the public with a custom publish: true/false flag).

soxofaan · 2020-04-23T14:26:08Z

interesting, thanks

I think we'll stick with "namespacing" these job configuration things under a toplevel "job_options" field

m-mohr · 2022-11-15T11:02:42Z

Reopening this with the idea to allow JSON-schema based exploration of additional options through the /jobs and /services endpoints (we could also add two additional endpoints, if you'd like), similar to how secondary service parameters are defined.

Please note that the primary use case is still top-level options, but it's still possible to expose the job options with small caveats in the rendered UIs.

Example for the VITO job options (I'm not sure about the exact options allowed) exposed in GET /jobs:

{
  "jobs": [...],
  "links": [...],
  "create_parameters": [
    {
      "name": "job_options",
      "schema": {
        "type": "object",
        "properties": {
          "driver-memory": {
            "type": "string",
            "description": "...",
            "default": "2g"
          },
          "executor-memory": {
            "type": "string",
            "description": "...",
            "default": "2g"
          },
          "large-scale": {
            "type": "boolean",
            "default": false
          }
        }
      },
      "description": "...",
      "optional": true,
      "default": {}
    }
  ]
}

m-mohr · 2022-11-15T11:02:58Z

Thoughts, @jdries and @soxofaan ? PR is open at #471

… define the parameters that can be submitted suring creation #276

soxofaan · 2022-11-15T15:20:03Z

  "create_parameters": [
    {
      "name": "job_options",

That means that each backend can choose the actual field name of these job options ("job_options" in this example).
Would it be possible to enforce a choice here, e.g. at API level, or at level of federation extension?

m-mohr · 2022-11-15T17:47:01Z

Your example is not ideal, because in the API the job and service options should be top-level, but you've chosen to do group them in one parameter as an object. So yes, this is free to choose and the place where you could standardize this is the openEO Platform API/federation contract. job_options would not be standardized as part of the openEO API.

soxofaan · 2022-11-21T11:06:15Z

in the API the job and service options should be top-level, but you've chosen to do group them in one parameter as an object.

Indeed, that's because it feels much cleaner like that.
For example see openEOPlatform/documentation#53 for a sample of job options we currently support (note that there are more experimental/ad-hoc options and flags not documented there). It doesn't feel very clean or maintainable to put all these at the top level of the request

m-mohr · 2022-11-21T11:08:04Z

Honestly, I don't see a difference between both approaches.

soxofaan · 2022-11-21T11:10:59Z

Another example: if we are talking about "large area" processing in the aggregator, maybe some options are intended towards the aggregator, some other towards the VITO backend, some others towards the EODC backend, etc. In such a context it feels just more maintainable to have some kind of grouping or tree structure to cleanly separate options.

m-mohr · 2022-11-21T11:18:03Z

I don't quite get what your aim is? The top-level approach has been in the API spec since 0.3 or so and your way of grouping is actually supported, but the grouping itself is not standardized. So are we good or what is your expectation?

The aggregator surely can expose them in multiple groups, too (e.g. eodc_job_options, vito_job_options) and merge what is specified the same way. On the other hand, grouping them in the aggregator may also feel weird from a user perspective... But the aggregator thing is also a completely new story to be discussed separately, I think.

soxofaan · 2022-11-24T17:01:39Z

I understand that anything is possible and that a backend can pick any approach to their liking,
but my concern is how to handle that freedom properly in clients and whether differences between backends will be perceived as inconsistent or annoying.

for example, in the python client, Connection.create_job() has an argument additional (a dictionary of job options) which will be grouped under a top-level field job_options in the request. I'm not sure how clean it will be to also allow the non-grouped variant in a non-breaking way.

m-mohr · 2022-11-24T19:08:13Z

That's the issue if implementors choose to not follow the spec and even integrate proprietary stuff into only one of the clients... The "additional" thing has been there since years and now a decision VITO has taken later should not enforce a change in the spec, sorry.

The JS client for example also has an additional flag for the top level fields and the GEE driver uses the top level fields to make resources public.

Open-EO/openeo-api#276

jdries · 2023-02-08T08:18:46Z

We committed a change in the geotrellis backend to also support top-level job_options. This is backward compatible, and once it's rolled out we can also adjust the Python client to prefer that style.

PS: our proposal for specifying job_options predates the clarifications made to the spec:
Open-EO/openeo-geopyspark-driver@627b380
Meaning, we never intentionally 'choose' not to follow the spec, in fact the spec choose not to follow the only implementation at that time. You even said in this issue that you were open to discussing it when other backends would also require these extra properties, but that never happened. We were even so kind to point out the lack of clear specification in the first place. So case closed for us, but just don't assume that we're somehow implementing changes that intentionally go against the spec.

m-mohr · 2023-02-08T09:00:58Z

@jdries Sorry, I acknowledge that this wasn't bad intention from your side. I'm also always very happy that VITO is very active in pointing out issues and giving feedback. The primary reason for this misunderstanding is that ReDoc earlier showed the "additional properties for batch job" thing in the rendered version nicely and that indicated additional properties should go top-level (through the additionalProperties: true in the openAPI document). At some point the renderer lost this ability and now it doesn't explicitly show this anymore. This is generally an issue as ReDoc doesn't show everything that is formally specified in OpenAPI. Thus the rendered version can never be the only source of truth, unfortunately.

in fact the spec choose not to follow the only implementation at that time.

This is not true. The spec fully allows job_options in the PR and GEE still has an implementation that uses the top-level fields.

Anyway, great that we are coming to a conclusion here so that we can push the open PRs and issues to harmonize this in openEO. Thanks!

* Jobs and services: Added `create_parameters` property in responses to define the parameters that can be submitted suring creation #276 * Add processing options to a separate endpoint #276 * Options should be optional * Move to an extension * Add UDP extension * Apply suggestions from code review Co-authored-by: Stefaan Lippens <[email protected]> * Renamed according to PR review * Clarify where UDPs can come from * Apply suggestions from code review * Add example * Update extensions/processing-parameters/README.md Co-authored-by: Stefaan Lippens <[email protected]> --------- Co-authored-by: Stefaan Lippens <[email protected]>

m-mohr closed this as completed Apr 23, 2020

m-mohr added this to the future milestone Apr 23, 2020

This was referenced Jun 25, 2022

Support adding additional parameters to jobs, services etc. Open-EO/openeo-web-editor#266

Open

Support adding additional parameters to jobs, services etc. Open-EO/openeo-r-client#125

Closed

m-mohr reopened this Nov 15, 2022

m-mohr modified the milestones: future, 1.2.0 Nov 15, 2022

m-mohr added job management incl. /result service management labels Nov 15, 2022

m-mohr self-assigned this Nov 15, 2022

m-mohr added a commit that referenced this issue Nov 15, 2022

Jobs and services: Added create_parameters property in responses to…

0de6500

… define the parameters that can be submitted suring creation #276

m-mohr mentioned this issue Nov 15, 2022

Processing Parameters Extension #276 #545 #471

Merged

m-mohr linked a pull request Nov 15, 2022 that will close this issue

Processing Parameters Extension #276 #545 #471

Merged

m-mohr added a commit that referenced this issue Nov 16, 2022

Add processing options to a separate endpoint #276

e3265f7

m-mohr modified the milestones: 1.2.0, 1.3.0 Dec 21, 2022

jdries added a commit to Open-EO/openeo-python-driver that referenced this issue Feb 7, 2023

support retrieving job_options inline if not specified separately

b00132f

Open-EO/openeo-api#276

jdries mentioned this issue Feb 7, 2023

support retrieving job_options inline if not specified separately Open-EO/openeo-python-driver#165

Merged

m-mohr modified the milestones: 1.3.0, 1.2.0 Feb 10, 2023

m-mohr added work in progress and removed work in progress labels Apr 5, 2023

m-mohr modified the milestones: 1.2.0, 1.3.0 May 2, 2023

jdries assigned soxofaan Aug 1, 2023

soxofaan removed their assignment Aug 2, 2023

soxofaan mentioned this issue Sep 4, 2024

Extra endpoint that shows default job options Open-EO/openeo-python-driver#307

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch job options/configuration settings #276

Batch job options/configuration settings #276

soxofaan commented Apr 23, 2020

m-mohr commented Apr 23, 2020 •

edited

Loading

soxofaan commented Apr 23, 2020

soxofaan commented Apr 23, 2020

m-mohr commented Apr 23, 2020 •

edited

Loading

m-mohr commented Apr 23, 2020 •

edited

Loading

soxofaan commented Apr 23, 2020

m-mohr commented Nov 15, 2022 •

edited

Loading

m-mohr commented Nov 15, 2022 •

edited

Loading

soxofaan commented Nov 15, 2022

m-mohr commented Nov 15, 2022 •

edited

Loading

soxofaan commented Nov 21, 2022

m-mohr commented Nov 21, 2022

soxofaan commented Nov 21, 2022

m-mohr commented Nov 21, 2022 •

edited

Loading

soxofaan commented Nov 24, 2022

m-mohr commented Nov 24, 2022 •

edited

Loading

jdries commented Feb 8, 2023

m-mohr commented Feb 8, 2023 •

edited

Loading

Batch job options/configuration settings #276

Batch job options/configuration settings #276

Comments

soxofaan commented Apr 23, 2020

m-mohr commented Apr 23, 2020 • edited Loading

soxofaan commented Apr 23, 2020

soxofaan commented Apr 23, 2020

m-mohr commented Apr 23, 2020 • edited Loading

m-mohr commented Apr 23, 2020 • edited Loading

soxofaan commented Apr 23, 2020

m-mohr commented Nov 15, 2022 • edited Loading

m-mohr commented Nov 15, 2022 • edited Loading

soxofaan commented Nov 15, 2022

m-mohr commented Nov 15, 2022 • edited Loading

soxofaan commented Nov 21, 2022

m-mohr commented Nov 21, 2022

soxofaan commented Nov 21, 2022

m-mohr commented Nov 21, 2022 • edited Loading

soxofaan commented Nov 24, 2022

m-mohr commented Nov 24, 2022 • edited Loading

jdries commented Feb 8, 2023

m-mohr commented Feb 8, 2023 • edited Loading

m-mohr commented Apr 23, 2020 •

edited

Loading

m-mohr commented Apr 23, 2020 •

edited

Loading

m-mohr commented Apr 23, 2020 •

edited

Loading

m-mohr commented Nov 15, 2022 •

edited

Loading

m-mohr commented Nov 15, 2022 •

edited

Loading

m-mohr commented Nov 15, 2022 •

edited

Loading

m-mohr commented Nov 21, 2022 •

edited

Loading

m-mohr commented Nov 24, 2022 •

edited

Loading

m-mohr commented Feb 8, 2023 •

edited

Loading