You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
resource limiting: complete rework to handle context cancelation and native rate-limiting on a template (#196)
This introduces a complete rework of resource usage by implementing
contexts, so the lock can be cancelled, in case the lock takes too long
to be obtained, or if the instance is shutting-down.
It also introduces a native resource on all resolution based on the
template name: it allows to reduce the concurrent execution of a given
template, or to completely stop the execution of a template if it behave
incorrectly.
Signed-off-by: Romain Beuque <[email protected]>
Copy file name to clipboardexpand all lines: README.md
+42-6
Original file line number
Diff line number
Diff line change
@@ -356,6 +356,7 @@ Note that the operators `IN` and `NOTIN` expect a list of acceptable values in t
356
356
- `dependencies`: a list of step names on which this step waits before running
357
357
- `custom_states`: a list of personnalised allowed state for this step (can be assigned to the state's step using `conditions`)
358
358
- `retry_pattern`: (`seconds`, `minutes`, `hours`) define on what temporal order of magnitude the re-runs of this step should be spread (default = `seconds`)
359
+
- `resources`: a list of resources that will be used during the step execution, to control and limit the concurrent execution of the step (more information in [the resources section](#resources)).
Resources are a way to restrict the concurrency factor of certain operations, to control the throughput and avoid dangerous behavior e.g. flooding the targets.
596
+
597
+
High level view:
598
+
599
+
- For each action to execute, a list of target `resources` is determined. (see later)
600
+
- In the µTask configuration, numerical limits can be set to each _resource_ label. This acts as a semaphore, allowing a certain number of concurrent slots for the given _resource_ label. If no limit is set for a resource label, the previously mentionned target resources have no effect. Limits are declared in the `resource_limits` property.
601
+
602
+
The target _resources_ for a step can be defined in its YAML definition, using the `resources` property.
603
+
604
+
```yaml
605
+
steps:
606
+
foobar:
607
+
description: A dummy step, that should not execute in parallel
608
+
resources: ["myLimitedResource"]
609
+
action:
610
+
type: echo
611
+
configuration:
612
+
output:
613
+
foobar: fuzz
614
+
```
615
+
616
+
Alternatively, some target resources are determined automatically by µTask Engine:
617
+
618
+
- When a task is run, the resource `template:my-template-name` is used automatically.
619
+
- When a step is run, the plugin in charge of the execution automatically generates a list of resources. This includes generic resources such as `socket`, `url:www.example.org`, `fork`...
620
+
allowing the µTask administrator to set-up generic limits such as `"socket": 48` or `"url:www.example.org": 1`.
621
+
622
+
Each builtin plugins declares resources which can be discovered using the _README_ of the plugin (example for [_http_ plugin](./pkg/plugins/builtin/script/README.md#Resources)).
623
+
624
+
Declared `resource_limits` must be positive integers. When a step is executed, if the number of concurrent executions is reached, the µTask Engine will wait for a slot to be released. If the resource is limited to the `0` value, then the step will not be executed and is set to `TO_RETRY` state, it will be run once the instance allows the execution of its resources. The default time that µTask Engine will wait for a resource to become available is `1 minute`, but it can be configured using the `resource_acquire_timeout` property.
625
+
590
626
### Task templates validation
591
627
592
628
A JSON-schema file is available to validate the syntax of task templates, it's available in `hack/template-schema.json`.
Copy file name to clipboardexpand all lines: engine/engine.go
+42-14
Original file line number
Diff line number
Diff line change
@@ -36,8 +36,8 @@ var (
36
36
engEngine
37
37
38
38
// Used for stopping the current Engine
39
-
stopRunningStepschanstruct{}
40
-
gracePeriodEndchanstruct{}
39
+
shutdownCtx context.Context
40
+
gracePeriodEndchanstruct{}
41
41
)
42
42
43
43
// Engine is the heart of utask: it is the active process
@@ -97,13 +97,11 @@ func Init(ctx context.Context, wg *sync.WaitGroup, store *configstore.Store) err
97
97
}
98
98
99
99
// channels for handling graceful shutdown
100
-
stopRunningSteps=make(chanstruct{})
100
+
shutdownCtx=ctx
101
101
gracePeriodEnd=make(chanstruct{})
102
102
eng.wg=wg
103
103
gofunc() {
104
-
<-ctx.Done()
105
-
// Stop running new steps
106
-
close(stopRunningSteps)
104
+
<-shutdownCtx.Done()
107
105
108
106
// Wait for the grace period to end
109
107
time.Sleep(3*time.Second)
@@ -215,14 +213,43 @@ func (e Engine) launchResolution(publicID string, async bool, sm *semaphore.Weig
215
213
216
214
res.Values.SetConfig(e.config)
217
215
218
-
// all ready, run remaining steps
219
-
220
-
utask.AcquireExecutionSlot()
221
-
216
+
// check if all resources are available before starting the resolution
217
+
// first, check if we have a custom semaphore, for example, a semaphore that limits the concurrent execution of tasks recovery from a crashed instance.
218
+
// This semaphore needs to go first, because it will always be smaller than the global execution pool.
222
219
ifsm!=nil {
223
-
sm.Acquire(context.Background(), 1)
220
+
iferr:=sm.Acquire(shutdownCtx, 1); err!=nil {
221
+
debugLogger.Debugf("Engine: launchResolution() %s acquire resource: instance is shutting down", res.PublicID)
222
+
returnnil, errors.New("instance is shutting down")
223
+
}
224
+
}
225
+
// second, check if we have a resource limit on the current template
226
+
// template could be completely deactivated as a "dead resource". If it's the case, we need to exit because
227
+
// the limit won't change until next instance's reboot.
0 commit comments