You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug description
When Tutor is commanded to perform a K8S job (which uses the K8sTaskRunner runner), it validates that other jobs have been completed. The problem with this is that it validates the state of ALL the jobs in the namespace (not only those related to Tutor). If a non-related to Tutor job is running in the namespace, it will delay the execution of the Tutor K8S job till its completion. This is not an ideal situation in a namespace where additional jobs are running independently from Tutor.
A possible solution is to use job selectors so Tutor is able to validate that jobs related to Tutor have been completed.
How to reproduce
I used Minikube to test this behavior (used K8S 1.31 version, same kubectl version).
Create a new namespace in the cluster.
Install Tutor, create a new Tutor environment and run tutor config save.
Change the kubectl context to point to the namespace you just created in the cluster.
Create an infinite-running job in the namespace. This can be something like:
apiVersion: batch/v1kind: Jobmetadata:
name: infinite-jobspec:
template:
spec:
restartPolicy: Never # Can also be "OnFailure"containers:
- name: infinite-loopimage: busyboxcommand: ["sh", "-c", "while true; do echo 'Running...'; sleep 60; done"]backoffLimit: 0
Initialize tutor in K8S by running tutor k8s launch -I
Check that the services initialization does not start due to the existence of the previously created job:
Delete the infinite-running job, once this is done, the K8S initialization can be executed with no issues
The text was updated successfully, but these errors were encountered:
Bug description
When Tutor is commanded to perform a K8S job (which uses the K8sTaskRunner runner), it validates that other jobs have been completed. The problem with this is that it validates the state of ALL the jobs in the namespace (not only those related to Tutor). If a non-related to Tutor job is running in the namespace, it will delay the execution of the Tutor K8S job till its completion. This is not an ideal situation in a namespace where additional jobs are running independently from Tutor.
A possible solution is to use job selectors so Tutor is able to validate that jobs related to Tutor have been completed.
How to reproduce
I used Minikube to test this behavior (used K8S 1.31 version, same kubectl version).
tutor config save
.tutor k8s launch -I
The text was updated successfully, but these errors were encountered: