Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail gracefully #194

Open
Veldhoen opened this issue Dec 20, 2024 · 6 comments
Open

Fail gracefully #194

Veldhoen opened this issue Dec 20, 2024 · 6 comments
Assignees

Comments

@Veldhoen
Copy link
Member

Veldhoen commented Dec 20, 2024

Currently, the ASR task in Airflow fails every so often when the Whisper service gets evicted or OOMKilled. The triggerer will inquire about the status of a task ID that does not exist in the newly started Whisper service pod.
It would be nice if we can prevent this somehow, so that the same resources doesn't have to be processed for ASR again.
We should think about a good way to go about this. One proposal that we might discuss:

On receiving sigterm, stay alive until the task has finished AND Airflow trigger has been informed (that is: the API has sent a Status.DONE: status.HTTP_200_OK response upon a @api.get("/tasks/{task_id}") request.

@greenw0lf
Copy link
Collaborator

@Veldhoen should the proposed solution be implemented? Or is this still up for debate?

@Veldhoen
Copy link
Member Author

I think it's open for debate so if you have ideas about it: shoot :)

@gb-beng
Copy link
Contributor

gb-beng commented Jan 16, 2025

Related: #139

@greenw0lf
Copy link
Collaborator

@greenw0lf greenw0lf self-assigned this Jan 21, 2025
@greenw0lf
Copy link
Collaborator

UPDATE: The shutdown event for FastAPI blocks all incoming requests, therefore we cannot wait for a get request for the current task to be received before shutting down the app. So, the alternative, for now, is to let the current task finish, then program something in the Airflow triggerer that checks in S3 whether a specific program has been transcribed already or not.

@greenw0lf
Copy link
Collaborator

This was also discussed with Sara and, for now, we don't see solutions for this issue that could be implemented in the worker and not in Airflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants