Skip to content

Added custom timeout to ci #7641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kitsiosk
Copy link

@kitsiosk kitsiosk commented Jul 2, 2025

Change Summary

Added custom timeout for build_mingw job of the CI-mingw workflow based on historical data.

More details

Over the last 7456 successful runs, the build_mingw job has a maximum runtime of 13 minutes (mean=4, std=2).

However, there are failed runs that fail after reaching the threshold of 6 hours that GitHub imposes. In other words, these jobs seem to get stuck, possibly for external or random reasons.

One such example is this job run, that failed after 6 hours. More stuck jobs have been observed over the last six months, the first one on 2025-01-01 and the last one one on 2025-04-09, while more recent occurences are also possible because our dataset has a cutoff date around late May. With the proposed changes, a total of 45 hours would have been saved over the last six months retrospectively, clearing the queue for other workflows and speeding up the CI of the project, while also saving resources in general.

The idea is to set a timeout to stop jobs that run much longer than their historical maximum, because such jobs are probably stuck and will simply fail with a timeout at 6 hours.

Our PR proposes to set the timeout to max + 3*std = 19 minutes where max and std (standard deviation) are derived from the history of 7456 successful runs. This will provide sufficient margin if the workflow gets naturally slower in the future, but if you would prefer lower/higher threshold we would be happy to do it.

Context

Hi,

We are a team of researchers from University of Zurich and we are currently working on energy optimizations in GitHub Actions workflows.

Thanks for your time on this.

Feel free to let us know (here or in the email below) if you have any questions, and thanks for putting in the time to read this.

Best regards,
Konstantinos Kitsios
[email protected]

@danmar
Copy link
Owner

danmar commented Jul 3, 2025

it sounds good to me.

@kitsiosk
Copy link
Author

kitsiosk commented Jul 3, 2025

The CI failures seem unrelated to the changes, they are in a different yaml file that the one I changed. Is this something intermittent?

@danmar
Copy link
Owner

danmar commented Jul 4, 2025

Is this something intermittent?

I guess so I saw similar intermittent problems before.

@chrchr-github
Copy link
Collaborator

Seems related to #7433 somehow

@kitsiosk
Copy link
Author

kitsiosk commented Jul 7, 2025

Seems related to #7433 somehow

Indeed, it's the same library but not the same error.

Should I try commenting out the changes to see if it still fails?

@chrchr-github
Copy link
Collaborator

Should I try commenting out the changes to see if it still fails?

That might be helpful as a sanity check.

@danmar
Copy link
Owner

danmar commented Jul 9, 2025

Indeed, it's the same library but not the same error.

I saw another PR with the same QPair problem a week ago and that CI was solved when I rerun the CI.

@chrchr-github
Copy link
Collaborator

Indeed, it's the same library but not the same error.

I saw another PR with the same QPair problem a week ago and that CI was solved when I rerun the CI.

We're up to 8 reruns now, and the failure persists... Elsewhere the CI runs fine.

@kitsiosk
Copy link
Author

kitsiosk commented Jul 9, 2025

Should I try commenting out the changes to see if it still fails?

That might be helpful as a sanity check.

Hi @chrchr-github, just did this, let's see what happens. I think you need to approve the workflows to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants