Skip to content

[zuul] Add logging config and tests to functional-autoscaling-tests-osp18 #273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

elfiesmelfie
Copy link
Collaborator

@elfiesmelfie elfiesmelfie commented May 30, 2025

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0abffecca8bd45d6b5c9a785d549b469

✔️ openstack-k8s-operators-content-provider SUCCESS in 3h 27m 31s
functional-autoscaling-tests-osp18 TIMED_OUT in 3h 13m 18s
✔️ functional-logging-tests-osp18 SUCCESS in 1h 15m 05s
functional-graphing-tests-osp18 FAILURE in 1h 40m 29s (non-voting)

@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from ed8ff1a to bf6e661 Compare June 3, 2025 17:29
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/e2cc7665f2784dadb74303bc395ee688

✔️ openstack-k8s-operators-content-provider SUCCESS in 3h 28m 57s
functional-autoscaling-tests-osp18 TIMED_OUT in 3h 13m 22s
✔️ functional-logging-tests-osp18 SUCCESS in 1h 15m 53s
functional-graphing-tests-osp18 FAILURE in 1h 17m 13s (non-voting)

@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from bf6e661 to 820ee28 Compare June 4, 2025 18:43
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/442ed5cd2f1a4ce8a358b04f681ec814

functional-autoscaling-tests-osp18 TIMED_OUT in 1h 13m 11s
✔️ functional-logging-tests-osp18 SUCCESS in 1h 15m 55s

@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from 820ee28 to 61b60db Compare June 5, 2025 18:28
@vyzigold
Copy link
Contributor

vyzigold commented Jun 5, 2025

Found this while doing my round as CI watcher. Interesting. I wonder what's exactly causing the job to take so long. I took a quick look at the must-gather and it looks like logging is enabled and it might even work as far as I can tell. But autoscaling and metricstorage is disabled now. Looking at https://github.com/openstack-k8s-operators/telemetry-operator/blob/main/ci/vars-logging.yml and https://github.com/openstack-k8s-operators/telemetry-operator/blob/main/ci/vars-autoscaling.yml , both define the cifmw_edpm_prepare_kustomizations variable, which basically causes autoscaling to never get enabled. The kustomizations will either need to manually merge or maybe there is some ci-framework magic, which could take 2 kustomizations and apply both of them.

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/fd047a41935540b685aacead8486df86

functional-autoscaling-tests-osp18 FAILURE in 4h 33m 48s
✔️ functional-logging-tests-osp18 SUCCESS in 1h 12m 06s

@elfiesmelfie
Copy link
Collaborator Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/982813f0ba1a48de8460662a75ba5e7f

✔️ functional-autoscaling-tests-osp18 SUCCESS in 1h 58m 01s
functional-logging-tests-osp18 FAILURE in 1h 04m 09s

@elfiesmelfie
Copy link
Collaborator Author

Found this while doing my round as CI watcher. Interesting. I wonder what's exactly causing the job to take so long. I took a quick look at the must-gather and it looks like logging is enabled and it might even work as far as I can tell. But autoscaling and metricstorage is disabled now. Looking at https://github.com/openstack-k8s-operators/telemetry-operator/blob/main/ci/vars-logging.yml and https://github.com/openstack-k8s-operators/telemetry-operator/blob/main/ci/vars-autoscaling.yml , both define the cifmw_edpm_prepare_kustomizations variable, which basically causes autoscaling to never get enabled. The kustomizations will either need to manually merge or maybe there is some ci-framework magic, which could take 2 kustomizations and apply both of them.

The main issue is that the logging vars have changed, but the logging job has not been updated to deploy metrics storage or autoscaling. This will eventually be addressed in openstack-k8s-operators/telemetry-operator#698.
This PR was the spike part of the work, where I hack it together with whatever workarounds are needed in order to identify the effort needed to do it properly.

@elfiesmelfie elfiesmelfie marked this pull request as draft June 9, 2025 17:51
@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from 61b60db to ef08365 Compare June 13, 2025 12:32
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2b0bd6d62ca944ab9d13a9948904e7b5

functional-autoscaling-tests-osp18 FAILURE in 1h 32m 48s
✔️ functional-logging-tests-osp18 SUCCESS in 1h 13m 24s

@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from ef08365 to 3b38791 Compare June 13, 2025 17:13
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/9b45b588cd9749ff90411c522a15e46c

functional-autoscaling-tests-osp18 RETRY_LIMIT in 3m 18s
✔️ functional-logging-tests-osp18 SUCCESS in 1h 08m 04s

@elfiesmelfie
Copy link
Collaborator Author

recheck

@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch 2 times, most recently from 01d408d to d366758 Compare June 18, 2025 14:42
@elfiesmelfie elfiesmelfie marked this pull request as ready for review June 18, 2025 14:43
Copy link

This change depends on a change that failed to merge.

Change openstack-k8s-operators/telemetry-operator#698 is needed.

@elfiesmelfie
Copy link
Collaborator Author

recheck

@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from d366758 to e8b7dc7 Compare June 20, 2025 12:48
@elfiesmelfie elfiesmelfie enabled auto-merge (squash) June 20, 2025 14:16
@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from e8b7dc7 to 0b63833 Compare June 23, 2025 15:06
@elfiesmelfie elfiesmelfie force-pushed the ci/combine-autoscaling-logging branch from 0b63833 to 71a6c56 Compare June 26, 2025 18:39
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/19451bc59e4a4a12ac2a371fb3c8d5a1

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 53m 53s
functional-autoscaling-tests-osp18 POST_FAILURE in 2h 38m 13s
✔️ functional-logging-tests-osp18 SUCCESS in 1h 08m 15s
functional-graphing-tests-osp18 FAILURE in 1h 08m 53s (non-voting)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants