Skip to content

Conversation

BacLuc
Copy link
Contributor

@BacLuc BacLuc commented Jul 31, 2025

Since a long time i am unhappy with the the way we set the variables in our helm deployment for ecamp3.
e.g.:

            --set imageTag=${{ github.sha }} \
            --set frontend.image.repository='docker.io/${{ vars.DOCKER_HUB_USERNAME }}/ecamp3-frontend' \
            --set print.image.repository='docker.io/${{ vars.DOCKER_HUB_USERNAME }}/ecamp3-print' \
            --set api.image.repository='docker.io/${{ vars.DOCKER_HUB_USERNAME }}/ecamp3-api' \
            --set apiCache.image.repository='docker.io/${{ vars.DOCKER_HUB_USERNAME }}/ecamp3-varnish' \
            --set postgresql.dbBackupRestoreImage.repository='docker.io/${{ vars.DOCKER_HUB_USERNAME }}/ecamp3-db-backup-restore' \
            --set termsOfServiceLinkTemplate='https://ecamp3.ch/{lang}/tos' \
            --set newsLink='https://ecamp3.ch/blog' \
            --set helpLink='https://ecamp3.ch/faq' \
            --set domain=${{ env.domain }} \

The problems:

  • It is difficult to read
  • It is error prone during refactoring (renaming or moving a key)
  • It is currently duplicated and the staging workflow can more or less only be tested in staging
  • you can make a typo for the var/secret and a typo for the variable yamlPath
  • If you don't do proper null handling you accidentally overwrite the default value
  • If you forget the quotes and you have a multi line secret you have an error
  • If you forget the backslash at the end of the line you have an error
  • If you have a backslash on the last line you have an error

The cons against helmfile and "changing something"

  • it works and we don't change it much
  • with helmfile we have another abstraction more

My vision:
We have a deploy script for all resources we deploy (currently all with helm, and then with helmfile if needed)
which can be run locally to develop.
The same script is executed for all environments, so it can be tested already in development and in the feature branch deployment.
The script has a diff mode where you just see what would be changed, so have the ci jobs.
We don't run the tests for the staging and prod deployment, they already ran on devel.
To add a new configurable part to the helmfile you just need to add it in the helmfile, you don't have to repeat the yamlPath in the CI script.

Default values:
the default values are if possible:

  1. In the application code
  2. In the Containerfile

If the dev values differ from prod
then the default value for prod is in the values file template.

For dev, feature branch deployments and staging the values to change have to be overridden.
(it would be possible to have multiple sets of default values, but i don't think we need that.)

This here is just the example for ecamp3-logging

@BacLuc BacLuc added the Meeting Discuss Am nächsten Core-Meeting besprechen label Jul 31, 2025
@BacLuc BacLuc requested a review from a team July 31, 2025 19:09
@BacLuc BacLuc had a problem deploying to ecamp3-logging-dev July 31, 2025 19:10 — with GitHub Actions Error
@BacLuc BacLuc force-pushed the proposal-helmfile branch from 156664a to 16d5d30 Compare July 31, 2025 19:32
@manuelmeister
Copy link
Member

Core Meeting Decision

Concept approved!

@manuelmeister manuelmeister removed the Meeting Discuss Am nächsten Core-Meeting besprechen label Aug 11, 2025
Copy link
Member

@carlobeltrame carlobeltrame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the meeting, if this works well it'd be good to apply the same mechanism to all our deployments for consistency.

helm repo add fluent https://fluent.github.io/helm-charts
helm repo update
```
You need the helmfile in addition to kubectl and helm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instructions or a link on how to install the helmfile command might be useful. But maybe the wiki is a better place for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fi

envsubst < $SCRIPT_DIR/values.yaml > $SCRIPT_DIR/values.out.yaml
echo "ELASTIC_NODE_STORAGE_SIZE can not be automatically enlarged, see: https://github.com/kubernetes/enhancements/pull/4651 and https://github.com/kubernetes/kubernetes/issues/68737"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now we always output this warning, but what can be done about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a warning that you hopefully find if you accidentally changed the size and now the statefulset does not come up again.

@@ -2,8 +2,8 @@ fluent-operator:
containerRuntime: containerd
operator:
annotations:
trigger-recreate: ${RANDOM_STRING}
container:
trigger-recreate: {{ exec "uuidgen" (list) }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand any of the changes in this file. Care to explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator is implemented well enough to survive in dev and to recreate the fluentd and fluentbits when we deploy.
In prod in starts to crashloop after a while.
If we deploy then, the fluentbits and fluentd don't get recreated.
This is still a hack to recreate the deployment for the operator after the deployment.
Now everything is in one place, before we created a random string in shell and then injected that with envsubst.

@BacLuc BacLuc requested a review from a team August 17, 2025 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants