-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High CPU utilization causing kubernetes pod scaling with ddtrace > 2.3.0 #9447
Comments
Thank you for reporting this, @hemantgir. Could you share all relevant environment variables set in the app environment? This will help us understand what bits of Datadog functionality are enabled and disabled in this case. |
Thank you for your response. Please find the list of environment variables below: DD_DBM_PROPAGATION_MODE : disabled |
Did you ever figure this out? |
Hi there, I am impacted by this issue as well - Python service running on Kubernetes. Tried 2.14.2, 2.10.0, 2.9.2 - All these versions caused the initial cpu spike. Any updates on this? Pretty much blocks us from upgrading ddtrace any further. |
Accidentally close this issue and i don't have permission to reopen this. |
What Python version do you use? |
We are using python "3.10.14". We were seeing very minor cpu spikes until we upgraded from 2.7.2 -> 2.14.2, 2.10.0, 2.9.2. After which the spike was much bigger and stayed for much longer. 2.8.0, 2.8.1 sent it back to 2.7.2 levels |
Also seeing this after going from 2.7.4 -> 2.21.0 |
in case it's connected, I made this bug the other day #12370 |
Summary of problem
We have noticed that after upgrading ddtrace to any version above 2.3.0, results in a significant increase in CPU utilization, which leds to the maximum number of replicas being deployed.
For instance, our Kubernetes application is configured with an auto-scaling limit of 36 maximum replicas. Prior to the upgrade, our stage environment would typically use only 6-8 pods while idle. However, post-upgrade, we are reaching the upper limit of 36 replicas.
This unexpected behavior suggests that there may be a spike in resource usage introduced in versions above 2.3.0. We would like to understand the cause of this increased resource consumption and seek a solution to optimize it.
Additionally, updated
datadog_lambda==5.83.0
to be compatible withddtrace==2.3.0
version.( Maybe a red herring - we also noticed calls to
POST /telemetry/proxy/api/v2/apmtelemetry
increase on versions above 2.3.0 ).Datadog screenshots (Kubernetes pods are in idle state):

on ddtrace 2.7.5:
sum:kubernetes_state.deployment.replicas_available{env:... ,service:...}
APM

POST /telemetry/proxy/api/v2/apmtelemetry
on ddtrace 2.3.0:

sum:kubernetes_state.deployment.replicas_available{env:... ,service:...}
APM

POST /telemetry/proxy/api/v2/apmtelemetry
Which version of dd-trace-py are you using?
Originally had bumped to 2.7.5, but now downgraded to 2.3.0. Have also tried with latest 2.8.5.
Which version of pip are you using?
pip 24.0
Spike with:
Any version above ddtrace 2.3.0
pip freeze
How can we reproduce your problem?
I'm not sure how you can replicate the issue from your end. We are utilizing Datadog tools, and we have established metrics that continuously monitor and provide results whether in an idle or running.
What is the result that you get?
High CPU utilization causing Kubernetes pod scaling upto the max replicas even in idle condition, on ddtrace > 2.3.0.
What is the result that you expected?
CPU utilization and Kubernetes pod scaling only as much as required, on ddtrace > 2.3.0
The text was updated successfully, but these errors were encountered: