-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default otel reporter timeout of 900s is too long. #1670
Comments
I agree it is probably a bit too much but reducing it should not be the solution to services freezing at shutdown. At shutdown, exporters/processors should be given a grace-period to shutdown cleanly and if they don't then the pipeline should force them to shutdown and exit. We should have a different issue for that. |
@owais |
I think these are two separate issues.
|
I think we can address both. Java has default timeout to be 10 seconds, maybe we can do the same? |
Related #346 |
SGTM |
@lzchen OTLP python also has the default timeout 10 seconds for Export. Lines 183 to 187 in 24edd3d
The 900 secs is to deal with the transient errors by using exponential back-off retry strategy. This might still be too long but just want to make sure we are not confusing one with other. Lines 247 to 298 in 24edd3d
|
@lonewolf3739 |
900 secs is too high for maximum backoff time. Generally it is either 32 or 64 seconds and sometimes can be little high based on use case. I couldn't find what is the recommended maximum value by OTLP but I assume it wouldn't be as high 900 seconds. Probably we can raise an issue on the spec repo. |
Thanks for digging into this @lonewolf3739. Agree we should clarify with spec or at least with other SIG maintainers. And we should probably have separate tickets for possibly reducing exp-backoff max time and add grace period to exporters on shutdown. |
This issue was marked stale due to lack of activity. It will be closed in 30 days. |
I brought this up in spec sig meeting sometime back and they agreed this is too long but wouldn't recommend any value now as it requires some work to do. So as of now it is upto language implementations to decide. |
@lonewolf3739 |
Until the spec is updated it is upto each language SIG to make decision on what to do. |
@lonewolf3739 |
Yes, I think that is very reasonable value. |
Assigning self to address this and related issues. |
Please let me know on when is this planned to be changed, ready to help with this change. |
Since this timeout can occur when a process is closing/exiting it can mean a significant amount of time when the process is just retrying before it ultimately ends.
For example, it's easy to misconfigure the target service e.g.:
which can cause processes to easily hang at exit (and print nothing by default).
The text was updated successfully, but these errors were encountered: