You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Several ZGW-services communicate with each other. The most common (and potentially problematic) example: open-zaak / objecten-api calling open-notificaties.
I've seen several times now that when there is a misconfiguration or when there is a semi-large load, things start to break down. Response times start to increase, rest calls failing and database usage is maxing out.
Recently, I experienced a misconfiguration of open-zaak where the authentication for open-notifications was incorrect. Thus, resulting in a HTTP 403 for every notification being sent. Open-zaak kept trying to send notifications with about 10K calls per minute. In turn, this lead to the database use being 100% continuously (not sure why, probably for each call there is some sort of database check?). All systems started to degrade in performance and functionality.
I suggest that all ZGW components introduce some kind of back pressure and ideally even a circuit breaker to prevent the overload of components. Having a circuit breaker will also help with alerting on problems before the impact becomes so large that end-users start to complain.
Toegevoegde waarde / Added value
better monitoring capabilities
preventing cascading failures after a component fails
Aanvullende opmerkingen / Additional context
No response
The text was updated successfully, but these errors were encountered:
Thema / Theme
Objecten API
Omschrijving / Description
Several ZGW-services communicate with each other. The most common (and potentially problematic) example: open-zaak / objecten-api calling open-notificaties.
I've seen several times now that when there is a misconfiguration or when there is a semi-large load, things start to break down. Response times start to increase, rest calls failing and database usage is maxing out.
Recently, I experienced a misconfiguration of open-zaak where the authentication for open-notifications was incorrect. Thus, resulting in a HTTP 403 for every notification being sent. Open-zaak kept trying to send notifications with about 10K calls per minute. In turn, this lead to the database use being 100% continuously (not sure why, probably for each call there is some sort of database check?). All systems started to degrade in performance and functionality.
I suggest that all ZGW components introduce some kind of back pressure and ideally even a circuit breaker to prevent the overload of components. Having a circuit breaker will also help with alerting on problems before the impact becomes so large that end-users start to complain.
Toegevoegde waarde / Added value
Aanvullende opmerkingen / Additional context
No response
The text was updated successfully, but these errors were encountered: