Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API usage notifications issues with missing alerts at lower thresholds #5069

Open
zachaysan opened this issue Jan 31, 2025 · 0 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@zachaysan
Copy link
Contributor

Presently we have intermittent failures to send API usage notifications to users. Though theoretically a threshold could be skipped if a user has a large amount of traffic within a 12 hour window (which is how often the task runs) we have far too many people that skip threshold and an investigation of their API usage through the charts that connect to InfluxDB has shown us many examples of failed delivery.

For an incomplete list of organisations with failures consider the following query that can be run on Metabase:

SELECT
  "public"."organisations_organisationapiusagenotification"."id" AS "id",
  "public"."organisations_organisationapiusagenotification"."percent_usage" AS "percent_usage",
  "public"."organisations_organisationapiusagenotification"."notified_at" AS "notified_at",
  "public"."organisations_organisationapiusagenotification"."created_at" AS "created_at",
  "public"."organisations_organisationapiusagenotification"."updated_at" AS "updated_at",
  "public"."organisations_organisationapiusagenotification"."organisation_id" AS "organisation_id"
FROM
  "public"."organisations_organisationapiusagenotification"
WHERE organisation_id NOT IN (SELECT organisation_id FROM "public"."organisations_organisationapiusagenotification" WHERE percent_usage = 75)
ORDER BY
  "public"."organisations_organisationapiusagenotification"."organisation_id" DESC

Locations of the Code

See the handle_api_usage_notifications task in the organisations/tasks.py module and the handle_api_usage_notification_for_organisation function under the organisations/task_helpers.py module. Both of these modules have been carefully gone through by hand including pulling logs from AWS but the location of the offending code has yet to be found.

We have also gone through the task status and verified that the task runs are not timing out.

The Goal

The goal to closing this ticket is to figure out why the smaller thresholds are not firing correctly. Since API usage notifications run many systems which include billing customers automatically for API usage overages, it is important to get to the bottom of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants