Skip to content

Commit 38c8a97

Browse files
armenzgpriscilawebdev
authored andcommitted
fix(deletions): Delete seer matched group hash metadata first (#102612)
When deleting a GroupHash row, all GroupHashMetadata rows pointing to it via `seer_matched_grouphash` need updating (see code): https://github.com/getsentry/sentry/blob/698262018e6009759d8562e2da63be749df7c32d/src/sentry/models/grouphashmetadata.py#L115-L118 Before #101720, we would only delete GroupHash rows and that would time out because we would stomp queries longer than 30 seconds. In #101720 we added the deletion of the GroupHashMetadata rows but we should have also added the updating. The new code will have these three stages: ``` GroupHashMetadata.objects.filter(seer_matched_grouphash_id__in=hash_ids).update(seer_matched_grouphash=None) GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete() GroupHash.objects.filter(id__in=hash_ids).delete() ``` Fixes [SENTRY-5ABJ](https://sentry.sentry.io/issues/6930113529/). For posterity, this is the top of the stack trace: ``` OperationalError canceling statement due to user request SQL: UPDATE "sentry_grouphashmetadata" SET "seer_matched_grouphash_id" = NULL WHERE "sentry_grouphashmetadata"."seer_matched_grouphash_id" IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s) ```
1 parent adac080 commit 38c8a97

File tree

2 files changed

+16
-14
lines changed

2 files changed

+16
-14
lines changed

src/sentry/deletions/defaults/group.py

Lines changed: 10 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -258,20 +258,16 @@ def delete_group_hashes(
258258
logger.warning("Error scheduling task to delete hashes from seer")
259259
finally:
260260
hash_ids = [gh[0] for gh in hashes_chunk]
261-
# If we delete the grouphash metadata rows first we will not need to update the references to the other grouphashes.
262-
# If we try to delete the group hashes first, then it will require the updating of the columns first.
263-
#
264-
# To understand this, let's say we have the following relationships:
265-
# gh A -> ghm A -> no reference to another grouphash
266-
# gh B -> ghm B -> gh C
267-
# gh C -> ghm C -> gh A
268-
#
269-
# Deleting group hashes A, B & C (since they all point to the same group) will require:
270-
# * Updating columns ghmB & ghmC to point to None
271-
# * Deleting the group hash metadata rows
272-
# * Deleting the group hashes
273-
#
274-
# If we delete the metadata first, we will not need to update the columns before deleting them.
261+
# GroupHashMetadata rows can reference GroupHash rows via seer_matched_grouphash_id.
262+
# Before deleting these GroupHash rows, we need to either:
263+
# 1. Update seer_matched_grouphash to None first (to avoid foreign key constraint errors), OR
264+
# 2. Delete the GroupHashMetadata rows entirely (they'll be deleted anyway)
265+
# If we update the columns first, the deletion of the grouphash metadata rows will have less work to do,
266+
# thus, improving the performance of the deletion.
267+
if options.get("deletions.group-hashes-metadata.update-seer-matched-grouphash-ids"):
268+
GroupHashMetadata.objects.filter(seer_matched_grouphash_id__in=hash_ids).update(
269+
seer_matched_grouphash=None
270+
)
275271
GroupHashMetadata.objects.filter(grouphash_id__in=hash_ids).delete()
276272
GroupHash.objects.filter(id__in=hash_ids).delete()
277273

src/sentry/options/defaults.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -342,6 +342,12 @@
342342
type=Int,
343343
flags=FLAG_AUTOMATOR_MODIFIABLE,
344344
)
345+
register(
346+
"deletions.group-hashes-metadata.update-seer-matched-grouphash-ids",
347+
default=False,
348+
type=Bool,
349+
flags=FLAG_AUTOMATOR_MODIFIABLE,
350+
)
345351

346352
register(
347353
"deletions.group-history.use-bulk-deletion",

0 commit comments

Comments
 (0)