Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(AP-6075): Add publisher & consumer level message deduplication #236

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

Drodevbar
Copy link
Collaborator

@Drodevbar Drodevbar commented Jan 10, 2025

This PR adds publisher-level and consumer-level message deduplication mechanisms.

Changelog:

  1. add redis-message-deduplication-store package as ioredis-based deduplication storage (utilizes redis-sempaphore under the hood)
  2. adjust SQS & SNS publishers and SNS publisher manager to support publisher-level deduplication
  3. adjust SQS consumer to support consumer-level deduplication

@Drodevbar Drodevbar force-pushed the feat/AP-6075-producer-level-dedup branch 2 times, most recently from 3e77c00 to e49d745 Compare January 10, 2025 14:13
@Drodevbar Drodevbar force-pushed the feat/AP-6075-producer-level-dedup branch from e49d745 to 68266a6 Compare January 10, 2025 14:21
@Drodevbar Drodevbar force-pushed the feat/AP-6075-producer-level-dedup branch from 275b919 to 470cd49 Compare January 13, 2025 14:49
@Drodevbar Drodevbar force-pushed the feat/AP-6075-producer-level-dedup branch from 470cd49 to 972e853 Compare January 13, 2025 14:53
@Drodevbar Drodevbar force-pushed the feat/AP-6075-producer-level-dedup branch from 017adad to c770cd8 Compare January 13, 2025 20:16
@Drodevbar Drodevbar changed the title feat(AP-6075): [WIP] Add producer level message deduplication feat(AP-6075): Add producer level message deduplication Jan 14, 2025
@Drodevbar Drodevbar marked this pull request as ready for review January 14, 2025 10:05
@Drodevbar Drodevbar requested a review from kibertoad January 14, 2025 10:06
@Drodevbar Drodevbar requested a review from eduard-fa January 14, 2025 11:54
@Drodevbar Drodevbar force-pushed the feat/AP-6075-producer-level-dedup branch from 688bbcc to 8aa1a12 Compare January 15, 2025 13:00
@Drodevbar Drodevbar force-pushed the feat/AP-6075-producer-level-dedup branch from 8aa1a12 to 1344903 Compare January 15, 2025 13:09
@Drodevbar Drodevbar changed the title feat(AP-6075): Add producer level message deduplication feat(AP-6075): Add publisher level message deduplication Jan 16, 2025
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
messageTypeToConfigMap: Record<string, TConfig>
}

export enum ConsumerMessageDeduplicationKeyStatus {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if message fails with "Error, do not retry"? (e. g. payload cannot be parsed or does not pass validation).

Are we / should we mark the message as PROCESSED?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed 2., 3. and 4. scenarios from the below list:

  1. message validation fails and we remove the message from the queue
  2. retryLater and message should be retried
  3. retryLater and message should not be retried (due to exceeding the retries)
  4. error during processing the message

In cases 2., 3. and 4. I'm releasing the lock before changing the message status to allow further processing. In the case of 1, at this stage, the lock has not yet been acquired, so we don't need to do anything specific.

Hope it makes sense 🙌

@Drodevbar Drodevbar requested a review from kibertoad January 29, 2025 18:28
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@@ -248,6 +274,11 @@ export abstract class AbstractQueueService<
? // @ts-ignore
message[this.messageTypeField]
: undefined
const messageDeduplicationId =
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to reduce probability of errors, if (consumer)publisherDeduplication is enabled, and deduplicationId field is not present, I'd throw an error

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kibertoad Even if publisher/consumer-level deduplication is enabled, it's still possible to skip it by omitting the deduplication ID. This is by design as our publishers support multiple message types, so if dedup is enabled on the publisher level, not all message types will most likely support it.

Previously, this was being configured per message type on the consumer/publisher/manager level, but I've adjusted it to pass it as message metadata as discussed there with Krzysztof and Eduards: #236 (comment)

Let me know if it makes sense.

Drodevbar and others added 2 commits February 4, 2025 10:21
Co-authored-by: Igor Savin <[email protected]>
Co-authored-by: Igor Savin <[email protected]>
@Drodevbar Drodevbar requested a review from kibertoad February 4, 2025 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants