Skip to content

Deduplicate members on re-import across all importers#131

Merged
SiteRelEnby merged 2 commits into
mainfrom
import-dedup
Jun 11, 2026
Merged

Deduplicate members on re-import across all importers#131
SiteRelEnby merged 2 commits into
mainfrom
import-dedup

Conversation

@SiteRelEnby

Copy link
Copy Markdown
Contributor

What

Adds member deduplication to every importer (PluralKit, SimplyPlural, Tupperbox, PluralSpace, Prism, and the Sheaf native re-import). Re-importing the same export no longer doubles the member list: each incoming member is matched against the system's existing roster before anything is written.

Matching is by PluralKit ID where present (exact, so PK round-trips cleanly) and otherwise by the name blind-index, scoped by is_custom_front so a member and a custom front that happen to share a name never merge. A new conflict_strategy option chooses the behaviour on a match: skip (default, leave the existing member untouched and add nothing), update (overwrite the existing member's importable fields from the export), or create (the old append-everything behaviour, kept as an explicit escape hatch).

The tier member cap now counts only the members an import would actually create, so re-importing into a near-full system no longer trips the cap on members that already exist. New members_skipped / members_updated counts surface on the import detail page (the counts grid renders them automatically). Each import flow gains an "If a member already exists" selector.

Scope

Deduplication is member-scoped. Fronts, groups, journals, messages, polls, and reminders are still appended on re-import, so re-importing those sections over existing data can still duplicate them. Custom-field values are the one dependent section made idempotent here, because the Sheaf native importer also dedupes field definitions and the (field, member) pair has a uniqueness constraint. Broadening dedup to the other dependent sections is tracked as a follow-up.

Notes for review

The update strategy overwrites privacy from the source, so a PluralKit re-import with update would reset a member you had made public back to PluralKit's default. This is consistent with "refresh from source" but is a mild footgun; happy to exclude privacy from the update set if that reads better.

Validation

Shared logic lives in sheaf/services/import_dedup.py with a 13-test unit suite. Added re-import skip/update/create integration tests plus a regression test that the PluralKit member HID lands in pluralkit_id. Full importer suite plus the export/import parity round-trip run green (109 passed); backend ruff clean; frontend type-check, lint, and build clean.

Every importer (PluralKit, SimplyPlural, Tupperbox, PluralSpace, Prism, and the Sheaf native re-import) now matches each incoming member against the system's existing roster before writing, so re-importing the same export no longer doubles the member list.

Matching is by PluralKit ID where present (exact) and otherwise by the name blind-index, scoped by is_custom_front so a member and a custom front sharing a name never collide. A new conflict_strategy option picks the behaviour on a match: skip (default, leave the existing member alone), update (overwrite the existing member's importable fields), or create (the old append-everything behaviour, kept as an escape hatch).

The tier member cap now counts only the members an import would actually create, so re-importing into a near-full system no longer trips the cap on members that already exist. New members_skipped / members_updated counts surface on the import detail page.

Deduplication is member-scoped: fronts, groups, journals, messages, polls, and reminders are still appended on re-import. Shared logic lives in sheaf/services/import_dedup.py with unit coverage, plus re-import skip/update/create integration tests and a regression test that the PluralKit member HID lands in pluralkit_id.
@SiteRelEnby SiteRelEnby enabled auto-merge June 11, 2026 05:16
@SiteRelEnby SiteRelEnby disabled auto-merge June 11, 2026 05:48
@SiteRelEnby SiteRelEnby merged commit 0fa1ed6 into main Jun 11, 2026
2 checks passed
@SiteRelEnby SiteRelEnby deleted the import-dedup branch June 11, 2026 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant