Skip to content

SimplyPlural importer: real-world export robustness + chat import#134

Merged
SiteRelEnby merged 2 commits into
mainfrom
sp-import-format-robustness
Jun 12, 2026
Merged

SimplyPlural importer: real-world export robustness + chat import#134
SiteRelEnby merged 2 commits into
mainfrom
sp-import-format-robustness

Conversation

@SiteRelEnby

Copy link
Copy Markdown
Contributor

Summary

Hardens the SimplyPlural importer against the real-world export variants tidy test fixtures never produce (the shapes that were failing or silently dropping data on live imports), and adds opt-in SP chat-message import. Cross-referenced throughout against the Prism app's SP importer.

Changes

Robustness:

  • Collections read as either a JSON array or a map keyed by id (a map-keyed collection previously crashed the whole job).
  • Full timestamp normaliser: integer/float epoch millis, numeric strings, zone-less ISO strings, and Firebase {_seconds, _nanoseconds} objects. The old int-only converter silently dropped (or crashed on) the other shapes, which lost entire front histories.
  • Variant collection keys: frontStatuses/customFronts, frontHistory with a fronters fallback, system profile under settings or users[0].
  • Avatars built from avatarUuid + owner id (still policy-gated through the existing avatar sanitiser), and 8-hex ARGB colours.
  • Wrong-typed name/description/colour fields are coerced away instead of crashing. The import detail page now also logs what the export contained alongside what imported, so a partial import is diagnosable at a glance.

Chat import (opt-in via a new messages option, off by default since chat can be large):

  • SP multi-channel chat collapses onto the Sheaf system board, each message prefixed by its channel name when more than one channel is present. Authors resolve to imported members, reply threads are rebuilt, and SP <###@id###> mention tokens are rewritten to @name.
  • Reads both export shapes (the messages channel map and the flat chatMessages array) and the sender/timestamp/body field aliases.
  • Legacy bodies still encrypted in SP's old, undocumented format are detected (16-byte base64 IV + non-plaintext base64 content) and skipped with a warning advising a fresh export. No decryption is attempted: the format isn't published and no client decrypts it.

Testing

  • ruff check sheaf/ passes
  • cd web && npm run lint && npx tsc --noEmit passes (no frontend changes)
  • Existing tests pass (SP/TB runner, import API, SP gap bundle, export/import parity)
  • New tests added (map collections, timestamp shapes, variant keys, avatar/ARGB, malformed types, chat import in both shapes, encryption detection, reply chains, mentions, channel prefix)
  • Tested manually (covered by the automated suite)

Security / privacy impact

Article 9 data. The chat-message encryption handling is detect-and-skip only: no decryption, and no message content is ever quoted into the import event log (which admins can view). Imported avatar URLs route through the existing sanitize_external_avatar_url policy gate, so the avatarUuid construction can't bypass the hotlink/scheme rules. Wrong-typed values are dropped rather than interpolated into error text, so a malformed field can't leak member content into a log.

Screenshots

No UI changes.

Cross-referenced with prism's sp_parser.dart. Handles the SP export variants that tidy fixtures miss but live (Mongo/Firebase) exports hit: collections as array OR map-keyed-by-id; a full timestamp normaliser (int/float millis, numeric string, zone-less ISO, Firebase {_seconds,_nanoseconds}) replacing the int-only converter that silently dropped or crashed on the other shapes; variant collection keys (frontStatuses||customFronts, frontHistory->fronters fallback, settings||users[0] system profile); avatar construction from avatarUuid + owner id; ARGB (8-hex, alpha-first) colours. Also folds in the defensive string coercion, the aggregated bad-timestamp warning, and the parse-stage input-counts event from the earlier logging pass. Adds tests for each shape; all SP/TB runner, import-API, and SP-gap-bundle tests pass.
Opt-in SP chat import (off by default - chat can be large). Collapses SP's multi-channel chat onto the Sheaf system board with a channel-name prefix when multi-channel, resolves authors to imported members, rebuilds reply threads, and rewrites SP mention tokens to @name. Reads both export shapes (messages channel-map and the flat chatMessages array) and the sender/timestamp/body field aliases.

Legacy message bodies encrypted in SP's undocumented format are DETECTED (16-byte base64 iv + non-plaintext base64 content, heuristic ported from prism) and skipped with a warning - we don't decrypt (no published format, no client does), and no chat content is quoted into the log. New messages option + messages_imported/skipped/encrypted_skipped counters. CHANGELOG covers the full Tier 1 + Tier 2 SP work. Tests for both shapes, encryption detection, reply chains, mentions, and channel prefixing; all SP/TB, import-API, SP-gap, and parity suites pass.
@SiteRelEnby SiteRelEnby enabled auto-merge June 11, 2026 23:50
@SiteRelEnby SiteRelEnby merged commit 3894dac into main Jun 12, 2026
4 checks passed
@SiteRelEnby SiteRelEnby deleted the sp-import-format-robustness branch June 12, 2026 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant