Skip to content

Comments

Add full-document-key opt to mongo connector.#366

Merged
adiom-mark merged 1 commit intomainfrom
mongofdk
Feb 11, 2026
Merged

Add full-document-key opt to mongo connector.#366
adiom-mark merged 1 commit intomainfrom
mongofdk

Conversation

@adiom-mark
Copy link
Collaborator

@adiom-mark adiom-mark commented Feb 6, 2026

Summary by CodeRabbit

  • New Features

    • Added --full-document-key CLI flag for the MongoDB connector.
    • MongoDB connector can derive composite document IDs from full document keys and include full document data in change-stream updates when enabled.
  • Bug Fixes / Improvements

    • CosmosDB change-stream updates now include the _id field in produced identifiers for deletes/updates.
    • Reduced redundant update/write operations and improved change-stream processing reliability.

@coderabbitai
Copy link

coderabbitai bot commented Feb 6, 2026

📝 Walkthrough

Walkthrough

Adds a typed MongoUpdate and a FullDocumentKey option/CLI flag; refactors Mongo change-stream decoding and write-path ID filter construction (including util.KeepLastUpdate); sets change stream fullDocument lookup; and adds Name: "_id" to Cosmos BsonValue IDs.

Changes

Cohort / File(s) Summary
MongoDB connector core
connectors/mongo/conn.go
Introduce MongoUpdate type; add ConnectorSettings.FullDocumentKey; change convertChangeStreamEventToUpdate to (MongoUpdate, fullDocumentKey bool); decode streams into MongoUpdate; use ClusterTime for timestamps; call SetFullDocument("updateLookup"); import connectors/util and remove xxhash.
MongoDB write path
connectors/mongo/conn.go
Use util.KeepLastUpdate to filter/dedupe updates; build per-id idFilter supporting composite keys when FullDocumentKey is enabled; replace per-item indexing with ReplaceOneModel / DeleteOneModel using constructed filters.
CLI flags
internal/app/options/connectorflags.go
Add --full-document-key flag and bind it to settings.FullDocumentKey on ConnectorSettings.
Cosmos connector IDs & API type
connectors/cosmos/conn.go, gen/adiom/v1
Set Name: "_id" on BsonValue entries produced for update/delete IDs; add Name string field to adiomv1.BsonValue usages/type.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through streams where clustered times glint,
I stitch document keys, composite or mint,
FullDocument or _id, I sort and I keep,
I prune older hops so new changes leap,
A tiny rabbit coder—quick, tidy, and spry.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a full-document-key option to the MongoDB connector, which is reflected in the FullDocumentKey flag addition and related refactoring.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mongofdk

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@connectors/mongo/conn.go`:
- Around line 920-928: The ID filter construction loop incorrectly uses
update.GetData() for each ID element, so change the BSON value to use
idPart.GetData() instead: in the loop that builds idFilter (iterating updates,
idPart), replace the value passed to bson.RawValue from update.GetData() to
idPart.GetData() so each bson.E uses the individual ID part bytes; ensure the
Type remains bsontype.Type(idPart.GetType()) and the variable names idFilter,
idPart, and update are preserved.
🧹 Nitpick comments (3)
connectors/mongo/conn.go (3)

750-754: Unchecked type assertions on change.NS map values could panic.

Lines 752–753 assert change.NS["db"] and change.NS["coll"] to string without a comma-ok check. If a malformed change event omits these fields (or they decode as a non-string type), this panics the goroutine.

Practically low risk since MongoDB always provides these fields, but a guard or structured NS type would be safer.

🛡️ Optional: defensive extraction
-		db := change.NS["db"].(string)
-		col := change.NS["coll"].(string)
+		db, _ := change.NS["db"].(string)
+		col, _ := change.NS["coll"].(string)
+		if db == "" || col == "" {
+			slog.Error("Change stream event missing namespace fields", "ns", change.NS)
+			continue
+		}

918-939: Empty models slice will cause BulkWrite to error.

If all updates are filtered to have no matching ID parts (unlikely but possible with misconfiguration), or if updates is empty after KeepLastUpdate, models stays nil. The MongoDB driver's BulkWrite returns an error on an empty write models slice. Consider guarding.

🛡️ Suggested guard
+	if len(models) == 0 {
+		return connect.NewResponse(&adiomv1.WriteUpdatesResponse{}), nil
+	}
 	_, err := col.BulkWrite(ctx, models, options.BulkWrite().SetOrdered(false))

722-722: SetFullDocument("updateLookup") — use the typed constant for safety.

This line would benefit from using moptions.UpdateLookup instead of the raw string "updateLookup". The MongoDB Go driver exposes options.UpdateLookup as a typed constant, which guards against typos and is the recommended approach. The raw string works functionally, but the constant is safer.

@adiom-mark adiom-mark force-pushed the mongofdk branch 2 times, most recently from b8e0126 to 98c718b Compare February 6, 2026 18:25
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@connectors/mongo/conn.go`:
- Around line 918-942: The loop building idFilter can produce an empty/nil
filter which would match all documents; add a defensive check after building
idFilter: if len(idFilter) == 0 then log a warning (using the connector's
logger, e.g., c.logger.Warnf or similar) and skip creating any mongo model for
that update (continue), so neither mongo.NewReplaceOneModel nor
mongo.NewDeleteOneModel is called with an empty filter; reference
util.KeepLastUpdate, c.settings.FullDocumentKey, idFilter,
mongo.NewReplaceOneModel and mongo.NewDeleteOneModel to locate the change.
🧹 Nitpick comments (2)
connectors/mongo/conn.go (2)

750-753: Type assertions on change.NS could panic on malformed events.

Lines 752–753 perform unchecked type assertions (change.NS["db"].(string), change.NS["coll"].(string)). If a change stream event lacks these fields or they aren't strings, this will panic.

Consider using the two-value form (v, ok := ...) or defining NS as a typed struct with BSON tags (e.g., struct { Db string \bson:"db"`; Coll string `bson:"coll"` }) in MongoUpdate` to handle this at decode time.


722-722: Consider using the moptions.UpdateLookup constant instead of the string literal.

Line 722 uses SetFullDocument("updateLookup") while the Cosmos connector (e.g., line 215, 220) uses moptions.UpdateLookup. Using the typed constant is safer against typos and follows the pattern elsewhere in the codebase.

-	opts := moptions.ChangeStream().SetStartAfter(bson.Raw(r.Msg.GetCursor())).SetFullDocument("updateLookup")
+	opts := moptions.ChangeStream().SetStartAfter(bson.Raw(r.Msg.GetCursor())).SetFullDocument(moptions.UpdateLookup)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
connectors/mongo/conn.go (1)

940-950: ⚠️ Potential issue | 🟡 Minor

BulkWrite will error on an empty models slice.

If all updates have unrecognized UpdateType values (the switch has no default case), models remains nil. col.BulkWrite with an empty slice returns a driver error. Consider adding a guard or a default case.

🛡️ Suggested guard
 	}
+	if len(models) == 0 {
+		return connect.NewResponse(&adiomv1.WriteUpdatesResponse{}), nil
+	}
 	_, err := col.BulkWrite(ctx, models, options.BulkWrite().SetOrdered(false))
🤖 Fix all issues with AI agents
In `@connectors/mongo/conn.go`:
- Around line 752-753: The code uses unsafe type assertions
change.NS["db"].(string) and change.NS["coll"].(string) which can panic; update
the handler around change.NS to perform safe type checks (value, ok :=
change.NS["db"]; db, ok2 := value.(string)) and similarly for "coll", logging an
error and skipping the change or returning early if either key is missing or not
a string. Locate this in the change stream processing where variables db and col
are set and replace the bare assertions with the safe-ok pattern, ensuring you
handle the fallback path (log the unexpected shape and continue) instead of
letting the goroutine crash.
🧹 Nitpick comments (2)
connectors/mongo/conn.go (2)

686-692: Consider using bson.Raw for FullDocument to avoid a round-trip through bson.M.

FullDocument bson.M deserializes the document into a map and then re-serializes it at lines 641 and 658. This round-trip loses BSON key ordering and potentially incurs unnecessary allocations. Using bson.Raw would let you pass the bytes straight through without re-marshaling.

This is a minor optimization and not blocking.


638-649: No nil check on FullDocument for insert events.

The update/replace branch (line 652) defensively handles a nil FullDocument, but the insert branch does not. While MongoDB guarantees fullDocument on inserts, a nil check would be consistent and guard against corrupted events producing an empty document insert.

Comment on lines +752 to +753
db := change.NS["db"].(string)
col := change.NS["coll"].(string)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unsafe type assertions on change.NS will panic on unexpected data.

change.NS["db"].(string) and change.NS["coll"].(string) are bare type assertions. If the key is missing or the value isn't a string, this panics and kills the stream goroutine. While MongoDB guarantees these for standard CRUD events, defensive assertions are cheap insurance.

🛡️ Suggested defensive approach
-		db := change.NS["db"].(string)
-		col := change.NS["coll"].(string)
+		db, _ := change.NS["db"].(string)
+		col, _ := change.NS["coll"].(string)
+		if db == "" || col == "" {
+			slog.Error(fmt.Sprintf("Skipping change event with invalid namespace: %v", change.NS))
+			continue
+		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
db := change.NS["db"].(string)
col := change.NS["coll"].(string)
db, _ := change.NS["db"].(string)
col, _ := change.NS["coll"].(string)
if db == "" || col == "" {
slog.Error(fmt.Sprintf("Skipping change event with invalid namespace: %v", change.NS))
continue
}
🤖 Prompt for AI Agents
In `@connectors/mongo/conn.go` around lines 752 - 753, The code uses unsafe type
assertions change.NS["db"].(string) and change.NS["coll"].(string) which can
panic; update the handler around change.NS to perform safe type checks (value,
ok := change.NS["db"]; db, ok2 := value.(string)) and similarly for "coll",
logging an error and skipping the change or returning early if either key is
missing or not a string. Locate this in the change stream processing where
variables db and col are set and replace the bare assertions with the safe-ok
pattern, ensuring you handle the fallback path (log the unexpected shape and
continue) instead of letting the goroutine crash.

@adiom-mark adiom-mark merged commit 2b06e15 into main Feb 11, 2026
2 checks passed
@adiom-mark adiom-mark deleted the mongofdk branch February 11, 2026 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant