Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination MSSQL: standard inserts uses new typing interface #55849

Open
wants to merge 13 commits into
base: edgao/unknown_types_test
Choose a base branch
from

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Mar 19, 2025

  • Pass DestinationRecordRaw everywhere (instead of DestinationRecordAirbyteValue)
  • add airbyteMetaField: Meta.AirbyteMetaFields? to EnrichedAirbyteValue (we thought we didn't need it b/c we separated airbyteMetaFields / declaredFields; turns out it's still useful)
  • allow connectors to grab airbyteMeta specifically from EnrichedDestinationRecordAirbyteValue, instead of hiding it inside airbyteMetaFields
    • ... we should maybe rename airbyteMetaFields to something less confusing
  • fix the DataCleaner regex - now it correctly matches test20241212abcd_1
  • update MSSQL to use the EnrichedValue stuff

Copy link

vercel bot commented Mar 19, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 22, 2025 0:12am

}
} else {
if (field.name == COLUMN_NAME_AB_META) {
// don't populate _airbyte_meta yet - we might run into errors in the other fields
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't new, it matches the previous behavior (COLUMN_NAME_AB_META -> airbyteMetaStatementIndex = statementIndex). Just adding a comment b/c it's nonobvious.

setAdditionalProperty("sync_id", stream.syncId)
}
val enrichedRecord =
plainRecord.asDestinationRecordRaw().asEnrichedDestinationRecordAirbyteValue()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the two steps feels wrong. If the point is to get everyone on the same interface (a single type that can be marshaled into a bunch of stuff) then oughtn't we be giving everyone the "RecordRaw"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I misread what we were doing in s3datalake. how does 6525221 look?

I do think passing RecordRaw is correct though? B/c it's useful for destinations that just want the plain JSON blob - blob destination non-flattening csv/jsonl, warehouses in raw tables mode, etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pass the thing that can be marshaled to lots of stuff, and have a layer between whatever the framework uses and that.

@edgao edgao requested a review from a team as a code owner March 21, 2025 20:05
@edgao edgao changed the base branch from master to edgao/unknown_types_test March 21, 2025 20:32
@@ -302,7 +302,7 @@ abstract class IntegrationTest(
fun updateConfig(config: String): String = configUpdater.update(config)

companion object {
val randomizedNamespaceRegex = Regex("test(\\d{8})[A-Za-z]{4}")
val randomizedNamespaceRegex = Regex("test(\\d{8})[A-Za-z]{4}.*")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, can you explain this change please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this was what I mean by

fix the DataCleaner regex - now it correctly matches test20241212abcd_1

in the pr description. I.e. previously the regex only recognized test20241212abcd, so we weren't deleting any namespaces that ended with a _1 / _2 suffix 🤦

(... in related news, I deleted about 1K databases from AWS glue today)

Copy link
Contributor

@frifriSF59 frifriSF59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good to me. just one small question

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants