Skip to content
This repository was archived by the owner on Aug 31, 2021. It is now read-only.

How do I serialize a dynamoDB column of string set datatype? #69

Open
Peh-QinCheng opened this issue Jun 24, 2020 · 1 comment
Open

Comments

@Peh-QinCheng
Copy link

Hi guys, thanks for creating this project, it has been of great help to me and I have enjoyed using it so far.

I have a column in my table that is of string set datatype, and it is currently being inferred as an Array[String] which gets persisted as list of string when being written back to dynamoDB. I have tried coercing it toSet[String] but it is still being written back to dynamoDB as list of string. What datatype should I coerce it to in order to write the column as a string set?

Expected

  "names": {
    "SS": [
      "dummy-name"
    ]
  }

Actual

  "names": {
    "L": [
      {
        "S": "dummy-name"
      },
    ]
  }
@jacobfi
Copy link
Contributor

jacobfi commented Jul 10, 2020

Hello!
Thank you for using our library.
The problem with this issue is that Spark does not have a Set type - the best option is to read it as an array. The problem is that now we forget that it used to be a Set, and when writing it will become a List (due to the array->List conversion).

I can imagine a few solutions:

  1. Maintain some kind of metadata in Spark about the field's origin type in Dynamo, and use this when writing back into Dynamo
  2. Add an option to write arrays as Set instead of List, perhaps on a per-column basis

I would prefer solution 1. We will consider building it if we have time. PRs are welcome :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

2 participants