Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -572,12 +572,13 @@ lazy val `kyo-schema` =
.withoutSuffixFor(JVMPlatform)
.crossType(CrossType.Full)
.dependsOn(`kyo-data` % "test->test;compile->compile")
.dependsOn(`kyo-core` % "test->compile")
.in(file("kyo-schema"))
.withKyoTest
.settings(`kyo-settings`)
.jvmSettings(mimaCheck(false))
.nativeSettings(`native-settings`)
.jsSettings(`js-settings`)
.jsSettings(`js-settings`, Test / scalaJSLinkerConfig ~= (_.withModuleKind(ModuleKind.CommonJSModule)))
.wasmSettings(`wasm-settings`)

lazy val `kyo-core` =
Expand Down
42 changes: 38 additions & 4 deletions kyo-schema/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,13 @@ Schema[User].focus(_.address.city).update(alice)(_.toUpperCase)

Everything flows from `Schema[A]`, the central type that captures a type's structure at compile time. It's the single source of truth that powers serialization, validation, navigation, and conversion.

The serialization format is chosen at the call site, not baked into the type. `Json.encode(value)` and `Protobuf.encode(value)` summon the `Schema[A]` from implicit scope; a schema you reshaped or enriched only takes effect when you encode through that instance with `s.encode[Json](value)`.
The serialization format is chosen at the call site, not baked into the type. `Json.encode(value)`, `Ion.encode(value)`, and `Protobuf.encode(value)` summon the `Schema[A]` from implicit scope; a schema you reshaped or enriched only takes effect when you encode through that instance with `s.encode[Json](value)`.

These are the top-level entry points:

| Entry point | Purpose |
|-------------|---------|
| `Json` / `Yaml` / `Protobuf` | Serialize to JSON strings, YAML documents, or Protocol Buffers bytes |
| `Json` / `Ion` / `Yaml` / `Protobuf` | Serialize to JSON strings, Ion text, YAML documents, or Protocol Buffers bytes |
| `Focus` | Type-safe lens for reading, writing, and updating fields at any depth |
| `Compare` | Read-only field-by-field comparison of two values |
| `Modify` | Batched field mutations applied as a single unit |
Expand Down Expand Up @@ -171,6 +171,40 @@ Json.decode[User](untrustedInput, maxDepth = 64, maxCollectionSize = 10000)

Exceeding either limit returns `Result.Failure(LimitExceededException)`. `LimitExceededException` is a subtype of `DecodeException`, so the same pattern-match handles malformed input and limit breaches.

### Ion

`Ion.encode` converts a value to Amazon Ion text. Case classes become structs, collections become lists, `Map[String, V]` becomes a struct, and `Span[Byte]` becomes an Ion blob:

```scala
val ion: String = Ion.encode(alice)
// {id:1,name:"Alice",email:"alice@example.com",password:"secret",address:{city:"Portland",zip:"97201"}}

Ion.decode[User](ion)
// Result.Success(alice)

Ion.encode(Span.from("hello".getBytes("UTF-8")))
// {{aGVsbG8=}}
```

The reader accepts the Ion text features most useful for schema-shaped data: unquoted or quoted field names, comments, annotations, typed nulls, blobs, long strings, and symbol values decoded as strings:

```scala
Ion.decode[User](
"""user::{
| id: 1,
| name: "Alice",
| email: "alice@example.com",
| password: "secret",
| address: {city: Portland, zip: "97201"},
|}""".stripMargin
)
// Result.Success(alice)
```

Ion type annotations are accepted as input syntax and ignored as metadata during schema decoding. They are not preserved by `Ion.decode` or emitted by `Ion.encode`.

`Ion.decode` and `Ion.decodeBytes` accept the same `maxDepth` and `maxCollectionSize` safety limits as `Json.decode`.

### YAML

`Yaml.decode` parses one YAML document into a typed value and returns `Result[DecodeException, A]`. For document streams, use `Yaml.decodeAll`, or pass `Yaml.DocumentIndex(n)` to target one zero-based document without decoding the whole stream. Use `Yaml.ReaderConfig` when you need document selection, stream-fragment merging, decode limits, or YAML 1.1 scalar resolution for legacy systems.
Expand Down Expand Up @@ -1011,7 +1045,7 @@ The `Structure.Type` tree ships with a small set of operations for runtime inspe

## Custom Formats

`Json` and `Protobuf` are the built-in formats, but the serialization pipeline itself is format-agnostic. A schema describes a value as a sequence of typed events (`objectStart`, `field`, `int`, `arrayStart`, ...) and a matching sequence on the way back. A format is the code that turns those events into bytes and back.
`Json`, `Ion`, `Yaml`, and `Protobuf` are the built-in formats, but the serialization pipeline itself is format-agnostic. A schema describes a value as a sequence of typed events (`objectStart`, `field`, `int`, `arrayStart`, ...) and a matching sequence on the way back. A format is the code that turns those events into bytes and back.

### The Codec trait

Expand Down Expand Up @@ -1077,7 +1111,7 @@ Schema[User].encode(alice)(using Lines) // Span[Byte] in the Lines format
Schema[User].decode(bytes)(using Lines) // Result[DecodeException, User]
```

For a complete example, read `JsonWriter` and `JsonReader` (or their Protobuf counterparts) in the same package: they implement the full contract.
For a complete example, read `JsonWriter` and `JsonReader`, `IonWriter` and `IonReader`, or their Protobuf counterparts in the same package: they implement the full contract.

When writing a custom schema for an opaque or wrapper type, you can also construct a `Schema` instance directly using the public factories `Schema.init` (for plain schemas) and `Schema.initFocused` (when you need to track the focused type member). Both take inlined `writeFn` and `readFn` lambdas, plus an optional `getterFn`/`setterFn` pair for lens support. Abstract members must be supplied (including `fieldParse`, `matchField`, `lastFieldName`, and `captureValue`); optional overrides like `fieldBytes`, `initFields`, `clearFields`, `droppedFieldsMask`, and `release` are where real codecs recover allocation-sensitive performance.

Expand Down
5 changes: 3 additions & 2 deletions kyo-schema/shared/src/main/scala/kyo/Codec.scala
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ import java.nio.charset.StandardCharsets
*
* - Pluggable: implement `newWriter` and `newReader` to support any binary or text format
* - Used by [[kyo.Schema]] encode/decode methods to select the target format at the call site
* - Built-in implementations: [[Json]] (JSON) and [[Protobuf]] (Protocol Buffers wire format)
* - Built-in implementations: [[Json]] (JSON), [[Ion]] (Amazon Ion text), [[Yaml]] (YAML), and [[Protobuf]] (Protocol Buffers wire
* format)
*
* @see
* [[Codec.Writer]] for the serialization side
Expand Down Expand Up @@ -101,7 +102,7 @@ object Codec:
* wrappers (e.g. the internal `SchemaSerializer.TransformAwareReader`) can signal that fields dropped by the schema should not
* trigger [[MissingFieldException]].
*
* Default returns `0L` no fields pre-satisfied. Overrides must return a mask with bit `i` set iff field index `i` is
* Default returns `0L`: no fields pre-satisfied. Overrides must return a mask with bit `i` set iff field index `i` is
* pre-satisfied by this reader. Field index `i` corresponds to the case class constructor position (0-based). Only the low-order
* `n` bits are relevant; bits beyond that are ignored by the caller.
*/
Expand Down
109 changes: 109 additions & 0 deletions kyo-schema/shared/src/main/scala/kyo/Ion.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
package kyo

/** Amazon Ion text codec instance for codec-polymorphic schema APIs.
*
* The companion object provides high-level helpers for Ion text strings and UTF-8 bytes. An `Ion` instance is useful when code works
* through the generic [[Codec]] or [[Schema.encode]] / [[Schema.decode]] APIs and needs to select Ion as the contextual codec value.
*
* Decoding accepts schema-shaped Ion text and treats Ion annotations as metadata. Encoding emits plain Ion text for the schema value,
* without preserving or synthesizing annotations.
*/
final class Ion extends Codec:
/** Creates an Ion text writer. */
def newWriter(): Codec.Writer = kyo.internal.IonWriter()

/** Creates an Ion text reader over UTF-8 input bytes. */
def newReader(input: Span[Byte])(using Frame): Codec.Reader =
kyo.internal.IonReader(input)
end Ion

/** Primary entry point for Amazon Ion text serialization.
*
* Encoding uses Ion text. Case classes become structs, collections become lists, maps become structs, byte spans become blobs, and
* options or maybes become Ion nulls when absent. Decoding accepts Ion text features that are useful for schema-shaped values, including
* unquoted field names, comments, type annotations, typed nulls, blobs, symbols as strings, and long strings. Type annotations are
* treated as Ion metadata and are not preserved in the decoded Scala value.
*
* @see
* [[kyo.Schema]] for the type-driven serialization model
* @see
* [[kyo.Json]] for JSON serialization
*/
object Ion:
/** Default maximum nesting depth for structs and lists in Ion decoding. */
val DefaultMaxDepth: Int = Json.DefaultMaxDepth

/** Default maximum number of entries in any single collection or struct in Ion decoding. */
val DefaultMaxCollectionSize: Int = Json.DefaultMaxCollectionSize

given Ion = Ion()

/** Encodes a value of type A to an Ion text string.
*
* @param value
* the value to encode
* @return
* the Ion text representation
*/
inline def encode[A](value: A)(using schema: Schema[A], frame: Frame): String =
val w = summon[Ion].newWriter()
schema.writeTo(value, w)
w.resultString
end encode

/** Encodes a value of type A to raw UTF-8 Ion text bytes.
*
* @param value
* the value to encode
* @return
* the Ion text bytes
*/
inline def encodeBytes[A](value: A)(using schema: Schema[A], frame: Frame): Span[Byte] =
val w = summon[Ion].newWriter()
schema.writeTo(value, w)
w.result()
end encodeBytes

/** Decodes an Ion text string into a value of type A.
*
* @param input
* the Ion text string to decode
* @param maxDepth
* maximum nesting depth for structs and lists
* @param maxCollectionSize
* maximum number of entries in a single collection or struct
* @return
* the decoded value, or a DecodeException if the input is malformed or does not match the schema
*/
def decode[A](
input: String,
maxDepth: Int = DefaultMaxDepth,
maxCollectionSize: Int = DefaultMaxCollectionSize
)(using ion: Ion, schema: Schema[A], frame: Frame): Result[DecodeException, A] =
val reader = ion.newReader(Span.from(input.getBytes(java.nio.charset.StandardCharsets.UTF_8)))
reader.resetLimits(maxDepth, maxCollectionSize)
Result.catching[DecodeException](schema.readFrom(reader))
end decode

/** Decodes raw UTF-8 Ion text bytes into a value of type A.
*
* @param input
* the raw UTF-8 Ion text bytes
* @param maxDepth
* maximum nesting depth for structs and lists
* @param maxCollectionSize
* maximum number of entries in a single collection or struct
* @return
* the decoded value, or a DecodeException if the input is malformed or does not match the schema
*/
def decodeBytes[A](
input: Span[Byte],
maxDepth: Int = DefaultMaxDepth,
maxCollectionSize: Int = DefaultMaxCollectionSize
)(using ion: Ion, schema: Schema[A], frame: Frame): Result[DecodeException, A] =
val reader = ion.newReader(input)
reader.resetLimits(maxDepth, maxCollectionSize)
Result.catching[DecodeException](schema.readFrom(reader))
end decodeBytes

end Ion
Loading