handling for arbitrary derive cycles

JSON Schema can define types that have cycles. This is simple to handle in languages like JavaScript or Java, but more complex in Rust since those cycles must be explicitly broken with a `Box<T>`. Note that use of a `Vec` or `HashMap` also breaks the containment cycle but has implications for derive computation which we will discuss later. Note too that where one "breaks" a cycle may have multiple solutions, some that require more breaks than others. Note also that it may not be feasible to reconstruct the types e.g. if the JSON Schema were derived from Rust types because the information about `Box` indirections is explicitly discarded (and probably reasonably so, but one could imagine including hints; more on that later as well).

Currently we break trivial `A -> A` cycles such as:

```rust
struct A {
    a: Option<Box<A>>, // this needs to be boxed
}
```

We can do this without a bunch of graph traversal and it solved a proximate problem.

The more general case requires us to decompose the type graph into strongly connected subgraphs that form a DAG (e.g. with algorithms proposed by Tarjan, Dijkstra or Kosaraju). In this case, the edges are defined by structure or newtype containment either directly or via an `Option` type. Within each strongly connected subgraph we then would determine where to "break" the cycles by inserting `Box`es. The general case of this requires exponential time to compute. While the number of nodes (types) in a cycle is likely to be small, we still may elect for a heuristic, the simplest of which would be to cut all edges. There's very little harm in cutting more than is absolutely required--the serialization isn't affected for example--the only consequence is to the legibility and ergonomics of the generated types.

For JSON Schema generated from rust types, it could be helpful to annotate boxed types with an extension. This could act as a heuristic when slicing a strongly connected component i.e. we use these extensions to see if they properly break the containment cycle and do something else if they don't.

---

The derive macros we apply to types have a similar problem. Consider, for example, the following type:

```rust
struct A {
    value: u32,
}
```

For this struct we could `#[derive(Eq, PartialEq)]`, but if we change the `u32` to an `f32` we could not! A `Vec<T>` is `Eq` only if `T: Eq` and a `HashSet<T>` isn't `Ord` regardless of the traits implemented by `T`.

From the list of desirable traits to implement such as `Hash`, `Ord`, and `Eq`, the ones we can apply to a type depend on the types to which it refers. And those references may form a cycle. As above, we must compute the strongly connected components. Above the edges were containment; here the edges are all references (i.e. a `Vec` is an edge here but not above`). Within each strongly connected component we must take the intersection of all supportable traits.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

handling for arbitrary derive cycles #46

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

handling for arbitrary derive cycles #46

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions