Faster validation with tree-based constraint checking #16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements a new, more efficient design for constraint validation.
Previously, each cache entry was stored with a list of tracked calls that needed to be validated. When looking for a cache hit, all entries with the same key hash were iterated and their constraints checked. This could lead to$O(n^2)$ runtime when performing cache lookups in cases where many entries shared the same key hash.
The new design makes use of the fact that tracked functions are typically called in a deterministic order. Instead of iterating over all cache entries, it walks through a call tree to find a matching cache entry while validating the minimum possible number of tracked calls.
The speedup this brings varies a lot depending on how the cache is populated by the consumer, but in Typst this can bring double-digit speedups in incremental compiles and huge speedups in previously pathological cases that ran into quadratic cache validation runtime.
To make sure the new implementation is robust, I added fuzz tests for both the tree data structure and the memoization itself.
(Breaking change) I opted to drop support for mutable methods with return values and mixes of mutable and immutable methods in tracked blocks. Making sure this feature still works correctly would have been significant additional work that would be somewhat in vain as Typst does not use the feature anymore and I'm not aware of anyone else using it.
(Breaking change) The new design requires memoized functions to adhere to a new definition of determinism which I call reorderably deterministic. It is explained in the docs of the
#[memoize]
attribute. In practice, it's typically fulfilled by deterministic functions. If it's not fulfilled, comemo will panic in debug mode. Meanwhile, in release mode, memoized functions will still yield correct results, but caching may prove ineffective.(Breaking change) The
Validate
trait was removed. The new design for manual constraint handling and validation is centered around the newly publicConstraint
type. Note that manual constraint handling is not relevant for average usage, but it's useful in rare case, for example in Typst's introspection system to detect layout convergence.