Skip to content

Faster validation with tree-based constraint checking #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 31, 2025

Conversation

laurmaedje
Copy link
Member

@laurmaedje laurmaedje commented Jul 30, 2025

This PR implements a new, more efficient design for constraint validation.

Previously, each cache entry was stored with a list of tracked calls that needed to be validated. When looking for a cache hit, all entries with the same key hash were iterated and their constraints checked. This could lead to $O(n^2)$ runtime when performing cache lookups in cases where many entries shared the same key hash.

The new design makes use of the fact that tracked functions are typically called in a deterministic order. Instead of iterating over all cache entries, it walks through a call tree to find a matching cache entry while validating the minimum possible number of tracked calls.

The speedup this brings varies a lot depending on how the cache is populated by the consumer, but in Typst this can bring double-digit speedups in incremental compiles and huge speedups in previously pathological cases that ran into quadratic cache validation runtime.

To make sure the new implementation is robust, I added fuzz tests for both the tree data structure and the memoization itself.

(Breaking change) I opted to drop support for mutable methods with return values and mixes of mutable and immutable methods in tracked blocks. Making sure this feature still works correctly would have been significant additional work that would be somewhat in vain as Typst does not use the feature anymore and I'm not aware of anyone else using it.

(Breaking change) The new design requires memoized functions to adhere to a new definition of determinism which I call reorderably deterministic. It is explained in the docs of the #[memoize] attribute. In practice, it's typically fulfilled by deterministic functions. If it's not fulfilled, comemo will panic in debug mode. Meanwhile, in release mode, memoized functions will still yield correct results, but caching may prove ineffective.

(Breaking change) The Validate trait was removed. The new design for manual constraint handling and validation is centered around the newly public Constraint type. Note that manual constraint handling is not relevant for average usage, but it's useful in rare case, for example in Typst's introspection system to detect layout convergence.

@laurmaedje laurmaedje merged commit ec8f9b3 into main Jul 31, 2025
2 checks passed
@laurmaedje laurmaedje deleted the faster-validation branch July 31, 2025 08:16
@laurmaedje laurmaedje restored the faster-validation branch July 31, 2025 08:16
@laurmaedje laurmaedje deleted the faster-validation branch July 31, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant