In this work, a solution to the problem of comparing text trees is proposed and implemented. To solve this problem, the Zhang-Shasha algorithm is used, in which the semantic distance between texts is used as the cost of updating a node's label. BERT-like models are used to estimate the semantic similarity of text labels. Moreover, we propose several heuristics to improve the performance of the proposed algorithm.
-
Notifications
You must be signed in to change notification settings - Fork 0
intsystems/text-tree-distance
About
Repository for research project "Text tree edit distance: a language model-based metric for text hierarchy comparison".
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published