Skip to content

Repository for research project "Text tree edit distance: a language model-based metric for text hierarchy comparison".

Notifications You must be signed in to change notification settings

intsystems/text-tree-distance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text tree edit distance: a language model-based metric for text hierarchy comparison.

In this work, a solution to the problem of comparing text trees is proposed and implemented. To solve this problem, the Zhang-Shasha algorithm is used, in which the semantic distance between texts is used as the cost of updating a node's label. BERT-like models are used to estimate the semantic similarity of text labels. Moreover, we propose several heuristics to improve the performance of the proposed algorithm.

About

Repository for research project "Text tree edit distance: a language model-based metric for text hierarchy comparison".

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published